Biologically inspired binaural sound source localization and tracking for mobile robots

Autor: Calmes, Laurent
Přispěvatelé: Lakemeyer, Gerhard
Jazyk: angličtina
Rok vydání: 2009
Předmět:
Zdroj: Aachen : Publikationsserver der RWTH Aachen University XII, 129 S. : Ill., graph. Darst. (2009). = Aachen, Techn. Hochsch., Diss., 2009
Popis: This thesis proposes biologically inspired methods of binaural sound source localization for mobile robots. We also propose a method for modulating a robot's attention inspired from the barn owl and last, a tracking system which makes it possible for a robot to track objects emitting sounds. Regarding sound source localization, the method that was best understood and evaluated is a method based on the evaluation of interaural time differences (ITDs). There is a simple reason for this state of affairs. Interaural time differences are influenced mainly by the inter-microphone distance, provided there is no major obstruction between them. This would make the sound waves bend around the structure and thus increase path length and ITD in a frequency-specific manner. With no obstruction between the microphones and under the far-field assumption, the interaural time difference relates to azimuth through a simple equation, where only inter-microphone distance (constant) and speed of sound (can be regarded constant) are required additionally. Under these conditions, it is easy to adapt ITD localization to different hardware platforms. The method we use for ITD based localization relies on detecting phase coincidence for individual frequencies in the frequency domain and subsequent frequency integration to eliminate phase ambiguities. Overall, the results are excellent. Broadband signals can be localized with an accuracy of ±2°. Localization of pure tones is erratic, as was to be expected. The only unexpected behavior was a low accuracy in localizing 100 Hz – 1 kHz bandpass noise. Simulations in which the room acoustics could be controlled showed that this is caused by sound reflections from the environment. In larger rooms or, equivalently, rooms with a lower direct-to-reverberant ratio, localization precision of broadband signals also degrades significantly, which becomes evident in experiments on a real robot. All in all, care has to be taken as to the acoustic environment in which the ITD based localization is deployed, in order to achieve best performance. Interaural level differences based sound source localization relies on the acoustical properties of the microphone mount assembly and supporting structures. This means that adapting ILD localization to a new platform is more difficult. It requires mounting the microphones and then calibrating the whole setup to record the resulting azimuth/elevation/frequency dependent ILD values, which can then be used by the sound source localization algorithm. This is a quite elaborate, time-consuming procedure which has to be repeated every time something changes in the way the microphones are mounted - or if the microphones themselves are changed. Experiments with artificial owl ruffs illustrate this: even small changes in the ruff can have a huge impact on the ILDs (and, to a lesser degree, on the ITDs). The method for ILD based sound source localization relies on a neuronal model of the barn owl's auditory intensity pathway. Specifically, the neuronal responses in the VLVp and the ICc ls as well as the connections between these areas are modeled. The results of the experiments with the algorithm are encouraging. First tests showed that the system was able to accurately localize broadband sound sources in the range of -30°...+30°. More elaborate artificial ruffs experiments confirmed these results. Furthermore, with the correct acoustic design of the artificial ruff, it is possible to use the ILDs for various purposes as for example localization in elevation and/or verification/correction of the ITD based azimuth estimates. With the attentional module based on a neuronal saliency map it is possible to preactivate a robot's attention to a specific region of interest. It was possible to successfully reproduce with a robotic pan-tilt unit attentional latency experiments that were performed with barn owls. But the system we propose can easily be generalized to modulate (in several instances) the attention of the robot at various levels, from basic sensor level up to planning level. The Markov chain Monte Carlo based combined sound source and dynamic object tracking had a few problems accurately tracking simulated entities. Although the general viability of the method could be shown, the algorithm still has several shortcomings. MCMCDA with a virtual sensor is able to correctly track sound sources and objects alone, but the combination of both modalities in one track proved to be difficult. As long as individual entities are in clearly distinct positions, correct tracks are produced, but if they approach each other or - even worse - cross paths, tracking breaks down. This seems to be caused mainly by the lack of distance information in the sound source localization modality. As long as these shortcomings are not addressed, it makes little sense to test the method on a real robot. This is why the MCMCDA experiments in this thesis were limited to simulations.
Databáze: OpenAIRE