Zobrazeno 1 - 10
of 48
pro vyhledávání: '"Bear, Helen L."'
Sound scene geotagging is a new topic of research which has evolved from acoustic scene classification. It is motivated by the idea of audio surveillance. Not content with only describing a scene in a recording, a machine which can locate where the r
Externí odkaz:
http://arxiv.org/abs/2110.04585
Autor:
Heise, David, Bear, Helen L.
We analyse multi-purpose audio using tools to visualise similarities within the data that may be observed via unsupervised methods. The success of machine learning classifiers is affected by the information contained within system inputs, so we inves
Externí odkaz:
http://arxiv.org/abs/2110.04584
In this paper we investigate the importance of the extent of memory in sequential self attention for sound recognition. We propose to use a memory controlled sequential self attention mechanism on top of a convolutional recurrent neural network (CRNN
Externí odkaz:
http://arxiv.org/abs/2005.06650
Polyphonic Sound Event Detection (SED) in real-world recordings is a challenging task because of the dynamic polyphony level, intensity, and duration of sound events. Current polyphonic SED systems fail to model the temporal structure of sound events
Externí odkaz:
http://arxiv.org/abs/1907.05122
The majority of sound scene analysis work focuses on one of two clearly defined tasks: acoustic scene classification or sound event detection. Whilst this separation of tasks is useful for problem definition, they inherently ignore some subtleties of
Externí odkaz:
http://arxiv.org/abs/1905.00979
Acoustic Scene Classification (ASC) and Sound Event Detection (SED) are two separate tasks in the field of computational sound scene analysis. In this work, we present a new dataset with both sound scene and sound event labels and use this to demonst
Externí odkaz:
http://arxiv.org/abs/1904.10408
Autor:
Bear, Helen L.
This thesis is about improving machine lip-reading, that is, the classification of speech from only visual cues of a speaker. Machine lip-reading is a niche research problem in both areas of speech processing and computer vision. Current challenges f
Externí odkaz:
http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.687930
Autor:
Nolasco, Inês, Terenzi, Alessandro, Cecchi, Stefania, Orcioni, Simone, Bear, Helen L., Benetos, Emmanouil
The absence of the queen in a beehive is a very strong indicator of the need for beekeeper intervention. Manually searching for the queen is an arduous recurrent task for beekeepers that disrupts the normal life cycle of the beehive and can be a sour
Externí odkaz:
http://arxiv.org/abs/1811.06330
Lipreading is a difficult gesture classification task. One problem in computer lipreading is speaker-independence. Speaker-independence means to achieve the same accuracy on test speakers not included in the training set as speakers within the traini
Externí odkaz:
http://arxiv.org/abs/1810.10597
Autor:
Bear, Helen L, Benetos, Emmanouil
We present a new extensible and divisible taxonomy for open set sound scene analysis. This new model allows complex scene analysis with tangible descriptors and perception labels. Its novel structure is a cluster graph such that each cluster (or subs
Externí odkaz:
http://arxiv.org/abs/1809.10047