Výsledky vyhledávání - "KASHINO, Kunio"

Report

Exploring Pre-trained General-purpose Audio Representations for Heart Murmur Detection

Autor: Niizumi, Daisuke, Takeuchi, Daiki, Ohishi, Yasunori, Harada, Noboru, Kashino, Kunio

To reduce the need for skilled clinicians in heart sound interpretation, recent studies on automating cardiac auscultation have explored deep learning approaches. However, despite the demands for large data for deep learning, the size of the heart so

Externí odkaz: http://arxiv.org/abs/2404.17107

Zobrazit plný text záznamu

Report

Masked Modeling Duo: Towards a Universal Audio Pre-training Framework

Autor: Niizumi, Daisuke, Takeuchi, Daiki, Ohishi, Yasunori, Harada, Noboru, Kashino, Kunio

Self-supervised learning (SSL) using masked prediction has made great strides in general-purpose audio representation. This study proposes Masked Modeling Duo (M2D), an improved masked prediction SSL, which learns by predicting representations of mas

Externí odkaz: http://arxiv.org/abs/2404.06095

Zobrazit plný text záznamu

Report

Deep Attentive Time Warping

Autor: Matsuo, Shinnosuke, Wu, Xiaomeng, Atarsaikhan, Gantugs, Kimura, Akisato, Kashino, Kunio, Iwana, Brian Kenji, Uchida, Seiichi

Similarity measures for time series are important problems for time series classification. To handle the nonlinear time distortions, Dynamic Time Warping (DTW) has been widely used. However, DTW is not learnable and suffers from a trade-off between r

Externí odkaz: http://arxiv.org/abs/2309.06720

Zobrazit plný text záznamu

Report

Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement

Autor: Takeuchi, Daiki, Ohishi, Yasunori, Niizumi, Daisuke, Harada, Noboru, Kashino, Kunio

We proposed Audio Difference Captioning (ADC) as a new extension task of audio captioning for describing the semantic differences between input pairs of similar but slightly different audio clips. The ADC solves the problem that conventional audio ca

Externí odkaz: http://arxiv.org/abs/2308.11923

Zobrazit plný text záznamu

Report

Masked Modeling Duo for Speech: Specializing General-Purpose Audio Representation to Speech using Denoising Distillation

Autor: Niizumi, Daisuke, Takeuchi, Daiki, Ohishi, Yasunori, Harada, Noboru, Kashino, Kunio

Self-supervised learning general-purpose audio representations have demonstrated high performance in a variety of tasks. Although they can be optimized for application by fine-tuning, even higher performance can be expected if they can be specialized

Externí odkaz: http://arxiv.org/abs/2305.14079

Zobrazit plný text záznamu

Report

Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input

Autor: Niizumi, Daisuke, Takeuchi, Daiki, Ohishi, Yasunori, Harada, Noboru, Kashino, Kunio

Masked Autoencoders is a simple yet powerful self-supervised learning method. However, it learns representations indirectly by reconstructing masked input patches. Several methods learn representations directly by predicting representations of masked

Externí odkaz: http://arxiv.org/abs/2210.14648

Zobrazit plný text záznamu

Report

Reflectance-Oriented Probabilistic Equalization for Image Enhancement

Autor: Wu, Xiaomeng, Sun, Yongqing, Kimura, Akisato, Kashino, Kunio

Despite recent advances in image enhancement, it remains difficult for existing approaches to adaptively improve the brightness and contrast for both low-light and normal-light images. To solve this problem, we propose a novel 2D histogram equalizati

Externí odkaz: http://arxiv.org/abs/2209.06406

Zobrazit plný text záznamu

Report

Reflectance-Guided, Contrast-Accumulated Histogram Equalization

Autor: Wu, Xiaomeng, Kawanishi, Takahito, Kashino, Kunio

Existing image enhancement methods fall short of expectations because with them it is difficult to improve global and local image contrast simultaneously. To address this problem, we propose a histogram equalization-based method that adapts to the da

Externí odkaz: http://arxiv.org/abs/2209.06405

Zobrazit plný text záznamu

Report

ConceptBeam: Concept Driven Target Speech Extraction

Autor: Ohishi, Yasunori, Delcroix, Marc, Ochiai, Tsubasa, Araki, Shoko, Takeuchi, Daiki, Niizumi, Daisuke, Kimura, Akisato, Harada, Noboru, Kashino, Kunio

We propose a novel framework for target speech extraction based on semantic information, called ConceptBeam. Target speech extraction means extracting the speech of a target speaker in a mixture. Typical approaches have been exploiting properties of

Externí odkaz: http://arxiv.org/abs/2207.11964

Zobrazit plný text záznamu

Report

Introducing Auxiliary Text Query-modifier to Content-based Audio Retrieval

Autor: Takeuchi, Daiki, Ohishi, Yasunori, Niizumi, Daisuke, Harada, Noboru, Kashino, Kunio

The amount of audio data available on public websites is growing rapidly, and an efficient mechanism for accessing the desired data is necessary. We propose a content-based audio retrieval method that can retrieve a target audio that is similar to bu

Externí odkaz: http://arxiv.org/abs/2207.09732

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání