Zobrazeno 1 - 10
of 570
pro vyhledávání: '"Wu, Junyi"'
Deep learning has made remarkable progress recently, largely due to the availability of large, well-labeled datasets. However, the training on such datasets elevates costs and computational demands. To address this, various techniques like coreset se
Externí odkaz:
http://arxiv.org/abs/2407.07268
Unlike Object Detection, Visual Grounding task necessitates the detection of an object described by complex free-form language. To simultaneously model such complex semantic and visual representations, recent state-of-the-art studies adopt transforme
Externí odkaz:
http://arxiv.org/abs/2407.03243
Publikováno v:
CSCW2024
Due to the significant differences in physical conditions and living environments of people with disabilities, standardized assistive technologies (ATs) often fail to meet their needs. Modified AT, especially DIY (Do It Yourself) ATs, are a popular s
Externí odkaz:
http://arxiv.org/abs/2406.09467
Autor:
Yu, Zhewen, Sreeram, Sudarshan, Agrawal, Krish, Wu, Junyi, Montgomerie-Corcoran, Alexander, Zhang, Cheng, Cheng, Jianyi, Bouganis, Christos-Savvas, Zhao, Yiren
Deep Neural Networks (DNNs) excel in learning hierarchical representations from raw data, such as images, audio, and text. To compute these DNN models with high performance and energy efficiency, these models are usually deployed onto customized hard
Externí odkaz:
http://arxiv.org/abs/2406.03088
The recent introduction of Diffusion Transformers (DiTs) has demonstrated exceptional capabilities in image generation by using a different backbone architecture, departing from traditional U-Nets and embracing the scalable nature of transformers. De
Externí odkaz:
http://arxiv.org/abs/2405.16005
To interpret Vision Transformers, post-hoc explanations assign salience scores to input pixels, providing human-understandable heatmaps. However, whether these interpretations reflect true rationales behind the model's output is still underexplored.
Externí odkaz:
http://arxiv.org/abs/2404.01415
While Transformers have rapidly gained popularity in various computer vision applications, post-hoc explanations of their internal mechanisms remain largely unexplored. Vision Transformers extract visual information by representing image regions as t
Externí odkaz:
http://arxiv.org/abs/2403.14552
The practical deployment of diffusion models still suffers from the high memory and time overhead. While quantization paves a way for compression and acceleration, existing methods unfortunately fail when the models are quantized to low-bits. In this
Externí odkaz:
http://arxiv.org/abs/2402.03666
Publikováno v:
In Microbial Pathogenesis October 2024 195
Publikováno v:
In Petroleum September 2024 10(3):539-547