Zobrazeno 1 - 10
of 530
pro vyhledávání: '"Wang, Zhaowen"'
Accurate text segmentation results are crucial for text-related generative tasks, such as text image generation, text editing, text removal, and text style transfer. Recently, some scene text segmentation methods have made significant progress in seg
Externí odkaz:
http://arxiv.org/abs/2408.00106
Autor:
Argaw, Dawit Mureja, Yoon, Seunghyun, Heilbron, Fabian Caba, Deilamsalehy, Hanieh, Bui, Trung, Wang, Zhaowen, Dernoncourt, Franck, Chung, Joon Son
Long-form video content constitutes a significant portion of internet traffic, making automated video summarization an essential research problem. However, existing video summarization datasets are notably limited in their size, constraining the effe
Externí odkaz:
http://arxiv.org/abs/2404.03398
Autor:
Xing, Linzi, Tran, Quan, Caba, Fabian, Dernoncourt, Franck, Yoon, Seunghyun, Wang, Zhaowen, Bui, Trung, Carenini, Giuseppe
Video topic segmentation unveils the coarse-grained semantic structure underlying videos and is essential for other video understanding tasks. Given the recent surge in multi-modal, relying solely on a single modality is arguably insufficient. On the
Externí odkaz:
http://arxiv.org/abs/2312.00220
Tail Gini functional is a measure of tail risk variability for systemic risks, and has many applications in banking, finance and insurance. Meanwhile, there is growing attention on aymptotic independent pairs in quantitative risk management. This pap
Externí odkaz:
http://arxiv.org/abs/2309.06428
Autor:
Liu, Ying-Tian, Zhang, Zhifei, Guo, Yuan-Chen, Fisher, Matthew, Wang, Zhaowen, Zhang, Song-Hai
Automatic generation of fonts can be an important aid to typeface design. Many current approaches regard glyphs as pixelated images, which present artifacts when scaling and inevitable quality losses after vectorization. On the other hand, existing v
Externí odkaz:
http://arxiv.org/abs/2305.10462
Autor:
Ji, Jiabao, Zhang, Guanhua, Wang, Zhaowen, Hou, Bairu, Zhang, Zhifei, Price, Brian, Chang, Shiyu
Scene text editing is a challenging task that involves modifying or inserting specified texts in an image while maintaining its natural and realistic appearance. Most previous approaches to this task rely on style-transfer models that crop out text r
Externí odkaz:
http://arxiv.org/abs/2304.05568
The goal of multimodal summarization is to extract the most important information from different modalities to form output summaries. Unlike the unimodal summarization, the multimodal summarization task explicitly leverages cross-modal information to
Externí odkaz:
http://arxiv.org/abs/2303.07284
Autor:
Wang, Zhaowen
The increasing demand for high-capacity and high-speed I/Os is pushing wireline and optical transceivers to a higher aggregate data rate. Multiple lanes of transceivers are monolithically integrated on a single system on chip (SoC), bringing more str
Livestream videos have become a significant part of online learning, where design, digital marketing, creative painting, and other skills are taught by experienced experts in the sessions, making them valuable materials. However, Livestream tutorial
Externí odkaz:
http://arxiv.org/abs/2210.05840
Autor:
Qiu, Jielin, Zhu, Jiacheng, Xu, Mengdi, Dernoncourt, Franck, Bui, Trung, Wang, Zhaowen, Li, Bo, Zhao, Ding, Jin, Hailin
Multimedia summarization with multimodal output (MSMO) is a recently explored application in language grounding. It plays an essential role in real-world applications, i.e., automatically generating cover images and titles for news articles or provid
Externí odkaz:
http://arxiv.org/abs/2210.04722