Zobrazeno 1 - 10
of 96
pro vyhledávání: '"Morariu, Vlad"'
Autor:
Suri, Manan, Mathur, Puneet, Dernoncourt, Franck, Jain, Rajiv, Morariu, Vlad I, Sawhney, Ramit, Nakov, Preslav, Manocha, Dinesh
Document structure editing involves manipulating localized textual, visual, and layout components in document images based on the user's requests. Past works have shown that multimodal grounding of user requests in the document image and identifying
Externí odkaz:
http://arxiv.org/abs/2410.16472
Autor:
Jiang, Yue, Lutteroth, Christof, Jain, Rajiv, Tensmeyer, Christopher, Manjunatha, Varun, Stuerzlinger, Wolfgang, Morariu, Vlad
Designing adaptive documents that are visually appealing across various devices and for diverse viewers is a challenging task. This is due to the wide variety of devices and different viewer requirements and preferences. Alterations to a document's c
Externí odkaz:
http://arxiv.org/abs/2410.15504
Autor:
Biswas, Sanket, Jain, Rajiv, Morariu, Vlad I., Gu, Jiuxiang, Mathur, Puneet, Wigington, Curtis, Sun, Tong, Lladós, Josep
While the generation of document layouts has been extensively explored, comprehensive document generation encompassing both layout and content presents a more complex challenge. This paper delves into this advanced domain, proposing a novel approach
Externí odkaz:
http://arxiv.org/abs/2406.08354
Autor:
Basu, Samyadeep, Rezaei, Keivan, Kattakinda, Priyatham, Rossi, Ryan, Zhao, Cherry, Morariu, Vlad, Manjunatha, Varun, Feizi, Soheil
Identifying layers within text-to-image models which control visual attributes can facilitate efficient model editing through closed-form updates. Recent work, leveraging causal tracing show that early Stable-Diffusion variants confine knowledge prim
Externí odkaz:
http://arxiv.org/abs/2405.01008
Mixed-media tutorials, which integrate videos, images, text, and diagrams to teach procedural skills, offer more browsable alternatives than timeline-based videos. However, manually creating such tutorials is tedious, and existing automated solutions
Externí odkaz:
http://arxiv.org/abs/2403.08049
Autor:
Chu, Zhendong, Zhang, Ruiyi, Yu, Tong, Jain, Rajiv, Morariu, Vlad I, Gu, Jiuxiang, Nenkova, Ani
To achieve state-of-the-art performance, one still needs to train NER models on large-scale, high-quality annotated data, an asset that is both costly and time-intensive to accumulate. In contrast, real-world applications often resort to massive low-
Externí odkaz:
http://arxiv.org/abs/2310.16790
Text-to-Image Diffusion Models such as Stable-Diffusion and Imagen have achieved unprecedented quality of photorealism with state-of-the-art FID scores on MS-COCO and other generation benchmarks. Given a caption, image generation requires fine-graine
Externí odkaz:
http://arxiv.org/abs/2310.13730
Autor:
Wang, Zilong, Gu, Jiuxiang, Tensmeyer, Chris, Barmpalios, Nikolaos, Nenkova, Ani, Sun, Tong, Shang, Jingbo, Morariu, Vlad I.
Document images are a ubiquitous source of data where the text is organized in a complex hierarchical structure ranging from fine granularity (e.g., words), medium granularity (e.g., regions such as paragraphs or figures), to coarse granularity (e.g.
Externí odkaz:
http://arxiv.org/abs/2211.14958
Autor:
Gu, Jiuxiang, Kuen, Jason, Morariu, Vlad I., Zhao, Handong, Barmpalios, Nikolaos, Jain, Rajiv, Nenkova, Ani, Sun, Tong
Document intelligence automates the extraction of information from documents and supports many business applications. Recent self-supervised learning methods on large-scale unlabeled document datasets have opened up promising directions towards reduc
Externí odkaz:
http://arxiv.org/abs/2204.10939
We introduce Dessurt, a relatively simple document understanding transformer capable of being fine-tuned on a greater variety of document tasks than prior methods. It receives a document image and task string as input and generates arbitrary text aut
Externí odkaz:
http://arxiv.org/abs/2203.16618