Zobrazeno 1 - 10
of 253
pro vyhledávání: '"Khan, Zaid A."'
The process of creating training data to teach models is currently driven by humans, who manually analyze model weaknesses and plan how to create data that improves a student model. Recent approaches using LLMs as annotators reduce human effort, but
Externí odkaz:
http://arxiv.org/abs/2410.06215
Autor:
Khan, Zaid, Fu, Yun
The goal of selective prediction is to allow an a model to abstain when it may not be able to deliver a reliable prediction, which is important in safety-critical contexts. Existing approaches to selective prediction typically require access to the i
Externí odkaz:
http://arxiv.org/abs/2404.10193
Visual program synthesis is a promising approach to exploit the reasoning abilities of large language models for compositional computer vision tasks. Previous work has used few-shot prompting with frozen LLMs to synthesize visual programs. Training a
Externí odkaz:
http://arxiv.org/abs/2404.04627
Visual question answering (VQA) has traditionally been treated as a single-step task where each question receives the same amount of effort, unlike natural human question-answering strategies. We explore a question decomposition strategy for VQA to o
Externí odkaz:
http://arxiv.org/abs/2310.17050
Finetuning a large vision language model (VLM) on a target dataset after large scale pretraining is a dominant paradigm in visual question answering (VQA). Datasets for specialized tasks such as knowledge-based VQA or VQA in non natural-image domains
Externí odkaz:
http://arxiv.org/abs/2306.03932
Autor:
Khan, Zaid, Fu, Yun
Contrastive vision-language models (e.g. CLIP) are typically created by updating all the parameters of a vision model and language model through contrastive training. Can such models be created by a small number of parameter updates to an already-tra
Externí odkaz:
http://arxiv.org/abs/2303.11866
Autor:
Khan, Zaid A.
Telehealth is an online health care system that is extensively used in the current pandemic situation. Our proposed technique is considered a fog computing-based attack detection architecture to protect IoT Telehealth Networks. As for IoT Telehealth
Externí odkaz:
http://hdl.handle.net/1828/13798
Self-supervised vision-language pretraining from pure images and text with a contrastive loss is effective, but ignores fine-grained alignment due to a dual-stream architecture that aligns image and text representations only on a global level. Earlie
Externí odkaz:
http://arxiv.org/abs/2203.14395
The stochastic nature of public transport systems leads to headway variability and bus bunching, causing both operator and passenger cost to increase significantly. Traditional strategies to counter bus bunching, including bus-holding, stop-skipping,
Externí odkaz:
http://arxiv.org/abs/2202.06039
Autor:
Zhang, Kangkang, Han, Xiaomeng, Fu, Yanfeng, Khan, Zaid, Zhang, Biaojin, Bi, Junguo, Hu, Liyong, Luo, Lijun
Publikováno v:
In Plant Physiology and Biochemistry November 2024 216