Zobrazeno 1 - 10
of 14 392
pro vyhledávání: '"Data Pipeline"'
Creating high-quality, large-scale datasets for large language models (LLMs) often relies on resource-intensive, GPU-accelerated models for quality filtering, making the process time-consuming and costly. This dependence on GPUs limits accessibility
Externí odkaz:
http://arxiv.org/abs/2411.11289
Publikováno v:
New England Manipulation Symposium 2024
Training data is an essential resource for creating capable and robust vision systems which are integral to the proper function of many robotic systems. Synthesized training data has been shown in recent years to be a viable alternative to manually c
Externí odkaz:
http://arxiv.org/abs/2411.06166
Autor:
Parikh, Arav, Dori-Hacohen, Shiri
A fundamental step in the patent application process is the determination of whether there exist prior patents that are novelty destroying. This step is routinely performed by both applicants and examiners, in order to assess the novelty of proposed
Externí odkaz:
http://arxiv.org/abs/2407.12193
In the case of compute-intensive machine learning, efficient operating system scheduling is crucial for performance and energy efficiency. This paper conducts a comparative study over FIFO(First-In-First-Out) and RR(Round-Robin) scheduling policies w
Externí odkaz:
http://arxiv.org/abs/2409.15704
Autor:
Archibald, Taylor, Martinez, Tony
Document semantic segmentation is a promising avenue that can facilitate document analysis tasks, including optical character recognition (OCR), form classification, and document editing. Although several synthetic datasets have been developed to dis
Externí odkaz:
http://arxiv.org/abs/2404.19259
Data Pipeline plays an indispensable role in tasks such as modeling machine learning and developing data products. With the increasing diversification and complexity of Data sources, as well as the rapid growth of data volumes, building an efficient
Externí odkaz:
http://arxiv.org/abs/2402.12916
Autor:
Martell, Matthew, Terry, Nick, Sengupta, Ribhu, Salazar, Chris, Errett, Nicole A., Miles, Scott B., Wartman, Joseph, Choe, Youngjun
Street View Images (SVI) are a common source of valuable data for researchers. Researchers have used SVI data for estimating pedestrian volumes, demographic surveillance, and to better understand built and natural environments in cityscapes. However,
Externí odkaz:
http://arxiv.org/abs/2401.13087
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Publikováno v:
Journal of Big Data, Vol 11, Iss 1, Pp 1-42 (2024)
Abstract Serverless computing has gained significant popularity due to its scalability, cost-effectiveness, and ease of deployment. With the exponential growth of data, organizations face the challenge of efficiently processing and analyzing vast amo
Externí odkaz:
https://doaj.org/article/1195b9a1e2984826b82bc54ef1bac4b8
Autor:
Yeh, Chun-Hsiao, Cheng, Ta-Ying, Hsieh, He-Yen, Lin, Chuan-En, Ma, Yi, Markham, Andrew, Trigoni, Niki, Kung, H. T., Chen, Yubei
Recent text-to-image diffusion models are able to learn and synthesize images containing novel, personalized concepts (e.g., their own pets or specific items) with just a few examples for training. This paper tackles two interconnected issues within
Externí odkaz:
http://arxiv.org/abs/2402.15504