The TerraByte Client: providing access to terabytes of plant data

Autor: Beck, Michael A., Bidinosti, Christopher P., Henry, Christopher J., Ajmani, Manisha
Rok vydání: 2022
Předmět:
Druh dokumentu: Working Paper
Popis: In this paper we demonstrate the TerraByte Client, a software to download user-defined plant datasets from a data portal hosted at Compute Canada. To that end the client offers two key functionalities: (1) It allows the user to get an overview on what data is available and a quick way to visually check samples of that data. For this the client receives the results of queries to a database and displays the number of images that fulfill the search criteria. Furthermore, a sample can be downloaded within seconds to confirm that the data suits the user's needs. (2) The user can then download the specified data to their own drive. This data is prepared into chunks server-side and sent to the user's end-system, where it is automatically extracted into individual files. The first chunks of data are available for inspection after a brief waiting period of a minute or less depending on available bandwidth and type of data. The TerraByte Client has a full graphical user interface for easy usage and uses end-to-end encryption. The user interface is built on top of a low-level client. This architecture in combination of offering the client program open-source makes it possible for the user to develop their own user interface or use the client's functionality directly. An example for direct usage could be to download specific data on demand within a larger application, such as training machine learning models.
Databáze: arXiv