Cost-effective Deployment of BERT Models in Serverless Environment
Autor: | Andrej Švec, Katarína Benešová, Marek Suppa |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Development environment
FOS: Computer and information sciences Computer Science - Machine Learning Computer Science - Computation and Language Computer science Distributed computing Sentiment analysis Latency (audio) 02 engineering and technology 010501 environmental sciences 01 natural sciences Domain (software engineering) Machine Learning (cs.LG) Software deployment 020204 information systems 0202 electrical engineering electronic engineering information engineering Production (economics) Overhead (computing) Computation and Language (cs.CL) 0105 earth and related environmental sciences |
Zdroj: | NAACL-HLT (Industry Papers) |
Popis: | In this study we demonstrate the viability of deploying BERT-style models to serverless environments in a production setting. Since the freely available pre-trained models are too large to be deployed in this way, we utilize knowledge distillation and fine-tune the models on proprietary datasets for two real-world tasks: sentiment analysis and semantic textual similarity. As a result, we obtain models that are tuned for a specific domain and deployable in serverless environments. The subsequent performance analysis shows that this solution results in latency levels acceptable for production use and that it is also a cost-effective approach for small-to-medium size deployments of BERT models, all without any infrastructure overhead. NAACL-HLT 2021 Industry Track Camera Ready |
Databáze: | OpenAIRE |
Externí odkaz: |