Zobrazeno 1 - 10
of 14
pro vyhledávání: '"Alham Fikri Aji"'
Autor:
Tirana Noor Fatyanosa, Haryo Akbarianto Wibowo, Made Nindyatama Nityasya, Alham Fikri Aji, Radityo Eko Prasojo
Publikováno v:
Aji, A F, Nityasya, M N, Wibowo, H A, Prasojo, R E & Fatyanosa, T 2021, BERT Goes Brrr: A Venture Towards the Lesser Error in Classifying Medical Self-Reporters on Twitter . in Proceedings of the Sixth Social Media Mining for Health (SMM4H) Workshop and Shared Task . Mexico City, Mexico, pp. 58-64, 6th Social Media Mining for Health (SMM4H) Shared Tasks at NAACL 2021, 10/06/21 . https://doi.org/10.18653/v1/2021.smm4h-1.9
This paper describes our team's submission for the Social Media Mining for Health (SMM4H) 2021 shared task. We participated in three subtasks: Classifying adverse drug effect, COVID-19 self-report, and COVID-19 symptoms. Our system is based on BERT m
Autor:
Suci Fitriany, Made Nindyatama Nityasya, Haryo Akbarianto Wibowo, Radityo Eko Prasojo, Derry Tanti Wijaya, Afra Feyza Akyürek, Alham Fikri Aji
Publikováno v:
ACL/IJCNLP (Findings)
Autor:
Alham Fikri Aji, Kenneth Heafield
Publikováno v:
Aji, A F & Heafield, K 2020, Compressing Neural Machine Translation Models with 4-bit Precision . in Proceedings of the Fourth Workshop on Neural Generation and Translation . Seattle, pp. 35–42, The 4th Workshop on Neural Generation and Translation, Seattle, Washington, United States, 10/07/20 . https://doi.org/10.18653/v1/2020.ngt-1.4
Proceedings of the Fourth Workshop on Neural Generation and Translation
NGT@ACL
Proceedings of the Fourth Workshop on Neural Generation and Translation
NGT@ACL
Quantization is one way to compress Neural Machine Translation (NMT) models, especially for edge devices. This paper pushes quantization from 8 bits, seen in current work on machine translation, to 4 bits. Instead of fixed-point quantization, we use
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5a3105a7fb4fb4e7ec15e729ebec9d1c
https://hdl.handle.net/20.500.11820/ec6f46c9-625c-4771-8f03-2bcdc6940cf9
https://hdl.handle.net/20.500.11820/ec6f46c9-625c-4771-8f03-2bcdc6940cf9
Publikováno v:
ACL
Aji, A F, Bogoychev, N, Heafield, K & Sennrich, R 2020, In Neural Machine Translation, What Does Transfer Learning Transfer? in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics . pp. 7701–7710, 2020 Annual Conference of the Association for Computational Linguistics, Virtual conference, Washington, United States, 5/07/20 . https://doi.org/10.18653/v1/2020.acl-main.688
Aji, A F, Bogoychev, N, Heafield, K & Sennrich, R 2020, In Neural Machine Translation, What Does Transfer Learning Transfer? in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics . pp. 7701–7710, 2020 Annual Conference of the Association for Computational Linguistics, Virtual conference, Washington, United States, 5/07/20 . https://doi.org/10.18653/v1/2020.acl-main.688
Transfer learning improves quality for low-resource machine translation, but it is unclear what exactly it transfers. We perform several ablation studies that limit information transfer, then measure the quality impact across three language pairs to
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9423758fb59eecbf6119fc004b222f38
https://doi.org/10.5167/uzh-188224
https://doi.org/10.5167/uzh-188224
Autor:
Alham Fikri Aji, Tatag Aziz Prawiro, Rahmad Mahendra, Haryo Akbarianto Wibowo, Muhammad Ihsan, Suci Fitriany, Radityo Eko Prasojo
Publikováno v:
IALP
In its daily use, the Indonesian language is riddled with informality, that is, deviations from the standard in terms of vocabulary, spelling, and word order. On the other hand, current available Indonesian NLP models are typically developed with the
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f62dc682e310d57e442c90a2665a6efa
Publikováno v:
Aji, A F, Heafield, K & Bogoychev, N 2019, Combining Global Sparse Gradients with Local Gradients in Distributed Neural Network Training . in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing . Hong Kong, pp. 3624–3629, 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Hong Kong, Hong Kong, 3/11/19 . https://doi.org/10.18653/v1/D19-1373
EMNLP/IJCNLP (1)
EMNLP/IJCNLP (1)
One way to reduce network traffic in multi-node data-parallel stochastic gradient descent is to only exchange the largest gradients. However, doing so damages the gradient and degrades the model’s performance. Transformer models degrade dramaticall
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6bbec575d61434a17135499985c455bd
https://www.pure.ed.ac.uk/ws/files/129170063/Combining_Global_Sparse_AJI_DOA04112019_VOR_CC_BY.pdf
https://www.pure.ed.ac.uk/ws/files/129170063/Combining_Global_Sparse_AJI_DOA04112019_VOR_CC_BY.pdf
Autor:
Kenneth Heafield, Alham Fikri Aji
Publikováno v:
NGT@EMNLP-IJCNLP
Aji, A F & Heafield, K 2019, Making Asynchronous Stochastic Gradient Descent Work for Transformers . in Proceedings of the The 3rd Workshop on Neural Generation and Translation (WNGT 2019) . Hong Kong, pp. 80–89, The 3rd Workshop on Neural Generation and Translation, Hong Kong, Hong Kong, 4/11/19 . https://doi.org/10.18653/v1/D19-5608
Aji, A F & Heafield, K 2019, Making Asynchronous Stochastic Gradient Descent Work for Transformers . in Proceedings of the The 3rd Workshop on Neural Generation and Translation (WNGT 2019) . Hong Kong, pp. 80–89, The 3rd Workshop on Neural Generation and Translation, Hong Kong, Hong Kong, 4/11/19 . https://doi.org/10.18653/v1/D19-5608
Asynchronous stochastic gradient descent (SGD) is attractive from a speed perspective because workers do not wait for synchronization. However, the Transformer model converges poorly with asynchronous SGD, resulting in substantially lower quality com
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::3f1e8eea080aaf885f4dfd4ed64a0ed6
http://arxiv.org/abs/1906.03496
http://arxiv.org/abs/1906.03496
Autor:
Alham Fikri Aji, Young Jin Kim, Roman Grundkiewicz, Kenneth Heafield, Marcin Junczys-Dowmunt, Hany Hassan, Nikolay Bogoychev
Publikováno v:
NGT@EMNLP-IJCNLP
Proceedings of the 3rd Workshop on Neural Generation and Translation
Kim, Y J, Junczys-Dowmunt, M, Hassan, H, Aji, A F, Heafield, K, Grundkiewicz, R & Bogoychev, N 2019, From Research to Production and Back: Ludicrously Fast Neural Machine Translation . in Proceedings of the The 3rd Workshop on Neural Generation and Translation (WNGT 2019) . Hong Kong, pp. 280–288, The 3rd Workshop on Neural Generation and Translation, Hong Kong, Hong Kong, 4/11/19 . https://doi.org/10.18653/v1/D19-5632
Proceedings of the 3rd Workshop on Neural Generation and Translation
Kim, Y J, Junczys-Dowmunt, M, Hassan, H, Aji, A F, Heafield, K, Grundkiewicz, R & Bogoychev, N 2019, From Research to Production and Back: Ludicrously Fast Neural Machine Translation . in Proceedings of the The 3rd Workshop on Neural Generation and Translation (WNGT 2019) . Hong Kong, pp. 280–288, The 3rd Workshop on Neural Generation and Translation, Hong Kong, Hong Kong, 4/11/19 . https://doi.org/10.18653/v1/D19-5632
This paper describes the submissions of the “Marian” team to the WNGT 2019 efficiency shared task. Taking our dominating submissions to the previous edition of the shared task as a starting point, we develop improved teacher-student training via
Publikováno v:
Bogoychev, N, Junczys-Dowmunt, M, Heafield, K & Aji, A 2018, Accelerating Asynchronous Stochastic Gradient Descent for Neural Machine Translation . in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing . Brussels, Belgium, pp. 2991-2996, 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 31/10/18 . < http://aclweb.org/anthology/D18-1332 >
University of Edinburgh-PURE
EMNLP
University of Edinburgh-PURE
EMNLP
In order to extract the best possible performance from asynchronous stochastic gradient descent one must increase the mini-batch size and scale the learning rate accordingly. In order to achieve further speedup we introduce a technique that delays gr
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::692104b82156a0d1cb132e7c613de28f
https://www.pure.ed.ac.uk/ws/files/75718984/Accelerating_Asynchronous_Stochastic_Gradient_Descent_for_Neural_Machine_Translation.pdf
https://www.pure.ed.ac.uk/ws/files/75718984/Accelerating_Asynchronous_Stochastic_Gradient_Descent_for_Neural_Machine_Translation.pdf
Autor:
Kemal Kurniawan, Alham Fikri Aji
Publikováno v:
IALP
Previous work in Indonesian part-of-speech (POS) tagging are hard to compare as they are not evaluated on a common dataset. Furthermore, in spite of the success of neural network models for English POS tagging, they are rarely explored for Indonesian