Zobrazeno 1 - 10
of 625
pro vyhledávání: '"Tyukin A."'
We introduce a novel physics-informed approach for accurately modeling aggregation kinetics which provides a comprehensive solution in a single run by outputting all model parameters simultaneously, a clear advancement over traditional single-output
Externí odkaz:
http://arxiv.org/abs/2410.06050
The inference demand for LLMs has skyrocketed in recent months, and serving models with low latencies remains challenging due to the quadratic input length complexity of the attention layers. In this work, we investigate the effect of dropping MLP an
Externí odkaz:
http://arxiv.org/abs/2407.15516
Autor:
Sutton, Oliver J., Zhou, Qinghua, Wang, Wei, Higham, Desmond J., Gorban, Alexander N., Bastounis, Alexander, Tyukin, Ivan Y.
We reveal the theoretical foundations of techniques for editing large language models, and present new methods which can do so without requiring retraining. Our theoretical insights show that a single metric (a measure of the intrinsic dimension of t
Externí odkaz:
http://arxiv.org/abs/2406.12670
Autor:
Tyukin, Georgy
Large Language Models are growing in size, and we expect them to continue to do so, as larger models train quicker. However, this increase in size will severely impact inference costs. Therefore model compression is important, to retain the performan
Externí odkaz:
http://arxiv.org/abs/2404.05741
Autor:
Liu, Ying, Guo, Liucheng, Makarovc, Valeri A., Gorbana, Alexander, Mirkesa, Evgeny, Tyukin, Ivan Y.
Automated hand gesture recognition has long been a focal point in the AI community. Traditionally, research in this field has predominantly focused on scenarios with access to a continuous flow of hand's images. This focus has been driven by the wide
Externí odkaz:
http://arxiv.org/abs/2403.15421
Publikováno v:
Physical Review E 97(5) 052308, 2018
Social learning is widely observed in many species. Less experienced agents copy successful behaviors, exhibited by more experienced individuals. Nevertheless, the dynamical mechanisms behind this process remain largely unknown. Here we assume that a
Externí odkaz:
http://arxiv.org/abs/2402.02226
Autor:
Tyukin, Ivan Y., Tyukina, Tatiana, van Helden, Daniel, Zheng, Zedong, Mirkes, Evgeny M., Sutton, Oliver J., Zhou, Qinghua, Gorban, Alexander N., Allison, Penelope
We present a new methodology for handling AI errors by introducing weakly supervised AI error correctors with a priori performance guarantees. These AI correctors are auxiliary maps whose role is to moderate the decisions of some previously construct
Externí odkaz:
http://arxiv.org/abs/2402.00899
We report a novel approach for the efficient computation of solutions of a broad class of large-scale systems of non-linear ordinary differential equations, describing aggregation kinetics. The method is based on a new take on the dimensionality redu
Externí odkaz:
http://arxiv.org/abs/2312.06857
Publikováno v:
Artificial Neural Networks and Machine Learning ICANN 2023. Lecture Notes in Computer Science, vol 14254, pp 516-529. Springer, Cham
High dimensional data can have a surprising property: pairs of data points may be easily separated from each other, or even from arbitrary subsets, with high probability using just simple linear classifiers. However, this is more of a rule of thumb t
Externí odkaz:
http://arxiv.org/abs/2311.07579
Autor:
Bastounis, Alexander, Gorban, Alexander N., Hansen, Anders C., Higham, Desmond J., Prokhorov, Danil, Sutton, Oliver, Tyukin, Ivan Y., Zhou, Qinghua
In this work, we assess the theoretical limitations of determining guaranteed stability and accuracy of neural networks in classification tasks. We consider classical distribution-agnostic framework and algorithms minimising empirical risks and poten
Externí odkaz:
http://arxiv.org/abs/2309.07072