Zobrazeno 1 - 10
of 236
pro vyhledávání: '"Handschuh, Siegfried"'
In training neural networks, it is common practice to use partial gradients computed over batches, mostly very small subsets of the training set. This approach is motivated by the argument that such a partial gradient is close to the true one, with p
Externí odkaz:
http://arxiv.org/abs/2410.16523
Transformers are a widespread and successful model architecture, particularly in Natural Language Processing (NLP) and Computer Vision (CV). The essential innovation of this architecture is the Attention Mechanism, which solves the problem of extract
Externí odkaz:
http://arxiv.org/abs/2410.13732
This research dissects financial equity research reports (ERRs) by mapping their content into categories. There is insufficient empirical analysis of the questions answered in ERRs. In particular, it is not understood how frequently certain informati
Externí odkaz:
http://arxiv.org/abs/2407.18327
Publikováno v:
Proceedings of the 15th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR2023
Deep neural networks have a good success record and are thus viewed as the best architecture choice for complex applications. Their main shortcoming has been, for a long time, the vanishing gradient which prevented the numerical optimization algorith
Externí odkaz:
http://arxiv.org/abs/2309.08414
Sentences that present a complex syntax act as a major stumbling block for downstream Natural Language Processing applications whose predictive quality deteriorates with sentence length and complexity. The task of Text Simplification (TS) may remedy
Externí odkaz:
http://arxiv.org/abs/2308.00425
This research article analyzes the language used in the official statements released by the Federal Open Market Committee (FOMC) after its scheduled meetings to gain insights into the impact of FOMC official statements on financial markets and econom
Externí odkaz:
http://arxiv.org/abs/2304.10164
Determining an appropriate number of attention heads on one hand and the number of transformer-encoders, on the other hand, is an important choice for Computer Vision (CV) tasks using the Transformer architecture. Computing experiments confirmed the
Externí odkaz:
http://arxiv.org/abs/2209.07221
The commitment to single-precision floating-point arithmetic is widespread in the deep learning community. To evaluate whether this commitment is justified, the influence of computing precision (single and double precision) on the optimization perfor
Externí odkaz:
http://arxiv.org/abs/2209.07219
Autor:
Gubelmann, Reto, Handschuh, Siegfried
In this article, we explore the shallow heuristics used by transformer-based pre-trained language models (PLMs) that are fine-tuned for natural language inference (NLI). To do so, we construct or own dataset based on syllogistic, and we evaluate a nu
Externí odkaz:
http://arxiv.org/abs/2201.07614
In this article, we explore the potential of transformer-based language models (LMs) to correctly represent normative statements in the legal domain, taking tax law as our use case. In our experiment, we use a variety of LMs as bases for both word- a
Externí odkaz:
http://arxiv.org/abs/2108.11215