Zobrazeno 1 - 10
of 21 520
pro vyhledávání: '"Transformer Architecture"'
Characterizing the express power of the Transformer architecture is critical to understanding its capacity limits and scaling law. Recent works provide the circuit complexity bounds to Transformer-like architecture. On the other hand, Rotary Position
Externí odkaz:
http://arxiv.org/abs/2411.07602
Convolutional neural networks and attention mechanisms have greatly benefited remote sensing change detection (RSCD) because of their outstanding discriminative ability. Existent RSCD methods often follow a paradigm of using a non-interactive Siamese
Externí odkaz:
http://arxiv.org/abs/2412.17247
Large Language Models (LLMs) based on transformers achieve cutting-edge results on a variety of applications. However, their enormous size and processing requirements make deployment on devices with constrained resources extremely difficult. Among va
Externí odkaz:
http://arxiv.org/abs/2412.05225
This paper describes a memory-efficient transformer model designed to drive a reduction in memory usage and execution time by substantial orders of magnitude without impairing the model's performance near that of the original model. Recently, new arc
Externí odkaz:
http://arxiv.org/abs/2501.00042
Despite the popularity and widespread use of semi-structured data formats such as JSON, end-to-end supervised learning applied directly to such data remains underexplored. We present ORIGAMI (Object RepresentatIon via Generative Autoregressive Modell
Externí odkaz:
http://arxiv.org/abs/2412.17348
Spiking Neural Networks have attracted significant attention in recent years due to their distinctive low-power characteristics. Meanwhile, Transformer models, known for their powerful self-attention mechanisms and parallel processing capabilities, h
Externí odkaz:
http://arxiv.org/abs/2412.13553
MolMiner: Transformer architecture for fragment-based autoregressive generation of molecular stories
Deep generative models for molecular discovery have become a very popular choice in new high-throughput screening paradigms. These models have been developed inheriting from the advances in natural language processing and computer vision, achieving e
Externí odkaz:
http://arxiv.org/abs/2411.06608
We present a sensor-agnostic spectral transformer as the basis for spectral foundation models. To that end, we introduce a Universal Spectral Representation (USR) that leverages sensor meta-data, such as sensing kernel specifications and sensing wave
Externí odkaz:
http://arxiv.org/abs/2411.05714
In the past three years, there has been significant interest in hyperspectral imagery (HSI) classification using vision Transformers for analysis of remotely sensed data. Previous research predominantly focused on the empirical integration of convolu
Externí odkaz:
http://arxiv.org/abs/2409.09244
In recent years, point cloud analysis methods based on the Transformer architecture have made significant progress, particularly in the context of multimedia applications such as 3D modeling, virtual reality, and autonomous systems. However, the high
Externí odkaz:
http://arxiv.org/abs/2408.05508