Zobrazeno 1 - 10
of 12
pro vyhledávání: '"Ionut Cosmin Duta"'
We propose contextual convolution (CoConv) for visual recognition. CoConv is a direct replacement of the standard convolution, which is the core component of convolutional neural networks. CoConv is implicitly equipped with the capability of incorpor
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::95c4a23182c1e0e4458d43b645b9836b
http://arxiv.org/abs/2108.07387
http://arxiv.org/abs/2108.07387
Publikováno v:
ICPR
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture, widely adopted and used in various tasks. In this work we propose an improved version of ResNets. Our proposed improvements address all three ma
Autor:
Nicu Sebe, Bogdan Ionescu, Jasper Uijlings, Kiyoharu Aizawa, Ionut Cosmin Duta, Alexander G. Hauptmann
Publikováno v:
Multimedia Tools and Applications. 76:22445-22472
Feature extraction and encoding represent two of the most crucial steps in an action recognition system. For building a powerful action recognition pipeline it is important that both steps are efficient and in the same time provide reliable performan
Autor:
Fumin Shen, Ling Shao, Zhen Wei, Heng Tao Shen, Jingyi Zhang, Ionut Cosmin Duta, Fan Zhu, Li Liu, Xing Xu
Publikováno v:
ACM Multimedia
In the literature of video analysis, most researches, such as retrieval and recognition, hypothesize that each input video contains at least one complete semantic entity, e.g. an activity, action and event.However, this hypothesis does not hold in ma
Publikováno v:
CVPR
We introduce Spatio-Temporal Vector of Locally Max Pooled Features (ST-VLMPF), a super vector-based encoding method specifically designed for local deep features encoding. The proposed method addresses an important problem of video understanding: how
Publikováno v:
ICMR
For an action recognition system a decisive component is represented by the feature encoding part which builds the final representation that serves as input to a classifier. One of the shortcomings of the existing encoding approaches is the fact that
Publikováno v:
MultiMedia Modeling ISBN: 9783319518107
MMM (1)
MMM (1)
Encoding is one of the key factors for building an effective video representation. In the recent works, super vector-based encoding approaches are highlighted as one of the most powerful representation generators. Vector of Locally Aggregated Descrip
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::a356e671237ef288439e46ec9cdf8b6d
https://doi.org/10.1007/978-3-319-51811-4_30
https://doi.org/10.1007/978-3-319-51811-4_30
Publikováno v:
ICPR
The encoding method is an important factor for an action recognition pipeline. One of the key points for the encoding method is the assignment step. A very widely used super-vector encoding method is the vector of locally aggregated descriptors (VLAD
Autor:
Ionut Cosmin Duta, Kiyoharu Aizawa, Tuan Anh Nguyen, Bogdan Ionescu, Jasper Uijlings, Nicu Sebe, Alexander G. Hauptmann
Publikováno v:
CBMI
Besides appearance information, the video contains temporal evolution, which represents an important and useful source of information about its content. Many video representation approaches are based on the motion information within the video. The co
Publikováno v:
ICME
In this paper we introduce a new video description framework that replaces traditional Bag-of-Words with a combination of Fisher Kernels (FK) and Vector of Locally Aggregated Descriptors (VLAD). The main contributions are: (i) a fast algorithm to den