Zobrazeno 1 - 10
of 110
pro vyhledávání: '"Boullé, Marc"'
Autor:
Hue, Carine, Boullé, Marc
We study supervised classification for datasets with a very large number of input variables. The na\"ive Bayes classifier is attractive for its simplicity, scalability and effectiveness in many real data applications. When the strong na\"ive Bayes as
Externí odkaz:
http://arxiv.org/abs/2409.11100
Variable selection or importance measurement of input variables to a machine learning model has become the focus of much research. It is no longer enough to have a good model, one also must explain its decisions. This is why there are so many intelli
Externí odkaz:
http://arxiv.org/abs/2307.16718
Autor:
Boullé, Marc
Histograms are among the most popular methods used in exploratory analysis to summarize univariate distributions. In particular, irregular histograms are good non-parametric density estimators that require very few parameters: the number of bins with
Externí odkaz:
http://arxiv.org/abs/2306.05786
Publikováno v:
Computational Statistics and Data Analysis, 2023, 180, pp.107668
G-Enum histograms are a new fast and fully automated method for irregular histogram construction. By framing histogram construction as a density estimation problem and its automation as a model selection task, these histograms leverage the Minimum De
Externí odkaz:
http://arxiv.org/abs/2212.13524
Publikováno v:
Advances in Knowledge Discovery and Management, 834, Springer International Publishing, pp.23-41, 2019, Studies in Computational Intelligence
Co-clustering is a class of unsupervised data analysis techniques that extract the existing underlying dependency structure between the instances and variables of a data table as homogeneous blocks. Most of those techniques are limited to variables o
Externí odkaz:
http://arxiv.org/abs/2212.11728
Publikováno v:
Advances in Knowledge Discovery and Management, 834, Springer International Publishing, pp.3-22, 2019, Studies in Computational Intelligence
Co-clustering is a data mining technique used to extract the underlying block structure between the rows and columns of a data matrix. Many approaches have been studied and have shown their capacity to extract such structures in continuous, binary or
Externí odkaz:
http://arxiv.org/abs/2212.11725
Supervised learning of time series data has been extensively studied for the case of a categorical target variable. In some application domains, e.g., energy, environment and health monitoring, it occurs that the target variable is numerical and the
Externí odkaz:
http://arxiv.org/abs/2103.10247
Publikováno v:
Extraction et gestion des connaissances 2018, Jan 2018, Paris, France. Revue des Nouvelles Technologies de l'Information, RNTI-E-34, pp.275-280, 2018, Actes de la 18{\`e}eme Conf{\'e}rence Internationale Francophone sur l'Extraction et gestion des connaissances (EGC'2018)
We propose a MAP Bayesian approach to perform and evaluate a co-clustering of mixed-type data tables. The proposed model infers an optimal segmentation of all variables then performs a co-clustering by minimizing a Bayesian model selection cost funct
Externí odkaz:
http://arxiv.org/abs/1902.02056
Publikováno v:
In Computational Statistics and Data Analysis April 2023 180
Autor:
Boullé, Marc1 (AUTHOR) marc.boulle@orange.com
Publikováno v:
Intelligent Data Analysis. 2024, Vol. 28 Issue 5, p1347-1394. 48p.