Zobrazeno 1 - 10
of 22
pro vyhledávání: '"Citovsky, Gui"'
Autor:
Fahrbach, Matthew, Ramalingam, Srikumar, Zadimoghaddam, Morteza, Ahmadian, Sara, Citovsky, Gui, DeSalvo, Giulia
We propose a novel subset selection task called min-distance diverse data summarization ($\textsf{MDDS}$), which has a wide variety of applications in machine learning, e.g., data sampling and feature selection. Given a set of points in a metric spac
Externí odkaz:
http://arxiv.org/abs/2405.18754
Autor:
Ye, Ke, Jiang, Heinrich, Rostamizadeh, Afshin, Chakrabarti, Ayan, DeSalvo, Giulia, Kagy, Jean-François, Karydas, Lazaros, Citovsky, Gui, Kumar, Sanjiv
Pre-training large language models is known to be extremely resource intensive and often times inefficient, under-utilizing the information encapsulated in the training text sequences. In this paper, we present SpacTor, a new training procedure consi
Externí odkaz:
http://arxiv.org/abs/2401.13160
Autor:
Citovsky, Gui, DeSalvo, Giulia, Kumar, Sanjiv, Ramalingam, Srikumar, Rostamizadeh, Afshin, Wang, Yunjuan
We present a subset selection algorithm designed to work with arbitrary model families in a practical batch setting. In such a setting, an algorithm can sample examples one at a time but, in order to limit overhead costs, is only able to update its s
Externí odkaz:
http://arxiv.org/abs/2301.12052
Autor:
Citovsky, Gui, DeSalvo, Giulia, Gentile, Claudio, Karydas, Lazaros, Rajagopalan, Anand, Rostamizadeh, Afshin, Kumar, Sanjiv
The ability to train complex and highly effective models often requires an abundance of training data, which can easily become a bottleneck in cost, time, and computational resources. Batch active learning, which adaptively issues batched queries to
Externí odkaz:
http://arxiv.org/abs/2107.14263
Autor:
Sumengen, Baris, Rajagopalan, Anand, Citovsky, Gui, Simcha, David, Bachem, Olivier, Mitra, Pradipta, Blasiak, Sam, Liang, Mason, Kumar, Sanjiv
Hierarchical Agglomerative Clustering (HAC) is one of the oldest but still most widely used clustering methods. However, HAC is notoriously hard to scale to large data sets as the underlying complexity is at least quadratic in the number of data poin
Externí odkaz:
http://arxiv.org/abs/2105.11653
Autor:
Vainstein, Danny, Chatziafratis, Vaggos, Citovsky, Gui, Rajagopalan, Anand, Mahdian, Mohammad, Azar, Yossi
Recently, Hierarchical Clustering (HC) has been considered through the lens of optimization. In particular, two maximization objectives have been defined. Moseley and Wang defined the \emph{Revenue} objective to handle similarity information given by
Externí odkaz:
http://arxiv.org/abs/2101.10639
Autor:
Menon, Aditya Krishna, Rajagopalan, Anand, Sumengen, Baris, Citovsky, Gui, Cao, Qin, Kumar, Sanjiv
Hierarchical clustering is a widely used approach for clustering datasets at multiple levels of granularity. Despite its popularity, existing algorithms such as hierarchical agglomerative clustering (HAC) are limited to the offline setting, and thus
Externí odkaz:
http://arxiv.org/abs/1909.09667
Autor:
Arkin, Esther M., Banik, Aritra, Carmi, Paz, Citovsky, Gui, Jia, Su, Katz, Matthet J., Mayer, Tyler, Mitchell, Joseph S. B.
Given $n$ pairs of points, $\mathcal{S} = \{\{p_1, q_1\}, \{p_2, q_2\}, \dots, \{p_n, q_n\}\}$, in some metric space, we study the problem of two-coloring the points within each pair, red and blue, to optimize the cost of a pair of node-disjoint netw
Externí odkaz:
http://arxiv.org/abs/1710.00876
In this paper we study a natural special case of the Traveling Salesman Problem (TSP) with point-locational-uncertainty which we will call the {\em adversarial TSP} problem (ATSP). Given a metric space $(X, d)$ and a set of subsets $R = \{R_1, R_2, .
Externí odkaz:
http://arxiv.org/abs/1705.06180
Autor:
Arkin, Esther M., Banik, Aritra, Carmi, Paz, Citovsky, Gui, Katz, Matthew J., Mitchell, Joseph S.B., Simakov, Marina
Publikováno v:
In Discrete Applied Mathematics 11 December 2018 250:75-86