Výsledky vyhledávání - "Takezawa, Yuki"

Report

Polyak Meets Parameter-free Clipped Gradient Descent

Autor: Takezawa, Yuki, Bao, Han, Sato, Ryoma, Niwa, Kenta, Yamada, Makoto

Gradient descent and its variants are de facto standard algorithms for training machine learning models. As gradient descent is sensitive to its hyperparameters, we need to tune the hyperparameters carefully using a grid search, but it is time-consum

Externí odkaz: http://arxiv.org/abs/2405.15010

Zobrazit plný text záznamu

Report

PhiNets: Brain-inspired Non-contrastive Learning Based on Temporal Prediction Hypothesis

Autor: Ishikawa, Satoki, Yamada, Makoto, Bao, Han, Takezawa, Yuki

SimSiam is a prominent self-supervised learning method that achieves impressive results in various vision tasks under static environments. However, it has two critical issues: high sensitivity to hyperparameters, especially weight decay, and unsatisf

Externí odkaz: http://arxiv.org/abs/2405.14650

Zobrazit plný text záznamu

Report

An Empirical Study of Self-supervised Learning with Wasserstein Distance

Autor: Yamada, Makoto, Takezawa, Yuki, Houry, Guillaume, Dusterwald, Kira Michaela, Sulem, Deborah, Zhao, Han, Tsai, Yao-Hung Hubert

In this study, we delve into the problem of self-supervised learning (SSL) utilizing the 1-Wasserstein distance on a tree structure (a.k.a., Tree-Wasserstein distance (TWD)), where TWD is defined as the L1 distance between two tree-embedded vectors.

Externí odkaz: http://arxiv.org/abs/2310.10143

Zobrazit plný text záznamu

Report

Embarrassingly Simple Text Watermarks

Autor: Sato, Ryoma, Takezawa, Yuki, Bao, Han, Niwa, Kenta, Yamada, Makoto

We propose Easymark, a family of embarrassingly simple yet effective watermarks. Text watermarking is becoming increasingly important with the advent of Large Language Models (LLM). LLMs can generate texts that cannot be distinguished from human-writ

Externí odkaz: http://arxiv.org/abs/2310.08920

Zobrazit plný text záznamu

Report

Necessary and Sufficient Watermark for Large Language Models

Autor: Takezawa, Yuki, Sato, Ryoma, Bao, Han, Niwa, Kenta, Yamada, Makoto

In recent years, large language models (LLMs) have achieved remarkable performances in various NLP tasks. They can generate texts that are indistinguishable from those written by humans. Such remarkable performance of LLMs increases their risk of bei

Externí odkaz: http://arxiv.org/abs/2310.00833

Zobrazit plný text záznamu

Report

Beyond Exponential Graph: Communication-Efficient Topologies for Decentralized Learning via Finite-time Convergence

Autor: Takezawa, Yuki, Sato, Ryoma, Bao, Han, Niwa, Kenta, Yamada, Makoto

Decentralized learning has recently been attracting increasing attention for its applications in parallel computation and privacy preservation. Many recent studies stated that the underlying network topology with a faster consensus rate (a.k.a. spect

Externí odkaz: http://arxiv.org/abs/2305.11420

Zobrazit plný text záznamu

Report

Momentum Tracking: Momentum Acceleration for Decentralized Deep Learning on Heterogeneous Data

Autor: Takezawa, Yuki, Bao, Han, Niwa, Kenta, Sato, Ryoma, Yamada, Makoto

SGD with momentum is one of the key components for improving the performance of neural networks. For decentralized learning, a straightforward approach using momentum is Distributed SGD (DSGD) with momentum (DSGDm). However, DSGDm performs worse than

Externí odkaz: http://arxiv.org/abs/2209.15505

Zobrazit plný text záznamu

Report

Approximating 1-Wasserstein Distance with Trees

Autor: Yamada, Makoto, Takezawa, Yuki, Sato, Ryoma, Bao, Han, Kozareva, Zornitsa, Ravi, Sujith

Wasserstein distance, which measures the discrepancy between distributions, shows efficacy in various types of natural language processing (NLP) and computer vision (CV) applications. One of the challenges in estimating Wasserstein distance is that i

Externí odkaz: http://arxiv.org/abs/2206.12116

Zobrazit plný text záznamu

Report

Theoretical Analysis of Primal-Dual Algorithm for Non-Convex Stochastic Decentralized Optimization

Autor: Takezawa, Yuki, Niwa, Kenta, Yamada, Makoto

In recent years, decentralized learning has emerged as a powerful tool not only for large-scale machine learning, but also for preserving privacy. One of the key challenges in decentralized learning is that the data distribution held by each node is

Externí odkaz: http://arxiv.org/abs/2205.11979

Zobrazit plný text záznamu

Report

Communication Compression for Decentralized Learning with Operator Splitting Methods

Autor: Takezawa, Yuki, Niwa, Kenta, Yamada, Makoto

In decentralized learning, operator splitting methods using a primal-dual formulation (e.g., the Edge-Consensus Learning (ECL)) has been shown to be robust to heterogeneous data and has attracted significant attention in recent years. However, in the

Externí odkaz: http://arxiv.org/abs/2205.03779

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání