NTT Scientists Co-author 11 Papers Selected for NeurIPS 2021

Papers Address Machine Learning, Deep Learning, Optimization, Generative Modeling and Other Topics

Sunnyvale, Calif., and Tokyo – December 6, 2021 – NTT Research, Inc. and NTT R&D, divisions of NTT Corp. (TYO:9432), today announced that 11 papers co-authored by researchers from several of their laboratories were selected for presentation at NeurIPS 2021, the 35th annual conference of the Neural Information Processing Systems Foundation. Taking place from Dec. 6 to Dec. 14, scientists from the NTT Research Physics & Informatics (PHI) Lab and Cryptography & Information Security (CIS) Lab are presenting four papers. Scientists from NTT Corp’s Computer and Data Science (CD), Human Informatics (HI), Social Informatics (SI) and Communication Science (CS) Labs are presenting seven papers.  

The papers from NTT Research were co-authored by Drs. Sanjam Garg, Jess Riedel and Hidenori Tanaka. The papers from NTT R&D were co-authored by Drs. Yasunori Akagi, Naoki Marumo, Hideaki Kim, Takeshi Kurashima, Hiroyuki Toda, Daiki Chijiwa, Shin’ya Yamaguchi, Yasutoshi Ida, Kenji Umakoshi, Tomohiro Inoue, Shinsaku Sakaue, Kengo Nakamura, Futoshi Futami, Tomoharu Iwata, Naonori Ueda, Masahiro Nakano, Yasuhiro Fujiwara, Akisato Kimura, Takeshi Yamada and Atsutoshi Kumagai. These papers address issues related to deep learning, generative modeling, graph learning, kernel methods, machine learning, meta learning and optimization. One paper falls in the datasets and benchmarks track (“RAFT: A Real-World Few-Shot Text Classification Benchmark”) and two were selected as spotlights (“Pruning Randomly Initialized Neural Networks with Iterative Randomization” and “Fast Bayesian Inference for Gaussian Cox Processes via Path Integral Formulation”). For titles, co-authors (with NTT affiliations), abstracts and times, see the following list:

  • A Separation Result Between Data-oblivious and Data-aware Poisoning Attacks,” Samuel Deng, Sanjam Garg (CIS Lab), Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody and Abhradeep Guha Thakurta. Most poisoning attacks require the full knowledge of training data. This leaves open the possibility of achieving the same attack results using poisoning attacks that do not have the full knowledge of the clean training set. The results of this theoretical study of that problem show that the two settings of data-aware and data-oblivious are fundamentally different. The same attack or defense results in these scenarios are not achievable. Dec. 7, 8:30 AM (PT)
  • RAFT: A Real-World Few-Shot Text Classification Benchmark,” Neel Alex, Eli Lifland, Lewis Tunstall, Abhishek Thakur, Pegah Maham, C. Jess Riedel (PHI Lab), Emmie Hine, Carolyn Ashurst, Paul Sedille, Alexis Carlier, Michael Noetel and Andreas Stuhlmüller – datasets and benchmarks track. Large pre-trained language models have shown promise for few-shot learning, but existing benchmarks are not designed to measure progress in applied settings. The Real-world Annotated Few-shot Tasks (RAFT) benchmark focuses on naturally occurring tasks and uses an evaluation setup that mirrors deployment. Baseline evaluations on RAFT reveal that current techniques struggle in several areas. Human baselines show that some classification tasks are difficult for non-expert humans. Yet even non-expert human baseline F1 scores exceed GPT-3 by an average of 0.11. The RAFT datasets and leaderboard will track which model improvements translate into real-world benefits. Dec. 7, 8:30 AM (PT)
  • Non-approximate Inference for Collective Graphical Models on Path Graphs via Discrete Difference of Convex Algorithm,” Yasunori Akagi (HI Labs), Naoki Marumo (CS Labs), Hideaki Kim (HI Labs), Takeshi Kurashima (HI Labs) and Hiroyuki Toda (HI Labs). Collective Graphical Model (CGM) is a probabilistic approach to the analysis of aggregated data. One of the most important operations in CGM is maximum a posteriori (MAP) inference of unobserved variables. This paper proposes a novel method for MAP inference for CGMs on path graphs without approximation of the objective function and relaxation of the constraints. The method is based on the discrete difference of convex algorithm and minimum convex cost flow algorithms. Experiments show that the proposed method delivers higher quality solutions than the conventional approach. Dec. 8, 12:30 AM (PT)
  • Pruning Randomly Initialized Neural Networks with Iterative Randomization,” Daiki Chijiwa (CD Labs), Shin’ya Yamaguchi (CD Labs), Yasutoshi Ida (CD Labs), Kenji Umakoshi (SI Labs) and Tomohiro Inoue (SI Labs) – spotlight paper. This paper develops a novel approach to train neural networks. In contrast to the conventional weight-optimization (e.g., SGD), this approach does not directly optimize network weights; instead, it iterates weight pruning and randomization. The authors prove that this approach has the same approximation power as the conventional one. Dec. 8, 12:30 AM (PT)
  • Permuton-Induced Chinese Restaurant Process,” Masahiro Nakano (CS Labs), Yasuhiro Fujiwara (CS Labs), Akisato Kimura (CS Labs), Takeshi Yamada (CS Labs) and Naonori Ueda (CS Labs). This paper proposes a probabilistic model that does not require manual tuning of the model complexity (e.g., number of clusters) in relational data analysis methods for finding clusters in relational data including networks and graphs. The proposed model is a kind of stochastic process with infinite complexity called a Bayesian nonparametric model, and one of its notable advantages is its ability to accurately represent itself with variable-order (finite) parameters depending on the size and quality of the input data. Dec. 8, 4:30 PM (PT)
  • Meta-Learning for Relative Density-Ratio Estimation,” Atsutoshi Kumagai (CD Labs), Tomoharu Iwata (CS Labs) and Yasuhiro Fujiwara (CS Labs). This paper proposes a meta-learning method for relative density-ratio estimation (DRE), which can accurately perform relative DRE from a few examples by using multiple different datasets. This method can improve performance even with small data in various applications such as outlier detection and domain adaptation. Dec. 9, 12:30 AM (PT)
  • Beyond BatchNorm: Towards a General Understanding of Normalization in Deep Learning,” E.S. Lubana, R.P. Dick and H. Tanaka (PHI Lab). Inspired by BatchNorm, there has been an explosion of normalization layers in deep learning. A multitude of beneficial properties in BatchNorm explains its success. However, given the pursuit of alternative normalization layers, these properties need to be generalized so that any given layer’s success/failure can be accurately predicted. This work advances towards that goal by extending known properties of BatchNorm in randomly initialized deep neural networks (DNNs) to several recently proposed normalization layers. Dec. 9, 12:30 AM (PT)
  • Fast Bayesian Inference for Gaussian Cox Processes via Path Integral Formulation,” Hideaki Kim (HI Labs) – spotlight paper. This paper proposes a novel Bayesian inference scheme for Gaussian Cox processes by exploiting a physics-inspired path integral formulation. The proposed scheme does not rely on domain discretization, scales linearly with the number of observed events, has a lower complexity than the state-of-the-art variational Bayesian schemes with respect to the number of inducing points, and is applicable to a wide range of Gaussian Cox processes with various types of link functions. This scheme is especially beneficial under the multi-dimensional input setting, where the number of inducing points tends to be large. Dec. 9, 4:30 PM (PT)
  • Noether’s Learning Dynamics: The Role of Kinetic Symmetry Breaking in Deep Learning,” Hidenori Tanaka (PHI Lab) and Daniel Kunin. This paper develops a theoretical framework to study the “geometry of learning dynamics” in neural networks and reveals a key mechanism of explicit symmetry breaking behind the efficiency and stability of modern neural networks. It models the discrete learning dynamics of gradient descent using a continuous-time Lagrangian formulation; identifies “kinetic symmetry breaking” (KSB), and generalizes Noether’s theorem, known to take KSB into account, and derives “Noether’s Learning Dynamics” (NLD). Finally, it applies NLD to neural networks with normalization layers to reveal how KSB introduces a mechanism of implicit adaptive optimization. Dec. 9, 4:30 PM (PT)

Designated co-authors of these papers will participate in the event through poster and short recorded presentations. Registration to the conference provides access to all interactive elements of this year’s program. Last year at NeurIPS 2020, the conference accepted papers were co-authored by Drs. Tanaka, Iwata and Nakano.

About NTT Research

NTT Research opened its offices in July 2019 as a new Silicon Valley startup to conduct basic research and advance technologies that promote positive change for humankind. Currently, three labs are housed at NTT Research facilities in Sunnyvale: the Physics and Informatics (PHI) Lab, the Cryptography and Information Security (CIS) Lab, and the Medical and Health Informatics (MEI) Lab. The organization aims to upgrade reality in three areas: 1) quantum information, neuroscience and photonics; 2) cryptographic and information security; and 3) medical and health informatics. NTT Research is part of NTT, a global technology and business solutions provider with an annual R&D budget of $3.6 billion.

###

NTT and the NTT logo are registered trademarks or trademarks of NIPPON TELEGRAPH AND TELEPHONE CORPORATION and/or its affiliates. All other referenced product names are trademarks of their respective owners. © 2021 NIPPON TELEGRAPH AND TELEPHONE CORPORATION

NTT Research Contact:
Chris Shaw
Vice President, Global Marketing
NTT Research
+1-312-888-5412
chris.shaw@ntt-research.com
Media Contact:
Stephen Russell
Wireside Communications®
For NTT Research
+1-804-362-7484
srussell@wireside.com
Facebook
Twitter
LinkedIn
Pinterest