It is that time of the year! NeurIPS, the most prestigious conference in AI, is happening this week in Vancouver, Canada. Once again, it has grown to the largest it has ever been, with over 4000 papers accepted to the main conference, 56 workshops, 14 tutorials, and 8 keynote talks. As part of our annual tradition at Zeta Alpha, we've curated a short guide to help you navigate the conference, highlighting some of the areas and research that caught our attention. Let's have a look!
Before we begin:
If you like technical deep dives into research papers and staying up to date with the latest advancements in AI R&D, you'll also love our monthly talk show, Trends in AI, where we cover all the breakthroughs, releases, and trending papers of the month. Sign up to join us live at the next one!
To start processing this enormous amount of information, we have created a semantic map of the published papers as a visualization of the conference's content. We then categorized the papers into clusters based on their semantic similarity, using a bit of our secret sauce involving Large Language Model agents to automatically label the sub-areas.
Pro tip: To freely navigate the graph and discover the papers that pique your interest the most within each cluster, consider viewing it in full-screen mode (or opening it in a separate tab).
As the high-level overview presented above can still be overwhelming, we have taken a closer look into some of the major research areas and have selected a few papers worth reading in more detail. (& visiting in person if you are attending the conference!) Additionally, we have organized them in a Zeta Alpha collection for your convenience, and you can also browse all 72 accepted orals or have a go at the full collection of 4032 accepted main papers.
You'll notice that most of these papers are already well-known, with some having already garnered dozens of citations and inspired subsequent research, as they've been publicly available on arXiv for a while now. Nevertheless, this overview also serves as a compilation of some of the most influential works for 2024, with their publication at NeurIPS further underscoring their impact.
Throughout this guide, you can use Zeta Alpha to explore more of the papers that will be presented at NeurIPS this year. To do this, click the "🔎 Find Similar" button next to each featured title. This will take you to our platform, where you can discover similar work using neural search.
⚠️ Disclaimer ⚠️ Naturally, this cannot be a fully comprehensive guide, given the sheer number of papers we're working with here, but we hope this is a useful entry point to the conference.
1. Graph Neural Networks
Highlight paper:
Exploitation of a Latent Mechanism in Graph Contrastive Learning: Representation Scattering D. He, L. Shan, J. Zhao, H. Zhang, Z. Wang, W. Zhang - 🔎 Find Similar
This paper explores Graph Contrastive Learning (GCL) and identifies a common mechanism called representation scattering, which enhances the performance of various GCL frameworks. The authors introduce a new framework, Scattering Graph Representation Learning (SGRL), which incorporates a representation scattering mechanism and a topology-based constraint mechanism to improve representation diversity and prevent excessive scattering. Empirical evaluations on several benchmarks showcase SGRL's effectiveness and superiority over existing GCL methods.
2. Machine Learning Optimization
Highlight paper:
F. Petersen, H. Kuehne, C. Borgelt, J. Welzel, S. Ermon - 🔎 Find Similar
This paper introduces an approach to learning logic gate networks through differentiable relaxation, allowing for faster and more efficient inference than conventional neural networks. The authors extend this idea with deep logic gate tree convolutions, logical OR pooling, and residual initializations, enabling the scaling of logic gate networks. Their model achieves 86.29% accuracy on CIFAR-10 using only 61 million logic gates, significantly improving over the state-of-the-art while being 29 times smaller.
→ Runner Up: The Road Less Scheduled
3. Attention & Transformer Variants
Highlight paper:
M. Beck, K. Pöppel, M. Spanring, A. Auer, O. Prudnikova, M. K. Kopp, G. Klambauer, J. Brandstetter, S. Hochreiter - 🔎 Find Similar
This work proposes a few architectural modifications to the traditional LSTM model to address its limitations, particularly in language modeling. The authors introduce two new variants: the sLSTM, which includes scalar memory and exponential gating, and the mLSTM, which features matrix memory and a covariance update rule for improved parallelization. These modifications improve the LSTM's performance and scalability, making it competitive with modern models like Transformers and State Space Models. The xLSTM models demonstrate favorable results in both efficiency and scaling when compared to state-of-the-art methods.
4. Time Series & Neural Dynamics
Highlight paper:
S. R. Cachay, B. Henn, O. Watt-Meyer, C. S. Bretherton, R. Yu - 🔎 Find Similar
This paper introduces Spherical DYffusion, a conditional generative model for probabilistic emulation of global climate models. It integrates the dynamics-informed diffusion framework (DYffusion) with the Spherical Fourier Neural Operator architecture to produce accurate and physically consistent climate simulations. The model achieves stable 100-year simulations at 6-hourly timesteps with low computational overhead, outperforming existing methods in climate model emulation and demonstrating promising ensembling capabilities.
5. 3D Scene Understanding
Highlight paper:
I. Radosavovic, J. Rajasegaran, B. Shi, B. Zhang, S. Kamat, K. Sreenath, T. Darrell, J. Malik - 🔎 Find Similar
This work from UC Berkeley presents a novel approach to humanoid control by framing it as a next-token prediction problem, similar to language modeling. The authors train a causal transformer model to predict sensorimotor sequences in a modality-aligned manner, allowing the model to handle data with missing modalities. The model is trained on diverse datasets, including neural network policies, model-based controllers, motion capture, and YouTube videos, and it enables a humanoid robot to walk in real-world environments, such as the streets of San Francisco, zero-shot, even with limited training data.
6. Large Language Models
Highlight paper:
Z. Lin, Z. Gou, Y. Gong, X. Liu, Y. Shen, R. Xu, C. Lin, Y. Yang, J. Jiao, N. Duan, W. Chen - 🔎 Find Similar
This paper introduces RHO-1, a language model that uses Selective Language Modeling to improve pretraining efficiency by focusing on the most useful tokens. Unlike traditional models that apply loss to all tokens, RHO-1 selectively trains on tokens that align with the desired distribution. This approach significantly improves few-shot accuracy and overall performance, achieving state-of-the-art results on the MATH dataset with much fewer pretraining tokens compared to other models.
7. Domain Generalization
Highlight paper:
R. Agarwal, A. Singh, L. M. Zhang, B. Bohnet, L. Rosias, S. C.Y. Chan, B. Zhang, A. Anand, Z. Abbas, A. Nova, J. D. Co-Reyes, E. Chu, F. Behbahani, A. Faust, H. Larochelle - 🔎 Find Similar
This work from Google DeepMind explores the potential of many-shot in-context learning (ICL) in large language models, using the expanded context windows to include hundreds or thousands of examples. The authors find significant performance improvements across various tasks when transitioning from few-shot to many-shot ICL. They introduce "Reinforced ICL" using model-generated rationales and "Unsupervised ICL" using only domain-specific inputs, both effective in this new regime. Additionally, they show empirically that many-shot ICL can overcome pretraining biases, learn high-dimensional functions, and perform comparably to fine-tuning.
8. Multimodal Language Models
Highlight paper:
S. Tong, E. L. Brown II, P. Wu, S. Woo, A. J. Iyer, S. C. Akula, S. Yang, J. Yang, M. Middepogu, Z. Wang, X. Pan, R. Fergus, Y. LeCun, S. Xie - 🔎 Find Similar
The paper introduces Cambrian-1, a family of vision-centric multimodal large language models (MLLMs). It addresses the gap between language models and visual representation learning by evaluating various visual representations for MLLMs. The study introduces a new benchmark, CV-Bench, and proposes the Spatial Vision Aggregator to improve visual grounding. Cambrian-1 achieves state-of-the-art performance and provides comprehensive resources, including model weights, code, datasets, and detailed instruction tuning and evaluation recipes.
9. Text-to-Image Generation
Highlight paper:
K. Tian, Y. Jiang, Z. Yuan, B. Peng, L. Wang - 🔎 Find Similar
This work introduces Visual AutoRegressive modeling (VAR), a new paradigm for image generation that uses a coarse-to-fine "next-scale prediction" approach instead of the traditional raster-scan "next-token prediction." VAR significantly improves image generation performance, surpassing diffusion transformers in metrics like Fréchet Inception Distance (FID) and Inception Score (IS) on ImageNet, while also being 20 times faster. The model demonstrates strong zero-shot generalization capabilities and adheres to scaling laws similar to those observed in LLMs.
10. Reinforcement Learning
Highlight paper:
R. Y. Pang, W. Yuan, H. He, K. Cho, S. Sukhbaatar, J. E. Weston - 🔎 Find Similar
This paper presents an iterative approach to improve reasoning tasks in language models by optimizing preferences between competing Chain-of-Thought candidates. The method uses a modified Direct Preference Optimization loss with an additional negative log-likelihood term. The approach shows significant improvements in accuracy on datasets like GSM8K, MATH, and ARC-Challenge for the Llama-2-70B-Chat model, outperforming other similarly-sized models.
→ Runner Up: Span-Based Optimal Sample Complexity for Weakly Communicating and General Average Reward MDPs
If you prefer a more guided approach, you can find a 6-minute overview of these papers on YouTube:
And with that, we conclude our coverage of NeurIPS 2024! We hope this guide has helped you stay up to date with the latest developments and current trends in the field of AI. Enjoy Discovery!
Did we miss anything? Let us know on our social channels: @ZetaVector.