LLaMA 3 has sent shockwaves through the open-source community, followed by releases in vision-language models, LLMs geared for the enterprise, and a new contender for the throne of multimodal models from Reka AI. Join us for an overview of the latest breakthroughs and developments in AI R&D, along with a curated list of the month's top 10 trending research papers.
News Articles
OpenAI Researchers, Including Ally of Sutskever, Fired for Alleged Leaking
Adobe Firefly used thousands of Midjourney images in training its 'ethical AI' model
Intel Introduces Gaudi 3 AI Accelerator: Going Bigger and Aiming Higher In AI Market
Eric Schmidt-backed Augment, a GitHub Copilot rival, launches out of stealth with $252M
Model Releases
Meta: LLaMA 3
Mistral AI: Mixtral 8x22B
Snowflake: Snowflake Arctic
Reka: Reka Core
Microsoft: Phi 3
WizardLM: Wizard LM 2 (currently unavailable)
EleutherAI: Pile-T5
Apple: OpenELM
HuggingFace: Idefics 2
AI21 Labs: Jamba-Instruct
Trending AI papers for May 2024
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders - P. BehnamGhader et al. (McGill University, ServiceNow) - 8 Apr. 2024
PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval - S. Zhuang et al. (CSIRO, U. Waterloo, U. Queensland) - 29 Apr. 2024
DUQGen: Effective Unsupervised Domain Adaptation of Neural Rankers by Diversifying Synthetic Query Generation - R. Chandradevan et al. (Emory University) - 3 Apr. 2024
Better & Faster Large Language Models via Multi-token Prediction - F. Gloeckle et al. (FAIR) - 30 Apr. 2024
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models - P. Verga et al. (Cohere) - 29 Apr. 2024
Many-Shot In-Context Learning - R. Agarwal et al. (Google DeepMind) - 16 Apr. 2024
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention - T. Munkhdalai et al. (Google) - 10 Apr. 2024
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions - E. Wallace et al. (OpenAI) - 19 Apr. 2024
Chinchilla Scaling: A replication attempt - T. Besiroglu et al. (Epoch AI) - 15 Apr. 2024
Long-form music generation with latent diffusion - Z. Evans et al. (Stability AI) - 16 Apr. 2024
And a few runner-ups:
LongEmbed: Extending Embedding Models for Long Context Retrieval - D. Zhu et al. (Peking University, Microsoft) - 18 Apr. 2024
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length - X. Ma et al. (USC, Meta AI) - 12 Apr. 2024
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models - J. Baek et al. (KAIST, Microsoft Research) - 11 Apr. 2024
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws - Z. Allen-Zhu & Y. Li (FAIR, MBZUAI) - 8 Apr. 2024
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences - C. Rosset et al. (Microsoft Research) - 4 Apr. 2024
You can find an annotated collection of these papers in Zeta Alpha, allowing you to effortlessly discover relevant literature and dive deeper into any topic that interests you!
The full-length recording of our May 2024 Trends in AI webinar is available on our YouTube channel! For a short video overview describing the trending papers, check this out:
To join us live for the next installment of the Trends in AI webinar, sign up here.
Comments