There are many important LLMs, and keeping track of all can be challenging due to their fast-paced development. This list has been updated a few months after it was first made, and there are already new versions and additions.
The term "best" is subjective here: This list comprises the most notable, intriguing, and popular LLMs and LMMs, not necessarily the top performers. The focus is mainly on usable LLMs, not just ones featured in research papers, to keep things applicable and useful.
Before we start, it's important to mention that many AI apps don't specify the LLMs they use. Sometimes we can guess or find hints in their marketing, but often we can't. That's why "Undisclosed" appears in the table - it means we're not aware of any major apps using the LLM, but some might be.
Click on any app in the list below to find out more about it.
GPT
- Developer: OpenAI
- Parameters: More than 175 billion
- Access: API
OpenAI's Generative Pre-trained Transformer (GPT) models kickstarted the latest AI hype cycle. There are three main models currently available: GPT-3.5-turbo, GPT-4, and GPT-4-turbo. There's also a multimodal version called GPT-4 Vision or GPT-4-V. All the different versions of GPT are general-purpose AI models with an API, and they're used by a diverse range of companies—including Microsoft, Duolingo, Stripe, Descript, Dropbox, and Zapier—to power countless different tools. Still, ChatGPT is probably the most popular demo of its powers.
Gemini
- Developer: Google
- Parameters: Nano available in 1.8 billion and 3.25 billion versions; others unknown
- Access: API
Google Gemini is a family of AI models from Google. The three models—Gemini Nano, Gemini Pro, and Gemini Ultra—are designed to operate on different devices, from smartphones to dedicated servers. While capable of generating text like an LLM, the Gemini models are also natively able to handle images, audio, video, code, and other kinds of information.
Gemini Pro also powers AI features throughout Google's apps, like Docs and Gmail, as well as Google's chatbot, which is confusingly also called Gemini (formerly Bard). Gemini Pro 1.5 is available to developers through Google AI Studio or Vertex AI, and Gemini Nano and Ultra are due out later in 2024.
Google Gemma
- Developer: Google
- Parameters: 2 billion and 7 billion
- Access: Open
Google Gemma is a family of open AI models from Google based on the same research and technology it used to develop Gemini. It's available in two sizes: 2 billion parameters and 7 billion parameters.
Llama 3
- Developer: Meta
- Parameters: 8 billion, 70 billion, and 400 billion (unreleased)
- Access: Open
Llama 3 is a family of open LLMs from Meta, the parent company of Facebook and Instagram. In addition to powering most AI features throughout Meta's apps, it's one of the most popular and powerful open LLMs, and you can download the source code yourself from GitHub. Because it's free for research and commercial uses, a lot of other LLMs use Llama 3 as a base.
There are 8 billion and 70 billion parameter versions available now, and a 400 billion parameter version is still in training. Meta's previous model family, Llama 2, is still available in 7 billion, 13 billion, and 70 billion parameter versions.
Vicuna
- Developer: LMSYS Org
- Parameters: 7 billion, 13 billion, and 33 billion
- Access: Open
Vicuna is an open chatbot built off Meta's Llama LLM. It's widely used in AI research and as part of Chatbot Arena, a chatbot benchmark operated by LMSYS.
Claude 3
- Developer: Anthropic
- Parameters: Unknown
- Access: API
Claude 3 is arguably one of the most important competitors to GPT. Its three models—Haiku, Sonnet, and Opus—are designed to be helpful, honest, harmless, and crucially, safe for enterprise customers to use. As a result, companies like Slack, Notion, and Zoom have all partnered with Anthropic.
Like all the other proprietary LLMs, Claude 3 is only available as an API, though it can be further trained on your data and fine-tuned to respond how you need. You can also connect Claude to Zapier so you can automate Claude from all your other apps. Here are some pre-made workflows to get you started.
Stable Beluga and StableLM 2
- Developer: Stability AI
- Parameters: 1.6 billion, 7 billion, 12 billion, 13 billion, and 70 billion
- Access: Open
Stability AI is the group behind Stable Diffusion, one of the best AI image generators. They've also released a handful of open LLMs based on Llama, including Stable Beluga and StableLM 2, although they're nowhere near as popular as the image generator.
Coral
- Developer: Cohere
- Parameters: Unknown
- Access: API
Like Claude 3, Cohere's Coral LLM is designed for enterprise users. It similarly offers an API and allows organizations to train versions of its model on their own data, so it can accurately respond to specific queries from employees and customers.
Falcon
- Developer: Technology Innovation Institute
- Parameters: 1.3 billion, 7.5 billion, 40 billion, and 180 billion
- Access: Open
Falcon is a family of open LLMs that have consistently performed well in the various AI benchmarks. It has models with up to 180 billion parameters and can outperform older models like GPT-3.5 in some tasks. It's released under a permissive Apache 2.0 license, so it's suitable for commercial and research use.
DBRX
- Developer: Databricks and Mosaic
- Parameters: 132 billion
- Access: Open
Databricks' DBRX LLM is the successor to Mosaic's MPT-7B and MPT-30B LLMs. It's one of the most powerful open LLMs. Interestingly, it's not built on top of Meta's Llama model, unlike a lot of other open models.
DBRX surpasses or equals previous generation closed LLMs like GPT-3.5 on most benchmarks—though it is available under an open license.
Mixtral 8x7B and 8x22B
- Developer: Mistral
- Parameters: 45 billion and 141 billion
- Access: Open
Mistral's Mixtral 8x7B and 8x22B models use a series of sub-systems to efficiently outperform larger models. Despite having significantly fewer parameters (and thus being capable of running faster or on less powerful hardware), they're able to beat other models like Llama 2 and GPT-3.5 in some benchmarks. They're also released under an Apache 2.0 license.
Mistral has also released a more direct GPT competitor called Mistral Large that's available through cloud computing platforms.
XGen-7B
- Developer: Salesforce
- Parameters: 7 billion
- Access: Open
Salesforce's XGen-7B isn't an especially powerful or popular open model—it performs about as well as other open models with seven billion parameters. But I still think it's worth including because it highlights how many large tech companies have AI and machine learning departments that can just develop and launch their own LLMs.
Grok
- Developer: xAI
- Parameters: Unknown
- Access: Chatbot and open
Grok, an AI model and chatbot trained on data from X (formerly Twitter), doesn't really warrant a place on this list on its own merits as it's not very popular nor particularly good. Still, I'm listing it here because it was developed by xAI, the AI company founded by Elon Musk. While it might not be making waves in the AI scene, it's still getting plenty of media coverage, so it's worth knowing it exists.