So you’re trying to find the perfect LLM that’s gonna make your life easier, but there are just so many options out there, right? Like, where do you even start? Well, we’ve done the digging for you and narrowed it down to the top 9 7-8 billion parameter models that are actually worth your time. Let’s dive in and find the one that’s gonna make your language processing dreams come true!
Contact me if you think some other model should be on the list.
LLaMA3-iterative-DPO-final
Based on the LLaMA 3 8B model, the LLaMA3-iterative-DPO-final sets a new standard for instruct models in its class. This state-of-the-art model outperforms all similarly sized models and even many larger ones on key benchmarks like Alpaca-Eval-V2, MT-Bench, and Chat-Arena-Hard. Trained on over 15 trillion tokens using open-sourced datasets, it excels in chat tasks and academic benchmarks alike. The model features a simple, efficient online RLHF recipe for training, making it cheaper and easier to tune compared to traditional approaches. Try it out and see the difference for yourself!
Meta-Llama-3-8B-Instruct
The Meta-Llama-3-8B-Instruct model is a powerhouse for dialogue applications, standing out as one of the best in its class. Fine-tuned specifically for conversations, it excels in helpfulness and safety. Trained on a massive 15 trillion tokens, it outperforms many open-source chat models in key industry benchmarks. For instance, it scores an impressive 68.4 on the MMLU 5-shot test, leaving its predecessors like the Llama 2 7B far behind. Whether you’re building a chatbot or any interactive AI, the Llama-3-8B-Instruct model is a top-notch choice.
MAmmoTH2-8B-Plus
Built on the llama 3 8B framework, MAmmoTH2-8B-Plus excels at reasoning thanks to cutting-edge instruction tuning. By cleverly gathering 10 million instruction-response pairs from a vast web corpus, the MAmmoTH2 series achieves top-notch performance on reasoning benchmarks. Further fine-tuning with public datasets gives MAmmoTH2-Plus the edge, setting new records in reasoning and chatbot benchmarks. While it may lack some of the personality of the original llama 3, it offers a budget-friendly and innovative approach to enhancing LLM reasoning abilities.
Yi-1.5-9B-Chat
Yi-1.5-9B-Chat takes the already impressive Yi model to the next level. With an additional 500B tokens of high-quality training data and 3M diverse fine-tuning samples, it’s no surprise that it outperforms its predecessor in areas like coding, math, and instruction-following. While it still shines in language understanding and commonsense reasoning, I’ve got to admit that I still have a soft spot for LLaMA 3 – there’s something about its friendly, non-judgmental vibe that just feels more personal.
WizardLM-2-7B
WizardLM-2 7B packs a punch, achieving performance comparable to models ten times its size. This makes it ideal for users with limited computational resources. While smaller than its counterparts, WizardLM-2 7B outshines Qwen1.5-14B-Chat and Starling-LM-7B-beta, and holds its own against the mighty Qwen1.5-32B-Chat.
Important Note: WizardLM-2 7B is heavily censored and may not be suitable for all tasks.
FuseChat-7B-VaRM
Another contender in the local LLM ring is FuseChat-7B-VaRM, and this one’s a beast! It mashes together three super strong chat LLMs (NH2-Mixtral-8x7B, NH2-Solar-10.7B, and OpenChat-3.5-7B) to create a real conversation champion.
Here’s what makes FuseChat-7B-VaRM stand out from the crowd:
- Blows the Doors Off Benchmarks: FuseChat-7B-VaRM scored a whopping 8.22 on the MT-Bench test, leaving other 7B and even some 34B models like Starling-7B and Yi-34B-Chat in the dust. It even crushes popular choices like GPT-3.5 (March) and Claude-2.1, and comes really close to the mighty Mixtral-8x7B-Instruct.
- Fusion with Flair: This LLM isn’t just one trick pony. It combines different strengths from various architectures and scales to become a well-rounded pro at all things conversation.
- Less Censorship, More Fun: FuseChat-7B-VaRM isn’t afraid to let loose a bit. Compared to other local LLMs, it has fewer restrictions, which can make for more interesting and open chats.
Dolphin-2.8-mistral-7b-v02
Looking for an AI that can code, chat, and explain complex topics? Look no further than Dolphin-2.8-Mistral-7b-v02! This innovative model combines the power of the unreleased Mistral-7b-v02 base model with the fine-tuned capabilities of Dolphin-2.8.
Key Features:
- Conversational Coding: Write scripts, debug code, and get coding help – all through natural conversation.
- Clear Instructions: Need help fixing something? Dolphin can explain complex tasks in a way that’s easy to understand.
- Uncensored and Aligned: The dataset is filtered to remove bias while maintaining informative content.
- 32k Context Window: Keeps track of a significant conversation history for a more natural flow of interaction.
Mistral-7B Instruct-v0.2
The Mistral-7B Instruct-v0.2 Large Language Model (LLM) is a powerful tool for natural language processing tasks, fine-tuned from the Mistral-7B-v0.2 generative text model. This variant was trained on a diverse range of publicly available conversation datasets, enabling it to generate human-like responses to given prompts. With its ability to understand context and respond appropriately, the Mistral-7B Instruct model is ideal for applications such as chatbots, virtual assistants, and language translation software.
One of the key advantages of the Mistral-7B Instruct model is its ease of use. Despite its impressive capabilities, it does not require extensive training or expertise to operate effectively. In fact, the model’s creators demonstrate its potential with a simple, straightforward fine-tuning process.
Gemma 1.1 7B
Gemma 1.1 7B (IT) builds upon the original instruction-tuned Gemma release, offering significant advancements in several key areas. Here’s a breakdown of its strengths:
- Enhanced Quality and Factuality: Trained using a novel RLHF method, Gemma 1.1 delivers more accurate and reliable outputs, making it ideal for tasks requiring factual grounding.
- Stronger Instruction Following: This iteration excels at understanding and adhering to specific instructions, ensuring your prompts are executed precisely.
- Improved Multi-Turn Conversation: A previous bug in multi-turn conversations has been fixed, leading to smoother and more coherent interactions.
- Varied Response Openings: The repetitive “Sure,” response at the beginning of responses is no longer a constant, offering a more natural flow in communication.
Important Caveat: It’s essential to acknowledge that Gemma 1.1 prioritizes safety by employing a strong content filter. This filtering can be quite strict, sometimes hindering the model’s ability to address creative tasks effectively. In certain instances, this filtering might render the model almost unusable for tasks requiring unrestricted creative freedom.
OpenChat 3.5B Gemma
OpenChat 3.5B Gemma stands out as the most capable Gemma model available, achieving performance that rivals the Mistral-based OpenChat and surpassing both Gemma-7b and Gemma-7b-it. This powerful generalist model is trained with OpenChat’s C-RLFT on the openchat-3.5-06 dataset.
- Exceptional performance: OpenChat 3.5B Gemma delivers unmatched capabilities among Gemma models.
- Generalist expertise: Handles a wide range of tasks with proficiency.
- Coding prowess: Proven ability to assist with coding tasks.
- Reasoning skills: Offers decent reasoning abilities to tackle various problems.
- Versatile assistant: Can be a competent assistant for a multitude of tasks.