3 Open Source LLM With Longest Context Length

3 Min Read
LWM Text 1M Chat - 3 Open Source LLM With Longest Context Length

Tired of chatbots that forget what you just said? Open-source LLMs (Large Language Models) are revolutionizing the way we interact with AI, offering unprecedented context awareness and extended conversational memory. But what if you could unlock a model that remembers even longer? Look no further! This article dives into the fascinating world of open-source LLMs boasting the longest context lengths available.

LWM-Text-1M-Chat

The LWM-Text-1M-Chat model is a part of a fully open-sourced family of 7B parameter models capable of processing long text documents and videos of over 1M tokens. This model is designed to understand both human textual knowledge and the physical world, enabling broader AI capabilities for assisting humans.

The model utilizes the RingAttention technique to scalably train on long sequences and gradually increases the context size from 4K to 1M tokens. It also uses masked sequence packing for mixing different sequence lengths, loss weighting to balance language and vision, and a model-generated QA dataset for long sequence chat.

  • LWM-Text-1M-Chat: Up to 1 million tokens

dolphin-2.6-mistral-7b-dpo-laser

Dolphin 2.6 Mistral 7b – DPO Laser is an open-source language model developed by Cognitive Computations. This model is based on Mistral-7b and has a context length of 16k. It’s a special release of Dolphin-DPO based on the LASER paper and implementation. The model has been trained using a noise reduction technique based on SVD decomposition. This model has achieved higher scores than its predecessors, Dolphin 2.6 and Dolphin 2.6-DPO, and theoretically, it should have more robust outputs.

The model is uncensored and the dataset has been filtered to remove alignment and bias, making the model more compliant. However, it’s advised to implement your own alignment layer before exposing the model as a service.

  • dolphin-2.6-mistral-7b-dpo-laser: 16,000 tokens

OpenChat-3.5-0106-128k

OpenChat-3.5-0106-128k is an open-source language model developed by the OpenChat team and extended by CallComply. This model is a part of the OpenChat series and has an impressive context length of 128k. It’s a special release that outperforms ChatGPT (March) and Grok-1. The model has achieved a 15-point improvement in Coding over OpenChat-3.5.

The model introduces new features such as 2 Modes: Coding + Generalist, Mathematical Reasoning, and experimental support for Evaluator and Feedback capabilities.

  • OpenChat-3.5-0106-128k: 128,000 tokens
TAGGED: ,
Share This Article
Follow:
SK is a versatile writer deeply passionate about anime, evolution, storytelling, art, AI, game development, and VFX. His writings transcend genres, exploring these interests and more. Dive into his captivating world of words and explore the depths of his creative universe.