Top 17 Best Open Source Coding LLM

Large language models (LLMs) are a type of artificial intelligence (AI) that are trained on massive datasets of text and code. This allows them to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way.

In recent years, there has been a growing interest in developing LLMs for coding applications. This is because LLMs can be used to generate code, debug code, and even write entire programs.

Open Source Coding LLM

phi-1

Phi-1, developed by Microsoft, is a compact yet powerful Open Source Coding Language Model (LLM) with 1.3 billion parameters. Designed for Python programming, itโ€™s known for its efficiency on resource-constrained hardware. Trained on diverse data sources, including StackOverflow and code contests, Phi-1 achieves over 50% accuracy on Python coding tasks. However, it primarily uses specific Python packages, and thereโ€™s a slight risk of replicating online scripts. Phi-1 offers a valuable solution for Python developers, balancing power and resource efficiency.

WizardCoder-Python-34B-V1.0 (7, 13, 34B )

The WizardCoder-Python-34B-V1.0 is a language-specialized variant of Code Llama, which has been meticulously fine-tuned on an extensive dataset of 100 billion tokens of Python code. Its exceptional performance is underscored by its second-place ranking in a recent benchmark evaluation, where it outperformed GPT-4 (with a score of 73.2 compared to 67.0), ChatGPT-3.5 (73.2 compared to 72.5), and Claude2 (73.2 compared to 71.2) as of March 15, 2023.

With Python being a highly benchmarked language for code generation and Python along with PyTorch playing pivotal roles in the AI community, the specialization of this model offers an added layer of utility. The WizardCoder-Python-34B-V1.0 is poised to become an indispensable tool for developers and AI practitioners, enabling more efficient and accurate code generation for Python-based projects. Its finely-tuned expertise in Python programming makes it an excellent choice for those seeking precise and effective coding solutions.

Phind-CodeLlama-34B-v1

Phind-CodeLlama-34B-v1 is an impressive open-source coding language model that builds upon the foundation of CodeLlama-34B. It exhibits exceptional performance, achieving a remarkable 67.6% pass rate at rank 1 on HumanEval. Notably, its superiority is further highlighted by its fine-tuning on proprietary datasets. It boasts an impressive 69.5% pass rate on HumanEval for CodeLlama-34B-Python, a variant fine-tuned on an internal Phind dataset.

The modelโ€™s reliability is substantiated by OpenAIโ€™s decontamination methodology, ensuring the credibility of its results. The Phind-CodeLlama-34B-v1 is unique in its structural approach, employing instruction-answer pairs instead of mere code completion examples. Trained for two epochs on around 80,000 meticulously curated programming problems and solutions, the model stands out. Leveraging advanced technologies like DeepSpeed ZeRO 3 and Flash Attention 2, it was trained efficiently in just three hours using 32 A100-80GB GPUs. With a sequence length of 4096 tokens, Phind-CodeLlama-34B-v1 is at the forefront of efficient and effective open-source coding language models.

WizardCoder-15B

WizardCoder-15B is an innovative LLM (Language Model for Coding) designed specifically for coding tasks. Developed using the Evol-Instruct method, WizardCoder focuses on adapting prompts to the domain of code-related instructions. This specialized training approach ensures that the model excels in understanding and generating code instructions accurately.

The foundation of WizardCoder-15B lies in the fine-tuning of the Code LLM, StarCoder, which has been widely recognized for its exceptional capabilities in code-related tasks. By utilizing a newly created instruction-following training set, WizardCoder has been tailored to provide unparalleled performance and accuracy when it comes to coding.

Although WizardCoder-15B is primarily designed for coding tasks and not intended for conversation or other logic-based tasks, its coding capabilities are on par with, if not better than, renowned LLMs like ChatGPT, Bard, and Claude. Whether you need assistance in code generation, understanding programming concepts, or debugging, WizardCoder-15B is a reliable and powerful tool that can significantly enhance your coding experience.

Code Llama (7, 13, 34b)

Code Llama offers a trio of powerful open source coding language models, available in 7B, 13B, and 34B parameter configurations. Each variant is trained on an impressive 500B tokens of code and code-related data, ensuring robust performance. Notably, the 7B and 13B models possess fill-in-the-middle (FIM) capabilities, facilitating seamless code insertion and immediate code completion tasks.

These models cater to diverse requirements. The 7B model is suitable for single GPU deployment, prioritizing efficiency. The 34B model, while slightly more resource-intensive, delivers superior coding assistance, enhancing code quality. Furthermore, Code Llama extends its functionality through two specialized variations: Code Llama โ€“ Python and Code Llama โ€“ Instruct.

Code Llama โ€“ Python stands out as a language-specific adaptation, fine-tuned on an extensive 100B tokens of Python code. This specialization enhances its aptitude for generating Python code, a widely used language in the coding landscape. As Python and PyTorch hold significance in the AI community, this tailored model is poised to provide enhanced utility and performance for users.

NewHope

NewHope is a powerful open-source coding LLM based on llama-2-13b. It excels in various programming languages, such as Python, C++, Java, JavaScript, and Go. Through fine-tuning, NewHope achieves an impressive 99% of GPT-4โ€™s programming capabilities. Its performance was evaluated on HumanEval using the official OpenAI evaluation script, and it outperformed other models, as evident from the Pass@1 metric comparison with other state-of-the-art models from PapersWithCode. Embrace NewHope for robust and efficient coding assistance in your development projects.

OctoCoderย 

OctoCoder is an advanced AI coding language model, boasting an impressive 15.5 billion parameters. This model is the result of refining StarCoder through a process called instruction tuning, wherein it was trained on CommitPackFT and OASST datasets, as elucidated in the OctoPack research paper. OctoCoder is a polyglot, proficient in over 80 programming languages, making it a versatile tool for a wide range of coding tasks. While its extensive capabilities are remarkable, itโ€™s worth noting that OctoCoderโ€™s resource requirements might vary across hardware setups, although achieving functionality on different systems is feasible with appropriate configuration.

Redmond-Hermes-Coder 15B

Redmond-Hermes-Coder 15B is an advanced open source coding language model that has been fine-tuned on a comprehensive dataset of over 300,000 instructions. Developed by Nous Research in collaboration with Teknium, Karan4D, Redmond AI, and other contributors, this model represents the forefront of code generation technology.

Built upon the WizardCoder framework, Redmond-Hermes-Coder 15B outperforms its predecessor, the Nous-Hermes model based on Llama, in terms of coding proficiency. While it may not match the pure code benchmarks set by WizardCoder, such as HumanEval, it showcases exceptional capabilities in generating high-quality code.

Preliminary experiments have shown that Redmond-Hermes-Coder 15B achieved a score of 39% on HumanEval, while WizardCoder attained a score of 57%. However, ongoing efforts are being made to further improve its performance.

What sets Redmond-Hermes-Coder 15B apart is its versatility beyond coding tasks. In various non-code applications, including writing assignments, it has demonstrated promising results, even surpassing WizardCoder. This broadens its utility and makes it a valuable tool for both coding and writing endeavors.

DeciCoder 1B

DeciCoder 1B stands out as one of the smallest AI coding models covered in this article. Boasting just 1 billion parameters, itโ€™s designed specifically for code completion tasks. This model has been trained on code subsets from Python, Java, and Javascript found in the Starcoder Training Dataset. It employs a feature called Grouped Query Attention, which enhances its understanding of context. With a context window spanning 2048 tokens, DeciCoder 1B was fine-tuned using a Fill-in-the-Middle training objective.

Whatโ€™s intriguing is that its architecture is the result of Deciโ€™s innovative Neural Architecture Search-based technology, AutoNAC. This modelโ€™s strength lies not just in its capabilities, but in its accessibility tooโ€”itโ€™s optimized to comfortably run on most everyday hardware, ensuring ease of use for developers.

StableCode-Completion-Alpha-3B-4K

StableCode-Completion-Alpha-3B-4K stands as a remarkable open source LLM with a small 3 billion parameters. It boasts specialized decoding capabilities and is exclusively designed for code completion.

This model has been trained on a diverse range of programming languages and has gained recognition for topping the Stack Overflow Developer Survey. With its unparalleled stability and efficiency, developers can rely on StableCode-Completion-Alpha-3B-4K for single and multiline code completion tasks, accommodating context windows of up to 4,000 tokens.

StableCode-Instruct-Alpha-3B

StableCode-Instruct-Alpha-3B stands out as a remarkable open source coding language model with its 3 billion parameter decoder-only architecture. What makes this model truly stand apart is its focus on instruction-tuned code generation across a diverse spectrum of programming languages. This open source marvel has disrupted the landscape dominated by proprietary coding language models, with many even considering it to be surpassing them.

One of the key factors that elevates StableCode-Instruct-Alpha-3B is its performance in the Stack Overflow developer survey. Topping this renowned survey reflects its acceptance and appreciation within the developer community. Unlike its proprietary counterparts, this model brings an aura of inclusivity by being open source, enabling developers to engage, contribute, and enhance its capabilities collaboratively.

CodeGen2.5-7B

CodeGen2.5-7B is an advanced open-source autoregressive language model designed for program synthesis. It builds upon its predecessor, CodeGen2, and is trained on a massive dataset called StarCoderData, consisting of 1.4 trillion tokens. Despite its smaller size, CodeGen2.5-7B delivers competitive results similar to larger models like StarCoderBase-15.5B.

This LLM offers infilling capabilities, allowing it to generate code to fill in missing or incomplete sections. It supports multiple programming languages, making it versatile for developers. Additionally, it has been further trained on Python tokens to enhance its language proficiency, resulting in the CodeGen2.5-7B-mono variant.

For research purposes, CodeGen2.5-7B-instruct is available, which continues training from CodeGen2.5-7B-mono using instruction data. However, itโ€™s important to note that this variant is only intended for research and not licensed for general use.

CodeGen2.5-7B and its variants are released under the Apache-2.0 license, enabling developers and researchers to utilize and expand upon the models. With its impressive performance, language versatility, and support for multiple programming languages, CodeGen2.5-7B is a top choice for program synthesis tasks.

Replit-code-v1-3b

Replit-code-v1-3b is a 2.7B parameter causal language model focused on code completion. It was developed by Replit in partnership with MosaicML, and it is trained on a subset of the Stack Dedup v1.2 dataset. The training dataset contains 175B tokens, which were repeated over 3 epochs. In total, replit-code-v1-3b has been trained on 525B tokens (~195 tokens per parameter).

Replit-code-v1-3b is powered by state-of-the-art LLM techniques, such as:

  • Flash Attention for fast training and inference
  • AliBi positional embeddings to support variable context length at inference time
  • LionW optimizer, etc.

Replit-code-v1-3b is available for free under the CC BY-SA 4.0 license. It can be used for a variety of tasks, including:

  • Code completion
  • Code generation
  • Code debugging
  • Code linting
  • Code documentation

FalCoder

FalCoder is an impressive open source coding LLM (Language Model) built by fine-tuning the Falcon-7b base model on the CodeAlpaca 20k instructions dataset. It utilizes the QLoRA method with the PEFT library to enhance its performance. While FalCoder may not match the capabilities of larger models, it still delivers commendable results considering its size. It is a versatile option that you should definitely try out before making a decision, as it might just be the perfect fit for your coding needs.

StarCoder

StarCoder is a 15.5B parameter model trained on 80+ programming languages from The Stack (v1.2), with opt-out requests excluded. The model uses Multi Query Attention, a context window of 8192 tokens, and was trained using the Fill-in-the-Middle objective on 1 trillion tokens. The model was trained on GitHub code. As such it is not an instruction model and commands like โ€œWrite a function that computes the square root.โ€ do not work well. However, by using the Tech Assistant prompt you can turn it into a capable technical assistant.

To use the Tech Assistant prompt, simply type โ€œTech Assistantโ€ followed by your question or request. For example, you could type โ€œTech Assistant, what is the square root of 16?โ€ or โ€œTech Assistant, how do I write a function that prints the Fibonacci sequence?โ€ StarCoder will then generate a response that is relevant to your question or request.

StarCoder is still under development, but it has the potential to be a powerful tool for developers. It can be used to generate code, debug code, and even write entire programs. StarCoder can also be used as a technical assistant, providing developers with answers to their questions and helping them to solve problems.

Here are some of the features of StarCoder:

  • It can generate code in over 80 programming languages.
  • It can debug code.
  • It can write entire programs.
  • It can be used as a technical assistant.

Here are some of the strengths and weaknesses of StarCoder:

  • Strengths:
    • It is a powerful tool for developers.
    • It can generate code, debug code, and even write entire programs.
    • It can be used as a technical assistant.
  • Weaknesses:
    • It is still under development.
    • It is not as accurate as some other LLMs.
    • It can be difficult to use.

CodeGen

CodeGen is a family of autoregressive language models for program synthesis. It was developed by Salesforce and is based on the Transformer architecture. CodeGen is trained on a massive dataset of code and natural language descriptions, and it can generate code from natural language descriptions.

CodeGen has been shown to be effective at generating code from natural language descriptions. In a study, CodeGen was able to generate code that was as accurate as code generated by human experts. CodeGen is also able to generate code that is more efficient and easier to understand than code generated by human experts.

CodeGen is a promising new tool for program synthesis. It has the potential to revolutionize the way that software is developed. CodeGen can be used to generate code from natural language descriptions, which can save developers a lot of time and effort. CodeGen can also be used to generate more efficient and easier to understand code, which can improve the quality of software.

Here are some of the features of CodeGen:

  • It can generate code from natural language descriptions.
  • It is trained on a massive dataset of code and natural language descriptions.
  • It has been shown to be effective at generating code from natural language descriptions.
  • It is able to generate code that is as accurate as code generated by human experts.
  • It is able to generate code that is more efficient and easier to understand than code generated by human experts.

Here are some of the strengths and weaknesses of CodeGen:

  • Strengths:
    • It can generate code from natural language descriptions.
    • It is trained on a massive dataset of code and natural language descriptions.
    • It has been shown to be effective at generating code from natural language descriptions.
    • It is able to generate code that is as accurate as code generated by human experts.
    • It is able to generate code that is more efficient and easier to understand than code generated by human experts.
  • Weaknesses:
    • It is still under development, so it may not be as accurate or efficient as human experts.
    • It can be difficult to use, as it requires users to be familiar with natural language processing and programming languages.

CodeT5+

CodeT5+ is a large language model (LLM) that is specifically designed for code understanding and generation. It is based on the T5 architecture, but it has been extended to incorporate code-specific knowledge. CodeT5+ is trained on a massive dataset of code and natural language, which allows it to learn the relationships between code and its meaning.

Features

CodeT5+ has a number of features that make it well-suited for code understanding and generation tasks. These features include:

A large vocabulary of code tokens
The ability to understand the structure of code
The ability to generate code that is both correct and idiomatic
The ability to translate between code and natural language

Strengths

CodeT5+ has a number of strengths, including:

It can be used for a wide range of code understanding and generation tasks.
It is able to learn the relationships between code and its meaning.
It can generate code that is both correct and idiomatic.
It can translate between code and natural language.

Weaknesses

CodeT5+ has a few weaknesses, including:

It can be computationally expensive to train and use.
It is not as accurate as some other LLMs for some tasks.
It is still under development, so it may not be able to handle all code understanding and generation tasks.

Latest articles

Related articles

Best Spanish LLM Model

5 Best Japanese LLM Model