Supported models

Steps

How to support a new model

Model Support

FastChat

FastChat is an open platform for training, serving, and evaluating large language model based chatbots

Install

Chatbot Arena is an LLM benchmark platform featuring anonymous, randomized battles, available at https://chat.lmsys.org. We invite the entire community to join this benchmarking effort by contributing your votes and models.

Chatbot Arena

We integrated [AWQ](https://github.com/mit-han-lab/llm-awq) into FastChat to provide **efficient and accurate** 4bit LLM inference.

AWQ 4bit Inference

Support GPTQ 4bit inference with [GPTQ-for-LLaMa](https://github.com/qwopqwop200/GPTQ-for-LLaMa).

GPTQ 4bit Inference

[LangChain](https://python.langchain.com/en/latest/index.html) is a library that facilitates the development of applications by leveraging large language models (LLMs) and enabling their composition with other sources of computation or knowledge. FastChat's OpenAI-compatible [API server](openai_api) enables using LangChain with open models seamlessly.

Local LangChain with FastChat

FastChat Server Architecture

You can use the following command to train FastChat-T5 with 4 x A100 (40GB).

Fine-tuning FastChat-T5

Vicuna Weights

You can use [vLLM](https://vllm.ai/) as an optimized worker implementation in FastChat. It offers advanced continuous batching and a much higher (~10x) throughput. See the supported models [here](https://vllm.readthedocs.io/en/latest/models/supported_models.html).

Welcome to FastChat

Using FastChat

Model Support

Supported models

How to support a new model

Steps

Welcome to FastChat

Using FastChat

​Supported models

​How to support a new model

​Steps

Supported models

How to support a new model

Steps