Using FastChat
Model Support
Supported models
- meta-llama/Llama-2-7b-chat-hf
- example:
python3 -m fastchat.serve.cli --model-path meta-llama/Llama-2-7b-chat-hf
- example:
- Vicuna, Alpaca, LLaMA, Koala
- example:
python3 -m fastchat.serve.cli --model-path lmsys/vicuna-7b-v1.3
- example:
- BAAI/AquilaChat-7B
- BAAI/bge-large-en
- baichuan-inc/baichuan-7B
- BlinkDL/RWKV-4-Raven
- example:
python3 -m fastchat.serve.cli --model-path ~/model_weights/RWKV-4-Raven-7B-v11x-Eng99%-Other1%-20230429-ctx8192.pth
- example:
- bofenghuang/vigogne-2-7b-instruct
- bofenghuang/vigogne-2-7b-chat
- camel-ai/CAMEL-13B-Combined-Data
- codellama/CodeLlama-7b-Instruct-hf
- databricks/dolly-v2-12b
- FlagAlpha/Llama2-Chinese-13b-Chat
- FreedomIntelligence/phoenix-inst-chat-7b
- FreedomIntelligence/ReaLM-7b-v1
- h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b
- internlm/internlm-chat-7b
- lcw99/polyglot-ko-12.8b-chang-instruct-chat
- lmsys/fastchat-t5-3b-v1.0
- mosaicml/mpt-7b-chat
- example:
python3 -m fastchat.serve.cli --model-path mosaicml/mpt-7b-chat
- example:
- Neutralzz/BiLLa-7B-SFT
- nomic-ai/gpt4all-13b-snoozy
- NousResearch/Nous-Hermes-13b
- openaccess-ai-collective/manticore-13b-chat-pyg
- OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5
- VMware/open-llama-7b-v2-open-instruct
- Phind/Phind-CodeLlama-34B-v2
- project-baize/baize-v2-7b
- Qwen/Qwen-7B-Chat
- Salesforce/codet5p-6b
- StabilityAI/stablelm-tuned-alpha-7b
- THUDM/chatglm-6b
- THUDM/chatglm2-6b
- tiiuae/falcon-40b
- tiiuae/falcon-180B-chat
- timdettmers/guanaco-33b-merged
- togethercomputer/RedPajama-INCITE-7B-Chat
- WizardLM/WizardLM-13B-V1.0
- WizardLM/WizardCoder-15B-V1.0
- HuggingFaceH4/starchat-beta
- Any EleutherAI pythia model such as pythia-6.9b
- Any Peft adapter trained on top of a
model above. To activate, must have
peft
in the model path. Note: If loading multiple peft models, you can have them share the base model weights by setting the environment variablePEFT_SHARE_BASE_WEIGHTS=true
in any model worker.
How to support a new model
To support a new model in FastChat, you need to correctly handle its prompt template and model loading. The goal is to make the following command run with the correct prompts.
You can run this example command to learn the code logic.
You can add --debug
to see the actual prompt sent to the model.
Steps
FastChat uses the Conversation
class to handle prompt templates and BaseModelAdapter
class to handle model loading.
- Implement a conversation template for the new model at fastchat/conversation.py. You can follow existing examples and use
register_conv_template
to add a new one. Please also add a link to the official reference code if possible. - Implement a model adapter for the new model at fastchat/model/model_adapter.py. You can follow existing examples and use
register_model_adapter
to add a new one. - (Optional) add the model name to the “Supported models” section above and add more information in fastchat/model/model_registry.py.
After these steps, the new model should be compatible with most FastChat features, such as CLI, web UI, model worker, and OpenAI-compatible API server. Please do some testing with these features as well.