python3 -m fastchat.serve.cli --model-path meta-llama/Llama-2-7b-chat-hf
python3 -m fastchat.serve.cli --model-path lmsys/vicuna-7b-v1.3
python3 -m fastchat.serve.cli --model-path ~/model_weights/RWKV-4-Raven-7B-v11x-Eng99%-Other1%-20230429-ctx8192.pth
python3 -m fastchat.serve.cli --model-path mosaicml/mpt-7b-chat
peft
in the model path. Note: If
loading multiple peft models, you can have them share the base model weights by
setting the environment variable PEFT_SHARE_BASE_WEIGHTS=true
in any model
worker.--debug
to see the actual prompt sent to the model.
Conversation
class to handle prompt templates and BaseModelAdapter
class to handle model loading.
register_conv_template
to add a new one. Please also add a link to the official reference code if possible.register_model_adapter
to add a new one.