Users can serve a model with Inferflow by editing a model specification file. We have built predefined specification files for some popular or representative models. Below is a list of such models.
- Aquila (aquila_chat2_34b)
- Baichuan (baichuan2_7b_chat, baichuan2_13b_chat)
- BERT (bert-base-multilingual-cased)
- Bloom (bloomz_3b)
- ChatGLM (chatglm2_6b)
- Deepseek (deepseek_moe_16b_base)
- Facebook m2m100 (facebook_m2m100_418m)
- Falcon (falcon_7b_instruct, falcon_40b_instruct)
- FuseLLM (fusellm_7b)
- Gemma (gemma_2b_it)
- Internlm (internlm-chat-20b)
- LLAMA2 (llama2_7b, llama2_7b_chat, llama2_13b_chat)
- MiniCPM (minicpm_2b_dpo_bf16)
- Mistral (mistral_7b_instruct)
- Mixtral (mixtral_8x7b_instruct_v0.1)
- Open LLAMA (open_llama_3b)
- OPT (opt_350m, opt_13b, opt_iml_max_30b)
- Orion (orion_14b_chat)
- Phi-2 (phi_2)
- Qwen (qwen1.5_7b_chat)
- XVERSE (xverse_13b_chat)
- YI (yi_6b, yi_34b_chat)