diff --git a/docs/en/user_guides/models.md b/docs/en/user_guides/models.md index c93550a..1075987 100644 --- a/docs/en/user_guides/models.md +++ b/docs/en/user_guides/models.md @@ -1 +1,100 @@ # Prepare Models + +To support the evaluation of new models in OpenCompass, there are several ways: + +1. HuggingFace-based models +2. API-based models +3. Custom models + +## HuggingFace-based Models + +In OpenCompass, we support constructing evaluation models directly from HuggingFace's +`AutoModel.from_pretrained` and `AutoModelForCausalLM.from_pretrained` interfaces. If the model to be +evaluated follows the typical generation interface of HuggingFace models, there is no need to write code. You +can simply specify the relevant configurations in the configuration file. + +Here is an example configuration file for a HuggingFace-based model: + +```python +# Use `HuggingFace` to evaluate models supported by AutoModel. +# Use `HuggingFaceCausalLM` to evaluate models supported by AutoModelForCausalLM. +from opencompass.models import HuggingFaceCausalLM + +models = [ + dict( + type=HuggingFaceCausalLM, + # Parameters for `HuggingFaceCausalLM` initialization. + path='huggyllama/llama-7b', + tokenizer_path='huggyllama/llama-7b', + tokenizer_kwargs=dict(padding_side='left', truncation_side='left'), + max_seq_len=2048, + batch_padding=False, + # Common parameters shared by various models, not specific to `HuggingFaceCausalLM` initialization. + abbr='llama-7b', # Model abbreviation used for result display. + max_out_len=100, # Maximum number of generated tokens. + batch_size=16, # The size of a batch during inference. + run_cfg=dict(num_gpus=1), # Run configuration to specify resource requirements. + ) +] +``` + +Explanation of some of the parameters: + +- `batch_padding=False`: If set to False, each sample in a batch will be inferred individually. If set to True, + a batch of samples will be padded and inferred together. For some models, such padding may lead to + unexpected results. If the model being evaluated supports sample padding, you can set this parameter to True + to speed up inference. +- `padding_side='left'`: Perform padding on the left side. Not all models support padding, and padding on the + right side may interfere with the model's output. +- `truncation_side='left'`: Perform truncation on the left side. The input prompt for evaluation usually + consists of both the in-context examples prompt and the input prompt. If the right side of the input prompt + is truncated, it may cause the input of the generation model to be inconsistent with the expected format. + Therefore, if necessary, truncation should be performed on the left side. + +During evaluation, OpenCompass will instantiate the evaluation model based on the `type` and the +initialization parameters specified in the configuration file. Other parameters are used for inference, +summarization, and other processes related to the model. For example, in the above configuration file, we will +instantiate the model as follows during evaluation: + +```python +model = HuggingFaceCausalLM( + path='huggyllama/llama-7b', + tokenizer_path='huggyllama/llama-7b', + tokenizer_kwargs=dict(padding_side='left', truncation_side='left'), + max_seq_len=2048, +) +``` + +## API-based Models + +Currently, OpenCompass supports API-based model inference for the following: + +- OpenAI (`opencompass.models.OpenAI`) +- More coming soon + +Let's take the OpenAI configuration file as an example to see how API-based models are used in the +configuration file. + +```python +from opencompass.models import OpenAI + +models = [ + dict( + type=OpenAI, # Using the OpenAI model + # Parameters for `OpenAI` initialization + path='gpt-4', # Specify the model type + key='YOUR_OPENAI_KEY', # OpenAI API Key + max_seq_len=2048, # The max input number of tokens + # Common parameters shared by various models, not specific to `OpenAI` initialization. + abbr='GPT-4', # Model abbreviation used for result display. + max_out_len=512, # Maximum number of generated tokens. + batch_size=1, # The size of a batch during inference. + run_cfg=dict(num_gpus=0), # Resource requirements (no GPU needed) + ), +] +``` + +# Custom Models + +If the above methods do not support your model evaluation requirements, you can refer to +[Supporting New Models](../advanced_guides/new_model.md) to add support for new models in OpenCompass. diff --git a/docs/zh_cn/user_guides/models.md b/docs/zh_cn/user_guides/models.md index d24ab7b..71f4199 100644 --- a/docs/zh_cn/user_guides/models.md +++ b/docs/zh_cn/user_guides/models.md @@ -1 +1,91 @@ # 准备模型 + +要在 OpenCompass 中支持新模型的评测,有以下几种方式: + +1. 基于 HuggingFace 的模型 +2. 基于 API 的模型 +3. 自定义模型 + +## 基于 HuggingFace 的模型 + +在 OpenCompass 中,我们支持直接从 Huggingface 的 `AutoModel.from_pretrained` 和 +`AutoModelForCausalLM.from_pretrained` 接口构建评测模型。如果需要评测的模型符合 HuggingFace 模型通常的生成接口, +则不需要编写代码,直接在配置文件中指定相关配置即可。 + +如下,为一个示例的 HuggingFace 模型配置文件: + +```python +# 使用 `HuggingFace` 评测 HuggingFace 中 AutoModel 支持的模型 +# 使用 `HuggingFaceCausalLM` 评测 HuggingFace 中 AutoModelForCausalLM 支持的模型 +from opencompass.models import HuggingFaceCausalLM + +models = [ + dict( + type=HuggingFaceCausalLM, + # 以下参数为 `HuggingFaceCausalLM` 的初始化参数 + path='huggyllama/llama-7b', + tokenizer_path='huggyllama/llama-7b', + tokenizer_kwargs=dict(padding_side='left', truncation_side='left'), + max_seq_len=2048, + batch_padding=False, + # 以下参数为各类模型都有的参数,非 `HuggingFaceCausalLM` 的初始化参数 + abbr='llama-7b', # 模型简称,用于结果展示 + max_out_len=100, # 最长生成 token 数 + batch_size=16, # 批次大小 + run_cfg=dict(num_gpus=1), # 运行配置,用于指定资源需求 + ) +] +``` + +对以上一些参数的说明: + +- `batch_padding=False`:如为 False,会对一个批次的样本进行逐一推理;如为 True,则会对一个批次的样本进行填充, + 组成一个 batch 进行推理。对于部分模型,这样的填充可能导致意料之外的结果;如果评测的模型支持样本填充, + 则可以将该参数设为 True,以加速推理。 +- `padding_side='left'`:在左侧进行填充,因为不是所有模型都支持填充,在右侧进行填充可能会干扰模型的输出。 +- `truncation_side='left'`:在左侧进行截断,评测输入的 prompt 通常包括上下文样本 prompt 和输入 prompt 两部分, + 如果截断右侧的输入 prompt,可能导致生成模型的输入和预期格式不符,因此如有必要,应对左侧进行截断。 + +在评测时,OpenCompass 会使用配置文件中的 `type` 与各个初始化参数实例化用于评测的模型, +其他参数则用于推理及总结等过程中,与模型相关的配置。例如上述配置文件,我们会在评测时进行如下实例化过程: + +```python +model = HuggingFaceCausalLM( + path='huggyllama/llama-7b', + tokenizer_path='huggyllama/llama-7b', + tokenizer_kwargs=dict(padding_side='left', truncation_side='left'), + max_seq_len=2048, +) +``` + +# 基于 API 的模型 + +OpenCompass 目前支持以下基于 API 的模型推理: + +- OpenAI(`opencompass.models.OpenAI`) +- Coming soon + +以下,我们以 OpenAI 的配置文件为例,模型如何在配置文件中使用基于 API 的模型。 + +```python +from opencompass.models import OpenAI + +models = [ + dict( + type=OpenAI, # 使用 OpenAI 模型 + # 以下为 `OpenAI` 初始化参数 + path='gpt-4', # 指定模型类型 + key='YOUR_OPENAI_KEY', # OpenAI API Key + max_seq_len=2048, # 最大输入长度 + # 以下参数为各类模型都有的参数,非 `OpenAI` 的初始化参数 + abbr='GPT-4', # 模型简称 + run_cfg=dict(num_gpus=0), # 资源需求(不需要 GPU) + max_out_len=512, # 最长生成长度 + batch_size=1, # 批次大小 + ), +] +``` + +# 自定义模型 + +如果以上方式无法支持你的模型评测需求,请参考 [支持新模型](../advanced_guides/new_model.md) 在 OpenCompass 中增添新的模型支持。