English | 简体中文
PaddleClas supports rapid service deployment through PaddleHub. Currently, the deployment of image classification is supported. Please look forward to the deployment of image recognition.
- 1 Introduction
- 2. Prepare the environment
- 3. Download the inference model
- 4. Install the service module
- 5. Start service
- 6. Send prediction requests
- 7. User defined service module modification
The hubserving service deployment configuration service package clas
contains 3 required files, the directories are as follows:
deploy/hubserving/clas/
├── __init__.py # Empty file, required
├── config.json # Configuration file, optional, passed in as a parameter when starting the service with configuration
├── module.py # The main module, required, contains the complete logic of the service
└── params.py # Parameter file, required, including model path, pre- and post-processing parameters and other parameters
# Install paddlehub, version 2.1.0 is recommended
python3.7 -m pip install paddlehub==2.1.0 --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
Before installing the service module, you need to prepare the inference model and put it in the correct path. The default model path is:
- Classification inference model structure file:
PaddleClas/inference/inference.pdmodel
- Classification inference model weight file:
PaddleClas/inference/inference.pdiparams
Notice:
-
Model file paths can be viewed and modified in
PaddleClas/deploy/hubserving/clas/params.py
:"inference_model_dir": "../inference/"
-
Model files (including
.pdmodel
and.pdiparams
) must be namedinference
. -
We provide a large number of pre-trained models based on the ImageNet-1k dataset. For the model list and download address, see Model Library Overview, or you can use your own trained and converted models.
-
In the Linux environment, the installation example is as follows:
cd PaddleClas/deploy # Install the service module: hub install hubserving/clas/
-
In the Windows environment (the folder separator is
\
), the installation example is as follows:cd PaddleClas\deploy # Install the service module: hub install hubserving\clas\
This method only supports prediction using CPU. Start command:
hub serving start \
--modules clas_system
--port 8866
This completes the deployment of a serviced API, using the default port number 8866.
Parameter Description:
parameters | uses |
---|---|
--modules/-m | [required] PaddleHub Serving pre-installed model, listed in the form of multiple Module==Version key-value pairsWhen no Version is specified, the latest is selected by default version |
--port/-p | [OPTIONAL] Service port, default is 8866 |
--use_multiprocess | [Optional] Whether to enable the concurrent mode, the default is single-process mode, it is recommended to use this mode for multi-core CPU machinesWindows operating system only supports single-process mode |
--workers | [Optional] The number of concurrent tasks specified in concurrent mode, the default is 2*cpu_count-1 , where cpu_count is the number of CPU cores |
For more deployment details, see PaddleHub Serving Model One-Click Service Deployment |
This method only supports prediction using CPU or GPU. Start command:
hub serving start -c config.json
Among them, the format of config.json
is as follows:
{
"modules_info": {
"clas_system": {
"init_args": {
"version": "1.0.0",
"use_gpu": true,
"enable_mkldnn": false
},
"predict_args": {
}
}
},
"port": 8866,
"use_multiprocess": false,
"workers": 2
}
Parameter Description:
- The configurable parameters in
init_args
are consistent with the_initialize
function interface inmodule.py
. in,- When
use_gpu
istrue
, it means to use GPU to start the service. - When
enable_mkldnn
istrue
, it means to use MKL-DNN acceleration.
- When
- The configurable parameters in
predict_args
are consistent with thepredict
function interface inmodule.py
.
Notice:
- When using the configuration file to start the service, the parameter settings in the configuration file will be used, and other command line parameters will be ignored;
- If you use GPU prediction (ie,
use_gpu
is set totrue
), you need to set theCUDA_VISIBLE_DEVICES
environment variable to specify the GPU card number used before starting the service, such as:export CUDA_VISIBLE_DEVICES=0
; use_gpu
cannot betrue
at the same time asuse_multiprocess
;- ** When both
use_gpu
andenable_mkldnn
aretrue
,enable_mkldnn
will be ignored and GPU** will be used.
If you use GPU No. 3 card to start the service:
cd PaddleClas/deploy
export CUDA_VISIBLE_DEVICES=3
hub serving start -c hubserving/clas/config.json
After configuring the server, you can use the following command to send a prediction request to get the prediction result:
cd PaddleClas/deploy
python3.7 hubserving/test_hubserving.py \
--server_url http://127.0.0.1:8866/predict/clas_system \
--image_file ./hubserving/ILSVRC2012_val_00006666.JPEG \
--batch_size 8
Predicted output
The result(s): class_ids: [57, 67, 68, 58, 65], label_names: ['garter snake, grass snake', 'diamondback, diamondback rattlesnake, Crotalus adamanteus', 'sidewinder, horned rattlesnake, Crotalus cerastes' , 'water snake', 'sea snake'], scores: [0.21915, 0.15631, 0.14794, 0.13177, 0.12285]
The average time of prediction cost: 2.970 s/image
The average time cost: 3.014 s/image
The average top-1 score: 0.110
Script parameter description:
- server_url: Service address, the format is
http://[ip_address]:[port]/predict/[module_name]
. - image_path: The test image path, which can be a single image path or an image collection directory path.
- batch_size: [OPTIONAL] Make predictions in
batch_size
size, default is1
. - resize_short: [optional] When preprocessing, resize by short edge, default is
256
. - crop_size: [Optional] The size of the center crop during preprocessing, the default is
224
. - normalize: [Optional] Whether to perform
normalize
during preprocessing, the default isTrue
. - to_chw: [Optional] Whether to adjust to
CHW
order when preprocessing, the default isTrue
.
Note: If you use Transformer
series models, such as DeiT_***_384
, ViT_***_384
, etc., please pay attention to the input data size of the model, you need to specify --resize_short=384 -- crop_size=384
.
Return result format description: The returned result is a list (list), including the top-k classification results, the corresponding scores, and the time-consuming prediction of this image, as follows:
list: return result
└──list: first image result
├── list: the top k classification results, sorted in descending order of score
├── list: the scores corresponding to the first k classification results, sorted in descending order of score
└── float: The image classification time, in seconds
If you need to modify the service logic, you need to do the following:
-
Stop the service
hub serving stop --port/-p XXXX
-
Go to the corresponding
module.py
andparams.py
and other files to modify the code according to actual needs.module.py
needs to be reinstalled after modification (hub install hubserving/clas/
) and deployed. Before deploying, you can use thepython3.7 hubserving/clas/module.py
command to quickly test the code ready for deployment. -
Uninstall the old service pack
hub uninstall clas_system
-
Install the new modified service pack
hub install hubserving/clas/
-
Restart the service
hub serving start -m clas_system
Notice:
Common parameters can be modified in PaddleClas/deploy/hubserving/clas/params.py
:
- To replace the model, you need to modify the model file path parameters:
"inference_model_dir":
- Change the number of
top-k
results returned when postprocessing:'topk':
- The mapping file corresponding to the lable and class id when changing the post-processing:
'class_id_map_file':
In order to avoid unnecessary delay and be able to predict with batch_size, data preprocessing logic (including resize
, crop
and other operations) is completed on the client side, so it needs to modify data preprocessing logic related code in PaddleClas/deploy/hubserving/test_hubserving.py# L41-L47 and PaddleClas/deploy/hubserving/test_hubserving.py#L51-L76.