Skip to content

transformers for production runtime, 3x faster on cpu, no pytorch nor tensorflow included

License

Notifications You must be signed in to change notification settings

billju/onnxruntime_transformers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

onnxruntime_transformers

transformers for production runtime, 3x faster on cpu, no pytorch nor tensorflow included

convert models to onnx

install converter pip install optimum[exporters]

convert embedding model to onnx

optimum-cli export onnx --task sentence-similarity --model "infgrad/stella-base-zh-v3-1792d" bert_embed

convert sentence correction model to onnx

optimum-cli export onnx --task fill-mask --model "shibing624/macbert4csc-base-chinese" bert_csc

convert ner model to onnx

optimum-cli export onnx --task token-classification --model "shibing624/bert4ner-base-chinese" bert_ner

inference with onnx

generate embeddings

from onnxruntime_transformers import OnnxruntimeTransformers
encoder = OnnxruntimeTransformers("./bert_embed/tokenizer.json", "./bert_embed/model.onnx")
embeddings = encoder.encode([
    "how are you",
    "I'm fine thank you, and you?",
])

About

transformers for production runtime, 3x faster on cpu, no pytorch nor tensorflow included

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages