Skip to content

Commit

Permalink
scripts/vsmlrt.py: document fp16 behaviour of the ort_cuda backend
Browse files Browse the repository at this point in the history
  • Loading branch information
WolframRhodium committed Apr 20, 2024
1 parent 954733e commit 61682d2
Showing 1 changed file with 12 additions and 0 deletions.
12 changes: 12 additions & 0 deletions scripts/vsmlrt.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,18 @@ class ORT_CUDA:
basic performance tuning:
set fp16 = True (on RTX GPUs)
Semantics of `fp16`:
Enabling `fp16` will use a built-in quantization that converts a fp32 onnx to a fp16 onnx.
If the input video is of half-precision floating-point format,
the generated fp16 onnx will use fp16 input.
The output format can be controlled by the `output_format` option (0 = fp32, 1 = fp16).
Disabling `fp16` will not use the built-in quantization.
However, if the onnx file itself uses fp16 for computation,
the actual computation will be done in fp16.
In this case, the input video format should match the input format of the onnx,
and the output format is inferred from the onnx.
"""

device_id: int = 0
Expand Down

0 comments on commit 61682d2

Please sign in to comment.