solve zipformer streaming gpu inference #961

whaozl · 2023-03-23T06:34:59Z

No description provided.

yaozengwei · 2023-03-23T06:48:49Z

The script jit_trace_export.py exports model with torch.jit.trace. Why replace it with torch.jit.script?

yfyeung

Please explain the reason for this modification.

whaozl · 2023-03-23T07:09:32Z

@yaozengwei because it has a error RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! when using the torch.jit.trace on encode model.

see k2-fsa/sherpa#346

so, only need encode to torch.jit.script.
the decode and joiner keep the same as torch.jit.trace.

OK

yaozengwei · 2023-03-23T09:14:51Z

Could you try to convert the model to cuda device instead of cpu when doing the jit.trace exporting (See

icefall/egs/librispeech/ASR/pruned_transducer_stateless7_streaming/jit_trace_export.py

Line 290 in d74822d

model.to("cpu")

)?
We also need to create the inputs on cuda device in this case. (See

icefall/egs/librispeech/ASR/pruned_transducer_stateless7_streaming/jit_trace_export.py

Line 130 in d74822d

x = torch.zeros(1, T, 80, dtype=torch.float32)

)
I wonder if we need to export the model on cuda device when we want to run the model on cuda device.

See https://pytorch.org/docs/stable/jit.html#frequently-asked-questions

whaozl · 2023-03-23T09:34:32Z

@yaozengwei

there are two scenes:
1、convert the model to cuda device instead of cpu when doing the jit.trace exporting
model.to("cuda:0")
it has a error RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
or
torch.jit._trace.TracingCheckError: Tracing failed sanity checks!
ERROR: Graphs differed across invocations!
because https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless7_streaming/zipformer.py#L2280-L2281
,but it modify:

rows = torch.arange(start=time1 - 1, end=-1, step=-1)

rows = torch.arange(start=time1 - 1, end=-1, step=-1).cuda()

2、when use cpu to export. the sherpa online[https://github.com/k2-fsa/sherpa/blob/master/sherpa/cpp_api/bin/online-recognizer.cc] use gpu inference, it had a error
Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

so, the best method is to modify the encode export style. using torch.jit.script.

yaozengwei · 2023-03-23T09:42:27Z

Could you successfully export the model if you do the change rows = torch.arange(start=time1 - 1, end=-1, step=-1).cuda()?

The reason why we export with jit.trace instead of jit.script is some inference frameworks need that.

whaozl · 2023-03-23T12:02:29Z

when use rows = torch.arange(start=time1 - 1, end=-1, step=-1).cuda(), it failed.

so I try to use torch.jit.script for the encode model. then, use sherpa online. it can run successfullly when use_gpu.

yaozengwei · 2023-03-24T02:33:09Z

when use rows = torch.arange(start=time1 - 1, end=-1, step=-1).cuda(), it failed.

so I try to use torch.jit.script for the encode model. then, use sherpa online. it can run successfullly when use_gpu.

Ok. The exported encoder that you are running on cuda device is jit.script version.

solve zipformer streaming gpu inference

1559f9c

yfyeung reviewed Mar 23, 2023

View reviewed changes

This comment was marked as duplicate.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

solve zipformer streaming gpu inference #961

solve zipformer streaming gpu inference #961

whaozl commented Mar 23, 2023

yaozengwei commented Mar 23, 2023

yfyeung left a comment

This comment was marked as duplicate.

whaozl commented Mar 23, 2023

yaozengwei commented Mar 23, 2023

whaozl commented Mar 23, 2023

yaozengwei commented Mar 23, 2023

whaozl commented Mar 23, 2023

yaozengwei commented Mar 24, 2023

solve zipformer streaming gpu inference #961

Are you sure you want to change the base?

solve zipformer streaming gpu inference #961

Conversation

whaozl commented Mar 23, 2023

yaozengwei commented Mar 23, 2023

yfyeung left a comment

Choose a reason for hiding this comment

This comment was marked as duplicate.

whaozl commented Mar 23, 2023

yaozengwei commented Mar 23, 2023

whaozl commented Mar 23, 2023

yaozengwei commented Mar 23, 2023

whaozl commented Mar 23, 2023

yaozengwei commented Mar 24, 2023