-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track token scores #571
Track token scores #571
Conversation
Yes, it is a useful feature and will benefit some users who want to get confidence for each token. In addition to the modified_beam_search, could you also update the greedy_search? |
f8d2abd
to
6c5d358
Compare
Okay, i extended it also for the |
6c5d358
to
a4bc688
Compare
Hi Fangyun @csukuangfj , I manually tested it with CPU compiled sherpa-onnx through python API. In the GH test-workflow I see some segmentation fault for Online-CTC in linux-gpu test. Do you also run the tests locally to investigate ? |
Thanks! I will have time after today. Will look into it tomorrow. |
I see the error. It is a use-after-free error. Please see my comments about the code |
|
||
/// Helper for `OnlineRecognizerResult::AsJsonString()` | ||
template<typename T> | ||
const std::string& VecToString(const std::vector<T>& vec, int32_t precision = 6) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const std::string& VecToString(const std::vector<T>& vec, int32_t precision = 6) { | |
std::string VecToString(const std::vector<T>& vec, int32_t precision = 6) { |
|
||
/// Helper for `OnlineRecognizerResult::AsJsonString()` | ||
template<> // explicit specialization for T = std::string | ||
const std::string& VecToString<std::string>(const std::vector<std::string>& vec, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const std::string& VecToString<std::string>(const std::vector<std::string>& vec, | |
std::string VecToString<std::string>(const std::vector<std::string>& vec, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please never return a temporary reference from a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, thank you.
My bad. I remember in some context returning a local object as a const reference worked, but here I did it badly.
https://stackoverflow.com/questions/13318257/const-reference-to-temporary-vs-return-value-optimization
Best, K.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember in some context returning a local object as a const reference worked
Yes, you are right. But in this case, the temporary reference is from the function return value of a temporary object.
Think twice, there are two indirections.
Suppose obj
is a local variable in a function. It is totally fine to call
return obj
to return a const reference.
But it is not valid to invoke
return obj.someMethod();
since obj
is destroyed when the function returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right.
The code is fixed now. I tested the JSON output from python, and it looks fine.
Also now the lm_probs and contexts scores are filled only if LM and ContextGraph are used.
Otherwise there are empty lists to save bandwidth.
Thank you for finding the bug,
Karel
0ab7a40
to
e442564
Compare
I see that you have removed WIP. Do you think it is ready for review and merge? |
e442564
to
a164e13
Compare
- for best path of the modified-beam-search decoding of transducer
…1 API of OnlineRecognitionResult
- export un-scaled lm_probs (modified-beam search, online-transducer) - polishing
48e4a6a
to
4955fe8
Compare
Yes, it is almost ready to be merged. It is ready for the review. |
Ok, the workflow tests seem fine, including the linter style-check... It is ready... |
I will have a second look after dinner. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again!
Left some minor comments. Otherwise, it looks good to me.
std::ostringstream oss; | ||
oss << "[ "; | ||
std::string sep = ""; | ||
for (auto item : vec) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for (auto item : vec) { | |
for (const auto i&tem : vec) { |
LogSoftmax(p_logit, vocab_size); // renormalize probabilities, | ||
// save time by doing it only for | ||
// emitted symbols | ||
float *p_logprob = p_logit; // rename p_logit as p_logprob, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
float *p_logprob = p_logit; // rename p_logit as p_logprob, | |
const float *p_logprob = p_logit; // rename p_logit as p_logprob, |
4955fe8
to
2cba75f
Compare
okay, the two suggestions are implemented.... thank you... |
Thanks! |
* add export of per-token scores (ys, lm, context) - for best path of the modified-beam-search decoding of transducer * refactoring JSON export of OnlineRecognitionResult, extending pybind11 API of OnlineRecognitionResult * export per-token scores also for greedy-search (online-transducer) - export un-scaled lm_probs (modified-beam search, online-transducer) - polishing * fill lm_probs/context_scores only if LM/ContextGraph is present (make Result smaller)
Hello,
I modified the code, so that the per-token scores are accessible via python API
(ys_probs, lm_probs, context_scores).
This is tracked during the modified beam search decoding of online transduces,
and it is exported for the BestHypothesis.
Would you be interesting in having this functionality in the codebase ?
This can be used as confidences from the client side...
But I did not evaluate it yet, to see how "useful" these numbers are.
I am ready to do changes as well. I'd like to open the discussion...
Best regards
K. Veselý