You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi friends 👋. Bumblebee is an amazing project, and I'm excited about the prospect of integrating it into my Phoenix LiveView web app.
Description of Problem
speech_to_text_whisper_chunk only supports the raw text, start time, and stop time for that chunk as outputs. There is nothing comparable to (or at least no easy way to replicate) the per-segment avg_logprob that the Python-native Whisper API gives you.
Opportunity Statement (example use case)
AI-generated transcripts are getting better, but still often need to be cleaned by a human if you want to use them in a professional or research setting. Human cleaning of transcripts can be performed much more efficiently if attention can be directed to the places where the model was the least confident with its solution.
For example, I'd like to use the confidences/probabilities to return transcripts to users in a .docx format, where tokens/segments with low confidence are highlighted.
The text was updated successfully, but these errors were encountered:
I would really like Whisper to produce subtitles which have constraints (like making only lines with 36 chars max), so supporting an option that gives word_timestamps would be very nice.
Hi friends 👋. Bumblebee is an amazing project, and I'm excited about the prospect of integrating it into my Phoenix LiveView web app.
Description of Problem
speech_to_text_whisper_chunk
only supports the raw text, start time, and stop time for that chunk as outputs. There is nothing comparable to (or at least no easy way to replicate) the per-segmentavg_logprob
that the Python-native Whisper API gives you.Opportunity Statement (example use case)
AI-generated transcripts are getting better, but still often need to be cleaned by a human if you want to use them in a professional or research setting. Human cleaning of transcripts can be performed much more efficiently if attention can be directed to the places where the model was the least confident with its solution.
For example, I'd like to use the confidences/probabilities to return transcripts to users in a .docx format, where tokens/segments with low confidence are highlighted.
The text was updated successfully, but these errors were encountered: