Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chat Streaming doesn't work with audio modalities #398

Closed
vbandi opened this issue Jan 10, 2025 · 1 comment · Fixed by #399 · May be fixed by RageAgainstThePixel/com.openai.unity#323
Closed

Chat Streaming doesn't work with audio modalities #398

vbandi opened this issue Jan 10, 2025 · 1 comment · Fixed by #399 · May be fixed by RageAgainstThePixel/com.openai.unity#323
Assignees
Labels
bug Something isn't working

Comments

@vbandi
Copy link

vbandi commented Jan 10, 2025

Bug Report

Overview

The gpt-4o-audio-preview-2024-12-17 model allows for audio as an output modality. This doesn't seem to work.

To Reproduce

Steps to reproduce the behavior:

The code below doesn't have any Delta or any data in the chunks.

async Task GPTSpeech()
{
    var client = new OpenAIClient();
    var speaker = new SpeakerOutput();

    var chatRequest = new ChatRequest([new Message(Role.System, "Count from 1 to 10. Whisper please")],
        audioConfig: new AudioConfig(Voice.Nova), model: "gpt-4o-audio-preview-2024-12-17");  // Doesn't seem to work... OpenAI Lib issue??
        
    await foreach (var chunk in client.ChatEndpoint.StreamCompletionEnumerableAsync(chatRequest))
    {
        if (chunk.FirstChoice.Delta is not null)
            Console.Write(chunk.FirstChoice.Delta.Content);

        if (chunk.FirstChoice.Message?.AudioOutput is not null)
            Console.WriteLine(chunk.FirstChoice.Message.AudioOutput.Data.Length);
    }

    Console.WriteLine("Done.");
    Console.ReadKey();

}

However, when not providing audio in the ChatRequest, this still works:

    var chatRequest = new ChatRequest([new Message(Role.System, "Count from 1 to 10. Whisper please")]);

Expected behavior

Chunks should contain text and / or audio content when the model is generating audio

@vbandi vbandi added the bug Something isn't working label Jan 10, 2025
@StephenHodgson StephenHodgson added help wanted Extra attention is needed good first issue Good for newcomers enhancement New feature or request and removed bug Something isn't working labels Jan 10, 2025
@StephenHodgson StephenHodgson self-assigned this Jan 10, 2025
@StephenHodgson StephenHodgson linked a pull request Jan 11, 2025 that will close this issue
StephenHodgson added a commit that referenced this issue Jan 12, 2025
@StephenHodgson
Copy link
Member

Ended up being broken for both async and Enumerable streaming paths.

@StephenHodgson StephenHodgson changed the title StreamCompletionEnumerableAsync doesn't work with audio Chat Streaming doesn't work with audio modalities Jan 12, 2025
@StephenHodgson StephenHodgson added bug Something isn't working and removed enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers labels Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
2 participants