Streaming a Response in Foundation Models

lioneldude
Jun 27
1 min read

Foundation Models is the new Framework introduced at WWDC25.

From the documentation:

The Foundation Models framework provides access to Apple’s on-device large language model that powers Apple Intelligence to help you perform intelligent tasks specific to your use case. The text-based on-device model identifies patterns that allow for generating new text that’s appropriate for the request you make, and it can make decisions to call code you write to perform specialized tasks.

Generate text content based on requests you make. The on-device model excels at a diverse range of text generation tasks, like summarization, entity extraction, text understanding, refinement, dialog for games, generating creative content, and more.

Define the session as:

let session = LanguageModelSession()

When the user asks a question, you would usually use:

let prompt = "Tell me an interesting fact about using an on device large language model"

let response = try await session.respond(to: prompt)

However .response() will have to await for the answer to be returned by the session, and that is not the best user experience, as there is no indication of what is happening until the response is returned.

In this video example below, I used the streamResponse instead.

The code example on how to use stream response in the askQuestion() function:

Foundation Models appear promising and will assist developers in creating apps that fully leverage the device's large language model.

Streaming a Response in Foundation Models

Recent Posts

Comments