Documentation Index
Fetch the complete documentation index at: https://hastekit.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
HasteKit SDK supports text-to-speech generation with various LLM providers like OpenAI.
Generate speech
Generate audio from text input. The response contains raw binary audio data and content type information.
import (
"context"
"fmt"
"log"
"os"
"github.com/hastekit/hastekit-sdk-go/pkg/gateway/llm/speech"
)
resp, err := client.NewSpeech(context.Background(), &speech.Request{
Model: "OpenAI/tts-1",
Input: "Hello, this is a test of the text-to-speech system.",
Voice: "alloy",
})
if err != nil {
log.Fatal(err)
}
// Save audio to file
err = os.WriteFile("output.mp3", resp.Audio, 0644)
if err != nil {
log.Fatal(err)
}
// Access content type
fmt.Println("Content-Type:", resp.ContentType)
Response
The response contains the audio data as a byte array and the content type (e.g., audio/mpeg, audio/opus).
// Access raw audio bytes
audioData := resp.Audio
fmt.Printf("Audio size: %d bytes\n", len(audioData))
// Access content type
fmt.Println("Content-Type:", resp.ContentType)
// The audio data may be gzip-compressed, which is automatically handled by the SDK
Streaming Speech
For real-time applications, the SDK supports streaming speech generation. This returns a channel that yields audio chunks as they are generated.
import (
"context"
"fmt"
"github.com/hastekit/hastekit-sdk-go/pkg/utils"
"github.com/hastekit/hastekit-sdk-go/pkg/gateway/llm/speech"
)
stream, err := client.NewStreamingSpeech(context.Background(), &speech.Request{
Model: "OpenAI/tts-1",
Input: "This is a streaming text-to-speech example.",
Voice: "alloy",
Speed: utils.Ptr(1), // Speed parameter enables streaming
})
if err != nil {
panic(err)
}
var audioChunks []string
for chunk := range stream {
// Handle audio delta chunks
if chunk.OfAudioDelta != nil {
audioChunks = append(audioChunks, chunk.OfAudioDelta.Audio)
}
// Handle completion chunk with usage statistics
if chunk.OfAudioDone != nil {
fmt.Printf("\nCompleted! Tokens used: %d\n", chunk.OfAudioDone.Usage.TotalTokens)
}
}
Request Configuration
| Parameter | Type | Description |
|---|
| Input | string | The text input to convert to speech. |
| Model | string | The speech model to use (e.g., "tts-1", "tts-1-hd"). |
| Voice | string | The voice to use (e.g., "alloy", "echo", "fable", "onyx", "nova", "shimmer"). |
| Instruction | *string | Optional. Instructions for the voice style or pronunciation. |
| ResponseFormat | *string | Optional. Audio format: "mp3", "opus", "aac", "flac", or "pcm". Default is "mp3". |
| Speed | *int | Optional. Speed of the generated audio. When set, enables streaming mode. Range typically 0.25 to 4.0. |
| StreamFormat | *string | Optional. Format for streaming responses (e.g., "pcm"). |
Response Structure
| Field | Type | Description |
|---|
| Audio | []byte | Raw binary audio data. Automatically decompressed if gzip-compressed. |
| ContentType | string | MIME type of the audio (e.g., "audio/mpeg", "audio/opus"). |
Streaming Response Chunks
When using streaming, the response channel yields ResponseChunk objects:
ChunkAudioDelta
Contains incremental audio data during streaming.
| Field | Type | Description |
|---|
| Type | ChunkTypeAudioDelta | Chunk type identifier: "speech.audio.delta". |
| Audio | string | Base64-encoded audio chunk data. |
ChunkAudioDone
Sent when streaming is complete, includes usage statistics.
| Field | Type | Description |
|---|
| Type | ChunkTypeAudioDone | Chunk type identifier: "speech.audio.done". |
| Usage | Usage | Token usage statistics. |
Usage
| Field | Type | Description |
|---|
| InputTokens | int | Number of input tokens processed. |
| OutputTokens | int | Number of output tokens generated. |
| TotalTokens | int | Total tokens used. |
Example: Complete Speech Generation
package main
import (
"context"
"log"
"os"
"github.com/hastekit/hastekit-sdk-go/pkg/utils"
"github.com/hastekit/hastekit-sdk-go/pkg/gateway"
"github.com/hastekit/hastekit-sdk-go/pkg/gateway/llm"
"github.com/hastekit/hastekit-sdk-go/pkg/gateway/llm/speech"
hastekit "github.com/hastekit/hastekit-sdk-go"
)
func main() {
// Initialize SDK client
client, err := hastekit.New(&hastekit.ClientOptions{
ProviderConfigs: []gateway.ProviderConfig{
{
ProviderName: llm.ProviderNameOpenAI,
BaseURL: "",
CustomHeaders: nil,
ApiKeys: []*gateway.APIKeyConfig{
{
Name: "Key 1",
APIKey: os.Getenv("OPENAI_API_KEY"),
},
},
},
},
})
if err != nil {
log.Fatal(err)
}
// Generate speech
resp, err := client.NewSpeech(context.Background(), &speech.Request{
Model: "OpenAI/tts-1",
Input: "Hello! This is a text-to-speech example using HasteKit SDK.",
Voice: "alloy",
ResponseFormat: utils.Ptr("mp3"),
})
if err != nil {
log.Fatal(err)
}
// Save audio file
err = os.WriteFile("output.mp3", resp.Audio, 0644)
if err != nil {
log.Fatal(err)
}
log.Printf("Audio generated successfully! Size: %d bytes, Type: %s\n",
len(resp.Audio), resp.ContentType)
}
Supported Providers
| Provider | Speech |
|---|
| OpenAI | ✅ |
| Gemini | ✅ |
| Anthropic | ❌ |