Getting Started with Speech-to-Text

Convert your pre-recorded audio files to text using SpeechLytics Speech-to-Text API. This guide will walk you through the complete workflow from authentication to retrieving your transcription results.

Prerequisites

Valid SpeechLytics account credentials
Audio file in a supported format (WAV, MP3, M4A, etc.)
Bearer token (obtained from authentication endpoint)

Step 1: Get Authentication Token

First, obtain a Bearer token using your credentials:

curl -X POST "https://api.example.com/api/v1/auth/token" \
  -H "Content-Type: application/json" \
  -d '{
    "Username": "your_username",
    "Password": "your_password"
  }'

Response:

{
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "expires": "2025-11-28T23:59:59Z",
  "eventId": "evt_12345"
}

Save this token for use in subsequent API calls. See Authentication Guide for more details.

Step 2: Prepare Your Audio File

Convert your audio file to Base64 format. Here are examples for different platforms:

Linux/Mac

base64 -i your_audio_file.wav -o audio_base64.txt

Windows PowerShell

[Convert]::ToBase64String([IO.File]::ReadAllBytes("your_audio_file.wav")) | Out-File -Encoding utf8 audio_base64.txt

Python

import base64

with open('your_audio_file.wav', 'rb') as f:
    audio_base64 = base64.b64encode(f.read()).decode('utf-8')
    print(audio_base64)

Step 3: Upload and Transcribe Your Audio

Send your audio file to the transcription endpoint:

curl -X POST "https://api.example.com/api/v1/transcribe" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "DataBase64": "SUQzBAAAAAAAI1NUVEUAAAA...",
    "Filename": "call_recording.wav",
    "Language": "Auto",
    "Metadata": "agent_id=123;call_type=inbound",
    "HasPriority": false,
    "CheckFilenameExistence": false
  }'

Request Parameters

Parameter	Type	Required	Description
DataBase64	string	Yes	Audio file encoded in Base64 format
Filename	string	Yes	Name of the audio file (used for reference)
Language	enum	Yes	Language of the audio. Options: `Auto`, `ENGLISH`, `SPANISH`, etc. (100+ languages supported). Default: `Auto`
Metadata	string	No	Custom metadata (e.g., call ID, agent info)
HasPriority	boolean	No	Set to true for faster processing. Default: false
CheckFilenameExistence	boolean	No	Check if filename already exists. Default: false

Response

{
  "id": 123456789,
  "status": "0 - Queued"
}

Response Fields

Field	Type	Description
id	integer	Unique transcription ID for tracking
status	enum	Current processing status (`0 - Queued`, `1 - InProgress`, `2 - Processed`, `3 - Failed`, etc.)

Step 4: Check Transcription Status

Use the transcription ID to check processing status:

curl -X GET "https://api.example.com/api/v1/transcripts/123456789/status" \
  -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "id": 123456789,
  "status": 2,
  "statusDescription": "Processed",
  "score": 95.5,
  "inScope": true,
  "audioType": 1,
  "audioTypeDescription": "Stereo",
  "duration": 180,
  "name": "call_recording.wav",
  "created": "2025-11-28T10:00:00Z",
  "modified": "2025-11-28T10:05:00Z",
  "transcription": {
    "language": "en",
    "leftChannel": [...],
    "rightChannel": [...],
    "bothChannels": [...]
  },
  "keywords": [...],
  "topics": [...],
  "sentiments": [...],
  "callSummary": {...}
}

Status Values

Code	Description	Meaning
0	Queued	Waiting to be processed
1	InProgress	Currently being processed
2	Processed	Successfully completed
3	Failed	Processing failed
4	QuotaLimit	Account quota exceeded
5	NotFound	Transcription ID not found
6	KeywordMatch	Processing completed with keyword matches
7	Live	Currently processing live transcription

Step 5: Retrieve Full Transcripts

Get a list of all your transcripts with filters:

curl -X GET "https://api.example.com/api/v1/transcripts?Page=1&Rows=10&DateFrom=2025-11-01&DateTo=2025-11-30" \
  -H "Authorization: Bearer YOUR_TOKEN"

Query Parameters

Parameter	Type	Description
Page	integer	Page number (1-indexed)
Rows	integer	Number of rows per page (max: 100)
DateFrom	string	Filter by start date (ISO 8601 format)
DateTo	string	Filter by end date (ISO 8601 format)
FileName	string	Filter by filename
Topic	string	Filter by topic
Sentiment	string	Filter by sentiment
Tag	string	Filter by tag
Content	string	Search in transcription content
Cluster	string	Filter by cluster

Complete Workflow Example

Python

import requests
import base64
import time

API_BASE = "https://api.example.com"
USERNAME = "your_username"
PASSWORD = "your_password"

# Step 1: Get token
auth_response = requests.post(
    f"{API_BASE}/api/v1/auth/token",
    json={"Username": USERNAME, "Password": PASSWORD}
)
token = auth_response.json()['token']
headers = {"Authorization": f"Bearer {token}"}

# Step 2: Prepare audio
with open('call_recording.wav', 'rb') as f:
    audio_base64 = base64.b64encode(f.read()).decode('utf-8')

# Step 3: Upload audio
transcribe_response = requests.post(
    f"{API_BASE}/api/v1/transcribe",
    headers=headers,
    json={
        "DataBase64": audio_base64,
        "Filename": "call_recording.wav",
        "Language": "Auto",
        "Metadata": "agent_id=123"
    }
)
transcription_id = transcribe_response.json()['id']
print(f"Transcription started with ID: {transcription_id}")

# Step 4: Poll for completion
while True:
    status_response = requests.get(
        f"{API_BASE}/api/v1/transcripts/{transcription_id}/status",
        headers=headers
    )
    status = status_response.json()
    
    if status['status'] == 2:  # Processed
        print("Transcription completed!")
        print(f"Transcription: {status['transcription']}")
        break
    elif status['status'] == 3:  # Failed
        print("Transcription failed!")
        break
    else:
        print(f"Status: {status['statusDescription']}")
        time.sleep(5)  # Wait 5 seconds before checking again

C#

using System;
using System.IO;
using System.Net.Http;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;

class Program
{
    static async Task Main()
    {
        var client = new HttpClient();
        var apiBase = "https://api.example.com";
        
        // Step 1: Get token
        var tokenPayload = JsonSerializer.Serialize(new
        {
            Username = "your_username",
            Password = "your_password"
        });
        var tokenContent = new StringContent(tokenPayload, Encoding.UTF8, "application/json");
        var tokenResponse = await client.PostAsync($"{apiBase}/api/v1/auth/token", tokenContent);
        var tokenData = await tokenResponse.Content.ReadAsStringAsync();
        var tokenJson = JsonDocument.Parse(tokenData);
        var token = tokenJson.RootElement.GetProperty("token").GetString();
        
        client.DefaultRequestHeaders.Add("Authorization", $"Bearer {token}");
        
        // Step 2: Prepare audio
        var audioBytes = File.ReadAllBytes("call_recording.wav");
        var audioBase64 = Convert.ToBase64String(audioBytes);
        
        // Step 3: Upload audio
        var transcribePayload = JsonSerializer.Serialize(new
        {
            DataBase64 = audioBase64,
            Filename = "call_recording.wav",
            Language = "Auto",
            Metadata = "agent_id=123"
        });
        var transcribeContent = new StringContent(transcribePayload, Encoding.UTF8, "application/json");
        var transcribeResponse = await client.PostAsync($"{apiBase}/api/v1/transcribe", transcribeContent);
        var transcribeData = await transcribeResponse.Content.ReadAsStringAsync();
        var transcribeJson = JsonDocument.Parse(transcribeData);
        var transcriptionId = transcribeJson.RootElement.GetProperty("id").GetInt64();
        
        Console.WriteLine($"Transcription started with ID: {transcriptionId}");
        
        // Step 4: Poll for completion
        while (true)
        {
            var statusResponse = await client.GetAsync($"{apiBase}/api/v1/transcripts/{transcriptionId}/status");
            var statusData = await statusResponse.Content.ReadAsStringAsync();
            var statusJson = JsonDocument.Parse(statusData);
            var status = statusJson.RootElement.GetProperty("status").GetInt32();
            
            if (status == 2)
            {
                Console.WriteLine("Transcription completed!");
                break;
            }
            else if (status == 3)
            {
                Console.WriteLine("Transcription failed!");
                break;
            }
            
            await Task.Delay(5000);
        }
    }
}

Supported Languages

The Speech-to-Text API supports automatic language detection as well as explicit language selection for 100+ languages including:

English, Spanish, French, German, Italian, Portuguese
Chinese (Simplified & Traditional), Japanese, Korean
Russian, Polish, Dutch, Swedish, Danish, Norwegian
Hindi, Arabic, Hebrew, Turkish, Thai
And many more...

For a complete list, see the Language enum in the API Reference.

Next Steps

Learn about Features - Silence detection, quality metrics, channel analysis
Explore Audio Intelligence - Extract insights from your transcriptions
Check Insights - Access analytics on your call data
View API Reference - Complete API documentation

Troubleshooting

Common Issues

Issue: 401 Unauthorized

Verify your Bearer token is valid and not expired
Check the Authorization header format

Issue: Large file takes too long

Use HasPriority: true for faster processing
Consider splitting very large files

Issue: Transcript has low quality

Ensure your audio file is clear and not corrupted
Try explicit language specification instead of Auto
Check audio levels and background noise

For more help, contact support.

Prerequisites​

Step 1: Get Authentication Token​

Step 2: Prepare Your Audio File​

Linux/Mac​

Windows PowerShell​

Python​

Step 3: Upload and Transcribe Your Audio​

Request Parameters​

Response​

Response Fields​

Step 4: Check Transcription Status​

Status Values​

Step 5: Retrieve Full Transcripts​

Query Parameters​

Complete Workflow Example​

Python​

C#​

Supported Languages​

Next Steps​

Troubleshooting​

Common Issues​

Prerequisites

Step 1: Get Authentication Token

Step 2: Prepare Your Audio File

Linux/Mac

Windows PowerShell

Python

Step 3: Upload and Transcribe Your Audio

Request Parameters

Response

Response Fields

Step 4: Check Transcription Status

Status Values

Step 5: Retrieve Full Transcripts

Query Parameters

Complete Workflow Example

Python

C#

Supported Languages

Next Steps

Troubleshooting

Common Issues