InteractiveBuilt in 1 sessionMay 7, 2026

OpenAI Realtime Translation API Demo

A capped browser desk for testing OpenAI's new realtime translation API with mic input, WebRTC, pause and continue cues, and visible latency checks.

Loading tool...

What this proves

Realtime voice demos only matter if the latency, interruption behavior, and deployment path survive a real mic test.

#voice-ai #openai #azure-openai #translation #workflow

How it works

What This Proves

OpenAI's dedicated realtime translation model is the release signal, but the practical ShipWithTez question is different:

Can a small team use existing credits and a normal cloud account to test the workflow today?

This build answers that in two passes. First, I tried Azure OpenAI because the Microsoft credits were already available. Microsoft says gpt-realtime-2, gpt-realtime-translate, and gpt-realtime-whisper are rolling into Foundry, and the Azure deployment API recognizes the 2026-05-06 model names.

The catch: this Visual Studio subscription still returns SpecialFeatureOrQuotaIdRequired for those models. So the working fallback is gpt-realtime-1.5 until quota or feature access opens up.

The second pass uses direct OpenAI API access for the dedicated translation model, but with a strict build budget. The app reserves the full worst-case session cost before it mints a short-lived client secret, defaults to a $20 cap, over-reserves at $0.10 per minute, and auto-stops browser sessions after the configured session window. For a true provider-enforced ceiling, the OpenAI project budget should also be set to $20.

What I Built

The desk has three layers:

a server route that can mint Azure realtime secrets or OpenAI realtime translation secrets
a local OpenAI budget ledger that refuses to mint once the $20 reservation cap is exhausted
a browser WebRTC client that sends microphone audio directly to the realtime model
a test surface for target language, pause behavior, latency, transcripts, and event logs

The server route is gated by an env flag so the public page does not mint tokens by accident. On this Mac it can use Azure CLI auth for Azure, or OPENAI_API_KEY for direct OpenAI without exposing permanent keys to the browser.

The deployment test is part of the artifact: Azure accepts gpt-realtime-1.5, but the newest Azure-hosted realtime models currently require additional subscription access. Direct OpenAI is the fastest path to test gpt-realtime-translate today.

Why It Fits ShipWithTez

The post is not "look at the new API."

The useful story is:

I tried to turn the announcement into a working operator demo with the credits I already had. Here is what worked, what Azure exposes today, and where access still blocks the ideal version.

That is much stronger for SWT users than another stack walkthrough because it shows the real adoption path:

use official cloud credits
deploy the closest available realtime model
switch to OpenAI direct when cloud quota is blocked
cap the experiment before it can turn into surprise spend
test latency with human speech
verify interruption and resume behavior
call out the quota/access block instead of hiding it

What I Would Add Next

Set the OpenAI project budget to $20, then run one real mic test through gpt-realtime-translate.
Swap the Azure deployment to gpt-realtime-translate when this subscription gets access.
Add a saved latency run with first-token, first-audio, and pause-resume timings.
Add a browser tab audio mode for translating webinars or support calls.
Package the Azure CLI setup as a short operator runbook.
Turn the best run into the LinkedIn post and Instagram carousel.

Get new builds, breakdowns, and useful AI updates.

InteractiveBuilt in 1 sessionMay 7, 2026

OpenAI Realtime Translation API Demo

A capped browser desk for testing OpenAI's new realtime translation API with mic input, WebRTC, pause and continue cues, and visible latency checks.

Checkinggpt-realtimeAPI demo

OpenAI Realtime Translation API Demo

OpenAI's new realtime voice models can reason, translate, and transcribe while people are still speaking. This build tests the most immediate implication: live multilingual support that feels like a conversation, not a delayed transcript.

Release

reason, translate, transcribe

Wedge

multilingual voice support

Proof

latency, pause, recovery

Live Voice Translation

workflow test

Customer speaks

invoice kal bhej do, payment link today chahiye

Operator hears

Send the invoice tomorrow, but send the payment link today.

Token

SDP

Audio

The screenshot should show the workflow, not only the API settings.

Session

Deployment

checking

Auth

checking

New models

not available

Mode

Mic Test

Input transcript

Waiting for speech...

Translated output

Translated speech appears here.

Latency Lab

checking

Event Stream

No realtime events yet.

Loading tool...

What this proves

Realtime voice demos only matter if the latency, interruption behavior, and deployment path survive a real mic test.

#voice-ai #openai #azure-openai #translation #workflow

How it works

What This Proves

OpenAI's dedicated realtime translation model is the release signal, but the practical ShipWithTez question is different:

Can a small team use existing credits and a normal cloud account to test the workflow today?

The catch: this Visual Studio subscription still returns SpecialFeatureOrQuotaIdRequired for those models. So the working fallback is gpt-realtime-1.5 until quota or feature access opens up.

What I Built

The desk has three layers:

a server route that can mint Azure realtime secrets or OpenAI realtime translation secrets
a local OpenAI budget ledger that refuses to mint once the $20 reservation cap is exhausted
a browser WebRTC client that sends microphone audio directly to the realtime model
a test surface for target language, pause behavior, latency, transcripts, and event logs

Why It Fits ShipWithTez

The post is not "look at the new API."

The useful story is:

I tried to turn the announcement into a working operator demo with the credits I already had. Here is what worked, what Azure exposes today, and where access still blocks the ideal version.

That is much stronger for SWT users than another stack walkthrough because it shows the real adoption path:

use official cloud credits
deploy the closest available realtime model
switch to OpenAI direct when cloud quota is blocked
cap the experiment before it can turn into surprise spend
test latency with human speech
verify interruption and resume behavior
call out the quota/access block instead of hiding it

What I Would Add Next

Set the OpenAI project budget to $20, then run one real mic test through gpt-realtime-translate.
Swap the Azure deployment to gpt-realtime-translate when this subscription gets access.
Add a saved latency run with first-token, first-audio, and pause-resume timings.
Add a browser tab audio mode for translating webinars or support calls.
Package the Azure CLI setup as a short operator runbook.
Turn the best run into the LinkedIn post and Instagram carousel.

Get new builds, breakdowns, and useful AI updates.