It appears OpenAI’s Realtime Voice API, introduced per week in the past, is taking the world by storm. Builders are going berserk on X, sharing their creations utilizing the realtime voice API.
The brand new providing from the Sam Altman-led AI powerhouse permits apps to have pure, real-time conversations with their customers. Ever since its announcement, every new day has introduced new potentialities. Watching these demos would make AI assistants or different standard chatbots appear puny.
Listed here are some wild examples shared on X by builders.
Speech to Picasso
This unbelievable use case brings forth a voice-controlled portray app. Jordan Singer, who as per his X bio is the founding father of Mainframe, a generative computing firm, shared his new creation with OpenAI’s realtime voice API on X. Singer calls it Teledraw, an experimental drawing app that may be a fusion of real-time voice and picture fashions. It explores modern interfaces through the use of the most recent latent consistency fashions which permits customers to create artwork by voice instructions. Singer confirmed the distinctive UI, which mimics a cellphone name, pushing the boundaries of interactive know-how.
🎨 new applied sciences necessitate new interfaces
with real-time latent consistency fashions, right here’s a unique sort of drawing app: pic.twitter.com/XwNKzt2vF0
— Jordan Singer (@jsngr) December 3, 2023
PDF thoughts reader
One other X consumer, Marcus Schiesser, who calls himself a tech fanatic, has created a voice chat for paperwork. Generally known as Voice Chat PDF, the device is constructed utilizing OpenAI Realtime API, Llama Index, and Subsequent.js. The app permits customers to speak with their very own paperwork. The demo shared by Schiesser exhibits the characteristic utilizing a doc on bodily mailing requirements, highlighting how a consumer can work together with content material utilizing voice in real-time.
Need to chat over your personal paperwork utilizing the brand new @OpenAI Realtime API?
You are able to do so now utilizing Voice Chat PDF, constructed utilizing @llama_index and @nextjs.
The video under exhibits an instance utilizing a doc about bodily mailing requirements.
📄 https://t.co/Oq6GCdvIrM pic.twitter.com/GmAaLbSo7L
— Marcus Schiesser (@MarcusSchiesser) October 4, 2024
Assistant for mock interviews
Kenn Ejima, former head of Japan Quora, shared an AI interviewer who conducts mock interviews, basically quizzing folks on their resume. The brand new mock interview app lets customers observe interview abilities by importing their CVs or resumes for AI-driven questions. It at the moment helps Stanford MBA purposes and permits one free trial each 24 hours. It’s constructed with Remix, Render, Quadrant, and Cloudflare R2.
🚀 Simply launched! 🚀
Apply your interview abilities with our 2-minute mock interview app utilizing @OpenAI’s new Realtime API.
🎤 Add your CV, and let the AI interviewer ask about your expertise.
Attempt it for FREE! pic.twitter.com/5fcPG5UfhJ
— Kenn Ejima (@kenn) October 11, 2024
Voice-controlled browser
Software program engineer Sawyer Hood shared a voice-controlled browser on X. With this browser, one merely must open and say out loud what they’re trying to find. The browser is constructed utilizing OpenAI’s Realtime API and lets customers navigate the web by voice instructions. The system deploys a customized DOM format for dependable web page understanding, avoiding the intricacies of uncooked HTML. The browser is at the moment in improvement and based on Hood, the browser goals to supply seamless voice-based net interactions.
The open ai realtime api is sick! I hooked it as much as management my browser so I might browse the net with my voice 🤯 pic.twitter.com/sCsNOz1OXr
— Sawyer Hood (@sawyerhood) October 4, 2024
Your buying and selling assistant
Wily Douhard, a developer, has made a voice assistant that may monitor the value of a number of shares utilizing your voice. Douhard has created one thing generally known as Chainlit Realtime which helps WebSockets for real-time audio interactions by integrating OpenAI’s Realtime Voice API. This app exhibits how builders can construct responsive assistants that stream audio instructions and responses seamlessly.
🎙️Chainlit Realtime is right here! 🎙️
That includes first-class WebSocket help for realtime audio interactions in Chainlit purposes.
We’ve added help for @OpenAI real-time API to unlock an entire new UX for devs constructing clever, responsive assistants. pic.twitter.com/RxEUtqOGyI
— willy douhard (@willy_douhard) October 4, 2024
Your realtime-anime buddy
Bryan Pratte, founding father of Hallway.AI, confirmed how OpenAI’s Realtime API when mixed with ExpressionEngine, can convey anime characters to life. Primarily based on the demo, this integration appears to allow real-time voice interactions with animated characters. It presents an immersive expertise as seen within the demo under.
OpenAI Realtime API + ExpressionEngine opens up an entire new world. Chat with @join_hallway characters coming in sizzling! pic.twitter.com/oYckyuEilu
— bryan pratte (@btp4z7) October 1, 2024
On October 1, OpenAI launched the Realtime API that enables builders to construct purposes with stay interactions. This API helps speech-to-text, text-to-speech, and real-time dialog skills which makes it doable to create dynamic assistants and voice experiences. With audio and textual content being streamed forwards and backwards, the Realtime API permits for extremely responsive purposes.
In response to OpenAI, this API has been designed to be used instances like digital assistants, stay collaboration instruments, and interactive instructional apps. The Realtime API makes use of OpenAI’s highly effective language fashions which supply seamless real-time conversations that improve consumer engagement and interplay throughout a variety of use instances.