Google Speech to Text API

Gnani.ai Launches Indic Speech-To-Text Model Under IndiaAI Mission

Gnani.ai has launched Vachana STT, a speech-to-text model built for Indian languages, under the IndiaAI Mission. The startup ...

GitHub

princeton-ddss/speech-recognition-inference

This repo provides a command-line tool for performing automatic speech-to-text tasks (i.e., "transcription") using open source models from Hugging Face Hub. For interactive tasks, it allows users to ...

TestingCatalog

xAI launches Grok Voice Agent API for real-time voice apps

AI introduces the Grok Voice Agent API, offering developers real-time speech capabilities and configurable voice options for ...

Analytics Insight

How to Use Gemini Live API Native Audio in Vertex AI: Step-by-Step Guide

Overview: Real-time voice interaction is becoming a defining feature of next-generation AI applications. From conversational ...

eWeek

Google Rolls Out Gemini 2.5 Flash Native Audio for Natural Voice Interactions

Google updates Gemini 2.5 Flash Native Audio for smoother voice chats, stronger instruction following, and live speech translation in Translate and Gemini Live.

YourStory

Google’s Gemini audio models get sharper voice agents, live speech translation

Gemini 2.5 Flash Native Audio improves function calling, instruction following and multi‑turn dialogue. A new live speech ...

11d

5 Best Free Speech-to-Text APIs in 2025 Compared & Tested

Top free transcription APIs for 2025, pick accurate, scalable results for your app or AI project. Validate AI quality and ...

12d

Gemini 2.5 Text-to-Speech Update Brings Realistic AI Voices

Google has updated its Gemini text-to-speech technology, giving developers natural AI voices with pacing tone and multi-speaker support.

13d

Google’s Big Gemini AI Updates: AI Models, Search, Preferred Sources and More From the Week

Google upgrades Gemini speech features and expands new Search tools with publisher links and preferred sources.

13d

Google expands Gemini 2.5 Text-to-Speech with enhanced expressivity, context-aware pacing, and multilingual support

Google has announced updates to its Gemini 2.5 Flash and Gemini 2.5 Pro Text-to-Speech (TTS) preview models. The improvements ...

TestingCatalog

Google expands Gemini TTS with 24 languages, lifelike voices

What's new? Google announced Gemini 2.5 Flash and Gemini 2.5 Pro TTS preview models via the Gemini API in Google AI Studio; ...

IEEE

Speech-to-Text and Text-to-Speech Recognition Using Deep Learning

Abstract: Speech-to-Text (STT) and Text-to-Speech (TTS) recognition technologies have witnessed significant advancements in recent years, transforming various industries and applications. STT allows ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results