Credit: Shutterstock Today marks an exciting moment for the developer community as xAI officially introduces the Grok Voice ...
This repo provides a command-line tool for performing automatic speech-to-text tasks (i.e., "transcription") using open source models from Hugging Face Hub. For interactive tasks, it allows users to ...
Abstract: Image Caption generation is one of the challenging tasks in the field of artificial intelligence. It is used to generate a textual description for a given picture. But due to, the recent ...
Kokoro Web is powered by hexgrad/Kokoro-82M, an open-weight 82 million parameter Text-to-Speech model available on Hugging Face. Despite its lightweight architecture, it delivers comparable quality to ...
Abstract: The speech recognition is plays a vital role in the technology. The proposed work introduces a web application that leverages state-of-the-art technologies for audio-to-text recognition and ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results