MAI released models that can transcribe voice into text as well as generate audio and images after the group's formation six ...
Microsoft AI has made its in-house models for transcription, speech recognition, and image generation available on Foundry.
Relatively light at just 2 billion parameters, the model is meant for use with consumer-grade GPUs for those who want to self ...
Speechify just launched a native Windows app that employs locally stored models to enable dictation and transcription across ...
Microsoft says that MAI-Image-2 is at least twice as fast as its previous-generation image generator. The second new model ...
Microsoft's New AI Models Go Beyond Just Text ...
Voice AI startup raises $10 million to expand globally, invest in R&D and build advanced multilingual AI models under India’s ...
The results, drawn from thousands of spontaneous voice conversations across more than 60 languages, reveal capability gaps ...
OpenAI just happens to offer its own speech recognition, speech generation, and text-to-image models. Microsoft's models are available through Foundry (formerly Azure AI Studio), a platform to develop ...
Alibaba’s Qwen 3.5 Omni brings true real-time omnimodal AI to the frontier race: voice cloning, 10-hour audio, real-time ...
Mistral AI has made a move that has surprised the AI world. The French startup has released a new open-source voice model.
The Paris-based Mistral AI SAS today announced the release of Voxtral TTS, its first text-to-speech artificial intelligence ...