As ChatGPT (built on GPT-3.5 architecture) continues to make waves, OpenAI has launched the second version of Whisper, an open-sourced multilingual speech recognition model.
OpenAI released Whisper, which could translate and transcribe speech from 97 diverse languages. However, it wasn't fine-tuned to a specific dataset and didn't surpass other models.
Challenges
Users observed that the prediction is often biased to integer timestamps, and blurring the predicted distribution may help.
Potential Risk
The model could be used to automate surveillance or identify individual speakers, but the company hopes it will be used for beneficial purposes.
留言