IBM Watson SpechtoTextV1错误 – Python

我一直在尝试使用IBM Watson speechtotext api.但是,它适用于短长度音频文件,但不适用于大约5分钟的音频文件.它给了我以下错误

“watson {‘code_description’:’Bad Request’,’code’:400,’error’:’30秒没有检测到语音.’}”

我正在使用Watson的试用帐户.试用帐户是否有限制?或以下代码中的错误.

Python代码: –

from watson_developer_cloud import SpeechToTextV1

speech_to_text = SpeechToTextV1(
    username='XXX', 
    password='XXX',
    x_watson_learning_opt_out=False
)

with open('trial.flac', 'rb') as audio_file:
    print(speech_to_text.recognize(audio_file, content_type='audio/flac', model='en-US_NarrowbandModel', timestamps=False, word_confidence=False, continuous=True))

感谢任何帮助!

最佳答案 有关您尝试使用的识别API,请参阅Speech to Text API Explorer中的实施说明:

Implementation Notes

Sends audio and returns transcription results for
a sessionless recognition request. Returns only the final results; to
enable interim results, use session-based requests or the WebSocket
API. The service imposes a data size limit of 100 MB. It automatically
detects the endianness of the incoming audio and, for audio that
includes multiple channels, downmixes the audio to one-channel mono
during transcoding.

Streaming mode

For requests to transcribe live
audio as it becomes available or to transcribe multiple audio files
with multipart requests, you must set the Transfer-Encoding header to
chunked to use streaming mode. In streaming mode, the server closes
the connection (status code 408) if the service receives no data chunk
for 30 seconds and the service has no audio to transcribe for 30
seconds. The server also closes the connection (status code 400) if no
speech is detected for inactivity_timeout seconds of audio (not
processing time); use the inactivity_timeout parameter to change the
default of 30 seconds.

这里有两个因素.首先,数据大小限制为100 MB,因此我确保您不会将大于此文件的文件发送到Speech to Text服务.其次,如果在为inactivity_timeout定义的秒数内没有检测到语音,您可以看到服务器将关闭连接并返回400错误.似乎默认值为30秒,因此这与您在上面看到的错误相匹配.

我建议您确保文件的前30秒内有有效的语音和/或使inactivity_timeout参数更大,以查看问题是否仍然存在.为了简化操作,您可以在浏览器中使用API​​ Explorer测试失败的文件和其他声音文件:

Speech to Text API Explorer

点赞