This dataset is designed to train AI models that better understand spoken Spanish, enhancing natural interaction in speech recognition. It includes diverse real-life dialogues with high transcription accuracy. Key features like liaison and elision are carefully annotated, and punctuation reflects Spanish rhythm. Complete sentences support learning of complex verb forms, improving recognition robustness.
Language
Spanish
Data Style
Conversational Style
Bit Rate
16bits
Channel
2
Total Audio Duration
5000+ hours
ISO/IEC 27001 & ISO/IEC 27701:2019 compliant
Audio, text, image, and video multi-modal data
Conversational, scripted, and spontaneous data covering extensive domains
Expertise secured quality result