We provide a wide range of training data, such as commands datasets and interaction datasets. These datasets were collected from real-world scenarios, covering different languages, dialects, genders, and ages. Specifically, we record domestic noise datasets as well, since it also impacts recognition. We ensure the consistency and stability through our strict criterion for speakers, recording environments, devices, annotation specification, acceptance inspection, etc.
We have delivered single language and multiple language mixed datasets to our customers. These customized datasets assist our customers’ to enhance their AI or machine learning models which help them operate smoothly even in adverse environments.