AISHELL-WakeUp-1 中英文唤醒词语音数据库

AISHELL-WakeUp-1 Chinese and English Wake-up Words Speech Data

      AISHELL-WakeUp-1语音数据库共唤醒词语音3936003条,1561.12小时。录音语言,中文和英文;录音地区,中国。录音文本为“你好,米雅” “hi, mia”唤醒词。邀请254名发言人参与录制。录制过程在真实家居环境中,设置7个录音位,使用6个圆形16路PDM麦克风阵列录音板做远讲拾音(16kHz,16bit)、1个高保真麦克风做近讲拾音(44.1kHz,16bit)。此数据库经过专业语音校对人员转写标注,并通过严格质量检验,字正确率100%。可用于声纹识别、语音唤醒识别等研究使用。



AISHELL-WakeUp-1 database contains 1,561.12 hours speech data, including 3,936,003 wake-up words speech files.

  • Database language: Chinese and English

  • Recording area: China

  • Wake-up words for recording: “Hi mia” and the Chinese of “你好,米雅”

  • Speakers: 254 participants

  • Environment: Real home environment

  • Device setup: 7 different positions are set for recording, including:

    1) Six 16-channel circular microphone arrays (16kHz,16bit) for the far-field recording;

    2) One Hi-Fi microphone for the close-talk recording (44.1kHz,16bit).

AISHELL-WakeUp-1 database was transcribed by the professional speech annotators with high QA process, and the accuracy rate of word is 100%, which could be used in research of voiceprint recognition, wake-up words recognition and so on.



Speech & Speaker Recognition evaluation


1561.12小时 | 1561.12 Hours


254 speakers in the recording


merged with Kaldi system

Kaldi recipe




HI-MIA Recipe

Non-Open Source





Service  Application          Academic     



A far-field text-dependent speaker verification database for AISHELL Speaker Verification Challenge 2019



About this resource:

The data is used in AISHELL Speaker Verification Challenge 2019. It is extracted from a larger database called AISHELL-WakeUp-1 and AISHELL-2019B-EVAL.



The contents are wake-up words "Hi, Mia" in both Chinese and English. The data is collected in real home environment using microphone arrays and Hi-Fi microphone. The collection process and development of a baseline system was described in the paper below. The data used in the challenge is extracted from 1 Hi-Fi microphone and 16-channel circular microphone arrays for 1/3/5 meters. And the contents are the Chinese wake-up words. The whole set is divided into train (254 people), dev (42 people) and test (44 people) subsets. Test subset is provided with paired target/non-target answer to evaluate verification results.

Rule of the file




e.g. SV0001_2_10_N0001.wav



Field Value Field meaning
<speakerID> SV0001-SV0254 Speaker ID

1-6:circular microphone arrays ID   

(far-field);7:HI-FI Mic ID

Point position  ID
<micID> 2-17 Microphone ID of an 
<speed> Slow,Normal,Fast Speaking speed

License: Apache License v.2.0

Train/Dev Sets

1. Randomly add silence frames at

    the beginning or end for

    SVC- 2019 test set.

2. Unprocessed raw data.

Test Set



The HI-MIA-CW is a supplemental database to the HI-MIA wakeup database, and we used the same setup of HI-MIA database to further record 16434 audios. The specific text of the audios is the HI-MIA confusion words in Chinese, which are the negative samples for wake-up words "hi, Mia" (ni hao mi ya). The text details can be found in the paper and the transcription file (transcription.txt). Each audio sample was recorded in real home environment using high fidelity microphone ( 48kHz,16-bit ). Then we re-sampled to 16kHz to build the database. It contains 35 speakers. There is no overlap between these 35 speakers and the speakers who are in the previous HI-MIA database. These 35 speakers are new. This dataset aims to promote the advanced research on wakeup words detection. It serves as negative samples for the wakeup words detection system. It helps researchers test the performance when encountering the confusing words.


License: CC BY-SA 4.0