AISHELL-6A
开 源 数 据 ,助 力 人 工 智 能 发 展
AISHELL-Stammertalk 中文口吃数据库
A Mandarin stuttered speech dataset
开源时间:2024年6月
AISHELL-Stammertalk 是一个专业的中文口吃语音数据集。数据采集自70名母语为普通话的成年口吃人士,其中男性46人,女性24人,总时长达48.8小时。每位参与者的录音会话时长约一小时,包含两个部分:自由对话与语音指令阅读。
自由对话:通过线上会议平台(如Zoom、腾讯会议)进行访谈,旨在获取参与者在多样话题下的自发式口语。访谈员依据预置提纲提问,并可灵活引入即兴问题。
语音指令朗读:参与者需朗读总计200条语音指令,内容涵盖车载导航与智能家居交互两大场景。为丰富数据,每25名参与者更换一套全新的200条指令,最终数据集共包含600条不重复的指令。在此环节,参与者运用“主动口吃”技巧,以诱发更丰富的口吃现象。
本数据集依据标注规范,精确定义并标注了五种口吃类型:
[]:用于标注重复的完整的字、多字或词。如果只是单音重复,请用/r
/b: 标注明显的长时间卡壳或短时间卡断。b指block
/p: 标住托长音。p指prolongation
/r:标注声音重复,比如单个辅音或元音,不足以构成一个字。r指repetition
/i: 插入语,比如不自然的嗯、啊、呃。i指interjection。如果是听感自然的插入语不用标。
本数据集旨在为口吃语音检测、识别及相关的语音技术研究提供高质量、专业标注的数据资源。
The AISHELL-Stammertalk datasets consists of recordings from 70 native mandarin AWS (Adults who stutter), including 46 males and 24 females. The total duration is 48.8 hours. Each participant engaged in a recording session lasting up to one hour, comprising two parts: conversation and voice command reading. Conversations were conducted through online interviews using platforms like Zoom or Tencent Meet, aiming to capture spontaneous speech on diverse topics. The interviewer, one of the two authors, posed questions based on a prepared list, with the flexibility to introduce impromptu questions as needed.
In the voice command reading part, participants were tasked with reading a set of 200 commands, categorized into car navigation and smart home device interaction. To ensure variety, a new set of 200 commands was introduced for every 25 participants, resulting in a dataset featuring a total of 600 unique commands. Participants were encouraged to employ the Voluntary Stuttering technique, deliberately introducing stuttering.
Five types of stuttering were specified by the annotation guidelines, including:
[]: Word/phrase repetition. Designated for marking entire repeated character or phrase.
/b: block. Gasps for air or stuttered pauses.
/p: prolongation. Elongated phoneme.
/r: sound repetition. Repeated phoneme that do not constitute an entire character.
/i: interjections. Filler characters due to stuttering e.g., ‘嗯’, ‘啊’, or ‘呃’. Notably, naturally occurring interjections that don't disrupt the speech flow are excluded.

StutteringSpeech 挑战赛旨在检测口吃事件并进行自动语音识别。本次挑战的目标是:1)口吃事件检测;2)口吃自动语音识别(ASR)。 本次挑战赛所使用的数据集和 AISHELL-Stammertalk 数据集仅在训练集和测试集划分上有区分,数据内容本身完全一致。详情请参阅以下链接:https://stutteringspeech.org/
The StutteringSpeech Challenge is designed to detect stuttering events and perform automatic speech recognition. The objectives of this challenge are to: 1) stuttering event detection, 2) stuttering automatic speech recognition (ASR).
The challenge dataset is just a redivision of the training and test sets from The AISHELL-Stammertalk datasets, with the data itself being completely identical. You can refer this link for details: https://stutteringspeech.org/
数据下载

查看样例

数据下载

论 文

基线系统

基线系统

微信公众号
联系我们
商务合作:bd@aishelldata.com
技术服务:tech@aishelldata.com
联系电话:+86-010-80225006
公司地址:
北京市海淀区海淀大悦信息科技园D5-A501
开源数据
