This is a list of SSLI's Linguistic Data Consortium holdings. Due to our contract with the LDC, this data may only be accessed by members of registered departments at the University of Washington.


Category 1993

Catalog # Title Media Number of Media Data Type
LDC93S12 HCRC Map Task Corpus Disc 8 Speech
LDC93T3A Tipster Information Retrieval Disc 3 Text
LDC1993S3A Resource Management Continuous Speech Database (RM1) Speaker-Dependent Training Data Disc 1 Speech
LDC1993T01 Association for Computational Linguistics (ACL) Data Collection Initiative (DCI) Disc 1 Text
LDC93S8 Credit Card Corpus Disc 1 Speech
LDC93T4 Switchboard-1 Transcripts Disc 1 Text

Category 1994

Catalog # Title Media Number of Media Data Type
LDC94S18 Spelled and Spoken Word Telephone Corpus Disc 1 Speech
LDC94T4A United Nations Parallel Text Corpus Disc 3 Text
LDC1994V01 Association for Computational Linguistics European Corpus Initiative Multilingual Corpus 1 Disc 1 Speech

Category 1995

Catalog # Title Media Number of Media Data Type
LDC95T9 Spanish Language News Corpus Disc 1 Text
LDC95T20 Hansard Corpus Parallel Text in English and French Disc 1 Text
LDC1995V03 Penn Treebank Project Disc 1 Speech
LDC95T11 European Languages News Corpus Disc 1 Text
LDC1995T8 Japanese Business News Text Disc 1 Text
LDC95S25 TRAINS Spoken Dialog Corpus Disc 1 Speech
LDC1995T13 Mandarin Chinese News Text Corpus Disc 1 Text

Category 1996

Catalog # Title Media Number of Media Data Type
LDC96S46 CALLFRIEND American English Non-Southern Dialect Disc 3 Speech
LDC96L14 Celex 2 Disc 1 Lexicon
LDC96S52 CALLFRIEND Hindi Disc 3 Speech
LDC96S47 CALLFRIEND American English Southern Disc 3 Speech
LDC96S35 CALLHOME Spanish Speech Disc 1 Speech

Category 1997

Catalog # Title Media Number of Media Data Type
LDC97S45 CALLHOME Egyptian Arabic Disc 3 Speech

Category 1998

Catalog # Title Media Number of Media Data Type
LDC1998S73 1997 Mandarin Broadcast News Speech (Hub-4NE) Disc 8 Speech
LDC1998S76 1998 Speaker Recognition Benchmark (sid98e1f) Disc 6 Speech

Category 1999

Catalog # Title Media Number of Media Data Type
LDC1999T42 Penn Treebank Project Release 3 Disc 1 Text
LDC1999T40 Portuguese Newswire Text Corpus Disc 1 Text
LDC1999S80 1997 Speaker Recognition Benchmark Disc 6 Speech
LDC1999T41 Spanish Newswire Text Corpus Volume 2 Disc 1 Text
LDC99T34 Japanese Business News Text Supplement Disc 1 Text
LDC1999S81 1999 Speaker Recognition Benchmark NIST Speech Disc R55-1.2 Training Data Disc 5 Speech

Category 2000

Catalog # Title Media Number of Media Data Type
LDC2000S86 1998 HUB-4 Broadcast News Evaluation English Test Material Disc 1 Speech
LDC2000S85 Santa Barbara Corpus of Spoken American English Part 1 Disc 3 Speech
LDC2000T52 TREC Mandarin Text Retrieval Conference Mandarin Newswire Disc 1 Text
LDC2000T43 BLLIP 1987-1989 WSJ Corpus Release 1 Disc 2 Text
LDC2000S88 1999 HUB-4 Broadcast News Evaluation English Test Material Disc 1 Speech
LDC2000T50 Hong Kong Hansards Parallel Text Disc 1 Text
LDC2000T45 Korean Newswire Disc 1 Text

Category 2001

Catalog # Title Media Number of Media Data Type
LDC2001S95 TDT3 Mandarin Audio Disc 13 Speech
LDC2001S91 1997 HUB-4 Broadcast News Evaluation Non English Test Material Disc 1 Speech
LDC2001T55 Arabic Newswire A Corpus Disc 1 Text
LDC2001T57 Topic Detection and Tracking (TDT) 2 Multilanguage Text Corpus Version 4.0 Disc 1 Text
LDC2001S06 Speech in Noisy Environments (SPINE2) Part 2 Disc 2 Speech
LDC2001S97 2000 NIST Speaker Recognition Evaluation Disc 8 Speech
LDC2001T58 TDT3 Multilanguage Text Version 2.0 Disc 1 Text
LDC2001T62 CETEMPublico Version 1.7 Disc 1 Text
LDC2001S93 Topic Detection and Tracking (TDT) 2 Mandarin Audio Disc 6 Speech
LDC2001S04 Speech in Noisy Environments (SPINE2) Part 1 Disc 3 Speech
LDC2001S16 Grassfields Bantu Fieldwork Ngomba Tone Paradigms Disc 1 Speech
LDC2001S08 Speech in Noisy Environments 2 (SPINE2) Part 3: Evaluation Data Disc 3 Speech

Category 2002

Catalog # Title Media Number of Media Data Type
LDC2002S11 1997 HUB4 English Evaluation Speech and Transcripts Disc 1 Speech
LDC2002V2 Callhome Arabic Speech and Transcripts for JHU 2002 Workshop Disc 1 Speech
LDC2002L27 Chinese-English Translation Lexicon v3.0 Disc 1 Lexicon
LDC2002T01 Multiple-Translation Chinese Corpus Disc 1 Text

Category 2003

Catalog # Title Media Number of Media Data Type
LDC2003S06 Santa Barbara Corpus of Spoken American English Part 2 Disc 1 Text
LDC2003T16 SummBank 1.0 Disc 4 Text
LDC2003L01 Grassfields Bantu Fieldwork: Dschang Lexicon Disc 1 Lexicon
LDC2003T12 Arabic Gigaword Disc 1 Text
LDC2003T17 Mandarin Multi-Pack Translation Chinese Part 2 (SSLI Data) Disc 1 Text
LDC2003T09 Chinese Gigaword Disc 1 Text
LDC2003T15 SLX Corpus of Classic Sociolinguistic Interviews Disc 1 Text
LDC2003T20 ANC First Release Disc 1 Text
LDC2003E14 FBIS Multilanguage Texts Disc 2 Text
LDC2003S02 Grassfields Bantu Fieldwork: Dschang Tone Paradigms Disc 1 Speech
LDC2003E01 Mandarin Name Entities (SSLI Data) Disc 1 Text

Category 2004

Catalog # Title Media Number of Media Data Type
LDC2004E12 UN Chinese English Parallel Text Disc 1 Text
LDC2004E01 NIST Pilot Meeting Corpus Speech Disc 8 Speech
LDC2004S11 2002 Rich Transcription Broadcast News and Conversational Telephone Speech Disc 1 Speech
LDC2004T02 Arabic Treebank Part 2, v2.0 Disc 1 Text
LDC2004S02 ICSI Meeting Speech Disc 9 Speech
LDC2004T07 Multiple-Translation Chinese (MTC) Part 3 Disc 1 Text
LDC2004E04 ISL Meeting Corpus Speech Disc 9 Speech
LDC2004T16 2001 Communicator Dialogue Act Tagged Disc 1 Text
LDC2004T19 Fisher English Training Speech Part 1 Transcripts Disc 1 Text
LDC2004T11 Arabic Treebank Part 3, v1.0 Disc 1 Text
LDC2004T09 ACE 2003 Multilingual Training Data Disc 1 Text
LDC2004T08 Hong Kong Parallel Text Disc 1 Text
LDC2004E67 RT-04F STT Chinese CTS Development Data Speech Disc 1 Speech
LDC2004S10 Santa Barbara Corpus of Spoken American English, Part 3 Disc 1 Speech

Category 2005

Catalog # Title Media Number of Media Data Type
LDC2005T01 Chinese Treebank 5.0 Disc 1 Text
LDC2005S22 Articulation Index Disc 2 Speech
LDC2005S15 HKUST Mandarin Telephone Speech, Part 1 Disc 2 Speech
LDC2005T14 Chinese Gigaword Second Edition Disc 1 Text
LDC2005T34 Chinese-English Name Entity Lists v1.0 Disc 1 Text
LDC2005T23 Chinese Proposition Bank 1.0 Disc 1 Text
LDC2005T10 Chinese English News Magazine Parallel Text Disc 1 Text
LDC2005S11 TDT4 Multilingual Broadcast News Speech Corpus Disc 12 Speech
LDC2005T08 Discourse Graphbank Disc 1 Text
LDC2005T33 BBN Pronoun Coreference and Entity Type Corpus Disc 1 Text
LDC2005T30 Arabic Treebank: Part 4 v1.0 (MPG Annotation) Disc 1 Text
LDC2005S26 CSLU: 22 Languages Corpus Disc 2 Speech
LDC2005T02 Arabic Treebank Part 1 v3.0 (POS with full vocalization and syntactic analysis) Disc 1 Text
LDC2005S30 West Point Company G3 American English Speech Data Corpus Disc 1 Speech
LDC2005S07 Levantine Arabic QT Training Data, Set 3 Speech Disc 1 Speech
LDC2005S28 West Point Croatian Speech Corpus Disc 1 Speech
LDC2005T09 ACE 2004 Multilingual Training Corpus Disc 1 Text
LDC2005L01 Mawukakan Lexicon Disc 1 Lexicon
LDC2005S14 Levantine Arabic QT Training Data Set 4, Speech and Transcripts Disc 2 Speech
LDC2005T03 Levantine Arabic QT Training Data, Set 3 Transcripts Disc 1 Text
LDC2005T16 TDT4 Multilingual Broadcast News Speech Corpus, Text and Annotations Disc 1 Text
LDC2005S13 Fisher English Training Part 2 Speech Disc 7 Speech
LDC2005S08 BBN and AUB DARPA Babylon Levantine Arabic Speech and Transcripts Disc 2 Speech
LDC2005T32 HKUST Mandarin Telephone Transcript Data, Part 1 Disc 1 Text
LDC2005T13 CCGbank Disc 1 Text
LDC2005T24 MDE RT-04 Training Data, Text and Annotations Disc 1 Text
LDC2005T06 Chinese News Translation Text, Part 1 Disc 1 Text
LDC2005S25 Santa Barbara Corpus of Spoken American English, Part 4 Disc 1 Speech
LDC2005T20 Arabic Treebank: Part 3 (full corpus) v2.0 (MPG and Syntactic Analysis) Disc 1 Text
LDC2005T07 ACE Time Normalization (TERN) 2004 English Training Data Disc 1 Text
LDC2005G04 FBIS Arabic Release v1.0 Disc 8 Speech
LDC2005S16 MDE RT-04 Training Data Speech Disc 2 Speech

Category 2006

Catalog # Title Media Number of Media Data Type
LDC2006S26 CSLU: Speaker Recognition v1.1 Disc 1 Speech
LDC2006S15 CSLU: Spelled and Spoken Words Disc 1 Speech
LDC2006T19 TDT5 Topics and Annotations Disc 1 Text
LDC2006S31-d NIST 2003 Language Recognition Development Data II Disc 1 Speech
LDC2006S36 West Point Korean Speech Disc 2 Speech
LDC2006S45 Iraqi Arabic Conversational Telephone Speech Disc 1 Speech
LDC2006T14 Korean Broadcast News Transcripts Disc 1 Text
LDC2006S30 Speech Controlled Computing Disc 5 Speech
LDC2006S31-e 1996 Language Recognition Evaluation Disc 1 Speech
LDC2006T12 Spanish Gigaword First Edition Disc 1 Text
LDC2006T17 French Gigaword First Edition Disc 1 Text
LDC2006S33 Middle East Technical University Turkish Microphone Speech v1.0 Disc 1 Speech
LDC2006S37 West Point Heroico Spanish Speech Disc 1 Speech
LDC2006S31 NIST 2003 Language Recognition Evaluation Disc 1 Speech
LDC2006T13 Web 1T 5-gram Version 1 Disc 6 Text
LDC2006S01 CSLU: Voices Version 1.0 Disc 1 Speech
LDC2006T09 Korean Treebank Annotations Version 2.0 Disc 1 Text
LDC2006S42 Korean Broadcast News Speech Disc 1 Speech
LDC2006T20 Arabic Broadcast News, Transcripts Disc 1 Text
LDC2006T03 Korean Propbank Disc 1 Text
LDC2006T10 English-Arabic Treebank v1.0 Disc 1 Text
LDC2006S16 CSLU: Spoltech Brazilian Portuguese v1.0 Disc 1 Speech
LDC2006S34 Russian through Switched Telephone Network (RuSTeN) Disc 1 Speech
LDC2006T15 Gulf Arabic Conversational Telephone Speech Transcripts Disc 1 Text
LDC2006S43 Gulf Arabic Conversational Telephone Speech Disc 1 Speech
LDC2006S44 2004 NIST Speaker Recognition Evaluation Disc 6 Speech
LDC2006T07 Levantine Arabic QT Training Data Set 5 Transcripts Disc 1 Text
LDC2006S39 CSLU: Names Release 1.3 Disc 1 Speech
LDC2006S14 CSLU: Stories Release 1.2 Disc 1 Speech
LDC2006T06 ACE 2005 Multilingual Training Corpus Disc 1 Text
LDC2006T04 Multiple-Translation Chinese (MTC) Part 4 Disc 1 Text
LDC2006S13 N4 NATO Native and Non-Native Speech Disc 1 Speech
LDC2006T18 TDT5 Multilingual Text Disc 1 Text
LDC2006T01 The Praque Dependency Treebank 2.0 Disc 1 Text
LDC2006T16 Iraqi Arabic Conversational Telephone Transcripts Disc 1 Text
LDC2006S46 Arabic Broadcast News Speech Disc 1 Speech
LDC2006S29 Levantine Arabic QT Training Data Set 5 Speech Disc 2 Speech
LDC2006S35 CSLU: Multilanguage Telephone Corpus v1.2 Disc 1 Speech

Category 2007

Catalog # Title Media Number of Media Data Type
LDC2007T02 English Chinese Translation Treebank v1.0 Disc 1 Text
LDC2007T07 English Gigaword Third Edition Disc 2 Text
LDC2007S10 2003 NIST Rich Transcription Evaluation Data Disc 1 Speech
LDC2007E22 Agile Arabic Web Texts Disc 1 Text
LDC2007S05 CSLU Yes-No v1.2 Disc 1 Speech
LDC2007T21 OntoNotes v1.0 Disc 1 Text
LDC2007T38 Chinese Gigaword Third Edition Disc 1 Text
LDC2007E21 CMU and interACT MSA Al-Jazeera, Akhbar, Akhbar-ElYoum LM data Disc 1 Text
LDC2007T08 ISI Arabic-English Automatically Extracted Parallel Text Disc 1 Text
LDC2007T01 Levantine Arabic Conversational Telephone Speech Transcripts Disc 1 Text
LDC2007S08 CSLU Foreign Accented English v1.2 Disc 1 Speech
LDC2007V01 TRECVID 2005 Keyframes and Transcripts Disc 1 Video
LDC2007S13 CSLU Apple Words and Phrases Disc 1 Speech
LDC2007S03 ARL Urdu Speech Database Training Data Disc 8 Speech
LDC2007T04 Fisher Levantine Arabic Conversational Telephone Speech Transcripts Disc 1 Text
LDC2007E28 BNO3 CTMs from IBM Disc 1 Speech?
LDC2007V02 TRECVID 2003 Keyframes and Transcripts Disc 1 Video
LDC2007T40 Arabic Gigaword Third Edition Disc 1 Text
LDC2007T09 ISI Chinese-English Automatically Extracted Parallel Text Disc 1 Text
LDC2007S02 Fisher Levantine Arabic Conversational Telephone Speech Disc 1 Speech
LDC2007S12 2004 Spring NIST Rich Transcription (RT-04S) Evaluation Data Disc 1 Speech
LDC2007S11 2004 Spring NIST Rich Transcription (RT-04S) Development Data Disc 1 Speech
LDC2007T03 Tagged Chinese Gigaword Disc 1 Text
LDC2007G01 TRANSTAC Phase 1 Speech Disc 6 Speech
LDC2007S01 Levantine Arabic Conversational Telephone Speech Disc 1 Speech
LDC2007G03 TRANSTAC Phase 2 Speech Disc 6 Speech
LDC2007S18 CSLU Kid's Speech v1.1 Disc 3 Speech

Category 2008

Catalog # Title Media Number of Media Data Type
LDC2008L01 An English Dictionary of the Tamil Verb Disc 1 Lexicon
LDC2008S01 CSLU: Portland Cellular Telephone Speech v1.3 Disc 1 Speech
LDC2008T07 Chinese Proposition Bank 2.0 (CPB2.0) Disc 1 Text
LDC2008S02 CSLU: National Cellular Telephone Speech Release 2.3 Disc 1 Speech
LDC2008T17 CALLHOME Mandarin Chinese Transcripts XML Version Disc 1 Text
LDC2008S06 CSLU: Alphadigit Version 1.3 Disc 2 Speech
LDC2008S05 2005 NIST Language Recognition Evaluation Disc 1 Speech
LDC2008T05 Penn Discourse Treebank Version 2 Disc 1 Text
LDC2008L03 Global Yoruba Lexical Database v1.0 Disc 1 Lexicon
LDC2008T21 PennBioIE Oncology 1.0 Disc 1 Text
LDC2008S09 Characterizing Individual Speakers (CHAINS) Disc 1 Speech
LDC2008T04 OntoNotes Release 2.0 Disc 1 Text
LDC2008T25 AQUAINT-2 Information-Retrieval Text Research Collection Disc 1 Text
LDC2008S04 West Point Brazilian Portuguese Speech Disc 1 Speech
LDC2008S07 CSLU: ISOLET Spoken Letter Database Disc 1 Speech
LDC2008T20 PennBioIE CYP 1.0 Disc 1 Text
LDC2008S03 STC-TIMIT 1.0 Disc 1 Speech
LDC2008T03 ACE 2005 English SpatialML Annotations Disc 1 Text

Category 2009

Catalog # Title Media Number of Media Data Type
LDC2009T01 English CTS Treebank with Structural Metadata Disc 1 Text
LDC2009T05 2008 NIST Metrics for Machine Translation (MetricsMATR08) Development Data Disc 1 Text
LDC2009T14 Tagged Chinese Gigaword v2.0 Disc 1 Text
LDC2009L01 An English Dictionary of the Tamil Verb, Second Edition Disc 1 Lexicon
LDC2009T07 Unified Linguistic Annotation Text Collection Disc 1 Text
LDC2009S01 CSLU: Numbers Version 1.3 Disc 1 Speech

Category GALE

Catalog # Title Media Number of Media Data Type
LDC2007E102 GALE Phase 3 Release 1 Web Text v1.0 Disc 6 Speech
LDC2009E13 GALE Phase 4 Release 2 Broadcast Audio HDD 1 Speech
LDC2006E77 GALE Y1 Q3 Release Web Text Collection v1.0 Disc 1 Text
r105_1_1 GALE Translation Dry Run Evaluation - NIST Speech Disc Disc 1 Speech
LDC2008E53 GALE Phase 4 Release 1 Broadcast Audio HDD 1 Speech
LDC2008T18 GALE Phase 1 Chinese Broadcast News Parallel Text - Part 3 Disc 1 Text
GALE-cd007 GALE-cd007 Disc 1 Speech?
LDC2008E41 GALE Phase 3 Release 2 - Web Text Disc 4 Text
LDC2007E04 GALE Phase 2 Release 1 Web Text Disc 1 Text
GALE-chinese-text GALE Chinese TExt Disc 1 Text
GALE-cd006 GALE-cd006 Disc 1 Speech?
LDC2007E32 GALE DEV07 Supplement Audio C, D Disc 1 Speech
LDC2008T02 GALE Phase 1 Arabic Blog Parallel Text Disc 1 Text
LDC2006E94 GALE Year 1 Quarter 4 Release - Arabic Treebank Disc 1 Text
LDC2005E62 GALE Kickoff Release: Broadcast News Audio v1.0 - Chinese Disc 4 Speech
LDC2006E91 GALE Year 1 Quarter 4 Release - Transcripts Disc 1 Text
GALEY1Q2 GALE Year 1 Quarter 2 Disc 1 Speech
LDC2007E44 GALE Phase 2 Release 2 Web Text Disc 2 Text
LDC2009T06 GALE Phase 1 Chinese Broadcast Conversation Parallel Text - Part 2 Disc 1 Text
LDC2006E88 GALE Y1 Web 1T 5-gram Version 1 Disc 6 Text
LDC2008T06 GALE Phase 1 Chinese Blog Parallel Text Disc 1 Text
LDC2005E60 GALE Kickoff Release VOA Arabic Broadcast News Audio Disc 1 Speech
LDC2007T24 GALE Phase 1 Arabic Broadcast News Parallel Text - Part 1 Disc 1 Text
r109_1_1 GALE-06 Evaluation Data - NIST Speech Disc Disc 1 Speech
LDC2006G07 GALE Y1 BBN Iraqi Broadcast Conversation Corpus Disc 5 Speech
LDC2005E81 GALE Y1 Q1 Release Web Text Collection v1.0 Disc 1 Text
LDC2007E15 GALE Phase 2 DevTest - Broadcast Audio v1.0 Arabic Disc 2 Speech
LDC2007E60 GALE Phase 3 DevTest Broadcast Audio v1.0 Disc 2 Speech
GALE-cd005 GALE-cd005 Disc 1 Speech?
SSLI-SAUDIO-tdf GALE 3.5 English Corpus .tdf Files Disc 1 Speech
LDC2006E21 GALE Y1 Distillation Evaluation Audio - Arabic Disc 3 Speech
GALE041 GALE Year 1 Quarter 1 HDD 1 Text?
LDC2009T02 GALE Phase 1 Chinese Broadcast Conversation Parallel Text - Part 1 Disc 1 Text
LDC2008T09 GALE Phase 1 Arabic Broadcast News Parallel Text Part 2 Disc 1 Text
LDC2007E43 GALE Phase 2 Release 2 Broadcast Audio HDD 1 Speech
LDC2007E01 GALE Phase 2 Distillation Evaluation Supplemental English Broadcast Audio HDD 1 Speech
LDC2007_32_A GALE DEV07 Supplement Audio A Disc 1 Speech
LDC2009T15 GALE Phase 1 Chinese Newsgroup Parallel Text - Part 1 Disc 1 Text
GALE-cd004 GALE-cd004 Disc 1 Speech?
LDC2007T23 GALE Phase 1 Chinese Broadcast News Parallel Text - Part 1 Disc 1 Text
LDC2006E32 GALE Y1 Q2 Release Web Text Collection v1.0 Disc 1 Text
GALEQ4030 GALE Year 1 Quarter 4 HDD 1 Text?
2006-dryrun GALE Year 1 Dry Run 2005-12 to 2006-01 Disc 1 Text?
SSLI-SAUDIO-wav GALE 3.5 English SAUDIO .wav Files Disc 1 Speech
LDC2007E99 GALE Phase 3 Release 1 Broadcast Audio HDD 1 Speech
Gale35GoNoGo GALE Phase 3.5 Evaluation Go-No-Go Translation Evaluation HDD 1 Speech
GALE-cd001 GALE-cd001 Disc 1 Speech?
GALE-cd002 GALE-cd002 Disc 1 Speech?
LDC2005E76 GALE Kickoff Release 2: Levantine Arabic CTS Audio Disc 1 Speech
LDC2006E90 GALE Y1 Q4 Web Text Collection v1.0 Disc 1 Text
LDC2009T03 GALE Phase 1 Arabic Newsgroup Parallel Text - Part 1 Disc 1 Text
LDC2007E03 GALE Phase 2 Release 1 Broadcast Audio HDD 1 Speech
GALEQ233 GALE Year 1 Quarter 2 HDD 1 Text?
LDC2007T20 GALE Phase 1 Distillation Training Disc 1 Text
LDC2008T08 GALE Phase 1 Chinese Broadcast News Parallel Text Part 2 Disc 1 Text
GALE-cd008 GALE-cd008 Disc 1 Speech?
LDC2008E38 GALE Phase 3 Release 2 Broadcast Audio HDD 1 Speech
GALEQ3036 GALE Year 1 Quarter 3 HDD 1 Text?
LDC2009E25 Patch for GALE Phase 4 Release 2-Broadcast Audio Disc 1 Speech
LDC2005E61 GALE Kickoff Release: Broadcast Conversation Audio v1.0 - Chinese Disc 2 Speech
LDC2009E69 GALE Phase 4 DevTest Audio Source Snippets Disc 2 Audio
GALE-cd003 GALE-cd003 Disc 1 Speech?
LDC2007E02 GALE Phase 2 Distillation Evaluation - Supplemental Multilingual Newswire Disc 1 Text
LDC2009E14 GALE Phase 4 Release 2-Web Text Disc 3 Text

Category other

Catalog # Title Media Number of Media Data Type
LID03e1 2003 NIST Language Recognition Evaluation March 2003 Evaluation Disc 1 Speech
R87_8.1 NIST RT-03 Spring Evaluation Data - Arabic Conversational Telephone Speech Disc 1 Speech
R95_4.1 NIST RT-04 Fall Evaluation Data - Mandarin Conversational Telephone Speech Disc 1 Speech
26-6.1 DARPA Radio Broadcast News Continous Speech Recognition Corpus Hub-4 Data 3 Disc 1 Speech
2-1.1 DARPA Resource Management Continuous Speech Database (RM1) Speaker-Dependent Training Data Disc 1 Disc 1 Speech
R87_2.1 NIST RT-03 Spring Evaluation Data - English Conversational Telephone Speech Disc 1 Speech
3-2.2 DARPA Extended Resource Management Continuous Speech Speaker-Dependent Corpus (RM2) Disc 2 Disc 1 Speech
2-2.1 DARPA Resource Management Continuous Speech Database (RM1) Speaker-Dependent Training Data Disc 2 Disc 1 Speech
R87_1.1 NIST RT-03 Spring Evaluation Data - English Broadcast News Disc 1 Speech
26-2.1 DARPA Radio Broadcast News Continous Speech Recognition Corpus Hub-4 Data 2 Disc 1 Speech
CAToolkit Conversational Agent Toolkit and CU Communicator Spoken Dialog System Disc 1 Software
R87_3.1 NIST RT-03 SPring Evaluation Data - Mandarin Broadcast News and Conversational Telephone Disc 1 Speech
R79_1_1 Audio Files from 2001 Communicator Data Collection Disc 6 Speech
AMTA-2008-Proceedings Proceedings of the Eighth Conference of the Association for Machine Translation in the Americas Disc 1 Text
26-1.1 DARPA Radio Broadcast News Continous Speech Recognition Corpus Hub-4 Data 1 Disc 1 Speech
Hub4-98T24 Hub4-98T24 Transcript Disc 1 Text
HUB5DEV2001 Hub-5 Development 2001 Disc 1 Speech
2-4.1 DARPA Resource Management Continuous Speech Database (RM1) Development Test and Evaluation Test Data Disc 1 Speech
R87_5.1 NIST RT-03 Spring Evaluation Data - Mandarin Conversational Telephone Speech Disc 1 Speech
R87_3b.2 NIST RT-03 Spring Evaluation Data - Mandarin Conversational Telephone Speech Disc 1 Speech
R87_2.2 NIST RT-03 Spring Evaluation Data - English Conversational Telephone Speech Disc 1 Speech
SID96e1f SWITCHBOARD 1996 Speaker Recognition Benchmark Disc 3 Speech
3-1.2 DARPA Extended Resource Management Continuous Speech Speaker-Dependent Corpus (RM2) Disc 1 Disc 1 Speech
LDY2007V01 TRECVID 2005 Keyframes and Transcripts Disc 1 Video
R87_4.1 NIST RT-03 Spring Evaluation Data - Arabic Broadcast News and Conversational Telephone Speech Disc 1 Speech
2-5.1 DARPA Resource Management Continuous Speech Database (RM1) Isolated and Spelled Word Data Disc 1 Speech
HUB-4NE 1997 Spanish Broadcast News Speech Corpus Disc 9 Speech
27-2.1 1996 Hub-4 Continuous Speech Recognition Broadcast News Development Test Material Part 2 Disc 1 Speech
LTC2007T36 Chinese Treebank 6.0 (CTB6.0) Disc 1 Text
R87_4b.2 NIST RT-03 Spring Evaluation Data - Arabic Conversational Telephone Speech Disc 1 Speech
CSLU-22LANG-1.1 CSLU: 22 Language v1.1 Disc 9 Speech
2-3.1 DARPA Resource Management Continuous Speech Database (RM1) Speaker-Independent Training Data Disc 1 Speech
27-1.1 1996 Hub-4 Continuous Speech Recognition Broadcast News Development Test Material Part 1 Disc 1 Speech