IEEE Transactions on Speech and Audio (TSAP) and ICASSP EDICS -------------------------------------------------------------- Last updated: $Date: 2005/10/28 07:44:16 $ time-zone UTC Version: $Revision: 1.21 $ Outline: 1) constraints that the EDICS must follow. 2) the EDICS themselves. 3) edit history of the EDICS since June 2004 and whose comments the edits were based on. This file contains the latest IEEE TSAP/ICASSP EDICS. EDICS stands for "EDditors Information Classification Scheme", and it is important that we as a community develop a good set of EDICS for us to all use --- it helps both speed reviewer assignment and ensure reviewer quality for papers submitted to both conferences (ICASSP, ASRU) and journals (TSAP). Two issues motivated the effort to update the EDICS for speech processing (SP) and spoken language processing (SLP). First, it was necessary to expand the EDICS in the spoken language area to cover recent advances and trends. Second, there was a desire to integrate the T-SAP EDICS with the ICASSP EDICS so that there is only one set of EDICS for all transactions, conferences, and workshops. This is a draft, and we want feedback to improve the EDICS. Thus, all feedback from you will be incorporated to the extent possible, but there are a couple of constraints we all must follow: 1) The broad EDICS in each category should be less than about 20. 2) The inner level should also have no more than about 20 sub-categories. 3) At the very top level, there should be a "Speech Processing", and a "Spoken Language Processing". This is a constraint from TSAP which, along with "Audio and Electroacoustics" defines three top level categories. (if you are thinking that this is thus three levels of category nesting, you are correct). The reason for this is that manuscript central (the web service that IEEE uses) only allows submitters to choose the top 2 levels for paper assignment to associate editors (yes, changing the web page would be a good idea, but that would need to be IEEE wide). 4) Try to ensure that major topics in any given sub-EDIC (say, 1.1, or 1.2, etc) can be reviewed by the same author, i.e., someone with background in all the sub-EDIC areas. 5) Note that the numberings themselves are subject to change (in fact, the numberings will I think be different for ICASSP and TSAP), so focus only on the categories/sub-categories. Please send me feedback on the EDICS (but only for the SP and SLP categories, not for the Audio and Electroacoustics (AE) category, see below). The most recent version of this file (as I update based on your suggestions) will always stay here at the link: http://ssli.ee.washington.edu/~bilmes/TSAP-ICASSP-EDICS.txt At the bottom of this file, I am keeping a history which you can also check to see if your suggestions have yet been incorporated (I will do my very best to incorporate all suggestions, as long as they do not conflict with others' suggestions:). I'll probably not respond to each suggestion by email (but each will eventially appear in the history below). These EDICS were last updated $Id: TSAP-ICASSP-EDICS.txt,v 1.21 2005/10/28 07:44:16 bilmes Exp $ Here are the EDICS: -- begin of EDICS --- Speech Processing (SPE-) 1: Speech Production (SPE-SPRD) 1.1: Physical models of the vocal production system 1.2: Bioacoustics and Medical Acoustics 1.3: Singing and properties of the musical voice 2: Speech Perception and Psychoacoustics (SPE-SPER) 2.1: Models of Speech Perception 2.2: Hearing and Psychoacoustics 2.3: Physiological models and applications thereof 2.4: Audiology applications 3: Speech Analysis (SPE-ANLS) 3.1: Spectral and other time-frequency analysis techniques 3.2: Segmental and suprasegmental analysis 3.3: Distortion measures 3.4: Extraction of non-linguistic information (e.g., gender, stress, etc) 3.5: Voice/speech disorders 3.6: Speaker localization (space) (e.g., in meetings) 3.7: Speaker diarization (time) (e.g., in meetings) 3.8: Speaker clustering (e.g., in Broadcast news) 4: Speech Synthesis and Generation, including TTS (SPE-SYNT) 4.1: Segmental-Level and/or concatenative synthesis 4.2: Signal Processing/Statistical Model for synthesis 4.3: Articulatory Synthesis 4.4: Parametric Synthesis 4.5: Prosody, Emotional, and Expressive Synthesis 4.6: Text-to-phoneme conversion 4.7: Voice Quality/Morphing 4.8: Audio/Visual speech synthesis 4.9: Multilingual synthesis 5.10: Quality assessent/evaluation metrics in synthesis 4.11: Tools and data for speech synthesis 4.12: Text processing for speech synthesis (text normalization, syntactic and semantic analysis) 5: Speech Coding (SPE-CODI) 5.1: Narrow-band and wide-band Speech Coding 5.2: Theory and techniques for signal coding (e.g., waveform, transform) 5.3: Modulation and source/channel coding 5.4: Quantization and compression 5.5: Robust coding for noisy channels 5.6: Coding for Voice Over IP (VOIP) 5.7: Quality assessent/evaluation metrics (e.g., PESQ) in coding 5.8: New applications of VOIP 6: Speech Enhancement (SPE-ENHA) 6.1: Control and reduction of channel noise (e.g., reverb, room response) 6.2: Perceptual enhancement of non-noisy speech 6.3: Speech enhancement for humans with hearing impairments 6.4: Non-acoustic microphones for enhancement 6.5: Bandwidth expansion 6.6: Noise Reduction 7: Acoustic Modeling for Automatic Speech Recognition (SPE-RECO) 7.1: Feature Extraction 7.2: Low-level feature modeling - Gaussians & beyond 7.3: Pronunciation modeling at the acoustic level 7.4: State clustering and novel state definitions 7.5: Prosody and other speech characteristics 7.6: Dialect, accent, and idiolect at the acoustic level 7.7: Discriminative Acoustic Training Methods for ASR 7.8: Articulatory and physiological modeling 7.9: Feature Transformation and Normalization 8: Robust Speech Recognition (SPE-ROBU) 8.1: Features specifically for robust ASR (noise, channel, etc) 8.2: Model/backend based robust ASR 8.3: Confidence measures and rejection 8.4: Speech Activity/End-point detection 8.5: Barge-in 8.6: Non-acoustic microphones for ASR 9: Speech Adaptation/Normalization (SPE-ADAP) 9.1: Speaker adaptation and normalization (e.g., VTLN) 9.2: Speaker adapted training methods 9.3: Environmental/Channel adaptation 9.4: Idiolect adaptation 9.5: Register and/or dialect adaptation 10:General Topics in Speech Recognition (SPE-GASR) 10.1: Distributed Speech Recognition - Client/Server methods 10.2: Alternative Statistical/Machine Learning Methods (e.g., no HMMs) 10.3: Word spotting 10.4: Metadata (e.g., emotion, speaker, accent) extraction from acoustics 10.5: New algorithms, computational strategies, data-structures for ASR 10.6: Multi-modal (such as audio-visual) speech recognition 10.7: Corpora, annotation, and other resources 10.8: Algorithm approximation methods in ASR 10.9: Structured classification approaches 11:Multilingual Recognition and Identification (SPE-MULT) 11.1: Language (LID) and dialect (DID) identification 11.2: Multilingual Speech recognition 11.3: Processing of non-native accents 12:Lexical Modeling and Access (SPE-LEXI) 12.1: Pronunciation modeling at the lexical level 12.2: Dialect, accent, and idiolect at the lexical level 12.3: Multilingual aspects (e.g., unit selection) 14.4: Automatic lexicon learning 14:Large Vocabulary Continuous Recognition/Search (SPE-LVCR) 13.1: Decoding algorithms and implementation 13.2: Lattices 13.3: Multi-pass strategies 13.4: Miscellaneous Topics 14:Speaker Recognition and Characterization (SPE-SPKR) 14.1: Features and characteristics for speaker recognition 14.2: Robustness to variable and degraded channels 14.3: Verification, identification, segmentation, and clustering 14.4: Speaker characterization and adaptation 14.5: Speaker recognition with speech recognition 14.6: Speaker confidence estimation 14.7: Multimodal and multimedia human speaker recognition 14.8: Corpora, annotation, evaluation, and other resources 14.9: Higher-level knowledge in speaker recognition 15:Resource constrained speech recognition (SPE-RCSR) 15.1: Low-power speech recognition 15.2: Reduced computation speech recognition 15.3: ASR techniques for highly portable/mobile devices Spoken Language Processing (SLP-) 1: Spoken Language Understanding (SLP-UNDE) 1.1: Paralinguistic (emotion, age, gender, rate, etc.) information 1.2: Nonlinguistic (meaning external to language) information, gestures, etc. 1.3: Semantic classification 1.4: Question/answering from speech 1.5: Entity extraction from speech 1.6: Spoken document summarization 1.7: Detecting linguistic/discourse structure (e.g., disfluencies, sentence/topic boundaries, speech acts) 1.8: Relation to and interpretation of sign language 2: Human Spoken Language Acquisition, Development and Learning (SLP-LADL) 2.1: Language acquisition, development, and learning models 2.2: Computer aids for language learning 2.3: Attributes and modeling techniques for assessment of language fluency 3: Spoken and Multimodal Dialog Systems and Applications (SLP-SMMD) 3.1: Spoken and multimodal dialog systems, applications, and architectures 3.2: Stochastic Learning for dialog modeling 3.3: Response Generation 3.4: Technologies for the aged 3.5: Evaluation metrics and standards 3.6: Speech/voice-based human-computer interfaces (HCI) 3.7: Speech HCI for individuals with impairments (blindness, etc.) and universal access (UA) 3.8: other applications 4: Speech data mining and Document Retrieval (SLP-SMIR) 4.1: Analysis and Evaluations for mining spoken data 4.2: Search/retrieval of speech documents 4.3: Mining heterogeneous speech and multimedia data 4.4: Speech data mining theory, algorithms, and methods 4.5: Core machine learning algorithms for data mining 4.6: Topic spotting and classification 4.7: Pattern discovery and prediction from data 4.8: Applications and tools for speech data mining 5: Machine Translation of Speech (SLP-SSMT) 5.1: Semi-automatic and data driven methods 5.2: Speech processing for MTS 5.3: Corpora, annotation, and other resources 5.4: Interlingua and transfer approaches 5.5: Integration of speech and linguistic processing 5.6: Machine transliteration for named entities 5.7: Evaluation metrics (e.g., BLEU) 5.8: Systems and applications for MTS 6: Language Modeling, for Speech and SLP (SLP-LANG) 6.1: N-grams, their generalizations and smoothing methods. 6.2: Language Model Adaptation 6.3: Grammar based language modeling 6.4: Maxent and feature based language modeling 6.5: Dialect, accent, and idiolect at the language level 6.6: Discriminative LM Training Methods 6.7: Other approaches to LMs 6.8: Structured classification approaches 7: Spoken language resources and annotation (SLP-REAN) 7.1: General corpora, annotation, and other resources Audio and Electroacoustics (AUD-) Note: I am not currently taking comments on these, but they are here for completeness. 1: Room Acoustics and Acoustic System Modeling (AUD-ROOM) 2: Transducers (AUD-TRAN) 3: Loudspeaker and Microphone Array Signal Processing (AUD-LMAP) 4: Active Noise Control (AUD-ANCO) 5: Echo Cancellation (AUD-ECHO) 6: Auditory Modeling and Hearing Aids (AUD-AUDI) Aids for the handicapped; auditory modeling and psychoacoustics, multi-channel biological modeling, binaural hearing; multi-channel medical aids (cochlear implants, hearing aids). 7: Broadband and Perceptual Coding (AUD-ACOD) Low bit-rate audio coding; high quality audio coding; lossless audio coding; joint source-channel coding; parametric audio coding; audio coding theory. 8: Applications to Music (AUD-MUSI) Music analysis and synthesis; room acoustics for music performance and reproduction; music processing systems, hardware and software; content identification. 9: Spatial and Multichannel Audio (AUD-SMCA) 3-D Sound reproduction; head-related transfer function. 10: Audio for Multimedia (AUD-AUMM) 11: Network Audio (AUD-NWAU) 12: Hardware and Software Systems (AUD-HWSW) 13: Consumer and Professional Audio (AUD-CAUD) Extras (when I do get suggestions for AE, they are placed here): 14: Audio Classification -- end of EDICS --- History: - 5/2004: Jeff, Mazin and Isabel all agree that the current ICASSP/TSAP EDICS are quite out of date, and agree that they should ideally be unified. - 6/2004: Jeff creates new set of EDICS - 6/2004: Isabel created new EDICS for TSAP, Jeff integrated them into these EDICS. - 6/2004: Mazin edits based on constraints mentioned at top of this file. - 6/2004: Jeff sends out to STC and asks for rapid response from people in order to make it in for ICASSP'05 - 6/2004: Jeff integrates comments from Michael Picheny, Alan Black, Timothy Hazen, and Ivan Bulyko. - 6/2004: Mazin does a final pass to make sure EDICS match the ICASSP'05 constraints. - a year passes. - 6/2005: Mazin updates the EDICS based on an attempt to merge ICASSP/TSAP and get the ball rolling again. - 6/2005: Isabel mentions that TSAP needs the top level categories, so Mazin updates again. Mazin writes: "The difference between the ICASSP and the T-SAP will remain that ICASSP will have a third level and T-SAP will just state those levels exactly without numbering. That may be fine though if SPS is against explicitly defining a third level." - 6/2005: Isabel sends an updated version - 6/2005: Mazin updates them again - 7/2005: Jeff takes EDIC lock again. - 7/2005: integrates 6/2004 comments from Nick Cambell (that missed ICASSP'05 deadline). - 7/2005: integrates 6/2004 comments from Jean-Pierre Martens (that missed ICASSP'05 deadline) - 7/2005: updates based on Alex Acero's comments from 6/2005. - 7/2005: incorporates Joseph Cambell's comments from 6/2005. - 7/2005: incorporated suggestions from Isabel. - 7/18/2005: Solicitation from STC based on version 1.11. - 7/19/2005: updated based on new Michael Picheny comments from today. - 7/20/2005: incorporated Bill Byrne's comments from today. - 7/20/2005: integrated some (but not all yet) of Mari Ostendorf's comments. - 8/25/2005: added a several more categories (comments from myself). - 8/25/2005: integrated rest of Mari's 7/21/05 comments - 8/25/2005: fully integrated Isabel's comments from 7/6/05 - 8/25/2005: included AE category by Isabel's suggestion from 8/08/05 - 8/25/2005: included Alex Acero's comments from 7/27/05 - 8/26/2005: included Climent Nadeu's comments from today. - 10/27/2005: included new Mari labels for TSAP (e.g., things like SPE-RECO, SPE-ROBU, etc.) - 10/27/2005: included more Isabel suggestions from today. - EOF -