Resource Pool

Resource Pool

We aim to offer useful tools and datasets in the field of Language and Speech Communication Science. This page will be regularly updated to provide more relevant information regarding the areas of Speech and Communication Science.

  • Speech Databases
    • Perceptimatic Dataset: The Perceptimatic Dataset links provide access to different datasets composed of stimuli in French, English, Brazilian Portuguese, Turkish, Estonian, and German. You can access the dataset and download files and resources.
  • Human Perception Experimental Data
    • List
  • Neural Datasets
    • List
  • Psycholinguistic resources
    • English Lexicon Project: The English Lexicon Project, supported by the National Science Foundation, provides access to a comprehensive repository of lexical characteristics and behavioral data, covering studies on visual lexical decision and naming with 40,481 words and 40,481 nonwords. Data for the naming and lexical decision studies are compiled from six distinct universities. This dataset includes information collected from over 815 subjects for the lexical decision experiment and from 443 subjects in the naming experiment.
  • English Corpora
    • English-Corpora Org: A corpus (plural: corpora) is a highly structured text collection enabling sophisticated searches to explore language nuances, such as variations between genres, dialects, and over time. Unlike simple search engines like Google, corpora provide researchers, learners, and teachers with extensive data on words, phrases, and grammatical structures beyond textbook or dictionary limits. English-Corpora.org’s corpora, utilized by over 85,000 users monthly, offer detailed “word sketches” for the top 60,000 English words. These sketches include definitions, genre-specific frequencies, synonyms, collocates, related topics, clusters, concordance lines, and links to external resources like dictionary entries, pronunciation, images, videos, and translations to 100+ languages.
    • English-Corpora.org: introduction Video
  • Phenxtoolkit.org: Research Domain – Speech, Language and Hearing
    • The PhenX Toolkit is an online repository that houses crucial measures associated with complex diseases, phenotypic traits, and environmental factors. These measures undergo meticulous selection by expert working groups, ensuring a consensus-driven approach. The Toolkit provides Standard Measurement Protocols for various research domains.
    • Under the Research Domain of Speech, Language, and Hearing, the scope includes:
      • Apraxia/Speech/Sound Disorder (articulation disorders)
      • Audiogram
      • Central Auditory Processing
      • Dysphagia
      • Dysarthria
      • Dyslexia/Reading Disorder
      • Family History (Family History of Speech and Language Impairment, Personal and Family History of Hearing Loss)
      • Late Language Emergence (Early Childhood Speech and Language Assessment)
      • Morphosyntactic/Syntactic Impairments
      • Noise-induced Hearing Loss
      • Nonsyndromic Hearing Loss
      • Otitis Media/Ear Infections
      • Pitch-perception Disorders
      • Presbycusis
      • Semantic Impairments
      • Stuttering/Cluttering
      • Tinnitus
      • Verbal Memory
      • Vertigo
      • Vocal Cord Function
      • Velopharyngeal Incompetence (VPI)