Research

CLA | CDA | Assessment | Assessment Tools | ML Sociolinguistics | Prosody | Research Tools

Open Brain AI & Computerized Clinical Language Assessment

Themis Themistocleous Charalambos (2023). Computational Language Assessment Using AI: Open Brain AI.

OpenBrainAI.com: [WEB PLATFORM]
OpenBrainAI.com: [PDF]
Themis Themistocleous Charalambos, Bronte Ficek, Kimberly Webster, Dirk-Bart den Ouden, Argye E. Hillis, Kyrana Tsapkini (2021). Automatic subtyping of individuals with Primary Progressive Aphasia. Journal of Alzheimer’s Disease, https://doi.org/10.3233/JAD-201101

This Machine Learning model performs differential diagnosis of patients with PPA, using combined acoustic and linguistic information elicited automatically. The end-to-end automated machine learning approach enables clinicians and researchers to provide an easy, quick, and inexpensive classification of patients with PPA.

  • Journal of Alzheimer's Disease, version: [PDF]
  • Link to GitHub page with source code: [CODE]
neuralnet Themistocleous Charalambos, Eckerström Marie, and Dimitrios Kokkinakis (2018). Identification of Mild Cognitive Impairment (MCI) from Speech in Swedish using Deep Sequential Neural Networks. Frontiers in Neurology. doi: 10.3389/fneur.2018.00975.

To this day, there is no cure for dementia but early-stage treatment can delay the progression of MCI; thus, the development of valid tools for identifying early cognitive changes is of great importance. In this study, we provide an automated machine learning method, using Deep Neural Network Architectures, that aims to identify MCI. The Deep Neural Network Architecture proposed here constitutes a method that contributes to the early diagnosis of cognitive decline, quantifies the progression of the condition, and enables suitable therapeutics.

Frontiers in Neurology 2018 version: [PDF]

Link to GitHub page with source code: [CODE]

Computerized Clinical Discourse Analysis

Speech production characteristics (i.e., voice quality and speech fluency) in patients with MCI and healthy individuals (Themistocleous, Eckerström, & Kokkinakis, 2020) and grammar, namely Part of Speech Production, differences in patients with PPA (Themistocleous, Webster, Afthinos, & Tsapkini, 2020).

Morphosyntax Themistocleous Charalambos, Webster Kim, Afthinos Alexandros, & Tsapkini Kyrana (2020). Part of Speech Production in Patients With Primary Progressive Aphasia: An Analysis Based on Natural Language Processing. American Journal of Speech-Language Pathology. https://doi.org/10.1044/2020_AJSLP-19-00114

Primary progressive aphasia (PPA) is a neurodegenerative disorder characterized by a progressive decline of language functions. Its symptoms are grouped into three PPA variants: nonfluent PPA, logopenic PPA, and semantic PPA. Grammatical deficiencies differ depending on the PPA variant.
Using an automated analysis of a short picture description task, this study showed that content versus function words can distinguish patients with nonfluent PPA, semantic PPA, and logopenic PPA variants. Verbs were less important as distinguishing features of patients with different PPA variants than earlier thought. Finally, the study showed that among the most important distinguishing features of PPA variants were elaborative speech elements, such as adjectives and adverbs.
  • Journal of Alzheimer's Disease, version: [PDF]
  • Link to GitHub page with source code: [CODE]

Assessment of Treatment Efficacy

The effects of Transcranial Direct Current Stimulation (tDCS) over the left Inferior Frontal Gyrus in patients with Apraxia of Speech (AOS) on consonant and vowel duration and speech fluency. tDCS coupled with speech intervention results in significantly shorter vowels and consonants than sham (Themistocleous, Webster, & Tsapkini, 2021).

tDCS-sham Themistocleous Charalambos, Webster Kimberly, Tsapkini, Kyrana (2021). Effects of tDCS on Sound Duration in Patients with Apraxia of Speech in Primary Progressive Aphasia. Brain Sciences, 11(3):335-553.https://www.mdpi.com/2076-3425/11/3/335.

Transcranial direct current stimulation (tDCS) over the left inferior frontal gyrus (IFG) was found to improve oral and written naming in post-stroke and primary progressive aphasia (PPA), speech fluency in stuttering, a developmental speech-motor disorder, and apraxia of speech (AOS) symptoms in post-stroke aphasia. This paper addressed the question of whether tDCS over the left IFG coupled with speech therapy improves sound duration in patients with apraxia of speech (AOS) symptoms in non-fluent PPA (nfvPPA/AOS) more than sham.
Segmental duration was significantly shorter after tDCS compared to sham and tDCS gains generalized to untrained words. The effects of tDCS sustained over two months post-treatment in trained and untrained sounds. Taken together, these results demonstrate that tDCS over the left IFG may facilitate speech production by reducing segmental duration. The results provide preliminary evidence that tDCS may maximize efficacy of speech therapy in patients with nfvPPA/AOS.
  • Brain Sciences version: [PDF]

Computational Assessment Tools

Computational Assessment Tasks and Tools for Scoring Language Pefromance in patients with Aphasia.

Automatic Scoring Tools for Automatic Scoring of Spelling and Phonology

These two tools enable the automatic scoring of phonology and of spelling performance. They run on Terminal. To download these apps visit: Citing the apps
  • To cite the spelling assessment app, please cite the paper: Themistocleous, Charalambos, Neophytou, Kyriaki, Rapp, Brenda, & Tsapkini, Kyrana (2020). A Tool for Automatic Scoring of Spelling Performance. Journal of Speech, Language, and Hearing Research. doi:10.1044/2020_JSLHR-20-00177
  • To cite the Phonological Assessment App: Themistocleous, Charalambos (2021). A Tool for Automatic Scoring of Phonological distance.
IPA4 Word Repetition App

Word Repetition App is employed by clinicians to test ST memory. The application plays a lists of sounds of semantically related words (clothes, appliances, etc.) and asks the participant to repeat these sounds. During the repetition phase the software records the production. The process repeats five times except if the clinician decides to modify the number of repetitions.

IPA4 Word Repetition Span App

Word Repetition Span task may consist of word repetition or non-word repetition. Both spans begin at List length 2 and ceiling at List length 5. For this task, the computer will say a string of words or non-words for the participant to repeat in order. The clinician then copies the participant's responses in the order they were offered. This task assesses the ability to repeat single words and single non-words. Depending on the accuracy and types of errors, it is sensitive to deficits of input and output processing

NLP_Editor NLP Editor

NLP Editor is a simple text editor that enables the linguistic analysis of texts. The current version allows users to convert text in Greek to IPA, count words, etc.

  • To access the source code of the project, visit: NLP Editor.
IPA4 Phonetics IPA

Phonetics IPA is a platform for the linguistic analysis of texts in a Text-Editor-like environment. The current version allows users to

  • type text in Standard Modern Greek orthography and convert it to IPA.
  • type text in Cypriot Greek orthography and convert it to IPA.
  • create lists of words in reverse order for dictionaries.
  • analyze texts using the implemented regular expressions engine (currently only the windows version).

It was originally written in C# for Microsoft Windows; these versions are not under development any more. (There is also a newer version written in Python for testing; see details below.)

You may cite the software as follows:

Themistocleous Charalambos (2017). IPAGreek: Computational Greek Phonology. [Computer program]. Version 3.0, retrieved 21 August 2017 from https://charalambosthemistocleous.com

Themistocleous, Charalambos (2011). Computational Greek Phonology: IPAGreek. The 10th International Conference of Greek Linguistics. Komotini, Greece.

Machine Learning and Sociolinguistics

Identification of speakers' dialect from vowels (Themistocleous, 2017a), fricatives (Themistocleous, 2017b; Themistocleous, Savva, & Aristodemou, 2016), stops (Themistocleous, 2016a), and sonorants (Themistocleous, 2019; Themistocleous, Fyndanis, Tsapkini, 2022).

NLP_Editor Themistocleous Charalambos (2019). Dialect Classification from a Single Sonorant Sound Using Deep Neural Networks. doi: 10.3389/fneur.2018.00975.

Listeners do not require long productions of speech to identify the accent of a speaker, often a single sound suffices. This study shows that using machine learning and information from a speech segment, namely a single sonorant sound /m n l r/, it is possible to distinguish two dialects of Greek: Athenian Greek and Cypriot Greek. In our future research, we will be exploring further this approach to identify medical conditions that influence speech production.

  • Frontiers in Communication 2019 version: [PDF]
  • Link to GitHub page with source code: [CODE]
dialect classifier Themistocleous Charalambos (2017). Dialect classification using vowel acoustic parameters. Speech Communication 94, 13 -22.

This study provides a classification model of two Modern Greek dialects, namely Athenian Greek and Cypriot Greek, using information from speech. To this end, a large corpus of vowels from 45 speakers of Athenian Greek and Cypriot Greek was collected. The findings show that duration and the zeroth coefficient of F2, F3 and F4 contribute more to the classification of the dialect than the other measurements; it also shows that formant dynamics are important for the classification of dialect.
Link to the paper: [PDF]
vowels Themistocleous Charalambos (2017). The Nature of Phonetic Gradience across a Dialect Continuum: Evidence from Modern Greek Vowels. Phonetica 74, 157–172.

This study investigates the acoustic properties of vowels in Athenian Greek (AG) and Cypriot Greek. The findings show that (1) stressed vowels are more peripheral than unstressed vowels, (2) AG unstressed /i a u/ vowels are more raised than the corresponding CG vowels, (3) AG unstressed vowels are shorter than CG unstressed vowels, and (4) AG /i·u/ are more rounded than the corresponding CG vowels.

Link to the paper: [PDF]
Dictionary Themistocleous Charalambos (2017). Effects of two linguistically proximal varieties on the spectral and coarticulatory properties of fricatives: Evidence from Athenian Greek and Cypriot Greek. Frontiers in Psychology.

The central thesis of this paper is that cross-dialectal studies of fricative's acoustic structure can reveal patterns that designate speakers of different dialectal groups. The findings provide a solid evidence base for the manifestation of dialectal information in the acoustic structure of fricatives.

Frontiers 2017 Version: [PDF]
Tree Themistocleous Charalambos (2016). The bursts of stops can convey dialectal information. Journal of the Acoustical Society of America EL 140(4), EL334–EL340.

This study investigates the effects of the dialect of the speaker on the spectral properties of stop bursts. Forty-five female speakers—20 Standard Modern Greek and 25 Cypriot Greek speak- ers—participated in this study. The spectral properties of stop bursts were calculated from the burst spectra and analyzed using spectral moments. The findings show that besides linguistic information, i.e., the place of articulation and the stress, the speech signals of bursts can encode social information, i.e., the dialects. A classification model using decision trees showed that skewness and standard deviation have a major contribution for the classification of bursts across dialects.

Link to the JASA paper: [PDF]

Statistical Models of Prosody

The production of prenuclear and nuclear pitch accents (Themistocleous, 2011, 2016b) and their grammar.

Anchorage Themistocleous Charalambos (2016). Seeking an anchorage: Evidence from the tonal alignment of the Cypriot Greek prenuclear pitch accent. Language and Speech, 59(4):433–461.

By exploring the timing of the Cypriot Greek L*+H prenuclear pitch accent, this study tested the predictions of three hypotheses about tonal alignment: the invariance hypothesis, the segmental anchoring hypothesis, and the segmental anchorage hypothesis. The findings on the alignment of the high tone (H) are both intriguing and unexpected: the alignment of the H depends on the number of unstressed syllables that follow the prenuclear pitch accent. The `wandering' of the H over multiple syllables is extremely rare among languages, and casts doubt on the invariance hypothesis and the segmental anchoring hypothesis, as well as indicating the need for a modified version of the segmental anchorage hypothesis. To address the alignment of the H, we suggest that it aligns within a segmental anchorage–the area that follows the prenuclear pitch accent–in such a way as to protect the paradigmatic contrast between the L*+H prenuclear pitch accent and the L+H* nuclear pitch accent.

Link to the Language and Speech paper: [PDF]
Correlations Themistocleous Charalambos (2014). Edge-Tone Effects and Prosodic Domain Effects on Final Lengthening. Linguistic Variation 14(1). 129–160

This study reports two experiments that investigate the edge-tones and domain-specific effects on final lengthening. The study shows that in Cypriot Greek the following occur: (a) lengthening applies primarily on the syllable nucleus not the syllable onset, which suggests variety specific effects of lengthening; (b) lengthening depends on the edge-tones, namely, polar questions trigger more lengthening than statements and wh-questions; (c) lengthening provides support for at least two distinct prosodic domains over the phonological word, the intonational phrase and the intermediate phrase; greater lengthening associates with the first and shorter lengthening with the latter; (d) finally, syllable duration depends on the syllable distance from the boundary. By pointing to the distinct lengthening effects of edge-tones and domain-boundaries, the aforementioned findings highlight the application of different lengthening devices.

Link to the PDF paper: [PDF]

Machine Learning and Language Acquisition

Identification of language acquisition (clitic acquisition) in multidialectal children.

Tree Grohmann Kleanthes, Papadopoulou Elena and Themistocleous Charalambos (2017). Acquiring Clitic Placement in Bilectal Settings: Interactions between Social Factors. Frontiers in Communication.

The C5.0 machine-learning algorithm was employed to model the interaction of sociolinguistic factors on the development of clitic placement in bidialectal children. The model shows that speakers acquire the relevant features very early, yet compartmentalization of form and function according to style emerges only as they engage in the larger speech community.

Frontiers 2017 Version: [PDF]
⬆Top

Research Tools

Online Tools for the Study of Vocabulary in Dialects; an implementation for Cypriot Greek.

Dictionary Online Cypriot Greek Dictionary

Cypriot Greek Dictionary. This is an online dictionary of Cypriot Greek with text-to-speech, developed as part of the ‘Syntychies’ research program. You can search for specific words using basic regular expressions. The environment was written in C#

Presented in EURALEX 2012:

Themistocleous Charalambos, Marianna Katsoyannou, Spyros Armosti, and Kyriaci Christodoulou (2012). Cypriot Greek Lexicography: A Reverse Dictionary of Cypriot Greek. Paper presented at the 15th European Association for Lexicography (EURALEX) Conference, Oslo, Norway, 7 – 11 August 2012.

Link to the EURALEX paper: [PDF]

Themistocleous Charalambos, Marianna Katsoyannou, Spyros Armosti, and Kyriaci Christodoulou (2012). Cypriot Greek Lexicography: An online lexical database. Paper presented at the 15th European Association for Lexicography (EURALEX) Conference, Oslo, Norway, 7 – 11 August 2012

Link to the EURALEX software demonstration paper: [PDF]

See also other lexicography projects .
Keyboard LayoutsKeyboard Keyboard layouts for Windows and macOS. Often we need to type symbols that do not exist in the standard keyboard layouts. One way to solve this issue is to assign specific symbols in several applications, such as Microsoft Word or LibreOffice Writer but with most applications this is not even an option. The best way to address this issue is by installing a specified keyboard layout.

There are three keyboard layouts: one for writing Cypriot Greek and includes the characters that are needed to produce the post-alveolars, a layout for accessing IPA symbols, and a layout for adding symbols when working with historical manuscripts (paleography).

You may find more updated information about Keyboard layouts [here].

  • Cypriot Greek Keyboard (macOS): A keyboard layout that facilitates writing Cypriot Greek text. [Click here to download].
  • Cypriot Greek Keyboard (Windows): A keyboard layout that facilitates writing Cypriot Greek text. [Click here to download].
  • IPA Keyboard Layout (Windows): A keyboard layout that facilitates writing texts with IPA symbols (mainly for Greek).[Click here to download].
  • Keyboard for paleographers (Windows): This keyboard layout includes special symbols used in paleography.[Click here to download].

Tools for Getting Things Done in the Lab - Lab Managment Tools and Praat Scripts for Sound Analysis.

Project Project Management - GTD

This is a python code that generates folder structures for organizing your academic projects (you can modify it according to your needs)

Visit [Create Project Template] to access this code.
Praat Scripts Praat Scripts

Visit [GitHub repository] to download scripts for opening, saving, and manipulating sounds and Praat objects. I will be updating this repository as soon as I have something new to add.
⬆Top