← VMU_LABS
Speech disfluency taxonomy

Every problem we diagnose — and how we fix it.

VMU is not a beautiful picture. It is an acoustic and linguistic engine. Below is the full clinical taxonomy of speech defects we detect — using scientific terminology, not vague "verbal parasites".

What we detect

Every item below is a measurable acoustic or linguistic signal. The AI counts, classifies and timestamps each instance.

Speech disfluencies

Phonation & fluency disorders

Filled pauses (hesitation markers)

Hesitation phenomena
uh · um · er · ehm · э-э · м-м

Non-lexical vocalizations inserted while the brain assembles the next clause. Drains perceived authority.

Verbal stuttering / dysphemia

Developmental stuttering
I-I-I think · so-so-so

Involuntary repetition of phonemes, syllables or words. Often stress-triggered.

Phonemic repetitions

Iterative disfluency
the the · and and

Whole-word loops as the speaker buys cognitive time.

Self-corrections & restarts

Reformulation disfluency
I went— I mean, we went…

Mid-sentence reset that signals an unprepared mental model.

Long silent pauses

Excess inter-pausal units
…………

Pauses above 1.2 s mid-clause read as uncertainty.

Filler words & verbal parasites

Lexical fillers (discourse particles)

Discourse markers

Pragmatic fillers
like · you know · basically · literally · actually · so · I mean · right

Empty connective tissue that dilutes propositional content.

Slavic verbal parasites

Lexical fillers (RU/UK)
короче · типа · как бы · в общем · это самое · ну · значит · понимаешь

High-frequency lexical fillers typical for Russian and Ukrainian speech.

Vague quantifiers

Approximator hedges
kind of · sort of · maybe-ish · в принципе

Hedging tokens that signal low conviction.

Syntactic & grammatical errors

Morphosyntactic deviation

Case agreement errors

Inflectional disagreement
wrong case endings in UK/RU · subject-verb mismatch in EN

Endings, gender, number or case do not agree across the clause.

Illogical / non-cohesive speech

Cohesion breakdown
topic drift · dangling references

Sentences do not link — listener loses the thread.

Anacoluthon

Mid-sentence syntax break
starts one structure, ends in another

Common under cognitive load; reduces clarity score.

Register & social risk

Sociolinguistic register failures

Profanity & taboo lexicon

Coprolalic emissions
explicit, vulgar, blasphemous lexemes

Detected and flagged; configurable strictness per audience (kids, B2B, public).

Inappropriate slang

Register mismatch
street register in formal context

Wrong register for the social setting.

Prosody & paralinguistic stress

Prosodic & paralinguistic markers

Anxiety / nervousness

Vocal stress signature
↑ pitch · ↓ jitter stability · ↑ shimmer

Acoustic markers of autonomic arousal detected from spectral features.

Excessive speech rate

Tachylalia
> 180 WPM sustained

Listener comprehension drops; sounds defensive.

Monotone delivery

Reduced prosodic range
flat F0 contour

Loss of melodic contour kills persuasion.

Methodology

Two-mode correction protocol

Detection is step zero. Real change happens in two complementary loops — passive and active.

Mode 1 · Live haptic guard

Passive · always-on · zero cognitive overhead

The phone listens in the background. Every time you say a filler, a parasite, or a flagged word, your device vibrates — a single tactile pulse. No screen, no shame. Just a private somatic signal that rewires the habit through classical conditioning over 14–30 days.

  • On-device VAD + keyword spotting (privacy-preserving)
  • Per-category sensitivity: fillers, profanity, repetitions
  • Daily count, streak, downward trend graph
  • Silent mode for meetings
BZZ
fillers_detected · 3

How the engine actually works

STEP_01

Capture

Browser mic / device mic streams 16 kHz mono PCM into the analyzer.

STEP_02

Acoustic features

VAD, F0, jitter, shimmer, spectral tilt — extracted per 25 ms frame.

STEP_03

Transcription

Gemini 2.5 native audio returns verbatim transcript with timestamps.

STEP_04

Linguistic parse

Morphosyntactic analyzer detects agreement errors, register, cohesion.

STEP_05

Classification

Every disfluency tagged by category from the taxonomy above.

STEP_06

Correction loop

Haptic guard fires in real time; tutor session generates personalized drills.

Record 60 seconds. Get your full diagnosis.

Try the live demo