User Manual

Comprehensive guide for students, teachers, and administrators.
Press F1 for Help

Technical Requirements

High-quality audio recording is essential for accurate vowel analysis. Please review these settings before starting.

Recommended Browsers

  • Google Chrome (Desktop/Android): Best Choice. Most stable Web Audio API support.
  • Microsoft Edge: Excellent alternative.
  • Firefox: Fully supported.
  • AVOID: Opera Mini, UC Browser, and In-App Browsers (Facebook/Line/Instagram).

Audio Modes & Environment

High Fidelity Mode (Recommended)

By default, the app uses High Fidelity Mode. This captures your raw voice without any artificial enhancement. This is critical for accurate analysis but comes with a requirement:

Quiet Room Required: Because noise suppression is disabled, the system will hear fans, AC units, and background chatter. Please ensure you are in a silent environment.

Critical Settings: Disable "Data Savers"

Important

"Data Saver", "Turbo Mode", or "Lite Mode" features route traffic through proxy servers. This blocks the microphone or compresses audio, making analysis impossible.

  • Chrome (Android): Go to Settings > Lite Mode (if available) > Select OFF.
  • Opera Mobile / Mini: Go to Settings > Data Saver > Select OFF.
  • Android System: Ensure Settings > Network & Internet > Data Saver is OFF in system network settings.

Hardware Recommendation

Wired Headset

Best consistency (44.1kHz/48kHz). No connection drops.

Bluetooth/AirPods

Often drops to "Hands-Free" quality (8kHz), which is too low for accurate analysis.


Registration & Access

The system distinguishes between Students and Teachers during the registration process.

For Students

  • Go to the Register page.
  • Fill in your Username, First/Last Name, and Password.
  • Required: Enter your exact 10-digit Student ID.
  • Leave the "Invite Code" field blank.
  • Check the consent box and submit.

For Teachers

  • Obtain an Invite Code from a system administrator.
  • Go to the Register page.
  • Fill in your Username, First/Last Name, and Password.
  • Enter the code in the "Invite Code" field.
  • Note: When a valid code is detected, the "Student ID" requirement is automatically removed.
  • Submit the form to create your Instructor account.

Account Security & Recovery

Forgot Password?
If you lose access to your account, click the "Forgot Password?" link on the login page. Enter your registered email address to receive a secure reset link. This link expires in 10 minutes.

Account Lockout Policy:
To protect your account, the system will temporarily lock access after 5 consecutive failed login attempts. You must wait 15 minutes before trying again.

Password Requirements:
Passwords must be at least 8 characters long and include uppercase, lowercase, numbers, and special symbols. You cannot reuse your last 3 passwords.


Student Guide

The student interface is designed for a structured learning path. You cannot skip ahead.

Learning Phases

  • Pre-Test: The first phase. You must record all 20 words from the list to establish a baseline. You cannot switch to Post-Test until this is complete.
  • Post-Test: Unlocked after completing the Pre-Test. This phase serves as your training and final assessment.
  • Strict Mode: The system enforces order. Ensure you complete each word's recording before moving to the next phase.
1

Check Selected Word

The app automatically selects the first incomplete word for you ("Smart Resume"). You can also choose a different word from the left sidebar manually. Correctly pronounced words show a green checkmark.

2

Listen & Practice

Click the "Listen to Model" button to hear the reference pronunciation. The Blue waveform represents the native pronunciation.

3

Record & Submit

Click the "Record" button to start.

Important Tips:

  • Auto-Stop: The system stops automatically when you stop speaking. Articulate ending consonants clearly (e.g. "cat").
  • No Noise: Do not smack your lips or click your tongue before speaking, as this triggers the recording early.
After recording, listen to your playback. If satisfied, click SAVE.
Once saved, the Next ( > ) button will activate. Click it to move to the next word.

4

Analyze Feedback

After submitting, you will be able to see you score represented by Bark distance between acoustic features of vowel in the word pronounced by you and and vowel in example pronunciation.

Understanding the Bark score and colors:
Green: Bark score less than 1.5. Excellent!
Orange: Bark score less than 3.0. Good.
Red: Bark score greater than or equal to 3.0. Try Again.

5

Recommendations

After submitting, you will be able to see recommendations for improvement if your score is more than 1.5. The system is analyzing several formants (frequencies of sound in your speech) and comparing them with the native pronunciation. Recommendations might include suggestions to open your mouth more or close it slightly; move your tongue forward or backward.

6

Troubleshooting: Data Logging

If you experience technical issues, you can enable the LOG checkbox located at the bottom of the recording card.
This sends detailed diagnostic data (browser version, audio sample rate, errors) to the server to help the administrator debug your problem.


Teacher Guide

The Teacher Dashboard provides a high-level overview of class performance and individual student diagnostics.

Navigation & Filtering

  • Search: Use the search bar at the top of the Student Directory to filter by name or ID.
  • Pagination: The table displays 10 students per page. Use the "Next" and "Previous" buttons to navigate large classes.

Reading the Dashboard Table

Column Description
Progress Two bars showing Pre-Test and Post-Test completion percentage. 100% = 20 words submitted.
VTLN Alpha Vocal Tract Length Normalization Factor. A numeric value representing the acoustic scaling applied to the student's voice.
Typical values: Children (0.8-0.9), Women (1.0), Men (1.1-1.2).
Flags Outlier: Indicates one or more recordings have a Bark distance > 5.0 (Review required).
Deep Voice: Indicates the system applied specialized logic for low-frequency formants (F2 < 1500Hz).

Student Details

Clicking on a student's name opens their detailed profile. You can:

  • View their full submission history timeline.
  • Filter: Use the dropdown menu to toggle between "All", "Pre-Test Only", or "Post-Test Only" submissions.
  • Listen to their specific audio files to verify quality.
  • See exact VLTN normalization factors, Bark Score
  • See correction and outlier flags for each recording
  • For each recording it is possible to see vowel formant comparison chart

Research Panel

The Acoustic Research Dashboard serves as a comprehensive tool for analyzing the phonetic accuracy of student pronunciations. Its primary purpose is to visualize the discrepancies between student speech and native reference models.

Analysis Visualizations

1. Vowel Space (F1 vs F2)

Maps each vowel sound based on tongue height (F1) and backness (F2). This scatter plot reveals clustering patterns and how distinct a student's vowels are from each other.

2. Error Distribution (Bark)

A histogram showing the frequency of different error magnitudes. It helps identify if the class generally struggles (high error peak) or performs well (low error peak).

3. F1 Correlation (Height)

Plots Student F1 against Reference F1 with a linear trend line. Deviations from the diagonal indicate consistent issues with jaw opening (e.g., mouth too closed).

4. F2 Correlation (Backness)

Plots Student F2 against Reference F2 with a linear trend line. Deviations here highlight systematic errors in tongue advancement (e.g., tongue too far back).

5. Raw Data Table

The table provides granular data for every single recording submitted.

Column Explanation
Student The Username and ID of the speaker.
Word / Vowel The target word and its primary stressed vowel (IPA) being analyzed.
F1 / F2 (Student) The normalized first and second formants extracted from the student's recording.
F1 / F2 (Ref) The target formants from the native speaker reference model.
Dist (Bark) The perceptual distance between the student's vowel and the reference. Lower is better.
Alpha The VTLN scaling factor used.

Admin Guide

1. Critical System Operations (CLI)

Some operations are restricted to the server command line for security.

# Create a new Admin user
flask create-admin [username] [password]

# Reset a user's password (respects history)
flask reset-password [username]

# Delete any user (including admins)
flask delete-user [username]

# Bulk Reset Passwords (Teacher/Student only)
flask bulk-reset --role [student|teacher]

# Examples:
flask create-admin superuser Pass123!
flask reset-password student1
flask bulk-reset --role student

2. Web Control Panel Features

User Management

Located in the "Users" summary card. The directory lists 10 users per page.

  • Edit Role: You can promote a 'Student' to 'Teacher' or 'Admin'. It is not possible to change 'Admin' role to 'Student' or 'Teacher'.
  • Assign Student ID: Link a physical university ID to a user account.
  • Manage Invite Codes:
    • Generate: Click "Generate Code" to create a random 10-character key.
    • Distribute: Give this code to instructors for registration.
    • Delete: Remove unused codes to prevent unauthorized access. (Used codes are locked and can only be deleted alogside with a teacher they are assigned to.).
  • Delete User: WARNING: This is a destructive action. It permanently deletes the user account AND all their uploaded audio files/submissions from the disk. This cannot be undone.

Word Management

Manage the list of words.

  • Fetch / Auto-Generate: Enter a word to auto-fetch its IPA and Cambridge pronunciation audio.
  • Waveform Editor: The reference audio is visualized as a waveform.
  • Manual Correction:
    • Download: Save the current reference audio to your disk.
    • Upload: Replace it with your own custom MP3 (e.g., if the auto-generated one is poor).
  • Delete Word: Cascades to delete all student submissions for this word.

System Configuration & Health

Control global application settings via toggle switches.

  • Maintenance Mode: When enabled (), only Admins can access the site. All other users are blocked. Use for updates.
  • Registration Open: Toggle OFF () to strictly block new account creation.
  • Enable Logging: When enabled, the client application sends detailed usage logs to the server. Useful for debugging specific user issues.
  • System Logs: Button to download the full server-side log file (`pronounce.log`).
Health Metrics (Footer)
  • DB Size: Current size of the database (Storage usage).
  • CPU Load: Real-time server processor usage percentage.

Glossary of Scientific Terms

Formants (F1 & F2)

Resonant frequencies of the vocal tract that determine vowel quality.

F1 (First Formant): Correlates with jaw opening. Higher F1 = Lower jaw (e.g., "aa" in 'cat').
F2 (Second Formant): Correlates with tongue advancement. Higher F2 = Tongue forward (e.g., "iy" in 'seed').

Bark Scale

A frequency scale that matches how the human ear perceives pitch. Unlike Hertz (Hz) which is linear, the Bark scale is logarithmic, making it better for measuring perceptual differences in speech.

VTLN (Alpha)

Vocal Tract Length Normalization. Men, women, and children have different vocal tract lengths, which shifts their formant frequencies.

Comparison logic uses an "Alpha" factor to mathematically scale a student's voice to match the reference speaker model, allowing for accurate grading regardless of the speaker's age or gender.

Euclidean Distance

The straight-line distance between two points. In this app, it measures the "error" between your vowel pronunciation and the target vowel in the F1-F2 space.

Low Distance (< 1.5): Excellent match.
High Distance (> 3.0): Significant deviation.