Features
Version 1.5.0
Introduced a rudimental Graphical User Interface (GUI)
Customizable settings including preferred encodings, CSV separators, and GUI layout options
Corpus Loading
Integration of a speaker JSON file for defining speaker-independent variables
Capability to load pre-existing datasets
Visual representation of the loaded dataset and defined variables
Dataset Creation
Option to continue from where you left off in previous sessions
Structured error reporting that allows continuous execution. Errors are categorized by severity: info, warning, and error.
In-built stop button for ceasing dataset generation when needed
GUI window to define regex rules, n-gram context, and data generation fields
Optimized dataset creation process, now 25 times faster
Automatic Analysis
Integrated window for automatic analysis tasks
Support for various analyses including:
k-means clustering
Pairwise frequency computation
Cross-tabulation
t-test statistical analysis
Version 1.0.0
Corpus Loading
Support for loading multiple corpus files
Compatibility with International Phonetic Alphabet (IPA) and UTF-16 encoding
Recognition of different speakers within a single file
Variable Definition
Define custom variables depending on specific research questions
Support for both independent (e.g. age, gender, nationality) and dependent variables as JSON files
Flexibility with broad annotation rules using regular expressions
Dataset Creation
Creation of structured datasets in CSV format
Automated data cleaning and preprocessing for extracted annotations
Detailed output files, including annotation information, missed annotations, and corpus statistics
Efficient error-tracking for consistency and accuracy