I made neat little progam that will scan any text file and fix typos based on the words I have defined and their correct replacements. As I've been building the wordlist, I've realized that what I'm actually building is a Hive Dictionary for ASR transcripts.
This significantly increases the quality of the transcripts, making it possible to generate higher quality data (like summaries) based on them.
RE: LeoThread 2024-04-02 14:32