When using Limecraft to automatically transcribe audio into timed text, a 'Custom Dictionary' allows you to reduce the Word Error Rate (WER) to zero and to minimise the effort for manual post-editing. In this article, we explain how to configure and use custom dictionaries to achieve maximum efficiency.


TABLE OF CONTENTS


What is a Custom Dictionary?

When using Audio Transcription, a custom dictionary or glossary significantly enhances accuracy, particularly when dealing with specialised terminology, brand names, or proper nouns which are typically hard to recognise for a standard Automatic Speech Recognition (ASR) engine. 


Generic ASR models often struggle with industry-specific terminology, proper names, and technical jargon, leading to errors and extensive manual work. By using a tailored glossary or 'Custom Dictionary', you ensure correct spelling of brand names, company references, and domain-specific vocabulary. 



1. Configuring a Custom Dictionary

Before you can use glossaries or custom dictionaries to improve transcription accuracy, you first need to configure it. Go to your Workspace Settings > Transcriber. Scroll down to the section 'Dictionaries', as seen below. 


Note Limecraft supports a range of Automatic Speech Recognition (ASR) engines, not all of them supporting custom dictionaries. In case the 'Dictionaries' section is not visible, your workspace might be set up using an engine that not support it.


Limecraft screenshot illustrating how to configure glossaries or custom dictionaries

To create a new dictionary, select ‘Add new dictionary’, which gives you the following screen: 

Limecraft screenshot illustrating how to create a new Custom Dictionary or Glossary for using during audio transcription

Start creating a new Custom Dictionary by adding a descriptive name for the dictionary and the applicable language. The domain is optional. 


Limecraft screenshot fragment illustrating how to configure a Custom Dictionary









Next type or paste the terms or words in the input field at the bottom of the page. A dictionary entry can be a single word or a phrase which you expect to appear as-is in the spoken text of your material.


Each line in this input field should contain a single dictionary entry. You can specify up to 1000 entries in a single dictionary. Don't forget to confirm by using ‘Save dictionary’. 

Limecraft screenshot fragment illustrating how to configure a custom dictionary for using during automatic speech recognition

If you navigate back to the Transcriber settings, you’ll now see a table containing one row for each dictionary you created. 

Limecraft screenshot fragment illustrating the overview of custom dictionaries, giving access to a menu to edit, remove or export custom dictionaries


On the right side of each dictionary, there is a menu ("...") which allows you to edit, remove or export the dictionary.

 

2. Exporting and Importing Custom Dictionaries


When editing a Custom Dictionary, you have the option to export the contents as a list of words, or to import a similar list.


2.1 Exporting a dictionary


Limecraft screenshot fragment illustrating how to export a custom dictionary


You can export a dictionary as a JSON file or as a CSV (Comma Separated Values) file.


Limecraft screenshot fragment illustrating how you can export Custom Dictionaries as a JSON or CSV file

2.2 Importing a Custom Dictionary


It is also possible to import Custom Dictionaries. Simply select the file, and choose if you would like to replace all entries that are already in the dictionary or not.


When importing a CSV dictionary file that was not created by Limecraft in the first place:

  • the first row is assumed to contain header labels

  • the column with header label “content” should contain the term


To avoid issues though, it is best to start from a CSV exported from a Limecraft dictionary, and edit that. 


Limecraft screenshot fragment illustrating how to import Custom Dictionaries


3. Automatic Speech to Text Transcription using a Custom Dictionary


To use a Custom Dictionary during audio transcription, open the transcriber for the clip as shown below, select the language, and select the right Custom DIctionary for this language (cf above).


Limecraft screenshot illustrating how to select a Custom Dictionary for automatic speech-to-text transcription

4. Automatic subtitling with a dictionary


Custom Dictionaries can also be engaged when creating Subtitles using Automatic Transcription.

Limecraft screenshot fragment illustrating the use of Custom Dictionaries when creating subtitles using automatic speech-to-text transcription