Transcribing clips
Convert your interview recordings into a readable format
This guide will walk you through the process of transcribing audio and video clips in Leapfrog and how to effectively use transcripts in your research projects.
Prerequisites
- An active Leapfrog workspace.
- An audio or video file ready for transcription.
Step 1: Create a New Document
To begin, create a new document in your Leapfrog workspace:
- Open your workspace in the Leapfrog web app.
- Click the New Document button located in the top right corner of your screen.
- Name your document and confirm. You will be automatically redirected to the document’s page.
Now you are ready to start adding media files for transcription.
Step 2: Upload Audio or Video Files
To add media files, use the toolbar in the document editor:
- In the document editor, click Upload audio or video to get started.
- Choose the audio or video file you want to upload from your computer.
- Wait for the upload to complete. Once the file is uploaded, you’ll see a button to start the transcription.
Step 3: Start the Transcription
- Click the Transcribe button to begin the transcription process.
- Configure your transcription settings. You can customize the following options:
- Redaction: Automatically redact sensitive information, such as names or addresses.
- Exclude filler words: Remove common filler words like “uh,” “um,” and “mhmm” to produce a cleaner transcript.
- Language selection: Choose the language of the audio or video file for transcription. You can manually select a language or use Autoselect to detect it automatically.
Supported Languages
Leapfrog supports transcription in the following languages:
English, Chinese, German, Spanish, Russian, Korean, French, Japanese, Portuguese, Turkish, Polish, Catalan, Dutch, Arabic, Swedish, Italian, Indonesian, Hindi, Finnish, Vietnamese, Hebrew, Ukrainian, Greek, Malay, Czech, Romanian, Danish, Hungarian, Tamil, Norwegian, Thai, Urdu, Croatian, Bulgarian, Lithuanian, Latin, Maori, Malayalam, Welsh, Slovak, Telugu, Persian, Latvian, Bengali, Serbian, Azerbaijani, Slovenian, Kannada, Estonian, Macedonian, Breton, Basque, Icelandic, Armenian, Nepali, Mongolian, Bosnian, Kazakh, Albanian, Swahili, Galician, Marathi, Punjabi, Sinhala, Khmer, Shona, Yoruba, Somali, Afrikaans, Occitan, Georgian, Belarusian, Tajik, Sindhi, Gujarati, Amharic, Yiddish, Lao, Uzbek, Faroese, Haitian Creole, Pashto, Turkmen, Norwegian Nynorsk, Maltese, Sanskrit, Luxembourgish, Burmese, Tibetan, Tagalog, Malagasy, Assamese, Tatar, Hawaiian, Lingala, Hausa, Bashkir, Javanese, Sundanese.
Model Size Selection
You can also select different model sizes based on your needs. Leapfrog offers multiple models for transcription, ranging from smaller and faster to larger and more accurate:
- tiny: The smallest and fastest model.
- base: A step up from tiny in size and accuracy.
- small: Offers a balance between speed and accuracy.
- medium: The default model, providing good overall performance.
- large: The most accurate model available, using OpenAI’s Whisper large-v2 as its default.
Supported File Types
Leapfrog supports the following file formats for audio and video transcription:
- Video: .mp4, .webm, .ogv, .avi, .mov, .mkv
- Audio: .mp3, .wav, .ogg, .aac, .webm, .flac
Example of Redaction
Here’s an example of how redaction might work in a transcript:
Original | Redacted |
---|---|
”Hi, my name is Jane Doe and I live at 123 Main Street." | "Hi, my name is [NAME_1] and I live at [LOCATION_1]." |
"My bank account number is 1234567890." | "My bank account number is [ACCOUNT_NUMBER_1]." |
"You can contact me at john.doe@example.com." | "You can contact me at [EMAIL_ADDRESS_1].” |
Leapfrog automatically identifies and redacts sensitive personal information to protect privacy.
Step 4: Configure Filler Words
Filler words—such as “uh” and “um”—are common in spoken language, but can clutter transcripts. You can choose to either include or exclude these filler words from your transcription.
What Are Filler Words?
Filler words are non-essential phrases that people often use in conversation to pause or think aloud. Common filler words include:
- uh
- um
- mhmm
- uh-huh
You can customize your transcription process to either keep these words for a more natural flow or remove them for a cleaner result.
Example with and without Filler Words
Original | With Filler Words | Without Filler Words |
---|---|---|
”Uh, so you’re looking for, uh, something specific, um, like a precise model." | "Uh, so you’re looking for, uh, something specific, um, like a precise model." | "So you’re looking for something specific, like a precise model.” |
To adjust filler word settings:
- Enable Filler Words to keep them in the transcript.
- Disable Filler Words to remove them and improve readability.
Additional Options: Language Selection
Leapfrog allows you to transcribe audio or video in multiple languages. Simply choose the language in the transcription settings or enable the Autoselect option to detect the language automatically.
What’s Next?
Once your transcription is complete, you can start analyzing the data by:
- Tagging and coding the transcript to identify key themes and patterns.
- Collaborating with your team to refine your insights.
To learn more about these steps, visit our Tagging and Coding documentation page.
Now that you’ve learned how to transcribe audio and video, move on to our guide for analyzing transcripts and extracting insights.