avert_transcription_style_guide_1.0
avert_transcription_style_guide_1.0
This guide explains customer expectations for transcript quality and the metrics system, our way
of ensuring freelancer quality.
We trust you to deliver high-quality work. Our clients rely on your accurate and timely
transcription as a crucial part of their daily work.
Quality expectations fall into two main categories: Accuracy and Formatting.
Accuracy - Can you correctly hear and transcribe what words were said and who said them?
Formatting - Can you correctly communicate those words and notations in a way that is easily
Both categories have major errors and minor errors, which are the most common reasons
Errors in your work may lead to lowered metrics. Avert requires transcriptionists to maintain
certain
metrics to remain active. Take special care in proofing your work before submission.
5 – Excellent; Near perfect – May contain a few minor errors that do not alter the meaning of
the original audio.
4 – Good; Customer ready – Errors are more frequent or noticeable but do not change the
meaning of the original audio.
3 – Fair; Not customer ready – Errors are present that would lead to customer confusion. This
includes wrong words, additions or omissions that change the meaning of the original audio.
2 – Poor; Not customer ready – Transcript reflects severe carelessness or lack of understanding
of the Style Guide.
1 - Very Poor; Unusable – Transcript is a poor representation of the original audio, verbatim was
not used when requested, the transcript is incomplete/content is omitted, or Line draft that is
left unedited.
All transcripts MUST label the Speaker. Speaker labels must always have a space after the colon.
Use only square brackets “ [] “ around tags, never parentheses, (), curly brackets, {}, angle
brackets,
Important Criteria:
Non-verbatim transcripts: must be cleaned up, not verbatim. No uhm, arrh etc.. (See
below section on Verbatim / Non Verbatim)
Use standard English.
DO NOT paraphrase.
NEVER type anything that is not spoken in the audio, including your comments or the job
number or title.
Other Major Rules:
Do not make up words. There are two ways in which we mean this:
Do not spell words phonetically. All words should be spell checked and must be actual English
words, unless the speaker was deliberately making up words, such as "what awesome
majorness!"
Otherwise, do not include words just because they sound similar to the syllables that were
spoken.
Transcribe contraction as spoken by the speaker. If the speaker says “She’ll” then write
as “She’ll”, and if she says “She will” then write as “She will”.
Read your transcript before submitting, as if you were reading an article or story. If the
words you used do not make sense in each sentence, they are probably not the words
the speaker was saying.
Tag any words that you are uncertain about or can’t get with [?] i.e uncertain[?]
Use all the information you have available. There are a few major ways in which you can
get extra information, and we ask that you use them
ALWAYS read the Extra Comment area see if contain clues such as the speaker’s names
or the correct spelling for certain terms mentioned in the audio!
The audio itself can give you new information. For example, if at the end, the
interviewer says, "Thanks, Lucy, for this tea!", and the interviewee’s response clearly
indicates that she is Lucy, then you must go back and relabel the interviewee as Lucy:
throughout the transcript.
For both verbatim and non-verbatim, when one person is speaking and another says
nothing but "uh-huh" or “mm-hmm” in the meantime, leave out all those murmuring
noises as long as they aren’t an answer to a question from the speaker.
Speaker Labels:
• Each change in speakers should be placed on a new line. Add a blank line before the changed
speaker.
• Standard format: A complete speaker label includes a colon after the label as well as a space
after that colon. Do this: “Woman 2: “
• When to use: whenever the speakers change, or whenever something happens on a separate
line (like [laughter]) in the middle of a person speaking (even if the same person keeps speaking
after).
• Order of preference for labels: Use names whenever possible, then roles, then use gender as
the last resort.
• Full names: When you have information about a speaker’s full name (from Extra Comment or
because they state their name or are announced by name), use that the first time they appear
in your audio chunk. After that, use only their first name if known, or last name if first name not
known. Do not use full name.
• Descriptiveness: Make each speaker’s label as informative as possible about the person’s role
in the audio. Except in the case of large groups (see special subsection, later in this section),
labels must be useful for telling one person from another. Woman 1: is acceptable, but
Interviewer: or
Host: is much better. Other roles that may apply (use your judgment): Congregant:, Audience
• Adding gender: Use Male and Female only as adjectives for roles, never by themselves. Only
mention gender at all if people of different genders have the same role in the audio. Like this:
Male Host: , Female Host: , but two female hosts would just be Host 1: and Host 2: .
• Adding numbers and cutting down on clutter: Always use numbers with "Man" or “Woman”
labels. Do not use numbers if the speaker has a role other than just “Man” or “Woman,” unless
the audio includes two or more people of the same gender who are playing that same role. Like
this:
Woman 1: and Man 2: , or (if there are two male hosts and one female one)
• Audience: is the label for an audience as a whole, unless they are gathered in a church or
other place of worship, which makes them a Congregation:.
• If there are already two or more other speakers in your audio, don’t worry about telling the
audience or congregation apart. Each one will just be Audience Member: or Congregant: with
• However, if there is only one main speaker on the audio, then be more detailed in specifying
the first two group members who speak. We prefer you do this by mentioning gender (if they
have different genders from each other): Female Audience Member:.
If they are both the same gender, then add a number to their labels instead: Audience Member
2: .
TAGS
• [foreign word]: Word (or [foreign words]) was spoken in a language that is not in English.
• [?]: This is your best guess about the word or words, but it does not really make sense in the
sentence, so you would like someone to give it a close look. (He hurt his knee playing[?]
Monopoly.)
• When there are no real speakers: [background sounds only], [background conversations] or
[silence] are completely OK to use. If the audio file is completely silent, email us at
support@averttranscription.com. NEVER SUBMIT AN EMPTY FILE.
• Signs that an audio file may be corrupted include; all static; high-pitched squealing; high-
speed, high pitched voices, etcetera.
Verbatim Transcripts
Do not summarize -- write down exactly as you hear it. A verbatim transcript is prepared by
transferring each and every utterance, including those that are non-verbal, and even the
[pauses], [laughter], [silence] and [throat clearing] etc.. in an audio file exactly the same way as
delivered.
In a verbatim transcript, false starts, repetitions, or grammatical errors are copied faithfully and
delivered without being tidied up, or by being made more concise. The reader receives a true
copy of an event with the words transcribed exactly the way they were spoken, thereby
according a movie- dialogue, realistic feel to the transcript. Such a transcript is quite helpful
when an interview is being documented or serves as a testimonial for legal purposes,
as the thought process gets implied through verbal cues, such as repeated words or phrases, or
awkward hesitations.
Transcribe every utterance, including repetitive phrasing, false starts, filler words like, "um,"
"uh," "er," etc., and every "I mean," "you know," etc (there are grammatically correct uses of
those and similar words/phrases, even in non-verbatim).
All slang should be retained, e.g., "gonna," "kinda," "sorta, "coz"), etc., should be exactly as
spoken. In other words, do not make any kind of grammatical corrections to the language.
When multiple speakers are involved, a verbatim transcript indicates segments in which there is
an overlap of voices.
Non-Verbatim Transcripts
Businesses that are looking to get meetings transcribed, or academicians who want to provide
their lectures to the students in the written form, wouldn't want to include something like a
verbal nod included in that transcript, but rather would want to go with a clean transcript so
that it is more reader-friendly.
A non-verbatim or “intelligent” transcription, rather than typing the words exactly the way they
are spoken, captures the fundamental meaning behind them. Errors in grammar are rectified
and words or sounds that don't contribute to the underlying message are removed. If fillers or
repetitions occur naturally in the speakers' speech patterns, they are simply removed by the
transcriptionist. In other instances, paraphrasing of a statement is required which conveys the
same idea, but more succinctly. A non-verbatim transcript can be published online without
edits, or it can serve as a marketing piece.
All slang should be changed to proper spelling, e.g., "gonna" changed to "going to", "kinda,"
change to "kind of", "cuz," (or "coz") changed to "because".
• Where possible, break compound sentences into smaller ones. Long sentences should be
broken
into fragments.
• Insert a blank line between paragraphs. Also, start a new paragraph at every speaker change.
• Follow correct grammar. All sentences should start with a capital letter and have the correct
punctuation.
• If a single speaker speaks through-out without speaker change, break into different
paragraphs
as the topics change. New line does not need speaker Label.
Transcription Examples:
Tom: Correct. because you can see, I am labeled with just my first name now. If you don’t know
my first name, then you may name me by my last name. Mr Blingford[?] or Mrs Blingford[?] if
female.
Man 1: Hi, I'm a new speaker. No one ever mentions my name, so the transcriber going to
[inaudible] give me a descriptive name. Here, the only information that can be gathered on me
is that I'm male. So in this case I'm Man 1.
Man 2: Now, there are two identified males. No one ever mentions my name either, I am
identified later than Man 1, so I’d be Man 2.
[END]
Jerry: Correct, umm cuz you can see, I am labeled with just my first name now. If you don’t
know my first
name, then you may name me by my last name. Mr Atkins[?] or Mrs Atkins[?] if female.
Man 1: Hi, I'm a new speaker. No one ever mentions my name, so the transcriber gonna
[inaudible] give me a descriptive name. Here, the only information that can be gathered on me
is that I'm male. So in this case I'm Man 1.
Man 2: Now, there are two identified males. No one ever mentions my name either, I am
identified later
[END]
These sentences have the beginning of an utterance and subsequently stopping prior to
completion.
Example, “He was… uh… He was not as good as he seems.” The beginning, “He was…uh…”
would have been edited out in a non-verbatim transcript, depending on the context. In a
Verbatim transcription, only the part of the word that is spoken is written, followed by a dash to
show that it was cut-off.
Example:
VERBATIM:
Man 2: O-okay. W-, what have you done then? Did you, uh, bring her to the clinic?
NON-VERBATIM:
Man 2: Okay. What have you done then? Did you bring her to the clinic?
Stutters
Stuttering, also called as stammering, is a speech disorder in which the flow of speech is
disrupted by involuntary prolongations. Stammers are removed in Non-Verbatim transcripts but
retained in Verbatim Transcripts. Examples:
VERBATIM: W-w-w-w-w-well, I-I, uh, I th-th-thought that sh-she uh, s-she, uh, she left a-a-
already.
In Non-Verbatim transcriptions, please note the silence if it is abnormally long (more than 1
minute).
On Verbatim, please indicate any pause longer than 10 seconds, using [pause], which can
appear anywhere in the sentence. Short pauses, 2-10 seconds, can be indicated using ellipses (
… ). Example:
VERBATIM
David: Did you see Francis point the gun to Mrs. Gomez?
Francis: [sighs]
When two or more persons are speaking at the same time, remove the overtalk in Non-
Verbatim Transcripts. . Try to get as much from each speaker as possible.
In Verbatim, show exactly where the speech was interrupted by the other speaker/s. Use
ellipses for the last word spoken before the interruption happened.
VERBATIM:
NON-VERBATIM:
Example 2:
A patient is talking and the doctor interrupts with a question that is answered, do the following:
Non-Verbatim Example:
The doctor interjected before the patient finished speaking, but we do not want to split the
sentence into two paragraphs. Therefore, the patient's sentence is completed, and then the
doctor's question is inserted into a new paragraph.
RETAIN ALL SLANG (‘cus, y’all, dunno) in Verbatim transcriptions. You can also use slang in Non-
Verbatim, but sometimes, slang may be revised for clarity (‘cus to “because”)
Numbers:
• Numbers: Phone number, street address, zip code, date, year, unit of measurement, numbers
between 0-9 should be written as numeric. ALL other numbers such as fractions, decimals,
month etc. should ONLY use words to represent. (Special case for time and money.)
• Time: - If an exact time is mentioned, write it as “8:11 a.m.”, If the speaker says “o’clock”,
write
as spoken: “eight o’clock.” If the speaker doesn’t mention an exact time, write as words. “Let’s
have dinner at nine.” Days: A.D. 2010, the 1980s, the ‘90s, 21st century.
• Spell out units of measurement, such as “inches,” “feet,” “yards,” “miles,” “ounces,”
“pounds,”
and “tablespoons.” However, if spoken in shortened form, symbols should be used. Example:
• Use numerals and the percent sign to indicate all percentages except at the beginning of a
sentence.
Examples:
• Use the numeral plus the lowercase “th,” “st,” or “nd” when a day of the month is mentioned
by itself (no month is referred to). Example:
• When the day precedes the month, use the numeral plus the lowercase “th,” “st,” or “nd” if
the ending is spoken. Example: My birthday is on the 8th of May.
• Use the numeral alone when the day follows the month. Example: I will get back to you on
September 16.
• When the month, day, and year are spoken, use the numeral alone for the day, even if an
ending
• Use the numeral plus “cents” or “¢” for amounts under one dollar. Examples: I need 15 cents. I
• Use the dollar sign plus the numeral for dollar amounts under one million. For whole-dollar
amounts of one million and greater, spell out “million,” “billion,” etc.
Examples:
• Use the word “dollar” when describing a range, and upto to ten dollars.
• Use the dollar sign and numerals when transcribing a range of currency over ten dollars.
At Avert, we require you to upload the transcribed file on our template. The specifications to
use are as follows:
1. Ensure the font type is Arial black, font size 12 and alignment 3.08.
3. There should be one space after each speaker/ separating two speakers.