0% found this document useful (0 votes)

5 views23 pages

Help

AntConc is a freeware, multiplatform corpus linguistics tool developed by Laurence Anthony, designed for data-driven language learning and research. It includes various tools such as KWIC, Plot, File, Cluster, N-Gram, Collocate, Word List, Keyword List, Wordcloud, and Chat AI, each serving specific functions for analyzing text corpora. The software is compatible with Windows, MacOS, and Linux, and provides detailed installation instructions for each operating system.

Uploaded by

Sebastián Caro Landeros

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views23 pages

Help

Uploaded by

Sebastián Caro Landeros

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

AntConc (Windows, MacOS, Linux)

Build 4.3.1
Laurence Anthony, Ph.D.
Center for English Language Education in Science and Engineering, School of Science and
Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
July 29, 2024

Introduction
AntConc is a freeware, multiplatform tool for carrying out corpus linguistics research, introducing corpus
methods, and doing data-driven language learning. It runs on any computer running Microsoft Windows (built
on Win 10), MacOS (built on Mac Catalina), and Linux (built on Linux Mint). It is developed in Python and Qt
using the PyInstaller compiler to generate executables for the different operating systems. It uses SQLite as the
underlying database.

Getting Started
Windows - Installer
Double click the AntConc.exe file and follow the instructions to install the application into your Programs folder.
You can delete the .exe file when you are finished. You can start the application via the Start Menu.

Windows - Portable
Unzip the AntConc.zip file into a folder of your choice. In the AntConc folder, double click the AntConc.exe file to
launch the program.

Macintosh OS X
Double click the AntConc.dmg file to create an AntConc disk image on your desktop. Open the disk image and
drag and drop the AntConc app onto the Applications folder (or into another location if you desire). You can then
launch the app by double clicking on the icon in the Applications folder or Launchpad.

Linux
Decompress the AntConc.tar.gz file into a folder of your choice. In the AntConc folder, double click the AntConc.sh
file to launch the software. On the command line, type ./AntConc.sh to launch the software.
Overview of Tools
AntConc contains nine tools that can be accessed either by clicking on their 'tabs' in the tool window, using
CTRL+TAB to toggle through the tools, or using the key combination CTRL + Tool Number (e.g., CTRL +1 for
KWIC, CTRL +2 for Plot) to select a specific tool.

KWIC (Key-Word-In-Context) Tool

This tool shows search results in a concordance
or 'KWIC' (Key-Word-In-Context) format. This
allows you to see how words and phrases are
commonly used in a corpus of texts.

Plot Tool
This tool shows concordance search results
plotted in a 'barcode' format, with the length of
the text normalized to the width of the bar and
each hit shown as a vertical line within the bar.
This allows you to see the position where search
results appear in the individual texts of a corpus.

File Tool
This tool shows the contents of individual texts.
This allows you to investigate in more detail the
results generated in other tools of AntConc.

Cluster Tool
The tool shows contiguous (together in a
sequence) word patterns based on the search
condition. This allows you to see common
phrases that appear in the target texts.
N-Gram Tool
This tool scans the entire corpus for all 'N'-sized
clusters (e.g., 2-word clusters, 3-word clusters,
…). This allows you to find common expressions
in a corpus.

Collocate Tool
This tool shows words that appear frequently
within a certain distance of the search term (i.e.,
collocates). This allows you to find which words
co-occur with other words in a corpus.

Word List Tool:

This tool counts all the words in the corpus and
presents them in an ordered list. This allows you
to find which words are the most frequent in a
corpus.

Keyword List Tool:

This tool shows words that appear unusually
frequently in the target corpus in comparison
with the words in the reference corpus based on
a statistical measure (i.e., 'keywords'). These
words can be considered to be characteristic of
the target corpus. The settings can also be
changed to show words that appear unusually
infrequently in the target corpus compared with
the reference corpus (i.e., 'negative keywords'.
Wordcloud Tool:
This tool visualizes the results generated by
KWIC, File, Cluster, N-Gram, Collocate, Word,
and Keyword tools as well as a “Scratchpad” of
plain text in the form of a ‘word cloud’.
Wordclouds are often used as aesthetically
pleasing visualizations, where words are laid out
in a viewing area or ‘themed’ image mask and
sized according to a property (e.g., word
frequency). Care should be taken when using
wordclouds for linguistic analysis, as the
visualization necessitates distorting word sizes
to fit the viewing area.
Chat AI:
This tool gives you access to closed-source,
open-source, and open-weights Large Language
Models (LLMs) through a chat-like interface.
When 'chatting' with the LLMs, you can choose
to interact with the LLM directly, or you can
supply the LLM with results generated by other
AntConc tools (e.g., the KWIC tool) and ask it to
utilize that information in its responses.

KWIC (' Key-Word-In-Context') Tool

This tool shows search results in a concordance or
'KWIC' (Key-Word-In-Context) format. This allows
you to see how words and phrases are commonly
used in a corpus of texts.

The following steps produce a set of concordance

lines from a corpus and demonstrate the main
features of this tool.
1) Select a corpus using the "Corpus Manager"
available from the File menu. Alternatively,
create a quick corpus by choosing the "Open File(s) as Quick Corpus" from the file menu. The files
contained in the corpus are shown in the left frame of the main window under "Target Corpus".
2) Enter a search query in the search box. See the 'SEARCH OPTIONS' section in this document for an
explanation of the "Words", "Case", and "Regex" search term options.
3) Choose the size of the results set to be presented using the "Result Set" combobox widget.
4) Choose the number of words to be displayed on either side of the search term using the "Context Size"
spinbox widget.
5) Click on the "Start" button to start the search and wait for the results to be displayed.
6) Use the "Sort options" to rearrange the concordance lines by row ID, file name, or the position of the
word. The first widget allows you to quickly order the concordance lines by the words to right or left of
the center word, or choose no ordering, or using a custom order. The next three widgets allow you to
choose the order parameters: 1L, 2L... are words to the left of the target word, 'C' is the center word,
and 1R, 2R... are words to the right of the center word. The final widget allows you to order the results
by the frequency of the pattern determined by the sorting parameters (the "Order by freq" option) or
alphabetically (the "Order by value" option). The default "Order by freq" option is strongly
recommended as it will allow you to easily identify the most commonly occurring patterns in the target
corpus. After adjusting the sort options, click on the "Start" button to regenerate the concordance lines.
7) The total number of concordance lines generated (Total Hits) is shown at the top of the tool window.
When no hits are found, a warning will be shown on the screen.
8) Double-clicking on any cell in the results window will cause the software to jump to the File tool (see the
relevant section of this document) where you can view the hit exactly as it appears in context in the
original file.
9) If you want to filter the results, select the desired rows, and then press the "Delete" key to remove the
selected rows or press "SHIFT+ Delete" to keep the selected rows removing all the others.
10) Advanced searches are available with this tool. Several menu preferences are also available with this
tool. (See the relevant sections in this document for explanations).
Plot Tool
This tool shows concordance search results plotted
in a 'barcode' format, with the length of the text
normalized to the width of the bar and each hit
shown as a vertical line within the bar. This allows
you to see the position where search results appear
in the individual texts of a corpus. An example of
the use of the Plot Tool is in determining where
specific content words appear in a technical paper,
or where an actor or story character appears
through a play or novel.

The following steps produce a set of plot results from a corpus and demonstrate the main features of this tool.
1) Select a corpus using the "Corpus Manager" available from the File menu. Alternatively, create a quick
corpus by choosing the "Open File(s) as Quick Corpus" from the file menu. The files contained in the
corpus are shown in the left frame of the main window under "Target Corpus".
2) Enter a search query in the search box. See the 'SEARCH OPTIONS' section in this document for an
explanation of the "Words", "Case", and "Regex" search term options.
3) Choose the size of the results set to be presented using the "Result Set" combobox widget.
4) Use the "Plot Zoom" widget to control the size of the plot and the degree of detail to be shown.
5) Click on the "Start" button to start the search and wait for the results to be displayed.
6) Use the "Sort by" option to rearrange the plots according to the various parameters shown.
7) The total number of hits and total number of plots are shown at the top of the tool window. When no
hits are found, a warning will be shown on the screen.
8) Double-clicking on any cell in the results window will cause the software to jump to the File tool (see the
relevant section of this document) where you can view the hit exactly as it appears in context in the
original file.
9) By checking the "Overlay" option and choosing an appropriate color (by clicking on the color box),
existing results can be overlaid with new results for different searches. This allows you to see how
different search queries are related and/or overlap.
10) Advanced searches are available with this tool. Several menu preferences are also available with this
tool. (See the relevant sections in this document for explanations).
File Tool
This tool shows the text of individual files. This
allows you to investigate in more detail the results
generated in other tools of AntConc.

The following steps produce a view of the original

file and demonstrate the main features of this tool.

1) Select a corpus using the "Corpus Manager"

available from the File menu. Alternatively,
create a quick corpus by choosing the "Open
File(s) as Quick Corpus" from the file menu. The files contained in the corpus are shown in the left frame
of the main window under "Target Corpus".
2) Double-click a file in the “Target Corpus” list on the left of the main window to view its contents.
Alternatively, select a file in the "Target Corpus" list and click "Start" in the tool interface. The File tool
will automatically be selected, and the contents of the file will be shown.
3) To highlight search query results in the display, enter a search query and click "Start". See the relevant
section in this document explaining the "Words", "Case", and "Regex" search term options. Words in the
file that match the query will be automatically highlighted.
4) Use the "Hit Location" widget to jump to different hits in the file. Alternatively, use the keyboard
shortcut for your operating system (see the SHORTCUTS section).
5) Advanced searches are available with this tool. Several menu preferences are also available with this
tool. (See the relevant sections in this document for explanations).

Cluster Tool
The tool shows adjacent word groups based on the
search condition. This allows you to see how words
and phrases are commonly used in a corpus of texts.
In some cases, this tool can be seen as summarizing
the results generated in the KWIC tool.

The following steps produce a set of clusters and

demonstrate the main features of this tool.

1) Select a corpus using the "Corpus Manager"

available from the File menu. Alternatively, create a quick corpus by choosing the "Open File(s) as Quick
Corpus" from the file menu. The files contained in the corpus are shown in the left frame of the main
window under "Target Corpus".
2) Enter a search query in the search box. See the 'SEARCH OPTIONS' section in this document for an
explanation of the "Words", "Case", and "Regex" search term options.
3) Choose the various parameters to filter the number of clusters to be shown: cluster size (number of
words in the cluster), minimum cluster frequency, and minimum cluster range (number of files)
4) Click the "Start" button to start the search and wait for the results to be displayed.
5) Use the "Sort by" option to rearrange the ordering of the results.
6) Use the "Search Term Position" options to determine if the results will show clusters that start with the
search query terms ("On Left"), end with the search query terms ("On Right") or can either start or end
with the search query terms ("On Left/Right").
7) The total number of cluster types ("Cluster Types") and combined total count of all the cluster tokens
("Cluster Tokens") are shown at the top of the tool window. When no hits are found, a warning will be
shown on the screen.
8) Double-click on any cell in the results window to cause the software to jump to the KWIC tool (see the
relevant section of this document) where you can view concordance lines for that cluster across the
whole corpus.
9) Advanced searches are available with this tool. Several menu preferences are also available with this
tool. (See the relevant sections in this document for explanations).

N-Gram Tool
This tool scans the entire corpus for all 'N'-sized
clusters (e.g., 2-word clusters, 3-word clusters, …).
This allows you to find common expressions in a
corpus.

The following steps produce a set of n-grams and

demonstrate the main features of this tool.

1) Select a corpus using the "Corpus Manager"

available from the File menu. Alternatively,
create a quick corpus by choosing the "Open File(s) as Quick Corpus" from the file menu. The files
contained in the corpus are shown in the left frame of the main window under "Target Corpus".
2) Choose the various parameters to filter the number of n-grams to be shown: n-gram size (number of
words), open slots (number of slots in the n-gram that can take multiple values), minimum n-gram
frequency, and minimum n-gram range (number of files).
3) Click on the "Start" button to start the search and wait for the results to be displayed. If a search query is
entered, only n-grams that match the query will be shown. See the 'SEARCH OPTIONS' section in this
document for an explanation of the "Words", "Case", and "Regex" search term options.
4) Use the "Sort by" option to rearrange the ordering of the results.
5) The total number of n-gram types ("N-Gram types") and combined total count of all the n-gram tokens
("N-Gram Tokens") are shown at the top of the tool window. When no hits are found, a warning will be
shown on the screen.
6) Double-click on any cell in the results window to cause the software to jump to the KWIC tool (see the
relevant section of this document) where you can view
concordance lines for that n-gram across the whole corpus.
7) For entries that contain open slots, Shift + Double-click on the
"Type" entry in the results window to show the variants that can
fit in the open slots via the "Open Slot Viewer" and two associated
statistics that show the degree of variation for the slot. The *_TT
value is the type/token ratio for the slot, and the *_ent value is
the Entropy value for the slot.
8) Advanced searches are available with this tool. Several menu
preferences are also available with this tool. (See the relevant
sections in this document for explanations).
Collocate Tool
This tool shows words that appear frequently
within a certain distance of the search term (i.e.,
collocates). This allows you to find which words co-
occur with other words in a corpus.

The following steps produce a set of collocates and

demonstrate the main features of this tool.

1) Select a corpus using the "Corpus Manager"

available from the File menu. Alternatively,
create a quick corpus by choosing the "Open
File(s) as Quick Corpus" from the file menu. The files contained in the corpus are shown in the left frame
of the main window under "Target Corpus".
2) Choose the various parameters to filter the types of collocates to be shown: window span (possible
positions left and right of the search query terms, where the collocate can appear), minimum collocate
frequency, and minimum collocate range (number of files).
3) Enter a search query in the search box. See the 'SEARCH OPTIONS' section in this document for an
explanation of the "Words", "Case", and "Regex" search term options.
4) Click on the "Start" button to start the search and wait for the results to be displayed.
5) Use the "Sort by" option to rearrange the ordering of the results.
6) The total number of collocate types ("Collocate Types") and combined total count of all the collocate
tokens ("Collocate Tokens") are shown at the top of the tool window. When no hits are found, a warning
will be shown on the screen.
7) Double-click on any cell in the results window to cause the software to jump to the KWIC tool (see the
relevant section of this document) where you can view concordance lines for that collocate across the
whole corpus.
8) Advanced searches are available with this tool. Several menu preferences are also available with this
tool. (See the relevant sections in this document for explanations).

Word Tool
This tool counts all the words in the corpus and
presents them in an ordered list. This allows you to
find which words are the most frequent in a corpus.

The following steps produce a word list and

demonstrate the main features of this tool.

1) Select a corpus using the "Corpus Manager"

available from the File menu. Alternatively,
create a quick corpus by choosing the "Open
File(s) as Quick Corpus" from the file menu. The files contained in the corpus are shown in the left frame
of the main window under "Target Corpus".
2) Click on the "Start" button to start the processing and wait for the results to be displayed. If a search
query is entered, only words that match the query will be shown. See the 'SEARCH OPTIONS' section in
this document for an explanation of the "Words", "Case", and "Regex" search term options.
3) Use the "Sort by" option to rearrange the ordering of the results.
4) The total number word types ("Word types") and combined total count of all the word tokens ("Word
Tokens") are shown at the top of the tool window. When no hits are found, a warning will be shown on
the screen.
5) Double-click on any cell in the results window to cause the software to jump to the KWIC tool (see the
relevant section of this document) where you can view concordance lines for that word across the whole
corpus.
6) Advanced searches are available with this tool. Several menu preferences are also available with this
tool. (See the relevant sections in this document for explanations).

Keyword Tool
This tool shows words that appear unusually
frequently in the target corpus in comparison with
the words in the reference corpus based on a
statistical measure (i.e., 'keywords'). These words
can be considered to be characteristic of the target
corpus. The settings can also be changed to show
words that appear unusually infrequently in the
target corpus compared with the reference corpus
(i.e., 'negative keywords'.

The following steps produce a keyword list and demonstrate the main features of this tool.

1) Create a quick corpus by choosing the "Open File(s) as Quick Corpus" option from the file menu.
Alternatively, choose the "Corpus Manager" option from the file menu and make sure the "Target
Corpus" option is selected. Then, select one of the available corpora or create your own from raw files or
a word list (see the instruction under the Corpus Manager section of this help page for how to do this).
This corpus will then serve as the target corpus for your analysis. The files contained in the corpus will be
shown in the top left frame of the main window under "Target Corpus".
2) Choose a reference corpus by opening the "Corpus Manager" option from the file menu and checking
the "Reference Option" option. Next, as in step 1, select one of the available corpora or create your own
from raw files or a word list. The files contained in the corpus will be shown in the bottom left frame of
the main window under "Reference Corpus".
3) Click on the "Start" button to start the processing and wait for the results to be displayed. If a search
query is entered, only words that match the query will be shown. See the 'SEARCH OPTIONS' section in
this document for an explanation of the "Words", "Case", and "Regex" search term options.
4) Use the "Sort by" option to rearrange the ordering of the results.
5) The total number keyword types ("Keyword types") and combined total count of all the word tokens
("Keyword Tokens") are shown at the top of the tool window. When no hits are found, a warning will be
shown on the screen.
6) Double-click on any cell in the results window to cause the software to jump to the KWIC tool (see the
relevant section of this document) where you can view concordance lines for that keyword across the
whole corpus.
7) Advanced searches are available with this tool. Several menu preferences are also available with this
tool. (See the relevant sections in this document for explanations).
Wordcloud Tool
This tool visualizes the results generated by KWIC,
File, Cluster, N-Gram, Collocate, Word, and Keyword
tools as well as a “Scratchpad” of plain text in the
form of a ‘word cloud’. This is a grouping of words
where the sizing of the words reflects a property of
those words (e.g., frequency).
Wordclouds are often used as aesthetically pleasing
visualizations, where words are laid out in a viewing
area or ‘themed’ image mask and sized according to a
property (e.g., word frequency). Care should be taken
when using wordclouds for linguistic analysis, as the visualization necessitates distorting word sizes to fit the
viewing area.

The following steps produce a word cloud and demonstrate the main features of this tool.

1. Choose a “Source” for the word cloud. This can be a “Scratchpad” of plain text (accessible as the first
option in the “Appearance” list, or the output of the KWIC, File, Cluster, N-Gram, Collocate, Word, and
Keyword tools.
2. Choose the properties from the source to display as “labels” and “values” in the word cloud. For the KWIC,
File and Scratchpad sources, the properties of “Type” (Word) and “Freq” (Frequency) are chosen
automatically.
3. Choose the image size. This will determine how many words can be placed in the image. Note that the
image will be automatically scaled to the window display size.
4. Choose the maximum number of words to display in the word cloud. Depending on the number of items
from the source and the other settings (e.g., the minimum font size), this value might not be reached.
5. Check the “Use stopwords” checkbox to remove stopwords from the word cloud.
6. Check the “Repeat words” checkbox to fill as much remaining space as possible in the word cloud image
with existing words. Depending on other settings (e.g., the minimum font size), not all the space will be
used.
7. Click “Start” to generate the word cloud.

Appearance Option
The appearance of the word cloud can be adjusted through the following settings:
Scratchpad: This is a free writing area. If the scratchpad is chosen as the source, the words here (and their
frequencies of occurrence) will be used to plot the wordcloud.
Mask settings: Use these settings to determine if the word cloud should be ‘masked’ and if so, which mask to
use. Additional masks (.png and .svg files) can be added to the list by clicking the “Add” button.
The included masks with .svg extensions are kindly provided by Font Awesome
(fontawesome.com).
Color settings: Choose how to color the word cloud. Three options are available. If “Color theme” is chosen,
you can pick a color theme from the dropdown list of options and the range value that
determines which color(s) in the theme are used for which values in the word cloud. If “Text
color” is chosen, you can pick a specific color from the colors available on the system. If “Mask
color” is chosen, the colors of the words in the word cloud will match the colors used in the
original mask image. The image background can be set to a specific color or made transparent.
Font settings: Choose which font family and font size to use in the word cloud. If the “Allow squeezing”
option is selected, new words to be added to the word cloud that cannot fit in the remaining
space will be incrementally “squeezed” (reduced in font size) until they fit the space available.
This “squeezing” effect will distort the appearance of a word but will usually result in a more
aesthetically pleasing result. It is also used when the “Repeat words” option is chosen to fill as
much of the remaining space as possible. The “Scaling Factor” setting determines the weighting
given to the value (e.g., frequency) of the word to the ranking of that value. At 1.0, only the
value is considered. At 0.0, only the rank is considered. The “H(orizontal)/V(ertical) Ratio”
setting determines the probability of horizontal words plotted over vertical words. At 1.0, all
words will be plotted horizontally. At 0.0, all words will be plotted vertically.

ChatAI Tool
This tool gives you access to closed-source, open-
source, and open-weights Large Language Models
(LLMs) through a chat-like interface. When 'chatting'
with the LLMs, you can choose to interact with the
LLM directly, or you can supply the LLM with results
generated by other AntConc tools (e.g., the KWIC
tool) and ask it to utilize that information in its
responses.

The following steps demonstrate a typical interaction

with ChatAI

1. If you want to use an LLM provided by OpenAI, in the ChatAI tool settings menu, paste in your Open API
key. If you have already entered your API key into your environment settings (on Windows), the key will
appear here automatically. Note that you will be charged by OpenAI for all API interactions according to
their pricing scheme: https://openai.com/pricing. If you want to use an open source or open weights LLM
model, you can skip this step.
2. In the ChatAI tool settings menu...
a. Select the default LLM that you want to use. Clip "Update Models" to refresh the list of available
models to reflect the most recent models available.
b. Select the "System Prompt" setting that you want to use during your interactions. The default
system prompt is a typical standard: " You are a helpful assistant."
3. In the main ChatAI interface window...
a. Set the maximum number of tokens ("Max tokens") to be generated by the LLM. Deactivating this
option will allow the LLM to generate its maximum output tokens.
b. Set the temperature (Temp) of the LLM. A setting of 0.0 will cause the LLM to always choose the
most probable next predicted token. A setting of 2.0 will cause the LLM to choose from a wider
range of probabilities for the next token, making it produce more creative, unpredictable
responses.
c. Check the "Stream" option to allow the LLM to show the response as it is generated one token at a
time. When unchecked, the complete response will be shown after it is generated.
d. Set the LLM that you want to use.
e. Choose whether to use the output from another AntConc tool by activating the "Source" option
and selecting which tool to use.
f. Set the "Context Policy" to either "Remove previous context" so that each prompt submission will
be treated as the start of a completely new conversation or "Maintain rolling context window" so
that as much of the previous context as possible will be remembered until the LLM context size is
reached. At this point, the earliest messages (prompts and responses) will be deleted from the
context window before the new prompt is added. Note that the system prompt is never deleted.
4. Type in your prompt in the text area. Save your prompt by clicking the "Save Prompt" button if you think
you will use it again. All save prompts are available in the "Prompt History" list. If you choose to supply
results from another AntConc tool, they will be automatically appended to the end of the prompt (unseen).
As the LLM will remember the results supplied, after you supply the results once, you are recommended to
deactivate the "Source" option to reduce the number of tokens supplied to the LLM.
5. Click the "Submit" button or click "CTRL Return" to send your prompt to the LLM.
6. Wait for the LLM to respond. The response will appear in the upper window of the tool.
7. Continue to chat with the LLM based on the previous responses.
8. Click "Clear" to cause the LLM to forget the current conversation and start a new one.

SEARCH OPTIONS
Search queries can be composed of full words or word fragments, with or without wildcards. The basic syntax
roughly follows the Common Elementary Query Language (CEQL). See https://cwb.sourceforge.io/ceql.php for
more details). Searches can be either “case insensitive” (default) or “case sensitive” by activating or
deactivating the "Case" search term option. Searches can also be made using full regular expressions by
activating the "Regex" option. With the "regex" option, each word-level regular expression needs to be
separated by whitespace. To make regex expressions case-aware, select the "Case" option. For details on how
to use regular expressions, consult one of the many texts on the subject, e.g., Mastering Regular Expressions
(O'Reilly & Associates Press) or type "regular expressions" in a web search engine to find many sites on the
subject (e.g., http://www.regular-expressions.info/quickstart.html). AntConc supports Perl regular expressions
including Unicode character classes, e.g., \p{Letter}, even though the software is built using Python.

ADVANCED SEARCHES
By clicking on the "Advanced Search" button (available in all tools), more complex searches become possible.
• The "Search Query List" option allows you to import a set of search queries. You can do this in one of three
ways: 1) Type each individual search query in the entry box and click "Add"; 2) Drag and drop the list of
search queries into the viewer below the entry box; 3) Copy and paste the list of search queries into the
viewer below the entry box. When dragging and dropping or pasting your list of queries, each line will be
treated as a separate search query. This feature allows you to use a large set of search queries without
having to retype them each time. Any search query accepted in the main interface can be used.
• The "Context Search List" option (not available in all tools) allows you to define search queries that must
match within a certain context window around the main search term(s). For example, to search for
"student" or "students" appearing at least three words to the left or right of the word "university," add
"university" to the "Search Query List", and then add "student" and "students" to the "Context Search List"
list. Finally, set the "Context Search List" "Window Span" as "From 3L" and "To 3R".
• The "SQL Search List" option (not available in all tools) allows you to adjust query conditions by applying
conditions directly on tables in the database that stores the corpus through a series of 'join' operations. For
example, the following entry in the list will join a custom "genres" table in the database (created through
the Corpus Manager) with the main "corpus" table, applying a condition that the "genre" column entry for
the file must be marked as "academic". The two tables are joined via the common "doc_id" column:
["genres", "genre = 'academic'", "doc_id"]
The format for the list entry should be a JSON array, with three components, "table" + "condition" + "join
column".

MENU OPTIONS
Menu options are divided into three groups, "File", "Edit", "Settings" and "Help". The options available in each
group will be described below.
<FILE>
• Open File(s) as 'Quick Corpus'...
o This option is for quickly creating a temporary corpus. You will be asked to choose the files you
want to add to your corpus. Then, the software will create a "temp " corpus using the default
settings of the Corpus Manager and load this for immediate use. Any existing "temp" corpus
will be overwritten.
• Open Corpus Manager...
o This option opens the Corpus Manager, where you can choose prebuilt corpora from the
default library, add or delete corpora from a user library, or create custom corpora from raw
files. See the Corpus Manager for complete details.
• Swap Target/Reference Corpora
o This option swaps the target and reference corpora allowing easy comparisons.
• Clear Tool/All Tools/All Tools and Files
o These options will reset the interface.
• Save Current Tab Results...
o This option allows results displayed in the main interface to be exported in a file format. Note
that hidden columns will also be included. (Direct copying and pasting of results from the
interface are also possible.)
• Save Current Tab Database Tables...
o This option allows complete tables of results from the corpus database to be exported to a set
of .csv files. All relevant information about results can be found in these files.
• Import Settings From File.../Export Settings To File...
o These options allow the state of the software to be saved and reloaded at a different time.
• Restore Default Settings
o This option resets the state of the software to when it was first installed. All custom settings
are lost.
<EDIT>
• Select All
o This option selects all results in the results window. The same effect can be achieved using the
standard keyboard shortcut for "Select All". See the SHORTCUTS section for more details.
• Copy
o This option selects any text in the results window. The same effect can be achieved using the
standard keyboard shortcut for "Copy". See the SHORTCUTS section for more details.

<SETTINGS>
• Global Settings (applied to all tools in the interface)
o Colors - This setting decides the main highlight color (e.g., for highlighted words in the File tool)
and the color indicators for the Corpus Manager Pre-Built Corpus Library.
o Files - This setting decides how the paths to files are shown. Also, this setting determines which
file types are used as defaults in File Open dialogs and file drop options. This setting is also
used to decide if encoding errors are shown or ignored when creating corpora from raw files.
o Fonts - This setting decides the font family, size, and style of the font for the main interface.
o Language Direction - This setting decides how to display results (especially for the KWIC
concordance tool) depending on the language direction. For example, choose the default "Left-
to-right" option for language such as English. Choose the "Right-to-left" option for languages
such as Arabic. Check the 'Arabic' checkbox for smooth processing of Arabic in the Wordcloud
tool.
o Restore Settings - This setting decides if the settings will be automatically saved and restored
when AntConc is restarted.
o Searches - This setting lists all wildcards available in the system (note that these cannot be
edited).
o Statistics - This setting decides how values are displayed through normalization and floating-
point precision settings.
o Tags - This setting decides how word information is displayed in the interface depending on the
corpus currently loaded. For a fully tagged corpus, the options will be "Type", "POS",
"Type+POS", and "headword" (lemma).
o Tool Filters - This setting decides if only words in the selected file will be shown or hidden in
the respective tools. When the "Hide words in file" option is chosen, the selected file serves as
a "stop list".
• Tool Settings
o KWIC
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed.
▪ Display Options - decides the colors used to highlight the sort order.
▪ Other Options
• Choose to show or hide the file names in the display.
• Choose to show or hide the search term in the display. This option is useful for
allowing instructors to quiz students on possible words to fit the gap.
o Plot
▪ View Style - decides which view to use (table/graphic or graphic)
▪ Display Options - decides how results are displayed.
▪ Statistics - decides the parameters for determining the dispersion measure.
▪ Other Options - decides various parameters for sizing/displaying the plot graphs.
o File
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed.
o Cluster
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed.
▪ Display Options - decides what information is shown in the results window.
▪ Filter Options - decides if clusters can only span cross whitespace boundaries or can
include other characters (e.g., punctuation)
o N-Gram
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed.
▪ Display Options - decides what information is shown in the results window.
▪ Filter Options - decides if clusters can only span cross whitespace boundaries or can
include other characters (e.g., punctuation)
o Collocate
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed.
▪ Likelihood Measure + Threshold
• Choose the statistic and cut-off point (threshold) for inclusion of words in the
collocates list. Words below the cut-off-point are deemed to appear frequently
together with the query term by chance.
▪ Effect Size Measure + Threshold
• Choose the statistic used determine the strength of relationship between the
query term and collocate and a cut-off point (threshold) for inclusion of words
that meet the minimum effect size.
o Word
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed. Options that show information in brackets will collate that information and
present it as a family. See the examples below for an illustration for what happens with
POS tagged data (i.e., data for which the type and POS information are provided) and
lemmatized data (i.e., data for which the type and headword information are provided)
• Type: this option will combine words with same type but different POS tags or
lemma headwords into a single entry and sum the frequencies and calculate
the range values for this entry
• Type+POS: this option will treat words with the same type but different POS
tags as different entries. The frequencies and range values will be
independent.
• Type+[POS]: this option will combine words with the same type but different
POS tags into a single entry and sum the frequencies and calculate the range
values for this entry. The option will also show all the POS variants and their
separate frequency counts that combine to make up the total.
• Type+Headword: this option will treat words with the same type but different
headword (lemma) tags as different entries. The frequencies and range values
will be independent.
• Headword: this option will combine all words from the same lemma family into
a single headword entry and sum the frequencies and calculate the range
values for this entry
• Headword+[Type]: this option will combine words from the same lemma family
into a single entry and sum the frequencies and calculate the range values for
this entry. The option will also show all the lemma family variants and their
separate frequency counts that combine to make up the total.
▪ Display Options - decides what information is shown in the results window.
o Keyword
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed. See the entry for Word for an explanation of what the options represent.
▪ Display Options - decides what information is shown in the results window.
▪ Negative Keywords - decides to show words in the target corpus that appear unusually
infrequently in the target corpus compared with the target corpus.
▪ Likelihood Measure + Threshold - decides the statistic and cut-off point for inclusion of
words in the keyword list. Words below this cut-off-point are deemed to appear
frequently in the target corpus compared with the reference corpus by chance.
▪ Effect Size Measure + Threshold - decides the measure used determine the strength
keyness and a cut-off point for inclusion of words that meet the minimum effect size.
[Appropriate effect size measures are still being debated in the field, so the default
setting is to show all values for this measure. With the default settings, keywords are
ranked according to their likelihood measure scores. This equates to ranking keywords
according to p-values, which raises several questions/problems. However, it is the
current standard in the field and results tend to show that ranking by likelihood leads
to more intrinsically intuitive results than those generated when an effect size measure
is used. The current selection of likelihood measures and effect size measures are
inspired by the work of Andrew Hardie of Lancaster University.]
o Wordcloud
▪ Color Theme Options - decides what color themes are available in the tool controller.
• Perceptually uniform sequential
o These themes have incremental changes in lightness and often
saturation of color that are perceived to be uniform. This makes them
suitable to represent changes in frequency or other values.
• Sequential
oThese themes have incremental changes in lightness and often
saturation of color. This makes them suitable to represent changes in
frequency or other values.
• Qualitative
o These themes contain miscellaneous colors. This makes them
unsuitable for most cases. The exception is when you want to produce
visually appealing results with no connection between the color and
the value being represented.
▪ Mask Theme Options - decides what mask themes are available in the tool controller.
o ChatAI
▪ API Settings - decides the way ChatAI utilizes APIs.
• Show API warning on first use
o Show a warning when first using the ChatAI tool to remind the user
that they will be charged by OpenAI for all API interactions.
• OpenAI API Key
o An area to input your OpenAI API key. This is required to use LLMs
provided by OpenAI. Note that you will be charged by OpenAI for all
API interactions according to their pricing scheme:
https://openai.com/pricing.
▪ Model Settings
• Default Model
o This is the default model shown in the main interface.
▪ Prompt Settings
• System prompt
o This is the system prompt that will guide the LLM during a single chat
interaction. It can be used to strengthen the skills of the LLM (e.g.,
explain what expertise it is supposed to have), determine the style of
interaction (e.g., formal or informal), and even what in way it should
proceed (e.g., solving problems slowly in a stepwise fashion).
< HELP >
• Show Help Page
o This option shows the help guide as a PDF file.
• Show License
o This option shows the license agreement that you agree to when using the software.
• Show Version History
o This option shows the complete history of releases, detailing new features, bug fixes, and
major updates.
• About AntConc
o This option shows the release version, release date, copyright information, and
acknowledgments for the software.

Corpus Manager

The Corpus Manager is a multi-purpose tool used to load and save pre-built corpus databases, create, and save
a new corpus from raw (.txt, .srt., …), Word, of PDF files, or create and save a new corpus from a simple of
advanced word list. The three different scenarios are explained below.
Choosing/Saving a pre-built corpus Building/Saving a corpus from raw files Building/Saving a corpus from a word list

Choosing/Saving a pre-built corpus database

Choose the "Corpus Database" corpus option.
1. This option shows a list of pre-built corpus
databases available in the "Corpus Library" of
AntConc in the left windowpane in a tree layout.
a) The list shows all “Default” corpora that
are available with AntConc via an online
repository. The list also shows all “User”
corpora that you have created in the
Corpus Manager (see below) or loaded
into the library. To hide corpora in the
online repository that have now been downloaded,
uncheck the “Show online corpora” box.
b) By default, lists of corpora are collapsed to their top
level. Clicking on the list arrow will expand the list to see
all the entries. Clicking the arrow again will collapse the
list back to its initial state.
c) Corpora that are installed and ‘ready’ to use are marked
green ( ). Others that ‘available’ for download from the
online repository appear unmarked ( ). Orange
indicates that the list has some marked and unmarked
corpora ( ).
d) To download a corpus from the online repository, click its
indicator. The indicator will change from ‘available’ ( )
to ‘ready to download’ ( ). Next, activate the “Connect
online” checkbox and click the “Update” button. The
library list will refresh and show the new status of all the
corpora.
e) To delete a ‘ready’ corpus from the library, click on its indicator. The indicator will change from
‘ready’ ( ) to ‘delete’ ( ). Next, click the “Update” button. The library list will refresh and show
the new status of all the corpora.
f) To select a “Target Corpus”, first select “Target Corpus” tab in the right hand of the window. Next,
double click the name of the corpus (not the indicator). The corpus indicator will change from
‘ready’ ( ) to ‘Target’ ( ). The details of the corpus will appear in the “Target Corpus” tab.
g) To select a “Reference Corpus”, first select “Reference Corpus” tab in the right hand of the
window. Next, double click the name of the corpus (not the indicator). The corpus indicator will
change from ‘ready’ ( ) to ‘Reference’ ( ). The details of the corpus will appear in the “Reference
Corpus” tab.
2. Pre-built corpus databases (e.g., created by other users of AntConc) can be loaded directly into the Corpus
Database Library by clicking the “Add Database File(s)” or “Add Database Dir” buttons and selecting the
relevant file(s) or folder. All corpora loaded this way will appear in the “User” list.
3. The entire Corpus Database Library can be exported for backup by clicking the “Export Library” button or
restored from a backup by clicking the “Import Library” button. When restoring a backup library, current
corpora with the same name will be overwritten with the backup, but other corpora will remain
unchanged. This feature is useful for users on MacOS and Linux systems who want to create a backup of
the Corpus Database Library before updating AntConc to a new version. This is because the operating
system will delete all files before the update is made.
4. Any single corpus can be saved to a new location by activating it as “Target” or “Reference” corpus and
then clicking on the "Save" button at the bottom of the right windowpane and saving the file.
5. Click the "Return to Main Window" button at the bottom right of the Corpus Manager to return to the
main window and start using a selected corpus. The files in the corpus will be shown in the left pane of the
main window.

Building/Saving a corpus from raw files

Choose the "Raw Files" corpus option. Several 'builder
options' will appear in the left windowpane. Follow the
steps below:
1. Choose a name for your custom corpus. A default name
is provided.
2. Choose the files to be included in your corpus.
a. Use the "Add File(s)" or "Add Directory" options
to choose your raw files. You can choose from a
wide variety of text formats, including plain text (.txt, .srt, .sub), table files (.csv, .tsv),
HTML/XHTML/XML (.html, .xhtml, .xml), WORD (.doc, .docx), PDF (.pdf), and EPUB (.epub) files.
3. Adjust one or more of the basic settings as necessary (OPTIONAL)
a. Decide the indexer used to process the raw files.
i. For simple files with no annotation or part-of-speech (POS) tagging, the
simple_word_indexer" will work well.
ii. For simple files that have been part-of-speech (POS) tagged, the
"simple_word_pos_headword_indexer " will work well.
iii. For simple files that have been part-of-speech (POS) tagged using the Biber Tagger, the
"simple_word_bibertag_indexer " will work well.
iv. For other files with more complex structures, different indexers will become available over
time.
b. Choose the character encoding of the files.
i. The default option (UTF-8) is the standard in the field. Many other encodings are also
available in the "Other" options.
▪ If you are unsure of what encoding to use, you are recommended to initially
choose the default UTF-8 option. Later, if you see an encoding error when trying to
process your files or find that your files appear corrupted in the various tool
displays, it probably means that the encoding is wrong, and you should determine
the correct encoding to use. For Word (.docx) and PDF (.pdf) files, the default
encoding should generally be fine.
ii. For more information on the Unicode standards see:
▪ http://www.cs.tut.fi/~jkorpela/unicode/guide.html
▪ http://www.unicode.org/
▪ http://www.unicode.org/Public/5.0.0/ucd/UCD.html
▪ http://www.unicode.org/Public/UNIDATA/PropList.txt
▪ http://www.unicode.org/charts/
c. Decide the definition of a token (word) in the corpus.
i. In some cases, you may only want to include tokens (words) comprised of the letters a-zA-Z
in your corpus, whereas other times, you might want to include tokens comprised of
letters, numbers, apostrophes etc. The token definition determines what tokens your
corpus is comprised of.
ii. If you click on the "Show Token Definition Settings" button, a new window will open where
you can choose your definition. AntConc offers three ways to choose a token definition:
"Character Classes", "User-Defined Characters", and "User-Defined Regex":
1. "Character Classes": This is the default option and is the most comprehensive. Click
the various options to add characters to the definition. These classes are fully
Unicode compliant, meaning that they can handle data in any language, including
all European languages and Asian languages. For example, the default option
"Letters" refers to 'letters' in the broadest sense, including all English letters (a to z,
A to Z) and all Japanese and Chinese 'letter' characters.
2. " User-Defined Characters": This is a simple option whereby all characters you type
in the text edit box will be included in the final token definition.
3. " User-Defined Regex": This is another simple option whereby all characters that
match the given regex (regular expression) will be included in the final token
definition.
iii. If you activate the "Ignore header" option, you can choose a starting and ending tag for
your corpus files headers. Any text between these two tags will be ignored when the
corpus is created.
iv. If you activate the "Ignore footer" option, you can choose a starting tag for your corpus
files footers. The footer tag and any text appearing after the tag be ignored when the
corpus is created.
v. If you activate the "Ignore non-embedded tags" option, you can choose a starting and
ending tag for elements in your corpus files that you want to ignore. Any text between
these two tags will be ignored when the corpus is created.
vi. If you activate the "Ignore embedded tags" option, you can choose a tag maker for token
elements in your corpus files that you want to ignore. Any text appearing after the tag
marker will be ignored when the corpus is created.
vii. Using the "Token Testing Area", you can test your token definition by typing or
copy/pasting a text into the left-hand text box, clicking the "Test" button, and checking in
the right-hand text box to see what tokens will be generated.
viii. After defining your corpus token definition, click "Apply".
d. Choose how to process rows of data in your files. The “One text per file” option (default) will
process each file as a single corpus text. The “One text per row” option, will treat each row in each
file as individual corpus texts. This option is useful when the raw files are composed of tabular data
and each row needs to be treated separately.
4. Adjust one or more of the advanced settings as necessary (OPTIONAL)
a. Choose metadata tables (if available)
i. If you click on "Add File(s)" or "Add Directory", you can choose optional metadata tables
that will be stored as SQLite database tables together with your raw corpus data. The
information in these metadata tables must be aligned with the column names used in the
existing tables of the corpus. To understand the default table structure, open the corpus
database in an SQLite database reader (e.g., https://sqlitebrowser.org/) and view the
different tables.
ii. Once the corpus is built, you will be able to form search queries on the main corpus using
values in these tables as conditional elements.
b. Choose a headword/lemma/grouping list (if available)
i. If you click on "Add File", you can choose an optional headword//lemma/grouping list that
will be used to map words to headwords, lemmas, or grouping categories. Existing
headwords/lemmas/groupings (e.g., those generated by a POS tagger) will be overwritten
by these headwords/lemmas/groupings. The format of the headword//lemma/grouping
list is as follows (where -> represents a tab character):
headword->family member 1->family member 2->family member 3….
ii. Once the corpus is built, you will be able to form search queries on the main corpus using
these headword/lemma/grouping terms as conditional elements.
5. Create the corpus
a. To complete the corpus building process, click the "Create" button.
b. Once the corpus is created, a basic description will be displayed in the right windowpane.
c. The corpus will be available in the "Corpus Library" and can be saved to another location by clicking
on "Save" button next to the “Active Corpus Database” label.

Building/Saving a corpus from a word list

Choose the " Word List" corpus option. Several 'builder
options' will appear on the top-left of the window. Follow
the steps below:
1. Choose a name for your custom corpus. A default name
is provided.
2. Choose which type of word list you want to load.
a. Choose the “Simple Word List” option to build a
corpus from a simple list of type, frequency,
and optional range (document frequency)
values. The format for the simple word list is a
header column: “type”, “freq” and option
“range”, followed by rows for each type in the
corpus. AntConc will estimate the size of the
corpus as the sum of the type frequencies and
the total number of texts in the corpus from
the maximum range (document frequency)
value. If no range values are included, the
range for all types will be set as 1.
b. Choose the “Advanced Word List” option to
build a corpus from three files that describe the complete corpus (the “Corpus information file”)
the word list metadata (the “Word list information file”, and the frequency information of types in
the corpus (the “Word list data file”). The format of these files can be found by generating a word
list using the demo corpus or one of the corpora from the online corpus repository and saving the
results. In the saved .zip file, you will find examples of the table formats that you need to use.
Alternatively, open the corpus database in an SQLite database reader (e.g.,
https://sqlitebrowser.org/) and view the different tables. An SQLite database reader should allow
you to export the tables as CSV/TSV files that you can use as templates for your own wordlist
corpus creation.
3. To load the necessary file(s), click the "Add File" button for each file type and choose a file in CSV or TSV
format.
4. Create the corpus.
a. To complete the corpus building process, click the "Create" button.
b. Once the corpus is created, a basic description will be displayed in the right windowpane.
c. The corpus will appear in the "Corpus Library".
[Note that a wordlist corpus can only be used as a reference corpus for use when generating keyword
lists. Using a wordlist-based corpus in other tools will result in an error.]

Shortcuts
Here is a list of useful shortcuts (including some of the useful standard shortcuts on the operating system).
• CTRL/COMMAND + TAB: Toggles clockwise through the different tools in the tab bar.
• ALT + Tool Number: Selects a specific tool (e.g., ALT+1 for the KWIC Tool, ALT+2 for the Plot tool).
• SHIFT + CTRL/COMMAND + TAB: Toggles anti-clockwise through the different tools in the tab bar.
• CTRL/COMMAND + C: Copies the currently selected text.
• CTRL/COMMAND + A: Selects all text in the window.
• F4 (Win): Reveals the complete list of options in a 'combobox' widget (e.g., the search history in the
search query box).
• ARROW KEYS: For any 'combobox' widgets (e.g., the KWIC search query box) or 'spinbox' widgets (e.g.,
the KWIC context size), the 'UP' and 'DOWN' arrow keys on the keyboard can be used to change the
value of the option.
• CTRL/COMMAND + O: Opens the "Corpus Manager".
• CTRL/COMMAND + G: Opens the "Global Settings".
• CTRL/COMMAND + T: Opens the "Tool Settings".
• CTRL/COMMAND + H: Toggles the view "(H)ide" of "Show" setting for the KWIC tool file name column.
• CTRL/COMMAND + '+': Zooms in the "Plot Tool" display.
• CTRL/COMMAND + '-': Zooms out the "Plot Tool" display.
• CTRL/COMMAND + F: Searches for the next hit in the "File Tool" display.
• SHIFT + CTRL/COMMAND + F: Searches for the previous hit in the "File Tool" display.
• CTRL/COMMAND + SHIFT + T: Swaps/Toggles the target and reference corpora in the main display.

Citing/Referencing
Use the following method to cite and reference AntConc according to the APA style guide:

Anthony, L. (YEAR OF RELEASE). AntConc (Version VERSION NUMBER) [Computer Software]. Tokyo, Japan:
Waseda University. Available from https://www.laurenceanthony.net/software.html

For example, if you download AntConc 4.0.0, which was released in 2021, you will cite/reference it as follows:
Anthony, L. (2021). AntConc (Version 4.0.0) [Computer Software]. Tokyo, Japan: Waseda University. Available
from https://www.laurenceanthony.net/software.html

STATISTICS
Below is a list of statistics used in AntConc. The notation used here is taken from the work of Evert (2004: 36-
37). Explanations are provided in Anthony (2023).
References:
Anthony, L. (2023). Common statistics used in corpus linguistics. Available at
https://laurenceanthony.net/resources/statistics/common_statistics_used_in_corpus_linguistics.pdf.
Evert, S. 2004. The Statistics of Word Cooccurrences: Word Pairs and Collocations. Unpublished Ph.D. thesis.
University of Stuttgart. (Published 2005; available online at
http://elib.unistuttgart.de/opus/volltexte/2005/2371/.)

Effect Size Statistics

Statistic Equation
2𝑂11
Dice 𝐷𝑖𝑐𝑒 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 =
𝑅1 + 𝐶1
2𝑂11
LogDice 𝐿𝑜𝑔𝐷𝑖𝑐𝑒 = 14 + log 2 ( )
𝑅1 + 𝐶1
Log Ratio 𝑅2 𝑂11
(if 021=0, a value of 0.5 𝐿𝑜𝑔 𝑅𝑎𝑡𝑖𝑜 = log 2 ( )
is used) 𝑅1 𝑂21
𝑂11
MI 𝑀𝐼 = log 2 ( )
𝐸11
(𝑂11 )2
MI2 𝑀𝐼2 = log 2 ( )
𝐸11
(𝑂11 )3
MI3 𝑀𝐼3 = log 2 ( )
𝐸11
𝑂11 𝑂11
MS 𝑀𝑆 = 𝑚𝑖𝑛 { , }
𝑅1 𝐶1
𝑂11
Mu 𝑀𝑢 𝑣𝑎𝑙𝑢𝑒 =
𝐸11
RRF 𝑅2 𝑂11
(if 021=0, a value of 0.5 𝑅𝑅𝐹 =
is used) 𝑅1 𝑂21

𝑂11 𝑂
DRF 𝐷𝑅𝐹 = ⁄𝑅 − 21⁄𝑅
1 2

Z-Score 𝑂11 − 𝐸11

𝑧 =
√𝐸11
T-score 𝑂11− 𝐸11
𝑇 − 𝑠𝑐𝑜𝑟𝑒 =
√𝑂12

Likelihood Statistics
Statistic Equation
Chi-Squared (X2) (𝑂11 − 𝐸11 )2 (𝑂12 − 𝐸12 )2 (𝑂21 − 𝐸21 )2 (𝑂22 − 𝐸22 )2
χ2 = + + +
(4 term) 𝐸11 𝐸12 𝐸21 𝐸22
Chi-Squared (X2)
(|𝑂11 − 𝐸11 | − 0.5)2 (|𝑂12 − 𝐸12 | − 0.5)2 (|𝑂21 − 𝐸21 | − 0.5)2 (|𝑂22 − 𝐸22 | − 0.5)2
(4 term) (Yates χ2 (𝑌𝑎𝑡𝑒𝑠) = + + +
Correction) 𝐸11 𝐸12 𝐸21 𝐸22
Log-Likelihood 𝑂11 𝑂21 𝑂12 𝑂22
𝐿𝑜𝑔 𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 = 2 (𝑂11 ln ( ) + 𝑂21 ln ( ) + 𝑂12 ln ( ) + 𝑂22 ln ( ))
(G2) (4 term) 𝐸11 𝐸21 𝐸12 𝐸22
Log-Likelihood 𝑂11 𝑂21
𝐿𝑜𝑔 𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 = 2 (𝑂11 ln ( ) + 𝑂21 ln ( ))
(G2) (2 term) 𝐸11 𝐸21
Text Dispersion
(4 term) 𝑂11 𝑂21 𝑂12 𝑂22
𝑇𝑒𝑥𝑡 𝐷𝑖𝑠𝑝𝑒𝑟𝑠𝑖𝑜𝑛 = 2 (𝑂11 ln ( ) + 𝑂21 ln ( ) + 𝑂12 ln ( ) + 𝑂22 ln ( ))
(O and E for 𝐸11 𝐸21 𝐸12 𝐸22
range values)
Text Dispersion
(2 term) 𝑂11 𝑂21
𝑇𝑒𝑥𝑡 𝐷𝑖𝑠𝑝𝑒𝑟𝑠𝑖𝑜𝑛 = 2 (𝑂11 ln ( ) + 𝑂21 ln ( ))
(O and E for 𝐸11 𝐸21
range values)

Notes
• If you have any suggestions for improving the software or notice any bugs, please post them in the
AntConc Discussion Group (https://groups.google.com/g/antconc). Indeed, many of the improvements
and updates made to the software have been due to the comments of users around the world, for
which I am very grateful. The AntConc Discussion Group is also a good place to discuss how you are
using the software and any challenges that you face.
• If you find the software useful in your research, teaching, or learning, you may consider making a small
donation to support the future development of this tool. A link to the donation page can be found
here: https://www.laurenceanthony.net/software/antconc/
• You may also be interested in becoming an AntConc patron. Depending on the level of support, this
option will give you priority support with direct access to the developer (Laurence Anthony), and
various other benefits. A link to the donation page can be found here:
https://www.patreon.com/antlab

Acknowledgements
I would like to say thank you to the users of AntConc who have taken the trouble to post feedback on the
software and make suggestions for improvements and/or changes. A very special thank you goes to all those
who have very generously supported the project either through single donations via PayPal or becoming a
Patreon supporter. A complete list of individual acknowledgments can be found in the Help menu - "About
AntConc" menu.

The development of AntConc has been supported by a Japan Society for Promotion of Science (JSPS) Grant-in-
Aid for Scientific Research (C): No. 23501115, a Japan Society for Promotion of Science (JSPS) Grant-in-Aid for
Young Scientists (B): No. 18700658, a Japan Society for Promotion of Science (JSPS) Grant-in-Aid for Young
Scientists (B): No. 16700573, and a WASEDA University Grant for Special Research Projects: No. 2004B-861.

Known Issues
None at present.

Zero-Spark JSON Fields
0% (1)
Zero-Spark JSON Fields
6 pages
Ghid Utilizare ANTCONC 3.5.9
No ratings yet
Ghid Utilizare ANTCONC 3.5.9
12 pages
Computer Science Syllabus: Forms 5 - 6
No ratings yet
Computer Science Syllabus: Forms 5 - 6
31 pages
FANUC Seris O-L
No ratings yet
FANUC Seris O-L
117 pages
Dinnen
100% (1)
Dinnen
734 pages
Corpus Analysis With Antconc
No ratings yet
Corpus Analysis With Antconc
23 pages
AJ-6002i DTF (BYHX) For New Printer Installation
100% (1)
AJ-6002i DTF (BYHX) For New Printer Installation
22 pages
Chapter 1 Hoffmann, Evert, Smith, Lee, Berglund-Prytz (2008) Corpus Linguistics With BNCweb
No ratings yet
Chapter 1 Hoffmann, Evert, Smith, Lee, Berglund-Prytz (2008) Corpus Linguistics With BNCweb
23 pages
Merriam Webster Collegiate Dictionary Elevent Edition Users Guide (Merriam Webster, p72)
80% (5)
Merriam Webster Collegiate Dictionary Elevent Edition Users Guide (Merriam Webster, p72)
72 pages
Help
No ratings yet
Help
20 pages
Antconc Help
No ratings yet
Antconc Help
21 pages
AntConc Readme
No ratings yet
AntConc Readme
26 pages
AntConc Readme
No ratings yet
AntConc Readme
25 pages
Build 1.4.1.0: Antwordprofiler (Windows, Macintosh Os X, and Linux)
No ratings yet
Build 1.4.1.0: Antwordprofiler (Windows, Macintosh Os X, and Linux)
10 pages
Revit Structure Metric Tutorials PDF
88% (33)
Revit Structure Metric Tutorials PDF
560 pages
Corpus Tools
No ratings yet
Corpus Tools
46 pages
ITrack Live Quick Start Guide 2
No ratings yet
ITrack Live Quick Start Guide 2
20 pages
Unit 8 Going Solo - DIY Corpora
No ratings yet
Unit 8 Going Solo - DIY Corpora
5 pages
Introduction To Digital Tools
No ratings yet
Introduction To Digital Tools
23 pages
Corpus Analysis Using Antconc
No ratings yet
Corpus Analysis Using Antconc
36 pages
Bowker - Corpus Linguistics - Library Hi Tech 2018 - Accepted Version
No ratings yet
Bowker - Corpus Linguistics - Library Hi Tech 2018 - Accepted Version
26 pages
Corpus Vocab
No ratings yet
Corpus Vocab
47 pages
AntConc A Learner and Classroom Friendly, Multi-Platform Corpus Analysis Toolkit
No ratings yet
AntConc A Learner and Classroom Friendly, Multi-Platform Corpus Analysis Toolkit
7 pages
IGCSE - PAPER-1 - Student Notes-115-120
No ratings yet
IGCSE - PAPER-1 - Student Notes-115-120
6 pages
Hoffmann Et Al Ch1-3
No ratings yet
Hoffmann Et Al Ch1-3
60 pages
Seminar 2
No ratings yet
Seminar 2
11 pages
BNC170BBNNNCCC
No ratings yet
BNC170BBNNNCCC
170 pages
Antconc Presentation
No ratings yet
Antconc Presentation
12 pages
Corpus Linguistics Part 1
No ratings yet
Corpus Linguistics Part 1
30 pages
README AntConc3.2.4
No ratings yet
README AntConc3.2.4
21 pages
Excel Spreadsheet Formulas: Reference Operators
No ratings yet
Excel Spreadsheet Formulas: Reference Operators
43 pages
Introduction To Antconc by Tahir Shah
No ratings yet
Introduction To Antconc by Tahir Shah
20 pages
Concordance
No ratings yet
Concordance
12 pages
Cambridge Sketch Engine Getting Started 2.0
No ratings yet
Cambridge Sketch Engine Getting Started 2.0
24 pages
Build A Simple Chatbot With Fastapi 1733585637
No ratings yet
Build A Simple Chatbot With Fastapi 1733585637
11 pages
A 17 Concordance
No ratings yet
A 17 Concordance
13 pages
Voyant Tools Paradise Lost Tutorial
No ratings yet
Voyant Tools Paradise Lost Tutorial
31 pages
Abstract:: How To Create An Online Corpus
No ratings yet
Abstract:: How To Create An Online Corpus
13 pages
Antconc: Design and Development of A Freeware Corpus Analysis
No ratings yet
Antconc: Design and Development of A Freeware Corpus Analysis
9 pages
Do-It-Yourself Data Mining - Part I Text Analysis Using Architext Principles of Text Analysis
No ratings yet
Do-It-Yourself Data Mining - Part I Text Analysis Using Architext Principles of Text Analysis
9 pages
Chapter 11
No ratings yet
Chapter 11
37 pages
Ant Conc
No ratings yet
Ant Conc
13 pages
10 Different Common Use Cases of Business Rule in Successactors Employee Central
No ratings yet
10 Different Common Use Cases of Business Rule in Successactors Employee Central
12 pages
Aparkin
No ratings yet
Aparkin
13 pages
Ad TAThelp
No ratings yet
Ad TAThelp
12 pages
Greek Proverbs - D. S. Baker
No ratings yet
Greek Proverbs - D. S. Baker
41 pages
Stories & Legends A First Greek Reader - Colson
No ratings yet
Stories & Legends A First Greek Reader - Colson
242 pages
CSE 102: Computer Programming: Structures
No ratings yet
CSE 102: Computer Programming: Structures
12 pages
Ant Con
No ratings yet
Ant Con
7 pages
Oclc Connexion: Searching Worldcat Quick Reference
No ratings yet
Oclc Connexion: Searching Worldcat Quick Reference
8 pages
Identifying At-Risk Factors That Affect College Student Success
No ratings yet
Identifying At-Risk Factors That Affect College Student Success
20 pages
Uses of The Internet, Online Systems, and Platforms
No ratings yet
Uses of The Internet, Online Systems, and Platforms
33 pages
Part201 Communicationsoverlonworksv1.82
No ratings yet
Part201 Communicationsoverlonworksv1.82
74 pages
Corpus Linguistics and Corpus Analysis
No ratings yet
Corpus Linguistics and Corpus Analysis
7 pages
The Economics of Roman Elegy: Voluntary Poverty, The Recusatio, and The Greedy Girl
No ratings yet
The Economics of Roman Elegy: Voluntary Poverty, The Recusatio, and The Greedy Girl
31 pages
Iweb Overview
No ratings yet
Iweb Overview
8 pages
Wiktionary Matcher: 1 Presentation of The System
No ratings yet
Wiktionary Matcher: 1 Presentation of The System
8 pages
Linguistic Search
No ratings yet
Linguistic Search
4 pages
Search World Cat Quick Reference
No ratings yet
Search World Cat Quick Reference
8 pages
Getting Started With Antconc Wide Emu 2013
No ratings yet
Getting Started With Antconc Wide Emu 2013
11 pages
BT MeetMe Services With Cisco WebEx Install Guide
No ratings yet
BT MeetMe Services With Cisco WebEx Install Guide
26 pages
AntCorGen Manual
No ratings yet
AntCorGen Manual
3 pages
Exploring and Comparing Tools: Borrow With Antconc: February 2014
No ratings yet
Exploring and Comparing Tools: Borrow With Antconc: February 2014
4 pages
Huang 2015
No ratings yet
Huang 2015
5 pages
A Scalable Lock-Free Stack Algorithm (2004)
No ratings yet
A Scalable Lock-Free Stack Algorithm (2004)
10 pages
Antconc Tutorial
No ratings yet
Antconc Tutorial
2 pages
? How To Use Basic Functions in AntConc
No ratings yet
? How To Use Basic Functions in AntConc
2 pages
Sample Abap Code On BAPI
No ratings yet
Sample Abap Code On BAPI
14 pages
ServiceNow Notes
50% (2)
ServiceNow Notes
9 pages
Bca Vi Sem Mis III Unit Notes
No ratings yet
Bca Vi Sem Mis III Unit Notes
13 pages
Stylo R Script Mini Howto
No ratings yet
Stylo R Script Mini Howto
6 pages
Predicting CVSS Metric Via Description Interpretat
No ratings yet
Predicting CVSS Metric Via Description Interpretat
10 pages
Transition To College Inventory: Manual For The
No ratings yet
Transition To College Inventory: Manual For The
12 pages
SOA CSW2 Assignment On Heaps
No ratings yet
SOA CSW2 Assignment On Heaps
11 pages
SAVIGA C01 Demo
No ratings yet
SAVIGA C01 Demo
6 pages
Word Cruncher
No ratings yet
Word Cruncher
3 pages
Palabras Frecuentes en Inglés
100% (1)
Palabras Frecuentes en Inglés
7 pages
Strategy and Work Plan For Scanning and Digitization of Records For District Magistrate
No ratings yet
Strategy and Work Plan For Scanning and Digitization of Records For District Magistrate
4 pages
Updated CERTIFICATE PAGE
No ratings yet
Updated CERTIFICATE PAGE
7 pages
BDA Lab 5
No ratings yet
BDA Lab 5
6 pages
Business Analyst Resume Sample
No ratings yet
Business Analyst Resume Sample
4 pages
Quickstart Guide To Text Analysis With Textstat
No ratings yet
Quickstart Guide To Text Analysis With Textstat
2 pages
Kabita 708 Work File
No ratings yet
Kabita 708 Work File
3 pages
DISKPART
No ratings yet
DISKPART
1 page
Resume of Temberton
No ratings yet
Resume of Temberton
2 pages
Implementing Domain-Specific Languages with Xtext and Xtend - Second Edition
From Everand
Implementing Domain-Specific Languages with Xtext and Xtend - Second Edition
Lorenzo Bettini
4/5 (1)
Rust In Practice: A Programmers Guide to Build Rust Programs, Test Applications and Create Cargo Packages
From Everand
Rust In Practice: A Programmers Guide to Build Rust Programs, Test Applications and Create Cargo Packages
Rustacean Team
No ratings yet
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
Swift Programming Nuts and bolts
From Everand
Swift Programming Nuts and bolts
Keith Lee
No ratings yet
Perceptual Computing: Fundamentals and Applications
From Everand
Perceptual Computing: Fundamentals and Applications
Fouad Sabry
No ratings yet
Rust for Beginners
From Everand
Rust for Beginners
Hernando Abella
No ratings yet
Core Objective-C in 24 Hours
From Everand
Core Objective-C in 24 Hours
Keith Lee
5/5 (1)
Rust Crash Course: Build High-Performance, Efficient and Productive Software with the Power of Next-Generation Programming Skills (English Edition)
From Everand
Rust Crash Course: Build High-Performance, Efficient and Productive Software with the Power of Next-Generation Programming Skills (English Edition)
Abhishek Kumar
No ratings yet
iOS Programming Nuts and bolts
From Everand
iOS Programming Nuts and bolts
Keith Lee
4/5 (1)
Rust In Practice
From Everand
Rust In Practice
GitforGits
No ratings yet
Learn Rust Programming: Safe Code, Supports Low Level and Embedded Systems Programming with a Strong Ecosystem (English Edition)
From Everand
Learn Rust Programming: Safe Code, Supports Low Level and Embedded Systems Programming with a Strong Ecosystem (English Edition)
Claus Matzinger
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Help

Uploaded by

Help

Uploaded by

AntConc (Windows, MacOS, Linux)

KWIC (Key-Word-In-Context) Tool

Word List Tool:

Keyword List Tool:

KWIC (' Key-Word-In-Context') Tool

The following steps produce a set of concordance

The following steps produce a view of the original

1) Select a corpus using the "Corpus Manager"

The following steps produce a set of clusters and

1) Select a corpus using the "Corpus Manager"

The following steps produce a set of n-grams and

1) Select a corpus using the "Corpus Manager"

The following steps produce a set of collocates and

1) Select a corpus using the "Corpus Manager"

The following steps produce a word list and

1) Select a corpus using the "Corpus Manager"

The following steps demonstrate a typical interaction

Choosing/Saving a pre-built corpus database

Building/Saving a corpus from raw files

Building/Saving a corpus from a word list

Effect Size Statistics

Z-Score 𝑂11 − 𝐸11

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.