Help
Help
Build 4.3.1
Laurence Anthony, Ph.D.
Center for English Language Education in Science and Engineering, School of Science and
Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
July 29, 2024
Introduction
AntConc is a freeware, multiplatform tool for carrying out corpus linguistics research, introducing corpus
methods, and doing data-driven language learning. It runs on any computer running Microsoft Windows (built
on Win 10), MacOS (built on Mac Catalina), and Linux (built on Linux Mint). It is developed in Python and Qt
using the PyInstaller compiler to generate executables for the different operating systems. It uses SQLite as the
underlying database.
Getting Started
Windows - Installer
Double click the AntConc.exe file and follow the instructions to install the application into your Programs folder.
You can delete the .exe file when you are finished. You can start the application via the Start Menu.
Windows - Portable
Unzip the AntConc.zip file into a folder of your choice. In the AntConc folder, double click the AntConc.exe file to
launch the program.
Macintosh OS X
Double click the AntConc.dmg file to create an AntConc disk image on your desktop. Open the disk image and
drag and drop the AntConc app onto the Applications folder (or into another location if you desire). You can then
launch the app by double clicking on the icon in the Applications folder or Launchpad.
Linux
Decompress the AntConc.tar.gz file into a folder of your choice. In the AntConc folder, double click the AntConc.sh
file to launch the software. On the command line, type ./AntConc.sh to launch the software.
Overview of Tools
AntConc contains nine tools that can be accessed either by clicking on their 'tabs' in the tool window, using
CTRL+TAB to toggle through the tools, or using the key combination CTRL + Tool Number (e.g., CTRL +1 for
KWIC, CTRL +2 for Plot) to select a specific tool.
Plot Tool
This tool shows concordance search results
plotted in a 'barcode' format, with the length of
the text normalized to the width of the bar and
each hit shown as a vertical line within the bar.
This allows you to see the position where search
results appear in the individual texts of a corpus.
File Tool
This tool shows the contents of individual texts.
This allows you to investigate in more detail the
results generated in other tools of AntConc.
Cluster Tool
The tool shows contiguous (together in a
sequence) word patterns based on the search
condition. This allows you to see common
phrases that appear in the target texts.
N-Gram Tool
This tool scans the entire corpus for all 'N'-sized
clusters (e.g., 2-word clusters, 3-word clusters,
…). This allows you to find common expressions
in a corpus.
Collocate Tool
This tool shows words that appear frequently
within a certain distance of the search term (i.e.,
collocates). This allows you to find which words
co-occur with other words in a corpus.
The following steps produce a set of plot results from a corpus and demonstrate the main features of this tool.
1) Select a corpus using the "Corpus Manager" available from the File menu. Alternatively, create a quick
corpus by choosing the "Open File(s) as Quick Corpus" from the file menu. The files contained in the
corpus are shown in the left frame of the main window under "Target Corpus".
2) Enter a search query in the search box. See the 'SEARCH OPTIONS' section in this document for an
explanation of the "Words", "Case", and "Regex" search term options.
3) Choose the size of the results set to be presented using the "Result Set" combobox widget.
4) Use the "Plot Zoom" widget to control the size of the plot and the degree of detail to be shown.
5) Click on the "Start" button to start the search and wait for the results to be displayed.
6) Use the "Sort by" option to rearrange the plots according to the various parameters shown.
7) The total number of hits and total number of plots are shown at the top of the tool window. When no
hits are found, a warning will be shown on the screen.
8) Double-clicking on any cell in the results window will cause the software to jump to the File tool (see the
relevant section of this document) where you can view the hit exactly as it appears in context in the
original file.
9) By checking the "Overlay" option and choosing an appropriate color (by clicking on the color box),
existing results can be overlaid with new results for different searches. This allows you to see how
different search queries are related and/or overlap.
10) Advanced searches are available with this tool. Several menu preferences are also available with this
tool. (See the relevant sections in this document for explanations).
File Tool
This tool shows the text of individual files. This
allows you to investigate in more detail the results
generated in other tools of AntConc.
Cluster Tool
The tool shows adjacent word groups based on the
search condition. This allows you to see how words
and phrases are commonly used in a corpus of texts.
In some cases, this tool can be seen as summarizing
the results generated in the KWIC tool.
N-Gram Tool
This tool scans the entire corpus for all 'N'-sized
clusters (e.g., 2-word clusters, 3-word clusters, …).
This allows you to find common expressions in a
corpus.
Word Tool
This tool counts all the words in the corpus and
presents them in an ordered list. This allows you to
find which words are the most frequent in a corpus.
Keyword Tool
This tool shows words that appear unusually
frequently in the target corpus in comparison with
the words in the reference corpus based on a
statistical measure (i.e., 'keywords'). These words
can be considered to be characteristic of the target
corpus. The settings can also be changed to show
words that appear unusually infrequently in the
target corpus compared with the reference corpus
(i.e., 'negative keywords'.
The following steps produce a keyword list and demonstrate the main features of this tool.
1) Create a quick corpus by choosing the "Open File(s) as Quick Corpus" option from the file menu.
Alternatively, choose the "Corpus Manager" option from the file menu and make sure the "Target
Corpus" option is selected. Then, select one of the available corpora or create your own from raw files or
a word list (see the instruction under the Corpus Manager section of this help page for how to do this).
This corpus will then serve as the target corpus for your analysis. The files contained in the corpus will be
shown in the top left frame of the main window under "Target Corpus".
2) Choose a reference corpus by opening the "Corpus Manager" option from the file menu and checking
the "Reference Option" option. Next, as in step 1, select one of the available corpora or create your own
from raw files or a word list. The files contained in the corpus will be shown in the bottom left frame of
the main window under "Reference Corpus".
3) Click on the "Start" button to start the processing and wait for the results to be displayed. If a search
query is entered, only words that match the query will be shown. See the 'SEARCH OPTIONS' section in
this document for an explanation of the "Words", "Case", and "Regex" search term options.
4) Use the "Sort by" option to rearrange the ordering of the results.
5) The total number keyword types ("Keyword types") and combined total count of all the word tokens
("Keyword Tokens") are shown at the top of the tool window. When no hits are found, a warning will be
shown on the screen.
6) Double-click on any cell in the results window to cause the software to jump to the KWIC tool (see the
relevant section of this document) where you can view concordance lines for that keyword across the
whole corpus.
7) Advanced searches are available with this tool. Several menu preferences are also available with this
tool. (See the relevant sections in this document for explanations).
Wordcloud Tool
This tool visualizes the results generated by KWIC,
File, Cluster, N-Gram, Collocate, Word, and Keyword
tools as well as a “Scratchpad” of plain text in the
form of a ‘word cloud’. This is a grouping of words
where the sizing of the words reflects a property of
those words (e.g., frequency).
Wordclouds are often used as aesthetically pleasing
visualizations, where words are laid out in a viewing
area or ‘themed’ image mask and sized according to a
property (e.g., word frequency). Care should be taken
when using wordclouds for linguistic analysis, as the visualization necessitates distorting word sizes to fit the
viewing area.
The following steps produce a word cloud and demonstrate the main features of this tool.
1. Choose a “Source” for the word cloud. This can be a “Scratchpad” of plain text (accessible as the first
option in the “Appearance” list, or the output of the KWIC, File, Cluster, N-Gram, Collocate, Word, and
Keyword tools.
2. Choose the properties from the source to display as “labels” and “values” in the word cloud. For the KWIC,
File and Scratchpad sources, the properties of “Type” (Word) and “Freq” (Frequency) are chosen
automatically.
3. Choose the image size. This will determine how many words can be placed in the image. Note that the
image will be automatically scaled to the window display size.
4. Choose the maximum number of words to display in the word cloud. Depending on the number of items
from the source and the other settings (e.g., the minimum font size), this value might not be reached.
5. Check the “Use stopwords” checkbox to remove stopwords from the word cloud.
6. Check the “Repeat words” checkbox to fill as much remaining space as possible in the word cloud image
with existing words. Depending on other settings (e.g., the minimum font size), not all the space will be
used.
7. Click “Start” to generate the word cloud.
Appearance Option
The appearance of the word cloud can be adjusted through the following settings:
Scratchpad: This is a free writing area. If the scratchpad is chosen as the source, the words here (and their
frequencies of occurrence) will be used to plot the wordcloud.
Mask settings: Use these settings to determine if the word cloud should be ‘masked’ and if so, which mask to
use. Additional masks (.png and .svg files) can be added to the list by clicking the “Add” button.
The included masks with .svg extensions are kindly provided by Font Awesome
(fontawesome.com).
Color settings: Choose how to color the word cloud. Three options are available. If “Color theme” is chosen,
you can pick a color theme from the dropdown list of options and the range value that
determines which color(s) in the theme are used for which values in the word cloud. If “Text
color” is chosen, you can pick a specific color from the colors available on the system. If “Mask
color” is chosen, the colors of the words in the word cloud will match the colors used in the
original mask image. The image background can be set to a specific color or made transparent.
Font settings: Choose which font family and font size to use in the word cloud. If the “Allow squeezing”
option is selected, new words to be added to the word cloud that cannot fit in the remaining
space will be incrementally “squeezed” (reduced in font size) until they fit the space available.
This “squeezing” effect will distort the appearance of a word but will usually result in a more
aesthetically pleasing result. It is also used when the “Repeat words” option is chosen to fill as
much of the remaining space as possible. The “Scaling Factor” setting determines the weighting
given to the value (e.g., frequency) of the word to the ranking of that value. At 1.0, only the
value is considered. At 0.0, only the rank is considered. The “H(orizontal)/V(ertical) Ratio”
setting determines the probability of horizontal words plotted over vertical words. At 1.0, all
words will be plotted horizontally. At 0.0, all words will be plotted vertically.
ChatAI Tool
This tool gives you access to closed-source, open-
source, and open-weights Large Language Models
(LLMs) through a chat-like interface. When 'chatting'
with the LLMs, you can choose to interact with the
LLM directly, or you can supply the LLM with results
generated by other AntConc tools (e.g., the KWIC
tool) and ask it to utilize that information in its
responses.
1. If you want to use an LLM provided by OpenAI, in the ChatAI tool settings menu, paste in your Open API
key. If you have already entered your API key into your environment settings (on Windows), the key will
appear here automatically. Note that you will be charged by OpenAI for all API interactions according to
their pricing scheme: https://openai.com/pricing. If you want to use an open source or open weights LLM
model, you can skip this step.
2. In the ChatAI tool settings menu...
a. Select the default LLM that you want to use. Clip "Update Models" to refresh the list of available
models to reflect the most recent models available.
b. Select the "System Prompt" setting that you want to use during your interactions. The default
system prompt is a typical standard: " You are a helpful assistant."
3. In the main ChatAI interface window...
a. Set the maximum number of tokens ("Max tokens") to be generated by the LLM. Deactivating this
option will allow the LLM to generate its maximum output tokens.
b. Set the temperature (Temp) of the LLM. A setting of 0.0 will cause the LLM to always choose the
most probable next predicted token. A setting of 2.0 will cause the LLM to choose from a wider
range of probabilities for the next token, making it produce more creative, unpredictable
responses.
c. Check the "Stream" option to allow the LLM to show the response as it is generated one token at a
time. When unchecked, the complete response will be shown after it is generated.
d. Set the LLM that you want to use.
e. Choose whether to use the output from another AntConc tool by activating the "Source" option
and selecting which tool to use.
f. Set the "Context Policy" to either "Remove previous context" so that each prompt submission will
be treated as the start of a completely new conversation or "Maintain rolling context window" so
that as much of the previous context as possible will be remembered until the LLM context size is
reached. At this point, the earliest messages (prompts and responses) will be deleted from the
context window before the new prompt is added. Note that the system prompt is never deleted.
4. Type in your prompt in the text area. Save your prompt by clicking the "Save Prompt" button if you think
you will use it again. All save prompts are available in the "Prompt History" list. If you choose to supply
results from another AntConc tool, they will be automatically appended to the end of the prompt (unseen).
As the LLM will remember the results supplied, after you supply the results once, you are recommended to
deactivate the "Source" option to reduce the number of tokens supplied to the LLM.
5. Click the "Submit" button or click "CTRL Return" to send your prompt to the LLM.
6. Wait for the LLM to respond. The response will appear in the upper window of the tool.
7. Continue to chat with the LLM based on the previous responses.
8. Click "Clear" to cause the LLM to forget the current conversation and start a new one.
SEARCH OPTIONS
Search queries can be composed of full words or word fragments, with or without wildcards. The basic syntax
roughly follows the Common Elementary Query Language (CEQL). See https://cwb.sourceforge.io/ceql.php for
more details). Searches can be either “case insensitive” (default) or “case sensitive” by activating or
deactivating the "Case" search term option. Searches can also be made using full regular expressions by
activating the "Regex" option. With the "regex" option, each word-level regular expression needs to be
separated by whitespace. To make regex expressions case-aware, select the "Case" option. For details on how
to use regular expressions, consult one of the many texts on the subject, e.g., Mastering Regular Expressions
(O'Reilly & Associates Press) or type "regular expressions" in a web search engine to find many sites on the
subject (e.g., http://www.regular-expressions.info/quickstart.html). AntConc supports Perl regular expressions
including Unicode character classes, e.g., \p{Letter}, even though the software is built using Python.
ADVANCED SEARCHES
By clicking on the "Advanced Search" button (available in all tools), more complex searches become possible.
• The "Search Query List" option allows you to import a set of search queries. You can do this in one of three
ways: 1) Type each individual search query in the entry box and click "Add"; 2) Drag and drop the list of
search queries into the viewer below the entry box; 3) Copy and paste the list of search queries into the
viewer below the entry box. When dragging and dropping or pasting your list of queries, each line will be
treated as a separate search query. This feature allows you to use a large set of search queries without
having to retype them each time. Any search query accepted in the main interface can be used.
• The "Context Search List" option (not available in all tools) allows you to define search queries that must
match within a certain context window around the main search term(s). For example, to search for
"student" or "students" appearing at least three words to the left or right of the word "university," add
"university" to the "Search Query List", and then add "student" and "students" to the "Context Search List"
list. Finally, set the "Context Search List" "Window Span" as "From 3L" and "To 3R".
• The "SQL Search List" option (not available in all tools) allows you to adjust query conditions by applying
conditions directly on tables in the database that stores the corpus through a series of 'join' operations. For
example, the following entry in the list will join a custom "genres" table in the database (created through
the Corpus Manager) with the main "corpus" table, applying a condition that the "genre" column entry for
the file must be marked as "academic". The two tables are joined via the common "doc_id" column:
["genres", "genre = 'academic'", "doc_id"]
The format for the list entry should be a JSON array, with three components, "table" + "condition" + "join
column".
MENU OPTIONS
Menu options are divided into three groups, "File", "Edit", "Settings" and "Help". The options available in each
group will be described below.
<FILE>
• Open File(s) as 'Quick Corpus'...
o This option is for quickly creating a temporary corpus. You will be asked to choose the files you
want to add to your corpus. Then, the software will create a "temp " corpus using the default
settings of the Corpus Manager and load this for immediate use. Any existing "temp" corpus
will be overwritten.
• Open Corpus Manager...
o This option opens the Corpus Manager, where you can choose prebuilt corpora from the
default library, add or delete corpora from a user library, or create custom corpora from raw
files. See the Corpus Manager for complete details.
• Swap Target/Reference Corpora
o This option swaps the target and reference corpora allowing easy comparisons.
• Clear Tool/All Tools/All Tools and Files
o These options will reset the interface.
• Save Current Tab Results...
o This option allows results displayed in the main interface to be exported in a file format. Note
that hidden columns will also be included. (Direct copying and pasting of results from the
interface are also possible.)
• Save Current Tab Database Tables...
o This option allows complete tables of results from the corpus database to be exported to a set
of .csv files. All relevant information about results can be found in these files.
• Import Settings From File.../Export Settings To File...
o These options allow the state of the software to be saved and reloaded at a different time.
• Restore Default Settings
o This option resets the state of the software to when it was first installed. All custom settings
are lost.
<EDIT>
• Select All
o This option selects all results in the results window. The same effect can be achieved using the
standard keyboard shortcut for "Select All". See the SHORTCUTS section for more details.
• Copy
o This option selects any text in the results window. The same effect can be achieved using the
standard keyboard shortcut for "Copy". See the SHORTCUTS section for more details.
<SETTINGS>
• Global Settings (applied to all tools in the interface)
o Colors - This setting decides the main highlight color (e.g., for highlighted words in the File tool)
and the color indicators for the Corpus Manager Pre-Built Corpus Library.
o Files - This setting decides how the paths to files are shown. Also, this setting determines which
file types are used as defaults in File Open dialogs and file drop options. This setting is also
used to decide if encoding errors are shown or ignored when creating corpora from raw files.
o Fonts - This setting decides the font family, size, and style of the font for the main interface.
o Language Direction - This setting decides how to display results (especially for the KWIC
concordance tool) depending on the language direction. For example, choose the default "Left-
to-right" option for language such as English. Choose the "Right-to-left" option for languages
such as Arabic. Check the 'Arabic' checkbox for smooth processing of Arabic in the Wordcloud
tool.
o Restore Settings - This setting decides if the settings will be automatically saved and restored
when AntConc is restarted.
o Searches - This setting lists all wildcards available in the system (note that these cannot be
edited).
o Statistics - This setting decides how values are displayed through normalization and floating-
point precision settings.
o Tags - This setting decides how word information is displayed in the interface depending on the
corpus currently loaded. For a fully tagged corpus, the options will be "Type", "POS",
"Type+POS", and "headword" (lemma).
o Tool Filters - This setting decides if only words in the selected file will be shown or hidden in
the respective tools. When the "Hide words in file" option is chosen, the selected file serves as
a "stop list".
• Tool Settings
o KWIC
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed.
▪ Display Options - decides the colors used to highlight the sort order.
▪ Other Options
• Choose to show or hide the file names in the display.
• Choose to show or hide the search term in the display. This option is useful for
allowing instructors to quiz students on possible words to fit the gap.
o Plot
▪ View Style - decides which view to use (table/graphic or graphic)
▪ Display Options - decides how results are displayed.
▪ Statistics - decides the parameters for determining the dispersion measure.
▪ Other Options - decides various parameters for sizing/displaying the plot graphs.
o File
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed.
o Cluster
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed.
▪ Display Options - decides what information is shown in the results window.
▪ Filter Options - decides if clusters can only span cross whitespace boundaries or can
include other characters (e.g., punctuation)
o N-Gram
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed.
▪ Display Options - decides what information is shown in the results window.
▪ Filter Options - decides if clusters can only span cross whitespace boundaries or can
include other characters (e.g., punctuation)
o Collocate
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed.
▪ Likelihood Measure + Threshold
• Choose the statistic and cut-off point (threshold) for inclusion of words in the
collocates list. Words below the cut-off-point are deemed to appear frequently
together with the query term by chance.
▪ Effect Size Measure + Threshold
• Choose the statistic used determine the strength of relationship between the
query term and collocate and a cut-off point (threshold) for inclusion of words
that meet the minimum effect size.
o Word
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed. Options that show information in brackets will collate that information and
present it as a family. See the examples below for an illustration for what happens with
POS tagged data (i.e., data for which the type and POS information are provided) and
lemmatized data (i.e., data for which the type and headword information are provided)
• Type: this option will combine words with same type but different POS tags or
lemma headwords into a single entry and sum the frequencies and calculate
the range values for this entry
• Type+POS: this option will treat words with the same type but different POS
tags as different entries. The frequencies and range values will be
independent.
• Type+[POS]: this option will combine words with the same type but different
POS tags into a single entry and sum the frequencies and calculate the range
values for this entry. The option will also show all the POS variants and their
separate frequency counts that combine to make up the total.
• Type+Headword: this option will treat words with the same type but different
headword (lemma) tags as different entries. The frequencies and range values
will be independent.
• Headword: this option will combine all words from the same lemma family into
a single headword entry and sum the frequencies and calculate the range
values for this entry
• Headword+[Type]: this option will combine words from the same lemma family
into a single entry and sum the frequencies and calculate the range values for
this entry. The option will also show all the lemma family variants and their
separate frequency counts that combine to make up the total.
▪ Display Options - decides what information is shown in the results window.
o Keyword
▪ Display Type - decides which word-level information (type, POS, headword) will be
displayed. See the entry for Word for an explanation of what the options represent.
▪ Display Options - decides what information is shown in the results window.
▪ Negative Keywords - decides to show words in the target corpus that appear unusually
infrequently in the target corpus compared with the target corpus.
▪ Likelihood Measure + Threshold - decides the statistic and cut-off point for inclusion of
words in the keyword list. Words below this cut-off-point are deemed to appear
frequently in the target corpus compared with the reference corpus by chance.
▪ Effect Size Measure + Threshold - decides the measure used determine the strength
keyness and a cut-off point for inclusion of words that meet the minimum effect size.
[Appropriate effect size measures are still being debated in the field, so the default
setting is to show all values for this measure. With the default settings, keywords are
ranked according to their likelihood measure scores. This equates to ranking keywords
according to p-values, which raises several questions/problems. However, it is the
current standard in the field and results tend to show that ranking by likelihood leads
to more intrinsically intuitive results than those generated when an effect size measure
is used. The current selection of likelihood measures and effect size measures are
inspired by the work of Andrew Hardie of Lancaster University.]
o Wordcloud
▪ Color Theme Options - decides what color themes are available in the tool controller.
• Perceptually uniform sequential
o These themes have incremental changes in lightness and often
saturation of color that are perceived to be uniform. This makes them
suitable to represent changes in frequency or other values.
• Sequential
oThese themes have incremental changes in lightness and often
saturation of color. This makes them suitable to represent changes in
frequency or other values.
• Qualitative
o These themes contain miscellaneous colors. This makes them
unsuitable for most cases. The exception is when you want to produce
visually appealing results with no connection between the color and
the value being represented.
▪ Mask Theme Options - decides what mask themes are available in the tool controller.
o ChatAI
▪ API Settings - decides the way ChatAI utilizes APIs.
• Show API warning on first use
o Show a warning when first using the ChatAI tool to remind the user
that they will be charged by OpenAI for all API interactions.
• OpenAI API Key
o An area to input your OpenAI API key. This is required to use LLMs
provided by OpenAI. Note that you will be charged by OpenAI for all
API interactions according to their pricing scheme:
https://openai.com/pricing.
▪ Model Settings
• Default Model
o This is the default model shown in the main interface.
▪ Prompt Settings
• System prompt
o This is the system prompt that will guide the LLM during a single chat
interaction. It can be used to strengthen the skills of the LLM (e.g.,
explain what expertise it is supposed to have), determine the style of
interaction (e.g., formal or informal), and even what in way it should
proceed (e.g., solving problems slowly in a stepwise fashion).
< HELP >
• Show Help Page
o This option shows the help guide as a PDF file.
• Show License
o This option shows the license agreement that you agree to when using the software.
• Show Version History
o This option shows the complete history of releases, detailing new features, bug fixes, and
major updates.
• About AntConc
o This option shows the release version, release date, copyright information, and
acknowledgments for the software.
Corpus Manager
The Corpus Manager is a multi-purpose tool used to load and save pre-built corpus databases, create, and save
a new corpus from raw (.txt, .srt., …), Word, of PDF files, or create and save a new corpus from a simple of
advanced word list. The three different scenarios are explained below.
Choosing/Saving a pre-built corpus Building/Saving a corpus from raw files Building/Saving a corpus from a word list
Shortcuts
Here is a list of useful shortcuts (including some of the useful standard shortcuts on the operating system).
• CTRL/COMMAND + TAB: Toggles clockwise through the different tools in the tab bar.
• ALT + Tool Number: Selects a specific tool (e.g., ALT+1 for the KWIC Tool, ALT+2 for the Plot tool).
• SHIFT + CTRL/COMMAND + TAB: Toggles anti-clockwise through the different tools in the tab bar.
• CTRL/COMMAND + C: Copies the currently selected text.
• CTRL/COMMAND + A: Selects all text in the window.
• F4 (Win): Reveals the complete list of options in a 'combobox' widget (e.g., the search history in the
search query box).
• ARROW KEYS: For any 'combobox' widgets (e.g., the KWIC search query box) or 'spinbox' widgets (e.g.,
the KWIC context size), the 'UP' and 'DOWN' arrow keys on the keyboard can be used to change the
value of the option.
• CTRL/COMMAND + O: Opens the "Corpus Manager".
• CTRL/COMMAND + G: Opens the "Global Settings".
• CTRL/COMMAND + T: Opens the "Tool Settings".
• CTRL/COMMAND + H: Toggles the view "(H)ide" of "Show" setting for the KWIC tool file name column.
• CTRL/COMMAND + '+': Zooms in the "Plot Tool" display.
• CTRL/COMMAND + '-': Zooms out the "Plot Tool" display.
• CTRL/COMMAND + F: Searches for the next hit in the "File Tool" display.
• SHIFT + CTRL/COMMAND + F: Searches for the previous hit in the "File Tool" display.
• CTRL/COMMAND + SHIFT + T: Swaps/Toggles the target and reference corpora in the main display.
Citing/Referencing
Use the following method to cite and reference AntConc according to the APA style guide:
Anthony, L. (YEAR OF RELEASE). AntConc (Version VERSION NUMBER) [Computer Software]. Tokyo, Japan:
Waseda University. Available from https://www.laurenceanthony.net/software.html
For example, if you download AntConc 4.0.0, which was released in 2021, you will cite/reference it as follows:
Anthony, L. (2021). AntConc (Version 4.0.0) [Computer Software]. Tokyo, Japan: Waseda University. Available
from https://www.laurenceanthony.net/software.html
STATISTICS
Below is a list of statistics used in AntConc. The notation used here is taken from the work of Evert (2004: 36-
37). Explanations are provided in Anthony (2023).
References:
Anthony, L. (2023). Common statistics used in corpus linguistics. Available at
https://laurenceanthony.net/resources/statistics/common_statistics_used_in_corpus_linguistics.pdf.
Evert, S. 2004. The Statistics of Word Cooccurrences: Word Pairs and Collocations. Unpublished Ph.D. thesis.
University of Stuttgart. (Published 2005; available online at
http://elib.unistuttgart.de/opus/volltexte/2005/2371/.)
𝑂11 𝑂
DRF 𝐷𝑅𝐹 = ⁄𝑅 − 21⁄𝑅
1 2
Likelihood Statistics
Statistic Equation
Chi-Squared (X2) (𝑂11 − 𝐸11 )2 (𝑂12 − 𝐸12 )2 (𝑂21 − 𝐸21 )2 (𝑂22 − 𝐸22 )2
χ2 = + + +
(4 term) 𝐸11 𝐸12 𝐸21 𝐸22
Chi-Squared (X2)
(|𝑂11 − 𝐸11 | − 0.5)2 (|𝑂12 − 𝐸12 | − 0.5)2 (|𝑂21 − 𝐸21 | − 0.5)2 (|𝑂22 − 𝐸22 | − 0.5)2
(4 term) (Yates χ2 (𝑌𝑎𝑡𝑒𝑠) = + + +
Correction) 𝐸11 𝐸12 𝐸21 𝐸22
Log-Likelihood 𝑂11 𝑂21 𝑂12 𝑂22
𝐿𝑜𝑔 𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 = 2 (𝑂11 ln ( ) + 𝑂21 ln ( ) + 𝑂12 ln ( ) + 𝑂22 ln ( ))
(G2) (4 term) 𝐸11 𝐸21 𝐸12 𝐸22
Log-Likelihood 𝑂11 𝑂21
𝐿𝑜𝑔 𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 = 2 (𝑂11 ln ( ) + 𝑂21 ln ( ))
(G2) (2 term) 𝐸11 𝐸21
Text Dispersion
(4 term) 𝑂11 𝑂21 𝑂12 𝑂22
𝑇𝑒𝑥𝑡 𝐷𝑖𝑠𝑝𝑒𝑟𝑠𝑖𝑜𝑛 = 2 (𝑂11 ln ( ) + 𝑂21 ln ( ) + 𝑂12 ln ( ) + 𝑂22 ln ( ))
(O and E for 𝐸11 𝐸21 𝐸12 𝐸22
range values)
Text Dispersion
(2 term) 𝑂11 𝑂21
𝑇𝑒𝑥𝑡 𝐷𝑖𝑠𝑝𝑒𝑟𝑠𝑖𝑜𝑛 = 2 (𝑂11 ln ( ) + 𝑂21 ln ( ))
(O and E for 𝐸11 𝐸21
range values)
Notes
• If you have any suggestions for improving the software or notice any bugs, please post them in the
AntConc Discussion Group (https://groups.google.com/g/antconc). Indeed, many of the improvements
and updates made to the software have been due to the comments of users around the world, for
which I am very grateful. The AntConc Discussion Group is also a good place to discuss how you are
using the software and any challenges that you face.
• If you find the software useful in your research, teaching, or learning, you may consider making a small
donation to support the future development of this tool. A link to the donation page can be found
here: https://www.laurenceanthony.net/software/antconc/
• You may also be interested in becoming an AntConc patron. Depending on the level of support, this
option will give you priority support with direct access to the developer (Laurence Anthony), and
various other benefits. A link to the donation page can be found here:
https://www.patreon.com/antlab
Acknowledgements
I would like to say thank you to the users of AntConc who have taken the trouble to post feedback on the
software and make suggestions for improvements and/or changes. A very special thank you goes to all those
who have very generously supported the project either through single donations via PayPal or becoming a
Patreon supporter. A complete list of individual acknowledgments can be found in the Help menu - "About
AntConc" menu.
The development of AntConc has been supported by a Japan Society for Promotion of Science (JSPS) Grant-in-
Aid for Scientific Research (C): No. 23501115, a Japan Society for Promotion of Science (JSPS) Grant-in-Aid for
Young Scientists (B): No. 18700658, a Japan Society for Promotion of Science (JSPS) Grant-in-Aid for Young
Scientists (B): No. 16700573, and a WASEDA University Grant for Special Research Projects: No. 2004B-861.
Known Issues
None at present.