0% found this document useful (0 votes)

15 views

2_notes (3)

This lecture explores Large Language Models (LLMs), advanced AI systems that process and generate human language, transforming technology interactions. It covers the complexities of human language, the architecture and training of LLMs, and the importance of high-quality training data. Additionally, it discusses customization options for LLMs to enhance performance and adapt to specific tasks.

Uploaded by

nihilnoths

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views

2_notes (3)

Uploaded by

nihilnoths

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

SCDS1001 - Artificial Intelligence Literacy I

L-4: Understanding Large Language Models (LLMs)

Overview
This lecture delves into the intricacies of Large Language Models (LLMs), which are sophisticated
artificial intelligence systems that excel in processing and generating human language. LLMs are
transforming how we interact with technology, enabling more natural conversations and more effective
communication with machines.

Human Language
Human language is fundamentally a tool that facilitates communication and is a cornerstone of human
progress and innovation. It is not only a means of expressing ideas but also a meta-tool that enables the
creation, sharing, and refinement of various other tools. The complexity of human language arises from
its numerous irregularities, exceptions, idioms, and evolving meanings, which can pose challenges for
both learners and AI systems. Additionally, language is inherently ambiguous, as its meaning can shift
based on context, tone, and cultural nuances. This richness allows for creativity in expression, leading
to the emergence of new words and slang as society evolves. Furthermore, language has layers that
encapsulate emotional, cultural, and social norms, making it a dynamic and multifaceted system. As AI
models attempt to mimic human language, they must navigate these complexities to generate relevant
and coherent text.

Large Language Models (LLMs)

Large Language Models are a subset of deep neural networks specifically engineered to process and
generate human language. They operate by predicting the next word in a sequence, utilizing vast
amounts of data to learn the patterns and structures inherent in language. LLMs are distinguished by
their architecture, which includes multiple layers—input, hidden, and output layers—that work together
to analyze and produce text. The learning process for LLMs involves training on extensive datasets
derived from internet text, allowing them to acquire a broad understanding of language use across
different contexts.

Types of LLMs
Several prominent LLMs have emerged in recent years, each with unique features and capabilities.
Notable examples include Deepseek-R1 by High-Flyer, GPT-4o by OpenAI, Qwen2.5 by Alibaba, and
Llama3.3 by Meta. Each of these models employs advanced techniques and architectures to enhance
performance and adaptability.
LLMs can be grouped by how they are accessed: downloadable models (Deepseek-R1, Qwen2.5,
Llama3.3) and API-based models (GPT-4o). Downloadable models can be installed and run on your
own computer, giving you more control and customization options, but they require a powerful machine
to operate. On the other hand, API-based models are hosted on the servers of the provider, meaning you
can use them over the internet through an application programming interface (API). This method is
easier to integrate into apps and doesn’t require you to manage any hardware, but it usually comes with
usage fees and can be slower because it relies on internet connection.

Prediction Mechanisms
There are two primary mechanisms for predicting the next word in a text sequence: deterministic (not
commonly used) and stochastic (commonly used) prediction. Deterministic next word prediction means
that given the same input, the model will always predict the same next word, which can be useful in
controlled environments. In contrast, stochastic next word prediction introduces variability by utilizing
probabilities to determine the likelihood of various possible continuations. For instance, when presented
with the phrase "To be or not to ___," the model might predict "be" with a 75% probability, while other
options like "do" or "say" have lower probabilities. This stochastic nature allows LLMs to generate
diverse and contextually rich text, making their outputs more dynamic and engaging. Adjusting the
probabilities can control the randomness of the predictions, which can be particularly useful in
applications where creativity is desired.

File generated on 22 February 2025 at 2:02 PM Page 1 of 3

SCDS1001 - Artificial Intelligence Literacy I
L-4: Understanding Large Language Models (LLMs)

Stages of LLM Development

The training involves two key stages: pre-training on extensive text data to build a foundational
understanding of language, followed by fine-tuning on more specific tasks or datasets to improve
accuracy and relevance.
• Pre-training stage: Developers gather extensive text data from the internet, often amounting to
terabytes of information, and use powerful computational resources, such as clusters of GPUs,
to process this data. The model learns to recognize patterns and structures within the language,
forming a foundational understanding of communication.
• Fine-tuning stage: This involves crafting specific labelling instructions and collecting high-
quality, human-labelled responses to train the model for particular tasks. The fine-tuning
process allows developers to refine the model’s capabilities, ensuring that it can provide
accurate and contextually relevant responses. After deploying the model, continuous
monitoring and evaluation are essential to identify and correct any misbehaviours, thereby
enhancing the model's reliability over time.

Training Data
The quality and diversity of training data are critical to the success of LLMs. High-quality training data
must be large in volume and diverse in content, encompassing a wide range of topics and language uses
to ensure the model learns effectively. Most LLM providers do not disclose the exact datasets used, but
they typically consist of vast amounts of text harvested from the internet, including books, articles, and
websites. This data must be carefully curated to avoid biases and maintain relevance. For example,
Hugging Face’s FineWeb dataset comprises 3 billion web pages from 39 million domains,
demonstrating the scale required for effective training. The removal of personally identifiable
information (PII) and irrelevant content is also essential to ensure ethical use and compliance with data
privacy standards.

Important Terms
• Tokens are the fundamental units of text that the model processes, typically representing words
or subworlds. Tokenization is the process of breaking down text into smaller units, or tokens,
which can be words or subworlds. This enables the model to analyse and generate text more
efficiently. Effective tokenization helps the model manage vocabulary size and handle rare or
complex words by breaking them into manageable pieces, allowing for better performance
across diverse language inputs.
• The transformer model is a crucial architecture that powers many LLMs, utilizing attention
mechanisms to determine which parts of the input data are most relevant during processing.
This attention-based approach allows LLMs to consider the relationships between words in a
sentence, improving their ability to generate coherent and contextually appropriate text.
Attention works by computing a score for each word in relation to others, determining which
words should be emphasized when making predictions. This is particularly beneficial for
capturing long-range dependencies in language, where the meaning of a word can be influenced
by others that are far apart in the text. For example, in the sentence “The cat that chased the
mouse was very quick,” the attention mechanism helps the model understand that “cat” and
“quick” are related despite being separated by several words.

File generated on 22 February 2025 at 2:02 PM Page 2 of 3

SCDS1001 - Artificial Intelligence Literacy I
L-4: Understanding Large Language Models (LLMs)

Customizations of LLMs
Customizing LLMs can significantly enhance their performance and adapt them to specific tasks. Two
common customization options include adjusting the temperature and the maximum length of generated
text. The temperature controls the randomness of predictions; a lower temperature (e.g., 0.2) results in
more deterministic and focused outputs, while a higher temperature (e.g., 1.0) yields more diverse and
creative responses. This allows users to strike a balance between coherence and creativity based on their
needs.

The maximum length parameter dictates how many tokens the model will generate in response to a
prompt. Setting an appropriate maximum length is crucial, as it affects the completeness and relevance
of the output. Too short a length may truncate valuable information, while too long a length can lead to
irrelevant or overly verbose responses. By fine-tuning these parameters, users can tailor LLM outputs
to better suit specific applications, whether for casual conversation, technical writing, or creative
storytelling.

File generated on 22 February 2025 at 2:02 PM Page 3 of 3

Quick Start Guide To LLMs by Sinan Ozdemir 1703540700
100% (2)
Quick Start Guide To LLMs by Sinan Ozdemir 1703540700
275 pages
(EARLY RELEASE) Quick Start Guide To Large Language Models Strategies and Best Practices For Using ChatGPT and Other LLMs (Sinan Ozdemir) (Z-Library)
100% (14)
(EARLY RELEASE) Quick Start Guide To Large Language Models Strategies and Best Practices For Using ChatGPT and Other LLMs (Sinan Ozdemir) (Z-Library)
132 pages
Instant ebooks textbook Build a Large Language Model (From Scratch) (MEAP V01) Sebastian Raschka download all chapters
100% (3)
Instant ebooks textbook Build a Large Language Model (From Scratch) (MEAP V01) Sebastian Raschka download all chapters
34 pages
Sinan Ozdemir - Quick Start Guide to Large Language Models, Second Edition-Addison-Wesley (2024)
No ratings yet
Sinan Ozdemir - Quick Start Guide to Large Language Models, Second Edition-Addison-Wesley (2024)
279 pages
Encore Tricolore 1 - Textbook
86% (7)
Encore Tricolore 1 - Textbook
177 pages
Sinan Ozdemir - Quick Start Guide To Large Language Models - Strategies and Best Practices For Using ChatGPT and Other LLMs-Addison-Wesley Professional (2023)
100% (4)
Sinan Ozdemir - Quick Start Guide To Large Language Models - Strategies and Best Practices For Using ChatGPT and Other LLMs-Addison-Wesley Professional (2023)
326 pages
Precognitive Dreaming
100% (2)
Precognitive Dreaming
13 pages
SG-3 Getting To Know The K To 3 Learners
No ratings yet
SG-3 Getting To Know The K To 3 Learners
25 pages
Toulmin Model
100% (2)
Toulmin Model
16 pages
Flew The Presumption of Atheism
No ratings yet
Flew The Presumption of Atheism
13 pages
Large Language Model (LLM) 1
100% (1)
Large Language Model (LLM) 1
17 pages
Understanding Large Language Models (LLMs)_ A Mode
No ratings yet
Understanding Large Language Models (LLMs)_ A Mode
3 pages
Day 17 Introduction to LLMs
No ratings yet
Day 17 Introduction to LLMs
7 pages
large_language_models
No ratings yet
large_language_models
3 pages
Introduction to Large Language Models
No ratings yet
Introduction to Large Language Models
3 pages
LLM 1
No ratings yet
LLM 1
6 pages
《A Primer on Large Language Models and their Limitations
No ratings yet
《A Primer on Large Language Models and their Limitations
33 pages
Dokumen - Pub Quick Start Guide To Large Language Models Strategies and Best Practices For Using Chatgpt and Other Llms 9780138199425
No ratings yet
Dokumen - Pub Quick Start Guide To Large Language Models Strategies and Best Practices For Using Chatgpt and Other Llms 9780138199425
325 pages
aa
No ratings yet
aa
11 pages
Planet, Code - PYTHON for LARGE LANGUAGE MODELS_ a Beginners Handbook for Leveraging Llms Into Modern Development Workflows and Applications (2025)
No ratings yet
Planet, Code - PYTHON for LARGE LANGUAGE MODELS_ a Beginners Handbook for Leveraging Llms Into Modern Development Workflows and Applications (2025)
254 pages
Llm
No ratings yet
Llm
5 pages
Compact Guide To Large Language Models
No ratings yet
Compact Guide To Large Language Models
9 pages
Large Language Models A Comprehensive Survey of It
No ratings yet
Large Language Models A Comprehensive Survey of It
30 pages
Python BAKMR010399001
No ratings yet
Python BAKMR010399001
3 pages
Scalexm - Ai: A Compact Guide To Large Language Models
No ratings yet
Scalexm - Ai: A Compact Guide To Large Language Models
9 pages
Large Language Models
No ratings yet
Large Language Models
27 pages
s10115-024-02120-8
No ratings yet
s10115-024-02120-8
24 pages
D 02 Large Language Models
100% (1)
D 02 Large Language Models
58 pages
Large language models
No ratings yet
Large language models
2 pages
In Consulting Nasscom Deloitte Paper Large Language Models LLMs Noexp
No ratings yet
In Consulting Nasscom Deloitte Paper Large Language Models LLMs Noexp
13 pages
Module1_L4_LLMs_new
No ratings yet
Module1_L4_LLMs_new
37 pages
FAI UNIT-5 TB
No ratings yet
FAI UNIT-5 TB
7 pages
LLM compact guide
No ratings yet
LLM compact guide
9 pages
Unraveling the Magic of Large Language Models: A Journey into the Future of Communication
From Everand
Unraveling the Magic of Large Language Models: A Journey into the Future of Communication
Lila Hartney
No ratings yet
Sinan Ozdemir Quick Start Guide To Large Language Models Strategies
No ratings yet
Sinan Ozdemir Quick Start Guide To Large Language Models Strategies
285 pages
LLM
No ratings yet
LLM
3 pages
Large Language Models
From Everand
Large Language Models
A. Scholtens
2/5 (2)
LLM - Seminar Report
No ratings yet
LLM - Seminar Report
13 pages
LLM and Gen AI
No ratings yet
LLM and Gen AI
4 pages
Technical Seminar
No ratings yet
Technical Seminar
16 pages
Large Language Models
No ratings yet
Large Language Models
40 pages
Kickstart Your Journey with LLM_ A Comprehensive Guide
No ratings yet
Kickstart Your Journey with LLM_ A Comprehensive Guide
2 pages
A Review On Large Language Models Architectures Ap
No ratings yet
A Review On Large Language Models Architectures Ap
31 pages
llms
No ratings yet
llms
3 pages
ACompactGuidetoLearnLargeLanguageModels
No ratings yet
ACompactGuidetoLearnLargeLanguageModels
6 pages
DZ-getting-started-large Language Models LLMs-2024
No ratings yet
DZ-getting-started-large Language Models LLMs-2024
7 pages
Introduction to LLMs for Business Leaders: Responsible AI Strategy Beyond Fear and Hype: Byte-Sized Learning Series
From Everand
Introduction to LLMs for Business Leaders: Responsible AI Strategy Beyond Fear and Hype: Byte-Sized Learning Series
I. Almeida
No ratings yet
Week4 LLMs EN
No ratings yet
Week4 LLMs EN
48 pages
Large Language Models and Their Use Cases
No ratings yet
Large Language Models and Their Use Cases
3 pages
Large Language Models: Dr. Asgari, Dr. Rohban, Soleymani Fall 2023
No ratings yet
Large Language Models: Dr. Asgari, Dr. Rohban, Soleymani Fall 2023
53 pages
SW Post 1
No ratings yet
SW Post 1
5 pages
Whitepaper_Foundational Large Language Models & Text Generation_v2
100% (1)
Whitepaper_Foundational Large Language Models & Text Generation_v2
86 pages
LLM Basics
No ratings yet
LLM Basics
3 pages
Quick Start Guide to Large Language Models Second Edition Sinan Ozdemir - Read the ebook online or download it to own the full content
100% (1)
Quick Start Guide to Large Language Models Second Edition Sinan Ozdemir - Read the ebook online or download it to own the full content
62 pages
Pranay Report
No ratings yet
Pranay Report
26 pages
LLMS&TRANSFORMERS
No ratings yet
LLMS&TRANSFORMERS
4 pages
A_Review_on_Large_Language_Models_Archit
No ratings yet
A_Review_on_Large_Language_Models_Archit
32 pages
Pranay Report-1
No ratings yet
Pranay Report-1
36 pages
LLM Presentation
No ratings yet
LLM Presentation
10 pages
Report - PDF 20240827 210738 0000
No ratings yet
Report - PDF 20240827 210738 0000
23 pages
A Review On Large Language Models Architectures Applications Taxonomies Open Issues and Challenges
No ratings yet
A Review On Large Language Models Architectures Applications Taxonomies Open Issues and Challenges
36 pages
LLMs
No ratings yet
LLMs
40 pages
Natural learning
No ratings yet
Natural learning
35 pages
IJRPR29621
No ratings yet
IJRPR29621
7 pages
LLM
No ratings yet
LLM
3 pages
scripts
No ratings yet
scripts
4 pages
project
No ratings yet
project
5 pages
2_notes (1)
No ratings yet
2_notes (1)
5 pages
2_notes
No ratings yet
2_notes
3 pages
Thematic Analysis
No ratings yet
Thematic Analysis
25 pages
What Is The Common European Framework (CEFR) ?: Second Edition Pre-Intermediate and CEFR Level B1
No ratings yet
What Is The Common European Framework (CEFR) ?: Second Edition Pre-Intermediate and CEFR Level B1
6 pages
Unit 6
No ratings yet
Unit 6
76 pages
Serial Verbal Construction in Urhobo Language: ### Table of Contents
No ratings yet
Serial Verbal Construction in Urhobo Language: ### Table of Contents
15 pages
The Decline of Narrative Discourse in Alzheimers Disease
No ratings yet
The Decline of Narrative Discourse in Alzheimers Disease
2 pages
Motivation and Job Satisfaction
No ratings yet
Motivation and Job Satisfaction
7 pages
The French Experiment
No ratings yet
The French Experiment
4 pages
Introduction To The Philosophy of The Human Person: Reviewer
No ratings yet
Introduction To The Philosophy of The Human Person: Reviewer
2 pages
Peer Teaching Reflection 1
No ratings yet
Peer Teaching Reflection 1
4 pages
Translation Text Types and Translatability
No ratings yet
Translation Text Types and Translatability
10 pages
Írás (Writing) - Marking Schemes
No ratings yet
Írás (Writing) - Marking Schemes
3 pages
Kormbip 1
No ratings yet
Kormbip 1
393 pages
Free Holland Code Assessment Report
No ratings yet
Free Holland Code Assessment Report
13 pages
Insights Into The Cultural Heritage Landscape
No ratings yet
Insights Into The Cultural Heritage Landscape
186 pages
Piagets Cognitive Development Theory
No ratings yet
Piagets Cognitive Development Theory
27 pages
Teaching Writing Documentation Letters in Senior Classes of Secondary School
No ratings yet
Teaching Writing Documentation Letters in Senior Classes of Secondary School
6 pages
Lesson 12 - Lesson Plan PDF
No ratings yet
Lesson 12 - Lesson Plan PDF
2 pages
The Neuropsychology of Emotion
No ratings yet
The Neuropsychology of Emotion
532 pages
ENCh27
No ratings yet
ENCh27
10 pages
Action Research
No ratings yet
Action Research
2 pages
Ela Ss Day 3
No ratings yet
Ela Ss Day 3
7 pages
B1-1 (Nivel Cuatro)
No ratings yet
B1-1 (Nivel Cuatro)
29 pages
Test 2
No ratings yet
Test 2
3 pages
MTB DLL Q2 Week 7
100% (1)
MTB DLL Q2 Week 7
3 pages
Field Study 1
100% (1)
Field Study 1
26 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

2_notes (3)

Uploaded by

2_notes (3)

Uploaded by

SCDS1001 - Artificial Intelligence Literacy I

L-4: Understanding Large Language Models (LLMs)

Large Language Models (LLMs)

File generated on 22 February 2025 at 2:02 PM Page 1 of 3

Stages of LLM Development

File generated on 22 February 2025 at 2:02 PM Page 2 of 3

File generated on 22 February 2025 at 2:02 PM Page 3 of 3

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.