UNIT 4
UNIT 4
UDP is a simple protocol and it has some very important uses, such as client server interactions and
multimedia, but for most Internet applications, reliable, sequenced delivery is needed. UDP cannot
provide this, so another protocol is required. It is called TCP and is the main workhorse of the Internet.
Let us now study it in detail
Introduction to TCP
❖ TCP service is obtained by having both the sender and receiver create endpoints called SOCKETS
Each socket has a socket number(address)consisting of the IP address of the host, called a “PORT” ( =
TSAP )
❖ To obtain TCP service a connection must be explicitly established between a socket on the sending
machine and a socket on the receiving machine
❖ All TCP connections are full duplex and point to point i.e., multicasting or broadcasting is not
supported.
❖ A TCP connection is a byte stream, not a message stream i.e., the data is delivered as chunks
Sockets: A socket may be used for multiple connections at the same time. In other words, 2 or more
connections may terminate at the same socket. Connections are identified by socket identifiers at same
socket. Connections are identified by socket identifiers at both ends. Some of the sockets are listed below:
❖ A key feature of TCP, and one which dominates the protocol design, is that every byte on a TCP
connection has its own 32-bit sequence number.
❖ When the Internet began, the lines between routers were mostly 56-kbps leased lines, so a host
blasting away at full speed took over 1 week to cycle through the sequence numbers.
❖ The basic protocol used by TCP entities is the sliding window protocol.
❖ When the segment arrives at the destination, the receiving TCP entity sends back a segment (with
data if any exist, otherwise without data) bearing an acknowledgement number equal to the next
sequence number it expects to receive.
❖ If the sender's timer goes off before the acknowledgement is received, the sender transmits the
segment again.
Source port: this is a 16 bit field that specifies the port number of the sender.
Destination port: this is a 16 bit field that specifies the port number of the receiver.
Sequence number: the sequence number is a 32 bit field that indicates how much data is sent during the
TCP session. When you establish a new TCP connection (3 way handshake) then the initial sequence
number is a random 32 bit value. The receiver will use this sequence number and sends back an
acknowledgment.
Acknowledgment number: this 32 bit field is used by the receiver to request the next TCP segment. This
value will be the sequence number incremented by 1.
Data Offset: this is the 4 bit data offset field, also known as the header length. It indicates the length of the
TCP header so that we know where the actual data begins.
RSV(Reserved): these are 3 bits for the reserved field. They are unused and are always set to 0.
Flags: there are 9 bits for flags, we also call them control bits. We use them to establish connections, send
data and terminate connections:
URG: urgent pointer. When this bit is set, the data should be treated as priority over other data.
PSH: this is the push function. This tells an application that the data should be transmitted immediately and
that we don‟t want to wait to fill the entire TCP segment.
RST: this resets the connection, when you receive this you have to terminate the connection right away.
This is only used when there are unrecoverable errors and it‟s not a normal way to finish the TCP
connection.
SYN: we use this for the initial three way handshake and it‟s used to set the initial sequence number.
FIN: this finish bit is used to end the TCP connection. TCP is full duplex so both parties will have to use
the FIN bit to end the connection. This is the normal method how we end a connection.
ECN-Echo (ECE): It is used to echo back the congestion indication (i.e. signal the sender to reduce the
amount of information it sends).
CWR (Congestion Window Reduced): It‟s used to acknowledge that the congestion-indication echoing
was received. Basically ECE and CWR are used to warn senders of congestion in the network thereby
avoiding packet drops and retransmissions.
Window: the 16 bit window field specifies how many bytes the receiver is willing to receive. It is used so
the receiver can tell the sender that it would like to receive more data than what it is currently receiving. It
does so by specifying the number of bytes beyond the sequence number in the acknowledgment field.
Checksum: 16 bits are used for a checksum to check if the TCP header is OK or not.
Urgent pointer: these 16 bits are used when the URG bit has been set, the urgent pointer is used to indicate
where the urgent data ends.
Options: this field is optional and can be anywhere between 0 and 320 bits.
❖ To establish a connection, one side, say, the server, passively waits for an incoming connection by
executing the LISTEN and ACCEPT primitives, either specifying a specific source or nobody in
particular.
❖ The other side, say, the client, executes a CONNECT primitive, specifying the IP address and port to
which it wants to connect, the maximum TCP segment size it is willing to accept, and optionally some
user data (e.g., a password).
❖ The CONNECT primitive sends a TCP segment with the SYN bit on and ACK bit off and waits for a
response.
❖ Although TCP connections are full duplex, to understand how connections are released it is best to
think of them as a pair of simplex connections.
❖ Each simplex connection is released independently of its sibling. To release a connection, either party
can send a TCP segment with the FIN bit set, which means that it has no more data to transmit.
❖ When the FIN is acknowledged, that direction is shut down for new data. Data may continue to flow
indefinitely in the other direction, however.
❖ When both directions have been shut down, the connection is released. Normally, four TCP segments
are needed to release a connection, one FIN and one ACK for each direction. However, it is possible
for the first ACK and the second FIN to be contained in the same segment, reducing the total count to
three.
THE APPLICATION LAYER
1. Web Browsers: Users access and interact with web pages using browsers like Chrome, Firefox, and
Internet Explorer, which fetch and render content formatted with markup and scripts.
2. Hyperlinks: Hyperlinks (text, icons, or images) connect web pages. Clicking a link prompts the
browser to load a new page, enabling seamless navigation across servers worldwide.
3. HTTP Protocol: Web pages are delivered via HTTP over TCP. This client-server communication
fetches and displays content on demand.
4. Static vs. Dynamic Pages: Static pages display fixed content, while dynamic pages personalize
content in real time based on user behavior or preferences.
5. Multi-server Integration: A single web page may combine resources from multiple servers (e.g.,
main content, embedded videos, and tracking), unified by the browser into one display.
CLIENT SIDE
❖ The client side of the Web is handled by the browser, which displays web pages and manages user
interactions like clicking links. Each page is identified by a URL, which tells the browser the
protocol, server location, and specific resource to access.
❖ When a user clicks a hyperlink, the browser retrieves the page by resolving the domain name via
DNS, connecting to the server using TCP, and sending an HTTP request. The server responds with
the requested content, and the browser may make additional requests for images, videos, or scripts
from other servers.
❖ The browser then renders the page by combining all elements and showing the complete interface
to the user. Once loading is complete, idle connections are closed to conserve resources.
SERVER SIDE
❖ Accept Connection: Server accepts a TCP connection from the client (usually on port 80/443).
❖ Parse Request: It reads the URL path and determines the resource (file or program).
❖ Access Resource: It checks cache or fetches the file from disk or executes a program for dynamic
content.
❖ Send Response: The content is returned to the client along with headers (e.g., MIME type).
❖ Log & Close: Server logs the request, may reuse or close the connection after response.
❖ Definition and Structure:A URL uniquely identifies a resource on the Internet. It includes three
parts: the protocol (e.g., http), the domain name or IP address (e.g., www.example.com), and
the path to the specific file or resource (e.g., /page.html).
❖ Multiple Protocol Support: URLs are flexible and support various protocols beyond HTTP, such
as https (secure web pages), ftp (file transfer), mailto (email), file (local files), rtsp
(media streaming), sip (calls), and about (browser info).
❖ Integration into Browsers: Web browsers use URLs to access many different types of resources
and services, integrating functions like file access, media streaming, and email into one interface.
❖ Limitation of URL: A URL always points to a specific location, which can be a drawback when
content is replicated across servers. There's no built-in way to request a page without specifying its
location.
❖ From URL to URI: To address the above limitation, URLs were expanded into URIs (Uniform
Resource Identifiers). URIs include both URLs (which provide location) and URNs (Uniform
Resource Names, which provide names without location).
COOKIES
❖ The web’s original design treats each page request independently, with no memory of past
interactions, which limits personalized experiences.
❖ Servers cannot reliably track users by IP address due to shared devices, NAT, and dynamic IPs.
❖ Cookies solve this by storing small pieces of data (up to 4 KB) on the client, linking browser
sessions to the server for personalization.
❖ Cookies contain fields like Domain, Path, Content (name=value), Expiry, and Secure flag (used for
HTTPS-only).
❖ Cookies enable functions like login sessions, shopping carts, preferences, and user tracking, but
also raise privacy concerns due to cross-site tracking and profiling.