Troubleshooting OSI Layers 4 7

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Troubleshooting

OSI Layers 4−7

In this two-part white paper series, learn to quickly locate and resolve problems across the
OSI layers using the Troubleshooting Cheat Sheet.

The root cause of application anomalies don’t stop at Layer 3 of


!$%!#
the Open Systems Interconnection (OSI) model. In fact, some of the
!$%!#
most difficult to diagnose service issues are rooted in, or manifest
themselves at the Transport Layer (Layer 4) or higher. !$%!#

Since the Transport layer is responsible for building and maintaining


sessions between devices, it serves to connect the lower layers (1-3)
to the higher layers (5-7). It can also be the point of referred pain,
when the real source of the problem lies elsewhere. Top 3 user complaints
A methodical process to resolving service issues that reside at Layers • Slow network
4-7 can clear up ambiguities and accelerate problem resolution. Mike
Motta, NI University instructor and troubleshooting expert, and Tony • Inability to access network resources
Fortunato Senior Network Performance Specialist and Instructor
with the Technology Firm, place typical user complaints into three • Application-specific issues
categories:

 Slow Network
 Inability to Access Network Resources
 Application-Specific Issue

White Paper
Based upon the answers to the questions outlined in the following Troubleshooting Cheat Sheet, you’ll gain a better understanding of
the symptoms and be able to isolate the issue to the correct layer of the OSI model.

Complaint What to Ask What It Means

Slow Network • What type of application is being used? Is it web-based? • Determines whether the person is accessing local or
Is it commercial, or a homegrown application? external resources.

• How long does it take the user to copy a file from the • Verifies they can send data across the network to a server,
desktop to the mapped network drive and back? and allows you to evaluate the speed and response of the
DNS server.

• How long does it take to ping the server of interest? • Validates they can ping the server and obtain the
response time.

• If the time is slow for a local server, how many hops are • Confirms the number of hops taking place. Look at switch
needed to reach the server? and server port connections, speed to the client, and
any errors.

Inability to • What task is the user attempting to perform? • Indicates whether the action is limited to a specific
Access Network resource such as a mapped drive on a server or multiple
Resources network resources.

• What type of application is the user attempting • Similar to the above question on application type, this
to access? may point to a problem with multiple internal servers.

Application- • What’s the 3-way handshake time? • Identifies potential points where a slowdown might
Specific Issues be occurring.

• What’s the server processing time? • Points to whether the server is taking too long to
process data.

• How much data are you pulling from the application and • Assesses whether the application is sending data in an
sending across the network? expected and efficient way.

For additional information on application-specific issues, check out Mike Motta’s Tech Tips.

2 Troubleshooting OSI Layers 4−7


Continuing Up the Stack: The Second Steps to Troubleshooting Success:
4 Layers Transport Layer Checklist
With these questions answered, working through the OSI model is a 1. Is the TCP three-way handshake successful?
straightforward process. With the exception of Layer 1, each layer of the The TCP three-way handshake is the foundation on which the
OSI model relies on the next lower layer to provide services as specified. application session is built. “Without it, your service is DOA before it
Requests drop down and are completed, as every layer interacts with the even gets out of the gate,“ says Mike Motta.
next layer, both above and below.

When dealing with different layers, understanding how each delivers data
and functions impacts how you will troubleshoot.

Layer Highlights and Functions


At this point the session client-server connection is complete and
Transport Layer the application layer activities can begin. Occasionally, the session
request is delayed or fails as the application socket or port is too busy
 End-to-end connection and connectionless data delivery management
to support it (or if there are lower layer network issues). In this case,
 Ensures reliable packet delivery caused by network congestion & errors rather than responding with a SYN-ACK, the server will reply with an
 Congestion avoidance and data transmission flow control RST (or simply ignore the SYN).

Note: The majority of the functions below for Layers 5-7 are logically Below is an example of an abnormal three-way handshake. You can
combined into the “Application Layer” for the purposes of discussion see the client needs to say “Are you there?” (Transmitting a SYN) three
in this paper. Some capabilities, the application connectivity of the times before the server finally responds (with a SYN-ACK). Definitely
session layer in particular, can be thought of as residing within Layer 4. not a good way to start a conversation (and if seen repeatedly should
be investigated).
Session Layer
 Establishes, manages, and terminates application connections

 Manages data transfer permissions and records upper layer errors

Presentation Layer
 Application, system, and network independent data formatting

 Data conversion, compression, encryption, and decryption


Assuming the three-way handshake is successful, Motta states that,
Application Layer
“One important piece of data you can glean is the total network
 Supports application and end-user processes roundtrip time. It’s measured from the SYN to the ACK and is useful
 Offers application services for assessing the ability of the underlying network to service devices,
as it does not include any application processing overhead. Think of
 User identification, authentication, and privacy
it as the “best time” you can expect to attain from a responsiveness
standpoint between client and server. Given this, it can be useful
in measuring the infrastructure’s ability to support lower-latency
applications.”

The TCP three-way handshake is the


foundation on which the application
session is built.

3 Troubleshooting OSI Layers 4−7


Client application requests can then begin, below is an example Retransmissions: Real or Fast?
with HTTP: “Retransmissions come in two varieties,” says Tony Fortunato.

“The first I will refer to as Real. In this case the TCP timeout value
is reached before the data is received and the packet must be
retransmitted (after considerable delay) which may impact app
performance.”

“The second is a Fast retransmission. In this scenario, the receiving TCP


stack continues to resend duplicate ACKs to the sender for the last
contiguous sequenced packet number received as each new
out-of-order packet arrives. This is much better as the sender can
retransmit the packet that is assumed lost without waiting for the
timeout to occur. This usually minimizes the impact to the app and
vividly illustrates the power of TCP as it sorts out an occasional
2. Is TCP Repeating Itself Too Often?
lost packet,” says Fortunato.
The Pain of Excessive Retransmissions
Assuming neither of these are an issue, the next most frequent causes
It’s important to emphasize here that as a connection-based transport,
are an overloaded link or TCP server Stack Busy condition.
TCP is a highly robust protocol that can tolerate the re-sending of
packets. In fact, even the healthiest network will drop some packets. The former is caused when a switch or router is simply overwhelmed
with the amount of traffic passing through it. At some point the
TCP is also great at shielding the application from this activity at low
deluge exceeds the processing capability and packets must be
levels of packet loss, unless it becomes excessive. The potential root
discarded. Likewise, if the application server workload exceeds the
causes of too many retransmissions are varied, from problems with
processing capability, its TCP stack will ignore or delay a client request.
TCP itself (e.g. checksum or sequence number generation errors) to
This will be viewed by the client as packets dropped or lost, and a
degraded physical connectivity (very common, due to bad cabling or
retransmission request will be initiated.
switch port, etc…).
The solution in both cases is to reduce the load to the devices and/
or increase the switch/router or server processing performance to

1
01
support the heightened workloads.

Below is an example of multiple retransmissions for a web session:

1
0
0
1

Did You Know:


The three top communication disrup-
tors are excessive retransmissions, flow
There will always be some lost packets (and hence retransmissions). As
control issues, and congestion. long as these are not excessive, other frequent TCP issues are probably
related to flow and congestion control.

4 Troubleshooting OSI Layers 4−7


3. Is TCP Transmitting Too Slowly? As the travel time grows longer and larger via latency, applications can
Incorrect Flow Control Can Throttle App Performance be left waiting for TCP to request more data. The solution here can
vary. If possible and the application can support it, simply increase the
Incorrect window size can have a significant impact on application
window size.
performance. Window size is the end-to-end TCP flow control
protocol utilized to ensure the sender does not transmit data faster You can also re-architect the application host deployment so as
than the receiver can receive and process. Its value is expressed to reduce the latency between tiers. Of course, lowering network
in bytes. latency is another way to solve the issue if your budget supports the
associated costs in higher WAN speeds and/or faster
The graphic below shows that the receiving device’s window size is
infrastructure devices.
64,588 bytes. The window size should be large enough to adequately
deliver sufficient amounts of data to the service or application for Closely related to window sizing are the topics of chatty apps and
acceptable user performance without exceeding the receiving TCP application read/write buffer sizing.
packet’s buffering capacity (otherwise packets will be discarded and
need to be re-transmitted).

3 ways to resolve latency


Once the sender reaches the maximum amount of data advertised in
the window size, it must wait for an ACK with an updated window • Increase the window size
size from the receiver before proceeding with more data.

Generally, assuming a reasonable value, window sizing is not an issue • Re-architect app deployment
on a local network. However, higher levels of network latencies—often
associated with poor WAN performance—can starve an app.
• Upgrade infrastructure
Care and Feeding of Applications
Considering the round-trip time of the network, whenever a sender
transmits data and the window size reaches zero, it must stop and
wait for its data to:

 Traverse the network

 Reach the receiver

 Get the receiver’s ACK (with an updated window size)

5 Troubleshooting OSI Layers 4−7


4. Is TCP Experiencing Roadblocks? Selective Acknowledgement
– Congestion Control Can Point to the Solution The TCP protocol as initially designed can lead to inefficiencies
Network congestion can significantly degrade performance. because of the cumulative ACK scheme. Basically this means that a
Fortunately, recent additions to the TCP standards revolve around receiver is unable to say it received later data if it failed to receive
congestion control. As network complexity and speeds continue to earlier bytes. Thousands of bytes may be received but if the first 1,000
grow, the ability of TCP to effectively manage congestion bytes is missing, all the data may have to be re-sent.
has increased.
RFC 2018 defines an optional selective acknowledgement (SACK),
There are four algorithms that are used simultaneously: enabling the receiver to ACK discontinuous chunks of packets that
were successfully received. The receiver communicates the beginning
 Slow-start
and end of a contiguous range of packets that it has (via the sequence
 Congestion avoidance numbers), allowing the sender to simply resend the lost packets. “I’ve
 Fast retransmit worked with many clients that have not implemented this great TCP
feature,” says Fortunato. “To me it’s a simple way to improve overall
 Fast recovery.
service performance, especially since even the most robust networks
If the previously discussed concepts fail to improve performance and will drop packets.”
assuming you suspect the issue remains within the transport layer
please refer to RFC 5681 for additional details. Application Layers
At this point in the debug process, you’ve hopefully eliminated any
Optimize TCP Performance Layers 1 - 4 issues. Due to the complexity and number of applications
Fortunato and Motta offer three frequently overlooked ways to in a modern data center, it is important to realize that the concepts
improve your TCP transport layer performance. provided are limited to basic fundamentals.

HTTP, a protocol used for many application front-ends, and of course


Segment Size
web-based traffic is used for illustrative purposes of the process of
The optimal segment size (not counting TCP header and packet
debugging an app issue.
overhead) is 1460 bytes (assuming 10/100 Ethernet). Motta calls this
out as a potential waste of network resources because sending out
lots of small packets exacerbates the connection-orientated nature of
TCP (which in MS Windows requires an ACK for every two packets
or 200ms).

“I constantly remind my clients that if their app can support it, this is
the segment size that best leverages their network assets and offers a
great way to improve service performance,” says Motta.
Fast Facts:
Windows Scaling
Since the window size control field is limited to no more than 65,535
bytes, a TCP window scale option (defined in RF 1323) can be used to
HTTP status codes are divided into
increase the maximum window size up to a gigabyte. This is a great five groups:
way to significantly increase TCP throughput.

“However, before doing this be sure to confirm your devices can 1XX– Informational
support the added required buffering, otherwise it can corrupt your
drivers,” says Fortunato. “Also, your app must be able to handle the
2XX– Success
increased amount of possible incoming data.”

3XX– Redirection
4XX– Client Error
5XX– Server Error

6 Troubleshooting OSI Layers 4−7


Application Read/Write Block Buffer Sizing Status Codes
Block buffer sizing is an important metric that is often overlooked or Digging into application-specific issues requires the ability to perform
not visible to the network engineer (who frequently is not familiar payload-level analysis to assess exactly how an app is responding
with app details). to users’ requests. Depending on the service, reason, error, status,
or condition, codes can be captured and translated to infer how
The block buffer size is the maximum amount of data in bytes that
the application is performing. There are a number of performance
the app can support queued up. This information is what’s passed
monitoring vendors that offer these capabilities.
to the stack for use in TCP window sizing. If the value is too low,
it will effectively throttle the overall app performance, potentially Check out a comprehensive list to see individual values.
dramatically. What looks on the surface like a slow network—and
Here are the most common:
will show up as zero window problems—is actually a pure application
layer issue. Status Code What It Means What to Do About It
200 Ok A successful HTTP Nothing
request, your world
Chatty Apps would be happier if
If an application requires packet sizing that is appreciably lower this was the only code
you ever saw
than the optimal segment size mentioned above (1460 bytes), it is
considered chatty. After multiple clients complained about degraded 304 Not Modified The client already has Nothing
the information (which
performance, even after checking out their networks and finding them
has not changed,
fine, Motta explained the effects of the chatty applications. hence the web server
does not need to
“When clients re-architect their application deployments, applications
retransmit; this is also a
that once worked fine when hosted all in-house (with low latency) good response
suddenly act up when increased network latencies (associated with
WAN links) bring the application’s low payload limits to the surface,” 404 Not Found The user has requested Nothing, unless you
a URL that does not see too many of
Motta says.
exist them to the same
Motta recommends updating the app to support larger segments. If this URL, then it could be
someone attempting
is not an option, reduce the network latency to levels that will enable
to gain access either
the service to run within acceptable user performance parameters. innocently (e.g.
perhaps an incorrect
URL was provided
customers?) or for
nefarious reasons.
Further investigation
may be merited

500 Internal Something bad has Get in touch with


Server Error happened inside the the web server team
web server, “Houston if more than a few
we have a problem” of these are noted as
with a system it could be an early
warning of
underlying problems

7 Troubleshooting OSI Layers 4−7


Below are examples of web server responses. The first is the expected Conclusion
200 OK. This means all is well.
The life of the network engineer or administrator gets more
interesting and challenging every day. As the first responder, network
staff will often play a pivotal role in fixing the problem. What that
means is they will need to have a solid understanding of all layers of
the OSI model, correcting those issues that reside within the network
and assisting the application or systems teams when possible.
On the other hand, as you can see in the diagram below, a 500
Also, modern applications and the more complex hosting strategies
Internal Server Error is bad news.
that distribute app tiers around the world make detailed transport
and application layer awareness ever-more important. Use the
methodologies and suggestions in this paper as a starting point to
ensure that your network resources are ready to support today’s
modern apps and solve problems when they occur.

As long as you primarily see 200 OKs (and of course assuming


acceptable Layers 1 - 4 health), your web-based service should be
delivering acceptable user performance. The one caveat is that you
will want to setup adequate baseline response times to ensure that
data is received in sufficient time to maintain high-levels of
customer satisfaction.

While most applications use similar type messaging to communicate


their ability to effectively service clients’ requests, the format and
meanings will be unique. The key is to work with your application
teams to understand what these various codes are, and when errors
are detected to make the specific group aware so the issue can be
quickly resolved.

Contact Us +1 844 GO VIAVI © 2015 Viavi Solutions Inc.


(+1 844 468 4284) Product specifications and descriptions in this
document are subject to change without notice.
To reach the Viavi office nearest you, troubleshootingosilayers4−7-wp-ec-ae
visit viavisolutions.com/contacts. 30176217 901 0315

viavisolutions.com

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy