CYB 402_Fault Tolerant Routing (Lect.6)
CYB 402_Fault Tolerant Routing (Lect.6)
6)
A fault-tolerant routing system is a network infrastructure that can withstand failures in its
routing mechanisms while maintaining uninterrupted communication between devices.
The Fault Tolerant Router (daemon) uses multipath routing among multiple Internet
connections to keep you connected, even when some connections go down.
Stable Internet connections (uplinks) are mission critical in many enterprises. Unfortunately,
they often break down. If you want to connect two uplinks redundantly using two or more
providers, you will typically experiment with the Border Gateway Protocol (BGP). This solution
can be a fairly expensive, though, because providers charge dearly for enterprise-level
connections. With a few restrictions, you can achieve redundancy far less expensively by opting
for Linux and the Fault Tolerant (FT) Router.
A Linux host typically sends its packets with the help of a routing table. All packets that do not
belong to a specific route follow the default route, which usually leads to the Internet. If this link
fails, all the users on the inside are cut off from the Internet. The reasons for failure can be many,
including bulldozers digging up cables, Layer 2 or 3 software failures, or routers that fail one hop
downstream on the provider's network.
To avoid hard disk failures, administrators have relied on RAID for a long time; in the simplest
case, this means simply doubling the number of disks in a mirroring RAID. This isn't quite as
easy for access lines. In the classic setup for this scenario, at least two Internet providers
safeguard the network; that is, your own connection has two uplinks.
The administrator needs to inform the rest of the world using BGP (on internal networks, this can
also be an internal routing protocol such as OSPF, or Open Shortest Path First). If one link fails,
the protocols notice this and stop sending packets over the dead link.
The protocols detect failures automatically. If the Internet Protocol (IP) fails even though the link
is working perfectly at the lowest level, the routing protocol notices this through active
monitoring. Although it can take a while, at least the changeover happens without intervention
(i.e., without forcing the administrator out of bed in the middle of the night).
Summary
Fault-Tolerant Routing (FTR) is a routing technique that ensures reliable data transmission in a
network by adapting to faults or failures in the network topology. It aims to find alternative paths
to maintain connectivity and minimize disruptions.
By implementing Fault-Tolerant Routing, networks can ensure reliable data transmission,
minimize disruptions, and maintain connectivity even in the presence of faults or failures.
Key aspects of Fault-Tolerant Routing:
1. Network redundancy: Duplicate paths and links to ensure connectivity.
2. Real-time monitoring: Continuously monitor network status and detect faults.
3. Adaptive routing: Dynamically adjust routing decisions based on network changes.
4. Path diversity: Use multiple paths to avoid single points of failure.
5. Fast failure detection: Quickly identify and respond to faults.
5. Telecommunication networks
A unified architectural approach extends a well-known hardware fault tolerant without violating
the fundamental hardware fault-tolerance design principles, and it provides a possible solution to
the problem of correlated software errors.
Integrated Hardware/Software Fault Tolerance (IHSFT) combines hardware and software
techniques to achieve fault tolerance in a system. This approach integrates redundant hardware
components with software-based fault detection, isolation, and recovery mechanisms to ensure
system reliability and availability.
Integrating hardware and software fault tolerance techniques, provides a robust and reliable
solution for systems requiring high availability and reliability.
Key aspects of Integrated Hardware/Software Fault Tolerant (IHSFT)
1. Hardware redundancy: Duplicate critical hardware components.
2. Software fault tolerance: Use software techniques to detect and recover from faults.
3. Integrated design: Combine hardware and software fault tolerance techniques.
4. Fault detection and isolation: Identify and isolate faults quickly.
5. Recovery and reconfiguration: Restore system functionality after a fault.
The above could be taken as steps towards creation of a robust fault-tolerant system
Techniques used in IHSFT:
1. Hardware redundancy: Duplicate hardware components (e.g., CPUs, memories).
2. Software redundancy: Use redundant software components (e.g., processes, threads).
3. Error detection and correction: Use codes (e.g., ECC) to detect and correct errors.
4. Fault tolerance protocols: Implement protocols (e.g., TCP, UDP) with built-in fault
tolerance.
5. Self-diagnosis and repair: Use software to diagnose and repair hardware faults.
Benefits of IHSFT:
1. Improved system reliability
2. Increased availability
3. Reduced downtime
4. Enhanced performance
5. Better fault coverage
6. Simplified maintenance
Applications of IHSFT:
1. Mission-critical systems (e.g., aerospace, healthcare)
2. High-availability systems (e.g., data centers, financial services)
3. Safety-critical systems (e.g., transportation, energy)
4. Real-time systems (e.g., process control, robotics)
5. IoT devices (e.g., industrial sensors, smart home devices)