ITNS1
ITNS1
Han et al
ABSTRACT
A computer network intrusion detection and prevention system consists of collecting network traffic data, discovering user behavior patterns as intrusion detection rules, and applying these rules to prevent malicious and misuse. Many
commercial off-the-shelf (COTS) products have been developed to perform
each of these tasks. In this paper, the component-based software engineering approach is exploited to integrate these COTS products as components
into a computerized system to automatically detect intrusion rules from network
traffic data and setup IPTables to prevent future potential attacks. The component-based software architecture of this kind of system is designed, COTS
components are analyzed and selected, adaptor components to connect COTS
products are developed, the system implementation is illustrated, and the preliminary system experiment is presented.
Keywords: Component-based software engineering, Software reuse, Data
mining, Network security, Intrusion detection and prevention.
1- INTRODUCTION
As network attacks have increased in number and severity over the past years,
intrusion detection and prevention has become a necessary addition to the security infrastructure of most organizations.
A computer network intrusion detection and prevention system (IDPS) consists
of collecting network traffic data, discovering user behavior patterns as intrusion detection rules, and applying these rules to prevent malicious and misuse
[1]. Many commercial off-the-shelf (COTS) products have been developed to
perform each of these tasks. However, most these COTS products need users
to manually set up required rules to have the products work well. Unfortunately,
figuring out these rules is a challenging task. Our objective of this paper is to
compose these products to create an automatic intrusion detection and prevention system.
One of the most promising solutions to this objective today is the componentbased software development approach. This approach is based on the idea
that software systems can be developed by selecting appropriate off-the-shelf
85
2- RELATED WORK
The problem of network intrusion detection has been studied extensively in
computer security and has received much attention in machine learning and
data mining community. Many different approaches have been developed and
implemented to detect anomalies and/or misuses. These approaches can be
summarized as follows [3, 4, 5, 6, 7, 8, 9, 10, 11]:
Using data mining techniques over system audit data to extract consistent and useful patterns of program and user behavior, and to build classifiers that can recognize anomalies.
Using temporal association rules (a data mining technique that uses time
concepts), in terms of multiple time granularities. The temporal association rules technique generates fuzzy and classical rules.
Using short sequences of system calls with which running programs perform as discriminators between normal and abnormal operating characteristics.
Distributing the detection task in multiple independent entities (autonomous agents) working collectively.
86
Han et al
a component is, which features it supports, how it can be composed, etc. With
respect to the used component model and supporting tools, the component
systems can be divided by several points of view, e.g. who drives the component system (industry driven vs. academia-driven systems), type of composition
(flat vs. hierarchical component models), target domain (enterprise application,
embedded applications, general purpose component models), supported parts
of application lifecycle (design-only system vs. systems supporting complete
development cycle). Each of these systems has its pros and cons and each of
them brings different advantages to CBSE [12].
CBSE can be implemented either in pipelines or flow-based paradigm. A flowbased implementation defines the application as a network of black box processes, which exchange data across predefined connections by message passing [10]. These black box processes can be reconnected endlessly to form
different applications without having to be changed internally. This implementation is thus naturally component-based, and the focus is on the application data
and the transformations applied to it to produce the desired outputs.
A component in CBSE is usually a software unit that can be independantly deployed and composed without modification with other components to create a
software system according to a composition standard and may be subject to composition by the third parties [15]. One of the most useful ways to consider a component is as a standalone service provider. Thus a reusable component emphasises two characteristics: The component is an independent executable entity and
the services offered by a component are made available through an interface and
all interactions are through that interface. Each component can be defined by their
two related interfaces [12]: a requires interface that defines the services from the
components environment that it uses and must be provided by other components
in the system; and a provides interface that defines the services provided by the
component to other components. The provides interface is essentially the component API and defines the methods that can be called by a user of the component.
Most intrusion detection systems have been designed and developed according to the composition of vaious components based on their functionalities [16,
17]. However, very few publications have researched and practiced on developing intrusion detection systems using component-based software engineering.
Yau and Zhang developed a framework for computer network intrusion detection, assessment and prevention [18], which consists of three components: intrusion detection component that uses the audit data collected from multiple
network nodes and services, intrusion assessment component and intrusion
prevention component that both use security dependency relation and ripple
effect analysis. However, the authors didnt composite any COTS products and
neither considered the interfaces between components. Kemmerer and Vigna
[19] proposed a component-based architecture WebSTAT to compose the domain independent STAT runtime with a number of language extensions, event
providers, attack scenarios, and response functions. However, their concern is
to integrate knowledge from different domains. Blumenthal et al. [20] developed
88
Han et al
Data
capture
Traffic data
Data
preprocess
Learning data
Intrusion
prevention
Detection rules
Induce
detection
rules
1. The first step of intrusion detection and prevention is to normalize the network flow and capture the network traffic data, such as the packet data;
Han et al
FOR
Among the five components in this design architecture, three components: Traffic data collection, Intrusion pattern recognition, and Intrusion prevention, can
Among the five components in this design architecture, three components:
be selected from COTS products, while Data and Pattern format adaptor comTraffic data collection, Intrusion pattern recognition, and Intrusion prevention,
ponents should be designed and developed accordingly to solve the problems
can be selected from COTS products, while Data and Pattern format adaptor
of incompatibility between the COTS components selected.
components should be designed and developed accordingly to solve the problems of incompatibility between the COTS components selected.
Han et al
exist to perform this task, such as Snort [7], Tcpdump, Dsniff [27]. In our design,
Snort has been selected.
Snort is an open source network intrusion detection system that is capable of
performing real time trafficanalysis and packet logging on IP networks, such
as protocol analysis, contentsearching/matching. Snort canbeusedto detect a
variety of attacks and probes, such as buffer overflows, stealth port scans, CGI
attacks, SMB probes, OS fingerprinting attempts, and much more. Snort uses
a flexible language to describe traffic that it should collect or pass, as well as a
detectionenginethat utilizes the modular plugin architecture. Snort also has a
modular real-time alerting capability, incorporating alerting and logging plugins
for syslog, a ASCII text files, UNIX sockets, database (Mysql/PostgreSQL/Oracle/ODBC) or XML.
Snort has many functions, but we are currently using it for logging packet information. Snort uses rules to match packet information and payload for classifying the packet. Snort Version 2.4.3 (Build 26) developed by Sourcefire. Inc.
[7] is selected, where the Sniffer function is used. We are using MySQL [28] as
the packet data storage, although Snort can be setup to use MsSQL, Postgre,
Oracle, and the file format.
One can see that the requires interface of Snort is the network packets, while
its provides interface is a MySQL database. The Snort database contains four
tables to record information of the network packets that use the following protocols, icp, udp, icmp, and ip, and two other important tables acid_event and
opt, where acid_event consolidates all the logs of alerts that Snort captures and
orders them by the cid field, while opt holds the optional data that can be part of
the TCP/IP protocol. The Snort rules are set to collect all packets.
The requires interface of See5/C5.0 are two files called the .data file (training
data set) and the .names file (the describer file). The .names file starts with the
target attribute, followed by an arbitrary list of the attributes that are described
ether implicitly or explicitly. The .data file is composed of data fields separated
by commas with ? and N/A filler fields representing missing data or not applicable data.
The provides interface of See5/C5.0 is a decision tree. Following is an example
of the decision tree:
ip_len <= 377: 2 (269728/120)
ip_len > 377:
:...udp_dport <= 1187: 2 (908)
udp_dport > 1187: 1 (8547)
See5/C5.0 can also be set to output a set of decision rules. Following are examples of these decision rules:
Rule 1: (8547, lift 32.2)
class 1 [1.000]
Han et al
Each rule specifies what to do with a packet that matches.This is called a target, which may beajumptoauser-defined chain in the same table.
The requires interface of IPTables is a set of firewall rules. These rules can be
set from the console or a system file, e.g. /etc/sysconfig in CentOS.These firewall rules have the following format:
-A INPUT -m <matching options> <<matching options option>
<matching option parameter> <target -j REJECT|ACCEPT|...>
-A INPUT -p tcp --src-port 80-m length --length 100 -j REJECT
There are many matching options available for IPTables.
IPTables does not have explicit provides interface, but provides a packet-filtering service. Afirewall rule specifies criteria for a packet and a target.If the
packet does not match, the next rule in the chain is examined; if it does match,
then the next rule isspecified bythevalue of the target, which can be the name
of a user-defined chain or one of the special values ACCEPT, DROP, QUEUE,
or RETURN. ACCEPT means to let the packet through. DROP means to drop
the packet onthefloor.QUEUE means to pass the packet to user space (ifsupported by the kernel).RETURN means stop traversing this chain and resume
at the next rule in the previous (calling) chain. If the end ofabuilt-inchain is
reached or a rule in a built-in chain with target RETURN is matched, the target
specified by the chain policy determines the fate of the packet.
Figure 3 shows the implementation architecture of our component-based intrusion detection and prevention system with requires and provides interfaces of
each component.
Internet
the
unknown
SNORT
Packet data
Data adaptor
IPTables Firewall
Converted data
Prevention firewall
Rules
Pattern adaptor
Detection decision
tree
See5/ C5.0
95
6- THE ADAPTOR COMPONENTS DESIGN
96
Han et al
97
98
Han et al
mysql,
user=snort
password=password
host=localhost
}
This rule type creates an alert called nonalert to capture incoming traffic into a
database called snorttest.
The following are the rules that are added to the database using above rule type
nonalert to log normal traffic
pass tcp 0.0.0.0 any <> $HOME_NET any
pass udp 0.0.0.0 any <> $HOME_NET any
pass icmp 0.0.0.0 any <> $HOME_NET any
pass ip 0.0.0.0 any <> $HOME_NET any
nonalert tcp $EXTERNAL_NET any <> $HOME_NET any (msg:Traffic
CSRL 1.0 tcp; classtype:not-suspicious; sid: 1000001;)
99
Han et al
APPENDIX A
A-1 ATTRIBUTES USED IN See5
Decision attribute:
sig_class_id:
integers 1 to 34
| description of alert
Condition attributes:
sig_priority:
continuous
ip_ver:
continuous
| vertion of ip common 4
ip_hlen:
continuous
ip_tos:
continuous
ip_len:
continuous
ip_flags:
continuous
bits.
102
last part;added1,2
Han et al
ip_off:
N/A,0
ip_proto:
continuous
| Protocol <=list
opt_proto:
continuous
| what is in data
opt_code:
continuous
opt_len:
continuous
tcp_off:
0,5 to 12
tcp_flags:
N/A,0,2,4,16,17,18,20,24,25,194,196.
tcp_urp:
N/A,0,1,512,7424,64630,23410,45075
icmp_type:
continuous.
icmp_code:
N/A,0,1,2,3,4,10,13
icmp_seq:
continuous
Decision tree:
ip_flags > 0: 1 (1157)
ip_flags <= 0:
:...ip_len > 357:
:...ip_src <= 3.520088e+009:
: :...tcp_sport <= 1104: 2 (36356/14)
: : tcp_sport > 1104:
: : :...tcp_dport <= 285:
: :
:...ip_len <= 1272: 2 (370/4)
: :
: ip_len > 1272:
: :
: :...ip_src <= 3.284041e+009: 1 (37/2)
: :
:
ip_src > 3.284041e+009: 2 (2)
: :
tcp_dport > 285:
: :
:...ip_len > 479: 2 (54/2)
: :
ip_len <= 479:
: :
:...ip_len > 374: 1 (424)
: :
ip_len <= 374:
: :
:...ip_len <= 370: 2 (8)
: :
ip_len > 370: 1 (98)
: ip_src > 3.520088e+009:
: :...ip_ttl <= 107:
:
:...tcp_sport <= 46209: 2 (22/2)
:
: tcp_sport > 46209: 1 (3)
:
ip_ttl > 107:
:
:...ip_ttl <= 178: 1 (496)
:
ip_ttl > 178:
:
:...ip_len > 425:
:
:...tcp_dport <= 285: 2 (10)
:
: tcp_dport > 285: 1 (7)
:
ip_len <= 425:
:
:...tcp_sport > 1915: 1 (196)
:
tcp_sport <= 1915:
:
:...ip_len <= 400: 2 (14)
:
ip_len > 400: 1 (25)
ip_len <= 357:
:...ip_len <= 78: 2 (153436/6)
ip_len > 78:
:...ip_len <= 81:
:...tcp_dport <= 5190:
: :...ip_ttl <= 127: 2 (1584)
: : ip_ttl > 127:
: : :...ip_dst > 1.438688e+009: 2 (923/43)
: :
ip_dst <= 1.438688e+009:
: :
:...ip_dst <= 1.156555e+009: 1 (13)
: :
ip_dst > 1.156555e+009: 2 (181/7)
104
Han et al
Han et al
udp_dport > 1900
-> class 0 [0.999]
Rule 4: (2435/5, lift 4.2)
ip_dst <= 3.381627e+009
udp_len <= 47
-> class 0 [0.998]
Rule 5: (210, lift 4.2)
ip_len <= 80
udp_dport > 1900
-> class 0 [0.995]
Rule 6: (265, lift 1.3)
ip_src <= 2.161575e+008
udp_dport <= 138
-> class 1 [0.996]
Rule 7: (196, lift 1.3)
ip_id > 64056
ip_id <= 64242
udp_dport > 138
-> class 1 [0.995]
Rule 8: (78671/2358, lift 1.3)
udp_dport > 138
-> class 1 [0.970]
Default class: 1
Evaluation on training data (100274 cases):
Rules
---------------
No
Errors
8 8( 0.0%) <<
Attribute usage:
100% udp_dport
22% ip_src
20% ip_len
20% udp_sport
2% ip_dst
2% udp_len
107
APPENDIX B
The firewall rules transformed from the decision rules generated from See5 are
listed as below:
# Firewall configuration written by system-configsecuritylevel
# Manual customization of this file is not recommended.
*filter
:INPUT ACCEPT [0:0]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:RH-Firewall-1-INPUT - [0:0]
:CSRL-RULES - [0:0]
:CSRL-CH01 - [0:0]
:CSRL-CH02 - [0:0]
:CSRL-CH03 - [0:0]
:CSRL-CH04 - [0:0]
:CSRL-CH05 - [0:0]
:CSRL-CH06 - [0:0]
:CSRL-CH07 - [0:0]
:CSRL-CH08 - [0:0]
:CSRL-CH09 - [0:0]
:CSRL-CH10 - [0:0]
:CSRL-CH11 - [0:0]
-A
-A
-A
-A
INPUT -j RH-Firewall-1-INPUT
FORWARD -j RH-Firewall-1-INPUT
RH-Firewall-1-INPUT -i lo -j ACCEPT
RH-Firewall-1-INPUT -j CSRL-RULES
Han et al
ACKNOWLEDGMENT
This paper is based on work supported by the National Science Foundation
(NSF) through grant CNS-0540592 and NGA. Any opinions, findings, and conclusions or recommendations expressed in the paper are those of the authors
and do not necessarily reflect the views of the NSF.
REFERENCES
[1] R. G. Bace, Intrusion Detection, Sans Publishing, 1999.
[2] G. Pour, Component-Based Software Development Approach: New
Opportunities and Challenges, Proceedings Technology of Object-Oriented
Languages, 1998. TOOLS 26., pp. 375-383.
[3] H. Debar and A. Wespi, Aggregation and Correlation of Intrusion-Detection
Alerts. Proc. of RAID, 2001.
[4] P. Ning, Intrusion Detection in Distributed Systems: An Abstraction-Based
Approach, Springer, 2003.
[5] D. Wagner and D. Dean, Intrusion Detection via Static Analysis,
Proceedings of the 2001 IEEE Symposium on Security and Privacy, 2002.
[6] C. Y. Chung, M. Gertz, and K. Levitt, Discovery of Multi-Level Security
Policies, The Fourteenth Annual IFIP WG 11.3 Working Conference on
Database Security, 2000
[7] M. Roesch, The Story of Snort: Past, Present and Future. 2005, http://www.
net-security.org /article.php?id=860 .
[8] L. L. Peterson and S. B. Davie, Computer Networks: A systems Approach,
Morgan Kaufmann Publisher. San Francisco, CA. 2003.
[9] J. Han, K. Kowalski, M. Beheshti, Detecting Network Intrusions Based on
a Generalized Rough Set Model, Proc. of International Symposium on
Telecommunications, 247-252, September 10-12, 2005, Shiraz, Iran.
[10] J. Gomez and D. Dasgupta, Evolving Fuzzy Classifiers for Intrusion
Detection, Proc. of IEEE Workshop on Information Assurance, New York,
June, 2002.
[11] L. Kuang, and M. Zulkerning, An Anomaly Intrusion Detection Method using
the CSI-KNN Algorithm, pp. 921-926, The 23rd Annual ACM Symposium on
Applied Computing, Fortaleza, Brazil, 2008.
110
Han et al
112