WSF: An HTTP-level Firewall For Hardening Web Servers: Threat Model
WSF: An HTTP-level Firewall For Hardening Web Servers: Threat Model
WSF: An HTTP-level Firewall For Hardening Web Servers: Threat Model
1
validation applies, an attacker may input tom, 99), operations, but this is almost an impossible task without
(mary in the $username field, the user creation receiving feedback from the application.
command is then generated as: Aiming at the problem of network based IDS systems,
INSERT INTO USER(name, id) VALUES(tom, 99), several application level IDS systems are proposed.
(mary,100)
Mod_security[5] filters http requests that match specified
Because many database systems, such as MySQL, allow attack signatures. However, it does not provide fine-grained
users to insert multiple records in a line, this SQL command access control, and is less effective in preventing
will allow the attacker to insert two records instead of one as unauthorized accesses like bypass execution problem. In
expected. The reason of this SQL injection attack is a addition, it is hard to keeping attack signatures updated and
security bug: the user input validation is insufficient. enumerating all possible malicious request patterns.
Level of Protection David Scott and Richard Sharp proposed the Security
WSF helps to protect against a wide-range of common Gateway[6] to support CGI input validation based on
vulnerabilities with the following three mechanisms: application-level security policies, which is similar to WSFs
1. To prevent unauthorized access to web files, WSF input validity specification. The difference between WSF
provides a language for specifying fine-grained access and Security Gateway is that WSF supports fine-grained
control policy and enforcing it at the perimeter of a access control and collects user behavior statistics that can
web server. With this language, web administrators be used to detect abnormal web behaviors and adjust the
can classify web clients into variety of roles and access policy heuristically.
specify their access permissions to web objects at the WebSTAT [7, 8] detects intrusions against a web server by
granularity ranged from directories to files. In addition, analyzing its logs. Like WSF, it also uses behavior statistics
rather than allowing all files in /cgi-bin directory to be to infer abnormal activities. However, while WebSTAT
executed by web clients, WSF allows a web allows an administrator to associate actions with the
application to be invoked only if it is explicitly intermediate step of an attack, it is hard to stop one evil
specified as executable to web clients, which connection and avoid interrupting other valid connections at
effectively prevents the bypass execution attack. the same time, because WebSTAT is independent of a web
2. To thwart abuse of web applications, WSF proposes an server. For the same reason, WebSTAT does not prevent
input validity specification language to allow unauthorized access to web files. In contrast, WSF works as
developers to specify the valid input patterns instead of a module of Apache web server, it sit in line and stop
requiring enumeration of all possible malicious inputs, malicious requests on site.
which substantially simplifies the input validation task. Vulnerability Assessment Systems
3. WSF also collects user behavior statistics on a Various vulnerability scanners such as ISS Internet Scanner
per-user/per-IP basis. The behavior statistics can be [9], Saint[10], LibWhisker[11], Nikto[12, 13] and
used to detect abnormal web activities and heuristically Nessus[14], help assess a web system for loopholes before
change the access policy to proactively delay or block bad guys find them. They primarily rely on attack signature
the requests from malicious users. based checking, which makes them often raise false alarms
or fail to detect critical vulnerabilities[15].
The rest of the paper is organized as follows. In Section 2, we
describe related work. In Section 3, we illustrate the
architecture and design of WSF. In Section 4, the 3. Design of WSF
implementation details are presented. In Section 5, we 3.1 System Overview
evaluate the WSF system. Finally, we make our conclusions.
2. Related work
Most web protection mechanisms fall in two primary
categories: intrusion detection/prevention systems and
vulnerability assessment systems.
Intrusion Detection/Prevention Systems
Most intrusion detection/prevention systems deployed to
protect a website work at network level or application level.
Network based intrusion detection systems such as snort [3]
can analyze network traffic to detect web intrusions. Figure 1. The architecture of WSF
However, network-based intrusion detection is vulnerable As shown in Figure 1, WSF consists of the input and output
to insertion and evasion attacks[4]. In addition, the network filters. Input filter deep inspects the incoming HTTP
IDS needs to model how the application interprets the requests to reject invalid web accesses. Output filter collects
2
the status of outgoing responses. Response status 3.2 Access Control Policy
information helps infer user behavior patterns. WSF defines an access control policy language to allow
WSF maintains a per-user security context. A security administrators to explicitly define the access rights to web
context in WSF is indexed either by the users IP address or entries, including normal data files and CGI programs.
by a user ID (if the user authenticated to the web service). An access rule is a mapping as follows:
We will defer the description on how to extract a user ID
from web traffic to Section 4.2. The security context Web_Entry Web_User : Access_Right
contains the users past behavior statistics, such as the The web entry defines the object on which the access rule
number of invalid requests, the number of failed requests, should apply. It can be a specific file, a class of files with a
and the number of requests during a specified time interval. wildcard pathname or a directory. The web user defines the
All those behavior statistics are updated by the input and subject that is allowed to access the web entry. It can be a
output filters. specific user or a web group. The access right defines the
The input filter deploys three engines: security context authorization under which a web user can access a web entry.
checking engine, access right checking engine, and CGI The access right mapping means: the web_entry can and
input validation engine. These engines check the incoming only can be accessed by the web_user under the
requests one by one. An incoming request will be forwarded access_right authorization.
to the protected web server only if it goes through the checks An access policy usually includes three parts:
of the three engines.
1. Definition of valid user set and user groups
The security-context checking engine examines the user ID
2. Definition of default accessible file types
and the IP address of the request to see if requests from the IP
address or the user ID should be blocked or delayed. 3. Definition of access right rules of web entries
Administrators can use the security-context checking engine The first part defines the valid user set and user groups.
to temporarily block a users access to the web server if their
The second part contains the default accessible file types
statistical behavior, recorded in the security context, violates
(i.e. *.html and *.jpg files) for the web system. The accessible
specified limits (e.g., too many failed authentication requests
file types can be defined by file type extensions or certain
within a short interval). Therefore, the security context
file name patterns. By default, only common web file types
essentially works as a credit history report to help WSF
are included, which helps prevent unauthorized accesses to
monitor a clients abnormal behavior pattern and adjust its
sensitive files, such as creditcard.dat, that are left in the
access policy accordingly.
public web directory.
The access right checking engine checks the requested URI
The third part specifies the access right of users to web
against the access right policy. With the access right control,
entries. An access right policy may include multiple access
WSF can limit authenticated or unauthenticated users to only
rules. Each rule defines the access right of one URI entry. A
specified web files/services and prevent unauthorized access
URI entry can be defined as a specific file, a class of files
to the sensitive files that are left accidentally in public web
with a wildcard pathname or a directory. Wildcards are
directories. The access right checking engine provides
allowed and only allowed in file name to represent multiple
fine-grained control, rather than standard access control
files with similar name pattern. If an access rule defined for a
imposed by web servers. Section 3.2 gives more details
directory, this access rule applies to all files and
about the access right checking engine.
sub-directories under this directory if they are not associated
Finally, if the request is intended to invoke a CGI program, with access rules. In other words, if no access rule is defined
the request will be checked by the CGI input validation for a directory or a file, permissions are inherited from the
engine. The CGI input validation engine checks the parent directory. The access right rules are prioritized as
parameters carried in the CGI request against the input follows:
validity specifications. Only requests with valid inputs can
r o o t d ir e c to r y s u b -d ir e c to r y (le v e l1 )
be sent to the web server. The CGI input validation helps
s u b -d ir e c to r y (le v e l2 )... a c la s s o f file s s in g le file
mitigate many buffer overflow attacks and SQL injection
attacks that compromise web systems via sending malicious The access rule of root directory has the lowest priority and
parameters to CGI programs. More details are presented in access rules of single files have highest priority. Rules with
Section 3.3 higher priority have precedence in policy enforcement.
The output filter checks the status of outgoing replies and The CGI programs are treated differently. Each accessible
updates the behavior statistics in the security context. In CGI program must be explicitly specified to be executable.
addition, the output filter also helps the input filter to track No wildcard is allowed in the access right rules for CGI
the user information and generate the user tracking tag for programs. By default, only the CGI programs that are
each source. explicitly configured as executable can be requested to run
by web clients. Thus, if a helper program, say
"user_management.pl", is supposed to be only invoked by
other trusted CGI programs, it will not be put in the access
3
right policy. Any attempts to directly invoke such a helper directly regard them as malicious. This mechanism
program via a URI will then be blocked by WSF. effectively prevents many buffer overflow attacks such as
Code Red I and II attacks[17].
3.3 CGI Input Validity Specification
Because the inputs to CGI programs are complex, fixed To reduce the risks of mis-configurations, the validity
attack signatures are often not flexible enough to tell a valid specifications can be tested with known attack signatures to
input from invalid ones. see whether known attacks can slip through the protection of
validity specifications. Currently, WSF use signatures
To deal with this problem, WSF provides a fine-grained way extracted from the Snort attack signature database[3] to
to specify constraints on inputs of CGI programs. We use an check the validity specification.
example to describe how validity specification works:
suppose we have a user login script /cgi-bin/login.cgi, it only The above example shows, the rule clearly defines what
allows parameter transferred with POST method; the inputs are expected by the programmer developers. The CGI
expected input at the user name field is a string composed by program, at a minimum, must take care of inputs that satisfy
3-8 letters or digits and the expected valid password is a the above specification. Any other unexpected inputs will be
string composed by 6-15 letters and digits. No special blocked by this specification directly at the firewall. This
character is allowed in the username and password mechanism does not require developers to enumerate all
parameters. The validity specification can be defined as possible invalid input patterns. Instead, web application
follows: developers only need to express their intention of valid
inputs with regular express, which substantially simplify the
< Rule>
input validation procedure.
<URI> /cgi-bin/login.cgi <\URI>
< Method> POST <\ Method>
3.4 User Behavior Auditing
< Parameter>
<Name> username </Name>
<Value> ^[a-zA-Z0-9]{3,8}$ </Value> Figure 2. WSF Security Context
</ Parameter> As a complementary mechanism, WSF also supports
< Parameter> tracking and auditing of web user behaviors. WSF maintains
< Name> password </Name>
a security context for each web client. The security context is
indexed with the clients user ID if the client is an
< Value> ^[a-zA-Z0-9]{6,15}$ </Value> authenticated user. If the client is an anonymous guest, the
</ Parameter> security context is indexed with the clients IP address. As
<SIG_CHECKING> NO </SIG_CHECKING> Figure 2 shows, the WSF security context contains three
</Rule>
parts of user security information:
The URI section contains the URI of the CGI program. 1. Index of the security context (User ID or IP address);
The Method section configures which methods are allowed 2. Behavior statistics;
for this URI. The methods that are often used are GET and 3. Access control decision based on the behavior pattern.
POST. Other HTTP methods like PUT, TRACK must be WSF uses the index of the security context, IP address for
used carefully as they may bring vulnerabilities like cross unauthenticated user and User ID for an authenticated user,
site script attack[16]. to locate a users security context.
The Parameter section defines the validity specifications for The behavior statistics part contains cumulative user behavior
parameters of this CGI program. Each possible parameter patterns, measured over multiple configurable time-intervals
must have a Parameter definition. The validity specification on a per-user/ IP basis:
of each parameter consists of two parts: parameter name and
parameter value. The parameter name field is the parameter y The number of received requests. This data is collected
name to be checked while the parameter value field shows by the input filter.
the valid parameter value pattern. The valid parameter value y The number of bytes sent out. This data is collected by
pattern is defined with regular expression. If there is no the output filter.
restriction on a parameter, the valid parameter value pattern y The number of invalid requests. This data is collected
can be empty. Based on the configured validity pattern, the by the checking engines in the input filter. Any request
input validation checking engine can then check whether the that violates WSF security policies will be counted as an
user inputs carried in a CGI request is valid or not. Note that invalid request.
only parameters listed in this section will be regarded as
valid and checked against the corresponding validity y The number of failed requests. This data is collected by
specification. For those parameters whose names are not on the output filter. Any request with the HTTP status code
the valid parameter list, the input validation engine will
4
that does not fall into the period between 200 and 307 the login request as failed. It simply forwards the outgoing
will be counted as a failed request. message to the client and update the security context
y The number of failed authentication requests. The field corresponding to the clients IP address. If the success flag is
helps to prevent brutal force password guessing attacks. It found in the response message, WSF infers that this is a
is collected by the output filter. successful authentication. The user associated with this
authentication request becomes an authenticated user. WSF
The user behavior statistics help to detect abnormal behavior then generates a unique WSF cookie as the user
pattern and proactively adjust access control policies. For identification tag. The WSF cookie will be carried with this
example, excessive authentication failures of a specific user users further requests and used by the WSF system to track
may indicate that a hostile party is mounting brutal force this users activities. If no valid WSF cookie is located in an
password guessing attack or this user forgets the password. incoming HTTP request, WSF will always regard the request
To thwart password guessing attack, web administrators can sender as an anonymous user.
configure WSF to suspend this users further authentication
requests for several seconds upon the number of failed
authentications exceeding the specified threshold.
5. System Evaluation
5.1 Security Evaluation
4. Implementation Details To evaluate the effectiveness of WSF system, we copied all
files on our department website and deployed a parallel
4.1 Modularized WSF website as the testbed. Multiple attacks, including Bypass
The Apache modularized architecture processes web traffic execution, Random File Access, and SQL Injection, are
using the same idea as Unix command line filters: ps -ax | mounted against the testing website. The simulation results
grep "apache.*httpd" | wc l. The basic idea is to treat the showed that WSF can effectively mitigate various web
information processing flow as an information stream. attacks.
Apache modules can be inserted into the stream and
organized as a module chain. Each module receives the data 5.2 Performance Evaluation
from upstream module, processes the data and then forwards
Performance Comparison
the processed data to the next module in the chain. By this
means, data in the stream can be manipulated independently 500000
Throughput(bytes/s)
WSF with
from how it's generated. 400000 small cache
With the same idea, WSF is implemented as an Apache 300000
module to terminate the incoming request, check it and WSF with
200000 large cache
decide whether to let the request go to next module. One
advantage of deploying WSF as an Apache module is that 100000
the existing Apache code can be leveraged to reduce the Apache(no
0
firewall)
implementation complexity. Another benefit is that WSF sits 1 2 3 4 5
behind the SSL module and can monitor the decoded web Request File Size(KBytes)
traffic.
4.2 User Behavior Tracking Figure 3 Throughput Comparisons
To collect a users behavior statistics, WSF first needs to To evaluate the performance of WSF, we setup the
identify a web client. If the client is anonymous, WSF only simulation environment as follows: the web server is a
needs to identify it by the clients source IP. If a client is an Pentium IV PC with 1.8GHz CPU and 256MB memory with
authenticated web user, WSF has to identify the users ID Linux 2.5.75 and Apache 2.0.40 installed. 3 Pentium III PCs
to enforce the corresponding access policy. with 850MHz CPU and 256MB memory work as web clients.
To track the user identity, WSF requires the web Standard web system benchmark tools like WebStone does
administrator to fill out a login template to tell WSF the user not support testing of authenticated web sessions that carry
ID field and successful authentication flag (i.e. a session WSF cookies, we developed a benchmark tool that is similar
cookie). With the login template, WSFs input and output to WebStone but supports authenticated web sessions. In the
filters cooperate with each other to track the user information. benchmark experiments, each of the three client hosts has 8
The input filter identifies the user authentication requests threads to send out HTTP request at their best efforts. Each
and extracts user information from the requests. With the thread sends 2000 HTTP requests in a sequential manner: a
extracted user information, the input filter generates a login request will not be sent out until the reply of the previous
memo to mark this request as an authentication request and request is received. In the simulation, we have deployed the
save the extracted user information. The WSF output filter access rules for 3394 web files and validity specifications for
keeps checking whether an outgoing message carries the 150 CGI programs. The number of CGI validity
login memo. If it is, the output filter then searches for the specification rules has little effect on performance, because
successful authentication flags which are defined in the login
template. If no success flag is found, the output filter regards
5
the rules are indexed with CGI program pathnames and each 3. Roesch, M.S. Lightweight Intrusion Detection for
CGI program is governed by one rule. Networks. in Proc. of the USENIX LISA '99 Conference.
Figure 3 shows the throughput comparison of a web server November 1999.
with WSF support and without WSF support. We can see 4. Ptacek, T.H. and T.N. Newsham., Insertion, Evasion
that when request file size is large, the apache server with and Denial of Service: Eluding Network Intrusion
WSF can achieve performance comparable to an apache Detection. January 1998, Secure Networks.
server without WSF. However, when the requested file size 5. Ristic, I., Introducing mod_security, 2003.
is small, we can easily see performance penalties. The reason http://www.onlamp.com/pub/a/apache/2003/11/26/mod_
is that WSF is primarily CPU-bound. Most of its time is security.html
spent performing regular expression matching against client 6. Scott, D. and R. Sharp. Abstracting Application-Level
requests and updating behavior statistic records. When file Web Security. in Proceeding of the eleventh
size is large, the file transmission time is dominant, the WSF international conference on World Wide Web
cost is relatively small. If file size is small, the CPU time (WWW'2002). 2002.
used by WSF becomes non-negligible and thus reduces the 7. Vigna, G., et al. A Stateful Intrusion Detection System
apache server performance. However, as our prototype is for World-Wide Web Servers. in Proceedings of the 19th
completely un-optimized, we believe there is large scope to Annual Computer Security Applications Conference.
improve system performance. For example, Figure 3 also 2003.
shows by increasing cache size to hold security contexts, 8. Kruegel, C. and G. Vigna, Anomaly detection of
WSF can achieve higher throughputs. This indicates that the web-based attacks in Proceedings of the 10th ACM
size of memory allocated for caching security contexts can conference on Computer and communications security
affect the system performance significantly. Upon receiving 2003 ACM Press: Washington D.C., USA p. 251-261
requests from a new client, the security context checking 9. ISS, ISS Internet Scanner, 2004.
engine needs to load the clients security context from http://www.iss.net/products_services/enterprise_protecti
database into cache. If the cache is full, some clients on/vulnerability_assessment/scanner_internet.php
security contexts have to be sent back to the database. Those 10. SAINT Corp., SAINT vulnerability scanner.
database I/O operations thus increase the system overhead. http://www.saintcorporation.com/products/saint_engine.
The larger the cache size is, the higher cache hitting rate is, html
and the less database accesses are required. Therefore, large 11. rfp.labs, libwhisker.
cache helps to improve the performance of WSF. http://www.wiretrip.net/rfp/index.asp
12. Nikto, Nikto 1.32. http://www.cirt.net/code/nikto.shtml
13. Symantec Corp., Symantec NetRecon.
6. Conclusion http://enterprisesecurity.symantec.com/products/product
WSF proposes a policy-based framework to provide s.cfm?ProductID=46
perimeter security for those web services. With proper 14. Nessus, NESSUS Scanner, 2004.
policies, WSF can help to thwart unauthorized accesses to http://www.nessus.org/
system sensitive files and achieve flexible, role-based access 15. Forristal, J. and G. Shipley, Vulnerability Assessment
control. To prevent attackers from sending maliciously Scanners, 2001.
manipulated requests to CGI programs, WSF allows http://www.nwc.com/1201/1201f1b1.html
administrators to explicitly define the input validity 16. CERT Center, Microsoft Internet Information Server
specification for each accessible CGI program. Instead of (IIS) vulnerable to cross-site scripting via HTTP
inferring all possible attacks from known attack signatures, TRACK method, 2004.
WSF checks incoming requests against the input validity http://www.kb.cert.org/vuls/id/288308.
specification, which simplifies the procedure to determine 17. CERT Advisory, "Code Red" Worm Exploiting Buffer
whether a use input is valid or not. In addition, WSF collects Overflow In IIS Indexing Service DLL, 2001.
user behavior statistics, which helps web administrators to http://www.cert.org/advisories/CA-2001-19.html
detect abnormal user behaviors and proactively adjust the
access control policies.
References:
1. BBC News, Web attacks on the rise, 2002.
http://news.bbc.co.uk/1/hi/sci/tech/1930832.stm
2. Anley, C., Advanced SQL Injection In SQL Server
Applications, 2002.
http://www.nextgenss.com/papers/advanced_sql_injecti
on.pdf