fuzzing
fuzzing
Suman Jana
Fuzzing Dynamic
symbolic execuFon
Lower coverage Higher coverage
Lower false posi0ves Higher false posi0ves
Higher false nega0ves Lower false nega0ves
Blackbox fuzzing
Random
input
Test program
?
Seed input Mutated input Run test program
Example: fuzzing a PDF viewer
• Google for .pdf (about 1 billion results)
• Crawl pages to build a corpus
• Use fuzzing tool (or script)
– Collect seed PDF files
– Mutate that file
– Feed it to the program
– Record if it crashed (and input that crashed it)
MutaFon-based fuzzing
• Super easy to setup and automate
• LiWle or no file format knowledge is required
• Limited by iniFal corpus
• May fail for protocols with checksums, those
which depend on challenge
Enhancement II:
GeneraFon-Based Fuzzing
• Test cases are generated from some descripFon of
the input format: RFC, documentaFon, etc.
– Using specified protocols/file format info
– E.g., SPIKE by Immunity
• Anomalies are added to each possible spot in the
inputs
• Knowledge of protocol should give beWer results
than random fuzzing
RFC ?
Input spec Generated inputs Run test program
Enhancement II:
GeneraFon-Based Fuzzing
Execute
Seed against
inputs MutaFon instrumented
target
Input Next input
queue
branch/
edge
Add mutant coverage
to the queue increased?
Periodically culls the
queue without
affecFng total coverage
AFL
• Instrument the binary at compile-Fme
• Regular mode: instrument assembly
• Recent addiFon: LLVM compiler instrumentaFon mode
• Provide 64K counters represenFng all edges in the app
• Hashtable keeps track of # of execuFon of edges
– 8 bits per edge (# of execuFons: 1, 2, 3, 4-7, 8-15, 16-31,
32-127, 128+)
– Imprecise (edges may collide) but very efficient
• AFL-fuzz is the driver process, the target app runs as
separate process(es)
Data-flow-guided fuzzing
• Intercept the data flow, analyze the inputs of
comparisons
– Incurs extra overhead
• Modify the test inputs, observe the effect on
comparisons
• Prototype implementaFons in libFuzzer and
go-fuzz
Fuzzing challenges
• How to seed a fuzzer?
– Seed inputs must cover different branches
– Remove duplicate seeds covering the same
branches
– Small seeds are beWer (Why?)