š Fuzzing Random Ubuntu Packages with Mayhem - Part 1
About Mayhem
Mayhem is a cloud (or on-premises) fuzzing solution created by ForAllSecure. It has some great features that make fuzzing more approachable for software developers with little fuzzing experience. I almost think of it as a big red easy button for fuzzing. For experienced fuzz testers, itās quite nice for a number of reasons.
- It takes care of the backend for me. I donāt have to worry about setting up infrastructure like ClusterFuzz or networking a bunch of virtual machines together.
- A sub-component of 1. is that I am not using my own resources. I still have all of the cores available on my Mac, and Iām not paying for CI/CD minutes or anything. Though, Mayhem isnāt free unless youāre working on open-source projects.
- It abstracts network calls. With tools like AFL, your fuzz harness needs to convert calls that read from the network (think
recv
tostdin
). Mayhem takes care of this for you. - It supports Docker containers or native applications.
- It supports running native code (with or without source) or cross-architecture targets running with QEMU.
Random Package #1
Enough about Mayhem. I just scrolled down the list of available Ubuntu packages for Jammy Jellyfish (22.04 LTS), looking for something easy to harness (it was a lazy weekend, after all). I came across the words āassemblerā and knew I had something simple. crasm
is a cross assembler for 6800/6801/6803/6502/65C02/Z80 and is hosted on GitHub. At the time of harnessing, it was last updated on February 15th, 2021.
Compilation
I cloned down the repository from GitHub and compiled it with the AFL compiler CC=afl-clang-lto CXX=afl-clang-lto++ make
. This produced the compiled amd64
ELF in /src/crasm
. Easy. I wrapped this up and threw it on Dockerhub(which will come in handy later) as a public image.
Test Cases
For fuzzing, you usually want test cases for optimization. This saves the fuzzer a lot of time. Like, a lot. Fortunately for us, crasm
includes its own test suite under /test
. By default, Mayhem looks for these test cases in the folder testsuite
. For reference, documents for AFL usually just call this /input
.
Usage
The next thing I always do before starting a fuzzing campaign is to use the program and see how it accepts input and reacts. We can invoke crasm
just by calling the executable without any flags.
./crasm
No input!
Syntax: crasm [-slx] [-o SCODEFILE] INPUTFILE
Crasm 1.10 known CPUs:
6800 6801 6803
6500 6502 65C02
Z80
If we provide the program with one of the test cases described above, we get the output dumped to stdout
. This looks really easy to fuzz.
./crasm /testsuite/copy.6800.asm
Pass #1
Pass #2
Crasm 1.10: page 1
1 ;;; Author: Leon Bottou
2 ;;; Public Domain.
3
4 cpu 6800
5
8000 6 * = $8000
7
0040 8 begin = $40
...
Successful assembly...
Last address 803b (32827)
Code length 78 (120)
Crasm 1.10: page 2
0040 Abs BEGIN
^8013 Abs COPY
0042 Abs DEST
0044 Abs LEN
Harness and Mayhemfile
Since the crasm
program just reads from a file passed in as an argument, we donāt really need a harness. To run a fuzzing job, Mayhem uses a .yml
file called Mayhemfile
. This file specifies how to run the program, environments (like LD_PRELOAD
), how long to wait for the program to respond, etc. A full list of Mayhemfile options are provided for reference here. Mayhem provides a command line tool (aptly called mayhem
) that can automatically generate a Mayhem file for you, with mayhem init
. This creates the Mayhemfile
. We can modify it like so.
# Mayhem by <https://forallsecure.com>
# Mayhemfile: configuration file for testing your target with Mayhem
# Format: YAML 1.1
# Project name that the target belongs to
project: crasm
# Target name (should be unique within the project)
target: crasm
# Base image to run the binary in.
image: whatthefuzz/crasm-afl:1.0.0
# Turns on extra test case processing (completing a run will take longer)
advanced_triage: false
# List of commands used to test the target
cmds:
# Command used to start the target, "@@" is the input file
# (when "@@" is omitted Mayhem defaults to stdin inputs)
- cmd: /crasm/src/crasm @@
env: {}
# Max size in bytes of the test size.
max_length: 65536
So what are we looking at? ForAllSecure does a good job of providing descriptions for the keys. Iāll specifically point out the image
key. Itāll pull down a public Docker container to use as the executable. This is super helpful since the host I usually use is an arm64
Mac. The cmds.cmd
key has a value that is similar to what we used above in Usage. The only difference is the @@
which the fuzzer will substitute with each test case as it runs (i.e. the target reads input from a file, if it reads from stdin
we would omit this).
Start Fuzzing! š¾
With all of this, we can just kick off the fuzzing job with mayhem run .
From experience, it sometimes takes a while to get a job running. Mayhem provides an event log that lets you get things right (also notice that this was my seventh run š¬). Whether itās a typo or you left something enabled that you shouldnāt have, keep trying. Youāll get it.
Bugs
I went to grab a drink while Mayhem did its thing. I came back to two bugs, a divide by zero and a NULL pointer dereference. The nice thing about Mayhem is that it comes with its secret-sauce symbolic execution engine, which makes finding interesting test cases easier and happen faster. Itās also pretty great that Mayhem automatically provides CWEs.
Fixing the Bugs šØ
This wouldnāt be a cool project if we didnāt at least fix the bugs we found. First, letās compile the target without instrumentation (i.e. donāt use the AFL clang compiler) and add debugging symbols. In crasm/src
we can modify the flags like so: CFLAGS = -O0 -Wall -g
. -g
will give us debugging symbols. -O0
will remove optimizations and maybe leave variables and such intact. Then we can compile like so:
$ clang --version
Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: x86_64-apple-darwin22.2.0
Thread model: posix
$ CC=clang make
...
Then, we verify that we get a crash with the crashing test cases provided by Mayhem. If we use mayhem sync .
in the directory with our Mayhemfile
, we will download the corpus of test files Mayhem generated (crashing and non-crashing).
$ mayhem sync .
$ ls -al ./defects
4ed6eacf6ec3c24f587ec3321b5fd739480c96a7679c8108f2f6034f07ecaff4
517d1b402d585fdb0458f96802a616419b9112bdc119a2393c35e034576a0c62
# Make sure to use the non-instrumented crasm to test!
$ crasm ./defects/4ed6eacf6ec3c24f587ec3321b5fd739480c96a7679c8108f2f6034f07ecaff4
Pass #1
[1] 58286 segmentation fault ~/Developer/crasm/src/crasm
$ crasm ./defects/517d1b402d585fdb0458f96802a616419b9112bdc119a2393c35e034576a0c62
Pass #1
[1] 58369 floating point exception ~/Developer/crasm/src/crasm
Success (or failure?)! We still have crashing test cases. Now we can triage. We can call the program under lldb
(Linux folks can use gdb
, just substitute the --
with --args
) with the following:
$ lldb -- ./crasm ./4ed6eacf6ec3c24f587ec3321b5fd739480c96a7679c8108f2f6034f07ecaff4
(lldb) run
Process 99307 launched: '/crasm/src/crasm' (x86_64)
Pass #1
Process 99307 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
frame #0: 0x00000001000037b3 crasm`Xasc(modifier=0, label="msgb", mnemo="asc", oper=0x0000000000000000) at pseudos.c:221:15
218 register char delimiter;
219
220 s = oper;
-> 221 delimiter = *s;
222
223 if (delimiter != '\\'' && delimiter != '\\"')
224 {
Target 0: (crasm) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
* frame #0: 0x00000001000037b3 crasm`Xasc(modifier=0, label="msgb", mnemo="asc", oper=0x0000000000000000) at pseudos.c:221:15
frame #1: 0x0000000100002d6c crasm`asmline(s="asc", status=3) at crasm.c:562:7
frame #2: 0x00000001000027b1 crasm`pass(n=1) at crasm.c:274:9
frame #3: 0x0000000100002490 crasm`crasm(flag=138) at crasm.c:180:3
frame #4: 0x0000000100002292 crasm`main(argc=0, argv=0x00007ff7bfeff440) at crasm.c:147:5
frame #5: 0x00007ff812381310 dyld`start + 2432
So this is our null pointer dereference. Itās easy to see why with the source.
**********int Xasc(int modifier, char* label, char* mnemo, char* oper)
{
register char* s;
register char r;
register char delimiter;
s = oper;
delimiter = *s;
...**********
Whatever called this function (which we can see with the backtrace) passed in a NULL value for oper
. Looking at asmline:562
, we can see the offending line:
if (status & 2)
{
(*labmnemo->ptr)(labmnemo->modifier, label, mnemo, oper);
}
}
Lots of bugs come from using pointers as functions, as in this case. Anyway, weāre going to recommend a minimum viable patch to at least prevent the dereference. Here, we check that the character pointer oper
is not NULL. The other arguments arenāt used but are likely needed because of how the function is called with the dynamic function pointer.
int Xasc(int modifier, char* label, char* mnemo, char* oper)
{
if (oper == NULL)
{
error("Need an operand");
}
...
After re-compilation, we verify that we donāt get a segmentation fault, just more errors. š
$ crasm ./4ed6eacf6ec3c24f587ec3321b5fd739480c96a7679c8108f2f6034f07ecaff4
< assembly output omitted>
ERRORS: 5
WARNINGS: 0
No code generated...
After fixing this, I took a look at the pending (from 2019) pull requests on the authorās repository and saw that another individual spotted a similar bug in an adjacent function, Xdc
.
Test Case #2 - Divide by Zero
Looking at the next test case:
$ lldb -- ./crasm ./517d1b402d585fdb0458f96802a616419b9112bdc119a2393c35e034576a0c62
(lldb) target create "./crasm"
Current executable set to '/crasm/src/crasm' (x86_64).
(lldb) settings set -- target.run-args "./517d1b402d585fdb0458f96802a616419b9112bdc119a2393c35e034576a0c62"
(lldb) r
Process 2564 launched: '/crasm/src/crasm' (x86_64)
Pass #1
Process 2564 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_ARITHMETIC (code=EXC_I386_DIV, subcode=0x0)
frame #0: 0x00000001000078f4 crasm`opdiv(presult=0x0000000100017468, parg=0x00007ff7bfefefa0) at operator.c:415:18
412 presult->flags |= parg->flags;
413 checktype(presult, L_ABSOLUTE);
414 checktype(parg, L_ABSOLUTE);
-> 415 presult->value /= parg->value;
416 }
417
418 void oprlist(struct result* presult, struct result* parg)
Target 0: (crasm) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_ARITHMETIC (code=EXC_I386_DIV, subcode=0x0)
* frame #0: 0x00000001000078f4 crasm`opdiv(presult=0x0000000100017468, parg=0x00007ff7bfefefa0) at operator.c:415:18
frame #1: 0x0000000100006455 crasm`parse2(expr="ed/maica", presult=0x0000000100017468) at parse.c:152:7
frame #2: 0x00000001000062b4 crasm`parse(expr="ed/maica") at parse.c:233:3
frame #3: 0x0000000100009357 crasm`findmode(oper="aciam/de", pvalue=0x00007ff7bfeff068) at cpu6800.c:99:11
frame #4: 0x0000000100009214 crasm`standard(code=202, label=0x0000000000000000, mnemo="orab", oper="aciam/de") at cpu6800.c:163:9
frame #5: 0x0000000100002d3c crasm`asmline(s="orab aciam/de", status=3) at crasm.c:562:7
frame #6: 0x0000000100002781 crasm`pass(n=1) at crasm.c:274:9
frame #7: 0x0000000100002460 crasm`crasm(flag=138) at crasm.c:180:3
frame #8: 0x0000000100002262 crasm`main(argc=0, argv=0x00007ff7bfeff440) at crasm.c:147:5
frame #9: 0x00007ff812381310 dyld`start + 2432
Pretty easy to spot the bug there. Also, a pretty easy fix to check the value before we divide (Iām not going for gold here, just a minimum viable patch) like so:
void opdiv(struct result* presult, struct result* parg)
{
presult->flags |= parg->flags;
checktype(presult, L_ABSOLUTE);
checktype(parg, L_ABSOLUTE);
if (presult->value != 0) {
presult->value /= parg->value;
}
}
Re-compiling and rerunning the test case shows that the issue is resolved. We still get errors, but no floating-point exceptions!
./crasm ./517d1b402d585fdb0458f96802a616419b9112bdc119a2393c35e034576a0c62
...
ERRORS: 4
WARNINGS: 0
No code generated...
I submitted a pull request for both bugs. Weāll see if they get merged! It was merged within twenty minutes. Props to the author. After the merge, I submitted bug reports to the Ubuntu package repository to alert them to the possible security issues in one of their packages.
Whatās Next?
Mayhem limits OSS fuzz jobs to five minutes, so I continued with AFL++ (a super easy conversion). Maybe weāll shake out more bugs?