About MemLock

Uncontrolled memory consumption is a kind of critical software security weaknesses. It can also become a security-critical vulnerability when attackers can take control of the input to consume a large amount of memory and launch a Denial-of-Service attack. However, detecting such vulnerability is challenging, as the state-of-the-art fuzzing techniques focus on the code coverage but not memory consumption. To this end, we propose a memory usage guided fuzzing technique, named MemLock, to generate the excessive memory consumption inputs and trigger uncontrolled memory consumption bugs. The fuzzing process is guided with memory consumption information so that our approach is general and does not require any domain knowledge. We perform a thorough evaluation for MemLock on 14 widely-used real-world programs. Our experiment results show that MemLock substantially outperforms the state-of-the-art fuzzing techniques, including AFL, AFLfast, PerfFuzz, FairFuzz, Angora and QSYM, in discovering memory consumption bugs. During the experiments, we discovered many previously unknown memory consumption bugs and received 15 new CVEs.

Site Map

This website contains the following pages:


  1. About MemLock
    • MemLock is a dedicated fuzzing technique to automatically discover memory consumption related bugs, thorough generating inputs on excessive memory consumption.
  2. Memory Consumption Related Bugs
    • We focus on three types of memory consumption related bugs: "CWE-674: uncontrolled-recursion", "CWE-789: uncontrolled-memory-allocation", and "CWE-401: memory leak".
  3. MemLock's Solution
    • We guide the fuzzing process using the coverage information as well as memory consumption information.
  4. Practical Security Impact of MemLock
    • With MemLock, we have discovered many previously unknown security-critical vulnerabilities. These vulnerabilities were not previously reported and rated as a medium-security risk. We informed the maintainers, and Mitre assigned 17 CVEs. Our work leads to most of these bugs being patched.
  5. Evaluation Dataset
    • We select evaluation benchmarks considering several factors, e.g., popularity, frequency of being tested, development activeness, and functional diversity. We also open the detail information of our benchmark program, including the POC files and the initial seeds, at github: MemLock_Benchmark to facilitate the research in this area.
  6. Experiment Result and Raw Data
    • In this page, we provide the Detail experimental result, together with the raw experimental data and additional analysis. We consider both unique bugs and unique crashes each fuzzer finds in the fuzzing process. Since MemLock is to stress the space complexity issues of programs, we also distill the memory consumption of each seed in the pool. Finally, we analyze the impact of MemLock's guidance mechanism on both coverage and execution speed. We also apply statistical tests to the experimental results. We use the Mann Whitney U-test to determine the stochastic ranking of all the compared fuzzers, and the A12 statistics to show the possibility that one fuzzer is better than the other according to all the runs.
  7. Quality of seeds on Memory Consumption
    • In this page, we use experiments to observe the test input distribution according to memory consumption. MemLock seeks to generate test inputs that consume more and more memory, which is quite different from other fuzzers.
  8. Additional Evaluation for Overhead
    • Since MemLock intentionally keeps seeds that increase memory consumption, it is likely that test executions over time will take longers, thus reducing the efficiency of the fuzzer. Thus, we also analyze the impact of MemLock's guidance mechanism on both coverage and execution speed.
  9. Case Study: CVE-2019-6293
    • An issue was discovered in the function mark_beginning_as_normal in nfa.c in flex 2.6.4. There is a stack exhaustion problem caused by the mark_beginning_as_normal function making recursive calls to itself in certain scenarios involving lots of '*' characters. Remote attackers could leverage this vulnerability to cause a denial-of-service.
  10. Case Study: CVE-2018-17985
    • An issue was discovered in cp-demangle.c in GNU libiberty, as distributed in GNU Binutils 2.31. There is a stack consumption problem caused by the cplus_demangle_type function making recursive calls to itself in certain scenarios involving many 'P' characters.
  11. Case Study: CVE-2018-4868
    • The Exiv2::Jp2Image::readMetadata function in jp2image.cpp in Exiv2 0.26 allows remote attackers to cause a denial of service (excessive memory allocation) via a crafted file.
  12. Fuzzing Related Work
    • We have examined hundreds of recently published papers on fuzz testing located by perusing top-conference proceedings and other quality venues, and studied their approach. We made a simple comb and summary. Our investigation of fuzzing can also be seen in the summary: https://wcventure.github.io/FuzzingPaper/.
  13. Video of the presentation
    • We show a video of the presentation.
  14. Video of Tool Demo for Explanation
    • We show a video of our tool demo.
  15. Link
    • MemLock is available now.

Memory Consumption Related Bugs

As the definition in CWE 400, "The software does not properly control the allocation and maintence of a limited resource thereby enabling an actor to influence the amount of resources consumed, eventually leading to the exhaustion of available resources". If an attacker can trigger the allocation of these limited resources, but the number or size of the resources is not controlled, then the attacker could cause a denial of service that consumes all available resources. This would prevent valid users from accessing the software, and it could potentially have an impact on the surrounding environment. For example, a memory exhaustion attack against an application could slow down the application as well as its host operating system.

However, memory consumption bugs are essentially different from memory corruption vulnerabilities, because they are primarily in relation to the problem of uncontrolled memory resource consumption. In our work, we focus on three types of memory consumption related bugs: uncontrolled-recursion, uncontrolled-memory-allocation, and memory leak.

    CWE-674: Uncontrolled Recursion: The product does not properly control the amount of recursion that takes place, which consumes excessive resources, such as allocated memory or the program stack.
    CWE-789: Uncontrolled Memory Allocation: The product allocates memory based on an untrusted size value, but it does not validate or incorrectly validates the size, allowing arbitrary amounts of memory to be allocated.
    CWE-401: Memory Leak: The software does not sufficiently track and release allocated memory after it has been used, which slowly consumes remaining memory.

MemLock's Solution


We present MemLock to enhance grey-box fuzzing to find memory consumption bugs. The coverage information guides to explore different program paths, and the memory consumption information guides the program path to consume more memory. This input can be further mutated so that the newly generated input consumes more and more memory. After some mutations, MemLock is expected to generate an input whereby the memory consumption exceeds the available memory.

The Figure above shows the workflow of MemLock, which contains two main components: static analysis and fuzzing loop. In particular, the static analysis takes the program source code as the input, and analyzes three kinds of information: control flow graph, call graph, and memory operations. The static analysis in MemLock helps decide where to instrument and what to instrument. The control flow graph information is used to collect the branch coverage; the call graph information aids to instrument the function call entries and returns. Based on the memory operation statements, MemLock instruments the locations of memory allocation and free.

Once the program is instrumented, MemLock enters the continuous fuzzing loop to detect memory consumption bugs. Given the initial seeds, MemLock selects a seed s from the seed pool. As for the seed s, MemLock uses the mutation strategy to generate the new inputs (test cases). MemLock then runs the generated inputs against the instrumented program, and collects their memory consumption information and branch coverage information. If the generated seeds consume more memory or have new branch coverage, they are retained as interesting seeds. MemLock prioritizes these interesting seeds and adds them into the seed pool, again. MemLock repeats this process until reaching time or resource budget limits.



Read the MemLock paper for more details.




Practical Security Impact

With MemLock, we have responsibly disclosed several previously unknown security-critical vulnerabilities. These vulnerabilities were not previously reported and rated as a medium-security risk. We informed the maintainers, and Mitre assigned 17 CVEs for these issues. Among these 17 CVEs, 9 CVEs are uncontrolled-recursion vulnerabilities, 6 CVEs are vulnerabilities due to uncontrolled-memory-allocation issues, and 2 CVEs are about memory leak vulnerabilities. An attacker might leverage these vulnerabilities to launch an attack, by providing well-conceived inputs that trigger excessive memory consumption. Our report made these vulnerable programs being patched. At the time of writing, 12 of these vulnerabilities have been patched.

Our bug report has received wide acclaim in the community. We are confident that MemLock is effective and viable in practice.


We investigated an uncontrolled recursive vulnerability we identified. Users can directly control the depth of recursive function calls. Then, the developers also confirm the existence of this vulnerability and recognizes that uncontrolled recursion is bad behavior. And the developers reply to us, "Thanks for presenting your analysis of the defect. It is trivially handled by the user as you say bug a default recursion limit would be nice. Do you plan to implement that feature to fight off the CVE hounds? Thanks". Finally, We help developers fix this vulnerability by restricting recursion.


In another example, we found an uncontrolled memory allocation vulnerability through MemLock. The conditions for triggering this vulnerability are very harsh, and a large value is required to fail the memory allocation. After the memory allocation failed, the program crashed directly. We worked with the author to artificially analyze the vulnerability. Finally, we draw some conclusions. The program would crash under two situations: 1. try to allocate zero sized data; 2. try to allocate 1TB or more memory. The autore reply, "I have added a workaround for the latter case, but it is recommended to check the retrun value of malloc. You can Contribute PR".


Other Examples


In a word, MemLock is well received by the community, and we are confident that MemLock is effective and viable in practice.


To ensure anonymity, the CVEs ID are not made available in the review phase. We will make in available soon, togeter with a set of case studies.


Security-critical Vulnerabilities(15 CVEs)

CVEs Vulnerability Type Vulnerability Description
CVE-2019-6293 CWE-674: Uncontrolled Recursion An issue was discovered in the function mark_beginning_as_normal in nfa.c in flex 2.6.4. There is a stack exhaustion problem caused by the mark_beginning_as_normal function making recursive calls to itself in certain scenarios involving lots of '*' characters. Remote attackers could leverage this vulnerability to cause a denial-of-service.
CVE-2019-6292 CWE-674: Uncontrolled Recursion An issue was discovered in singledocparser.cpp in yaml-cpp (aka LibYaml-C++) 0.6.2. Stack Exhaustion occurs in YAML::SingleDocParser, and there is a stack consumption problem caused by recursive stack frames: HandleCompactMap, HandleMap, HandleFlowSequence, HandleSequence, HandleNode. Remote attackers could leverage this vulnerability to cause a denial-of-service via a cpp file.
CVE-2019-6291 CWE-674: Uncontrolled Recursion An issue was discovered in the function expr6 in eval.c in Netwide Assembler (NASM) through 2.14.02. There is a stack exhaustion problem caused by the expr6 function making recursive calls to itself in certain scenarios involving lots of '!' or '+' or '-' characters. Remote attackers could leverage this vulnerability to cause a denial-of-service via a crafted asm file.
CVE-2019-6290 CWE-674: Uncontrolled Recursion An infinite recursion issue was discovered in eval.c in Netwide Assembler (NASM) through 2.14.02. There is a stack exhaustion problem resulting from infinite recursion in the functions expr, rexp, bexpr and cexpr in certain scenarios involving lots of '{' characters. Remote attackers could leverage this vulnerability to cause a denial-of-service via a crafted asm file.
CVE-2018-18701 CWE-674: Uncontrolled Recursion An issue was discovered in cp-demangle.c in GNU libiberty, as distributed in GNU Binutils 2.31. There is a stack consumption vulnerability resulting from infinite recursion in the functions next_is_type_qual() and cplus_demangle_type() in cp-demangle.c. Remote attackers could leverage this vulnerability to cause a denial-of-service via an ELF file, as demonstrated by nm.
CVE-2018-18700 CWE-674: Uncontrolled Recursion An issue was discovered in cp-demangle.c in GNU libiberty, as distributed in GNU Binutils 2.31. There is a stack consumption vulnerability resulting from infinite recursion in the functions d_name(), d_encoding(), and d_local_name() in cp-demangle.c. Remote attackers could leverage this vulnerability to cause a denial-of-service via an ELF file, as demonstrated by nm.
CVE-2018-18484 CWE-674: Uncontrolled Recursion An issue was discovered in cp-demangle.c in GNU libiberty, as distributed in GNU Binutils 2.31. Stack Exhaustion occurs in the C++ demangling functions provided by libiberty, and there is a stack consumption problem caused by recursive stack frames: cplus_demangle_type, d_bare_function_type, d_function_type.
CVE-2018-17985 CWE-674: Uncontrolled Recursion An issue was discovered in cp-demangle.c in GNU libiberty, as distributed in GNU Binutils 2.31. There is a stack consumption problem caused by the cplus_demangle_type function making recursive calls to itself in certain scenarios involving many 'P' characters.
CVE-2019-7704 CWE-789: Uncontrolled Memory Allocation wasm::WasmBinaryBuilder::readUserSection in wasm-binary.cpp in Binaryen 1.38.22 triggers an attempt at excessive memory allocation, as demonstrated by wasm-merge and wasm-opt.
CVE-2019-7698 CWE-789: Uncontrolled Memory Allocation An issue was discovered in AP4_Array::EnsureCapacity in Core/Ap4Array.h in Bento4 1.5.1-627. Crafted MP4 input triggers an attempt at excessive memory allocation, as demonstrated by mp42hls, a related issue to CVE-2018-20095.
CVE-2019-7148 CWE-789: Uncontrolled Memory Allocation An attempted excessive memory allocation was discovered in the function read_long_names in elf_begin.c in libelf in elfutils 0.174. Remote attackers could leverage this vulnerability to cause a denial-of-service via crafted elf input, which leads to an out-of-memory exception. NOTE: The maintainers believe this is not a real issue, but instead a "warning caused by ASAN because the allocation is big. By setting ASAN_OPTIONS=allocator_may_return_null=1 and running the reproducer, nothing happens."
CVE-2018-20652 CWE-789: Uncontrolled Memory Allocation An attempted excessive memory allocation was discovered in the function tinyexr::AllocateImage in tinyexr.h in tinyexr v0.9.5. Remote attackers could leverage this vulnerability to cause a denial-of-service via crafted input, which leads to an out-of-memory exception.
CVE-2018-18483 CWE-789: Uncontrolled Memory Allocation The get_count function in cplus-dem.c in GNU libiberty, as distributed in GNU Binutils 2.31, allows remote attackers to cause a denial of service (malloc called with the result of an integer-overflowing calculation) or possibly have unspecified other impact via a crafted string, as demonstrated by c++filt.
CVE-2018-20657 CWE-401: Memory Leak The demangle_template function in cplus-dem.c in GNU libiberty, as distributed in GNU Binutils 2.31.1, has a memory leak via a crafted string, leading to a denial of service (memory consumption), as demonstrated by cxxfilt, a related issue to CVE-2018-12698.
CVE-2018-20002 CWE-401: Memory Leak The _bfd_generic_read_minisymbols function in syms.c in the Binary File Descriptor (BFD) library (aka libbfd), as distributed in GNU Binutils 2.31, has a memory leak via a crafted ELF file, leading to a denial of service (memory consumption), as demonstrated by nm.


Evaluation Dataset

We also provide a repo containing the benchmarks, CVE information, as well as the inital seeds to replicate our experiments.

MemLock's evaluation Data Set: https://github.com/ICSE2020-MemLock/MemLock_Benchmark


Target Program Project Version SLoC Input Format Test Instruction Detail Information
mjs mjs 1.20.1 40,440 js mjs @@ https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/mjs
cxxfilt Bintuils 2.31 1,757,662 string cxxfilt -t https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/cxxfilt
nm Bintuils 2.31 1,757,662 ELF nm -C @@ https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/nm
nasm NASM 2.14.03rc1 105,178 ASM nasm -f bin $POC -o /dev/null https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/nasm
flex Flex 2.6.4 27,653 c/cpp flex @@ https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/flex
prase Yaml-cpp 0.6.2 58,662 c/cpp parse @@ https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/yaml-cpp
sassc Libsass 3.5.4 27849 text sassc @@ https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/libsass
yara Yara 3.5.0 45,722 text yara @@ strings https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/yara
readelf binutils 2.28 1,844,210 elf readelf -a @@ https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/readelf
exiv2 Exiv2 0.26 84,430 jpeg exiv2 -pX @@ https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/exiv2
opj_decompress OpenJPEG 2.3.0 243,555 jpeg opj_decompress -i @@ -o ./tmp.png https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/openjpeg
mp42hls Bento4 v1.5.1-627 78,674 mp4 mp42hls @@ https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/Bento4
listswf libming  0.4.8 92,100 swf listswf @@ https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/libming
jasper jasper 2.0.14 44,925 jpeg jasper --input @@ --output test.bmp --output-format bmp https://github.com/ICSE2020-MemLock/MemLock_Benchmark/tree/master/jasper

Experiment Result

Unique Crashes Evaluation


To evaluate the effectiveness of fuzzers, a direct measurement is the number of unique crashes found by different fuzzers. It is believed that, more unique crashes usually indicate higher chances of covering more unique vulnerabilities.

The table below shows the number of unique crashes found by 7 different fuzzers within 24 hours in the benchmark programs. It is worth noting that, we only count the unique crashes related to memory consumption. Out of the 17 groups of experiments, MemLock performs best in the 10 (58.8%) group of experiments among 7 different fuzzers, as shown in column MemLock. In total, MemLock finds 2009 unique memory consumption-related crashes in the benchmark programs, improving by 59.2%, 70.5%, 76.9%, 98.1%, 40.5% and 66.7% respectively, compared to state-of-the-art fuzzers AFL, AFLfast, PerfFuzz, FairFuzz, Angora and QSYM. Especially, MemLock is able to find unique crashes in all benchmark programs, while other 6 state-of-the-art fuzzers may not find unique crashes in some benchmark programs. For example, other 5 state-of-the-art fuzzers cannot find any unique crashes in the program flex, however, MemLock can find 61 unique crashes within 24 hours.

Following Klees' recommendation, we also conduct the statistic test for the results. The A12 statistic measures the probability that one fuzzer outperforms another fuzzer, as shown in columns with the heading A12. Further, we use Mann-Whitney U to measure the statistical significance of performance gain. When significant, we mark the corresponding A12 values in the bold. Out of 102 A12 values in the table, 72 (70.6%) A12 values are bold and exceeding the conventionally large effect size (0.71). Thus, we can conclude that MemLock significantly outperforms other 5 state-of-the-art fuzzers in most benchmark programs.

As MemLock focuses on the space complexity issues, it prefers the longer test cases and thus may degrade the capability to find other types of vulnerabilities. For example, longer test cases tend to trigger uncontrolled-recursion issues. Therefore, we also evaluate the capability to find other types of crashes. In the benchmark programs, the state-of-the-art fuzzers MemLock, AFL, AFLfast, PerfFuzz, FairFuzz and QSYM find 77, 239, 228, 189, 276 and 236 other types of unique crashes, respectively.



Program Raw_Data Vulnerability Unique Crashes Statistical Test
MemLock AFL AFLfast PerfFuzz fairFuzz Angora QSYM AFL AFLfast PerfFuzz FairFuzz Angora QSYM
MemCrash Other MemCrash Other MemCrash Other MemCrash Other MemCrash Other MemCrash Other MemCrash Other p-value A12 p-value A12 p-value A12 p-value A12 p-value A12 p-value A12
mjs Download uncontrolled-recursion 114 16 36 184 31 172 88 106 12 219 0 0 30 181 0.0152 1.00 0.0152 1.00 0.0204 0.96 0.0152 1.00 0.0152 1.00 0.0152 1.00
cxxfilt Download uncontrolled-recursion 448 0 373 0 304 0 401 0 39 0 0 0 327 0 0.0152 1.00 0.0152 1.00 0.0561 0.88 0.0152 1.00 0.0152 1.00 0.0152 1.00
nm Download uncontrolled-recursion 127 0 12 0 21 0 17 0 0 0 0 0 20 0 0.0147 1.00 0.0152 1.00 0.0105 1.00 0.0105 1.00 0.0105 1.00 0.0142 1.00
nasm Download uncontrolled-recursion 132 0 6 0 4 0 40 0 0 0 0 0 4 0 0.0147 1.00 0.0147 1.00 0.0147 1.00 0.0101 1.00 0.0101 1.00 0.0142 1.00
flex Download uncontrolled-recursion 61 0 0 0 0 0 0 0 0 0 0 0 0 0 0.0101 1.00 0.0101 1.00 0.0101 1.00 0.0101 1.00 0.0101 1.00 0.0101 1.00
yaml-cpp Download uncontrolled-recursion 4 0 0 0 1 0 3 0 0 0 0 296 0 0 0.0105 1.00 0.0142 1.00 0.4404 0.56 0.0142 1.00 0.0105 1.00 0.0105 1.00
libsass Download uncontrolled-recursion 23 49 6 39 4 41 23 73 11 44 26 22 7 33 0.0142 1.00 0.0142 1.00 0.5000 0.53 0.0551 0.88 0.0101 0.25 0.0142 1.00
yara Download uncontrolled-recursion 156 0 34 0 33 0 65 0 0 0 13 0 31 1 0.0152 1.00 0.0152 1.00 0.0303 0.94 0.0147 1.00 0.0152 1.00 0.0147 1.00
readelf Download uncontrolled-memory-allocation 273 0 104 0 110 0 54 0 181 0 0 11 114 0 0.0061 1.00 0.0061 1.00 0.0060 1.00 0.0301 0.88 0.0152 1.00 0.0060 1.00
exiv2 Download uncontrolled-memory-allocation 10 5 11 8 11 8 6 9 15 6 13 6 8 14 0.0352 0.14 0.0675 0.20 0.0226 0.90 0.0058 0.00 0.0443 0.16 0.5000 0.52
openjpeg Download uncontrolled-memory-allocation 16 0 8 0 5 0 0 0 7 0 0 0 5 0 0.0694 0.80 0.0058 1.00 0.0036 1.00 0.4569 0.46 0.0152 1.00 0.0706 0.80
bento4 Download uncontrolled-memory-allocation 5 1 2 2 2 0 2 0 1 1 189 4 1 1 0.0014 1.00 0.0063 0.98 0.0036 1.00 0.0047 1.00 0.0038 0.00 0.0036 1.00
memory leak 145 78 72 61 125 290 74 0.0060 1.00 0.0056 1.00 0.0056 1.00 0.0061 1.00 0.0061 0.00 0.0060 1.00
libming Download uncontrolled-memory-allocation 18 0 20 0 18 0 17 0 20 0 3 0 16 0 0.0651 0.20 0.3342 0.60 0.2926 0.62 0.0116 0.20 0.0626 1.00 0.0060 0.80
memory leak 264 336 324 324 371 87 354 0.0061 0.00 0.0060 0.00 0.0061 0.00 0.0061 0.00 0.0060 1.00 0.0061 0.00
jasper Download uncontrolled-memory-allocation 3 6 2 6 3 7 0 1 3 6 2 4 2 6 0.0279 0.84 0.4102 0.56 0.0037 1.00 0.4106 0.56 0.0060 1.00 0.0124 0.92
memory leak 210 234 235 35 216 820 212 0.0177 0.08 0.0177 0.08 0.0059 1.00 0.3381 0.40 0.0061 0.00 0.4583 0.46
Total Number of Unique Crashes 2009 77 1262 239 1178 228 1136 189 1014 276 1430 343 1205 236
Improvement - +59.19% +70.54% +76.85% +98.13% +40.49% +66.72%


To better compare different fuzzers, we also use the plots to depict the performance over time in some benchmark programs, as shown in Figure below. It shows that MemLock has a steady and strong growth trend in finding unique crashes, and MemLock is also the first fuzzer that reported crashes.




Vulnerabilities Discovery


Although there is a weak correlation between the number of unique crashes and the number of vulnerabilities found during the fuzzing process. A more important measurement is the capability to find real-world vulnerabilities, as suggested by Klees. Thus, we compare the capability of MemLock to find unique vulnerabilities against baseline fuzzers in this section.

The table below shows the statistic results in 7 different fuzzers. The benchmark programs totally contain 34 vulnerabilities, out of which MemLock performs best in the 25 vulnerabilities among 6 state-of-the-art fuzzers, as shown in column MemLock. MemLock averagely takes about 5.4 hours to find each unique vulnerability, which is 2.15, 2.15, 2.20, 2.69, 3.76, 2.07 times faster than the state-of-the-art fuzzers AFL, AFlfast, PerfFuzz, FairFuzz, Angora and QSYM respectively. In particular, MemLock find 33 out of 34 unique vulnerabilities within 24 hours, while other fuzzers AFL, AFLfast, PerfFuzz, FairFuzz, Angora and QSYM only find 26, 28, 20, 17, 6 and 25, respectively. The three unique vulnerabilities (i.e., issue#106, CVE-2018-18701 and CVE-2019-6293) in mjs, nm, and flex can found by only MemLock within 24 hours. Therefore, it is proved that our memory-consumption guided strategy is very effective to find memory consumption-related bugs.

In addition, we also conduct the statistic test for real-world vulnerability evaluation. Out of 204 A12 values in the table, 139 (68.1%) A12 values are bold and exceeding the conventionally large effect size (0.71). Thus, MemLock significantly outperforms other 6 state-of-the-art fuzzers in finding unique vulnerabilities.



Program Vulnerability Type Exposing Time (hours) Statistical Test (p-value & A12)
MemLock AFL AFLfast PerfFuzz FairFuzz Angora QSYM AFL AFLfast PerfFuzz FairFuzz Angora QSYM
mjs issue#58 uncontrolled-recursion 0.5 0.3 0.4 0.2 0.4 T/O 0.3 0.1169 0.25 0.1561 0.25 0.0344 0.13 0.1547 0.25 0.0101 1.00 0.1168 0.22
issue#106 uncontrolled-recursion 13.7 T/O T/O T/O T/O T/O T/O 0.0019 1.00 0.0019 1.00 0.0019 1.00 0.0019 1.00 0.0019 1.00 0.0019 1.00
cxxfilt CVE-2018-9138 uncontrolled-recursion 0.3 7.2 10.1 0.5 T/O T/O 3.3 0.0147 1.00 0.0147 1.00 0.0902 0.81 0.0101 1.00 0.0101 1.00 0.0147 1.00
CVE-2018-9996 uncontrolled-recursion T/O 16.5 T/O T/O T/O T/O T/O 0.0105 0.00 nan 0.50 nan 0.50 nan 0.50 nan 0.50 nan 0.50
CVE-2018-17985 uncontrolled-recursion 0.2 1.1 4.5 0.2 1.9 T/O 1.4 0.0089 1.00 0.0132 1.00 0.2266 0.63 0.0127 1.00 0.0127 1.00 0.0132 1.00
CVE-2018-18484 uncontrolled-recursion 0.2 1 4.5 0.2 8 T/O 1.4 0.0132 1.00 0.0132 1.00 0.2266 0.63 0.0105 1.00 0.0132 1.00 0.0132 1.00
CVE-2018-18700 uncontrolled-recursion 0.2 1.2 4.6 0.3 12.6 T/O 1.4 0.0105 1.00 0.0105 1.00 0.0929 0.75 0.0105 1.00 0.0105 1.00 0.0105 1.00
nm CVE-2018-12641 uncontrolled-recursion 2.6 19.1 12.6 12.2 T/O T/O 12.8 0.0132 1.00 0.0151 1.00 0.0561 0.88 0.0105 1.00 0.0105 1.00 0.0331 0.88
CVE-2018-17985 uncontrolled-recursion 10.4 18.2 11.9 T/O T/O T/O 13.3 0.0281 0.81 0.4426 0.56 0.0105 1.00 0.0105 1.00 0.0105 1.00 0.3315 0.63
CVE-2018-18484 uncontrolled-recursion 9.9 16.4 17.1 T/O T/O T/O 14 0.0448 0.84 0.0448 0.84 0.0105 1.00 0.0105 1.00 0.0105 1.00 0.4422 0.75
CVE-2018-18700 uncontrolled-recursion 9.6 14.9 17.8 T/O T/O T/O T/O 0.3315 0.63 0.0461 0.88 0.0105 1.00 0.0105 1.00 0.0105 1.00 0.0105 1.00
CVE-2018-18701 uncontrolled-recursion 13.9 T/O T/O T/O T/O T/O T/O 0.0019 1.00 0.0019 1.00 0.0019 1.00 0.0019 1.00 0.0019 1.00 0.0019 1.00
CVE-2019-9070 uncontrolled-recursion 18.4 15.6 13.9 T/O T/O T/O 15.8 0.4419 0.56 0.4419 0.44 0.0105 1.00 0.0101 1.00 0.0101 1.00 0.4419 0.56
CVE-2019-9071 uncontrolled-recursion 12.4 T/O 14 T/O T/O T/O T/O 0.0344 0.88 0.2352 0.69 0.0344 0.88 0.0105 0.88 0.0344 1.00 0.0344 0.88
nasm CVE-2019-6290, (CVE-2018-1000886) uncontrolled-recursion 0.9 T/O 19 9 T/O T/O 17.6 0.0105 1.00 0.0152 1.00 0.0152 1.00 0.0105 1.00 0.0105 1.00 0.0147 1.00
CVE-2019-6291 uncontrolled-recursion 1.5 9 14 8.7 T/O T/O 7.5 0.0303 0.94 0.0152 1.00 0.0152 1.00 0.0105 1.00 0.0105 1.00 0.0152 1.00
flex CVE-2019-6293 uncontrolled-recursion 5.4 T/O T/O T/O T/O T/O T/O 0.0036 1.00 0.0036 1.00 0.0036 1.00 0.0036 1.00 0.0036 1.00 0.0036 1.00
prase CVE-2019-6292, (CVE-2019-6285) uncontrolled-recursion 0.4 T/O 18.4 0.9 T/O T/O T/O 0.0101 1.00 0.0127 1.00 0.0956 0.81 0.0101 1.00 0.0101 1.00 0.0101 1.00
CVE-2018-20573, (CVE-2018-20574) uncontrolled-recursion 6.1 T/O T/O 12.4 T/O T/O T/O 0.0334 0.88 0.0334 0.84 0.0843 0.84 0.0334 0.84 0.0105 1.00 0.0334 0.84
sassc CVE-2018-19837 uncontrolled-recursion 1.6 13.3 10.5 1.8 8.5 T/O 5 0.0451 0.88 0.0461 0.88 0.3325 0.63 0.0461 0.88 0.0105 1.00 0.0769 0.81
CVE-2018-20821 uncontrolled-recursion 0.1 5.7 6.5 0.1 9.5 T/O 7.4 0.0105 1.00 0.0105 1.00 nan 0.50 0.0105 1.00 0.0105 1.00 0.0105 1.00
CVE-2018-20822 uncontrolled-recursion 15.6 14.3 19.5 14.6 0.92 11.3 10.5 0.4388 0.50 0.4388 0.56 0.5000 0.47 0.4388 0.56 0.0105 0.00 0.4408 0.44
yara CVE-2017-9438, (CVE-2017-9304) uncontrolled-recursion 0.2 0.9 0.61 4.3 5.3 T/O 0.8 0.0128 1.00 0.0132 1.00 0.0325 0.91 0.0132 1.00 0.0105 1.00 0.0132 1.00
readelf CVE-2017-15996 uncontrolled-memory-allocation 0.2 0.3 0.2 0.5 0.3 T/O 0.3 0.0326 0.86 0.1821 0.68 0.0144 0.92 0.183 0.68 0.0105 1.00 0.0091 0.96
exiv2 CVE-2018-4868 uncontrolled-memory-allocation 0.1 0.1 0.1 0.1 0.1 0.1 0.1 nan 0.50 nan 0.50 nan 0.50 nan 0.50 nan 0.50 nan 0.50
mp42hls CVE-2018-20186 uncontrolled-memory-allocation 0.4 0.4 0.4 0.4 0.4 0.1 0.4 nan 0.50 nan 0.50 nan 0.50 nan 0.50 0.0105 0.00 nan 0.50
CVE-2019-7698, (CVE-2019-6966) uncontrolled-memory-allocation 14.6 T/O T/O T/O T/O 0.5 T/O 0.0190 1.00 0.0190 1.00 0.0190 1.00 0.0190 1.00 0.0105 0.00 0.0190 1.00
listswf CVE-2019-7581, (CVE-2018-7876) uncontrolled-memory-allocation 0.6 0.8 1.4 2 0.4 T/O 1.6 0.2009 0.68 0.0712 0.80 0.0296 0.88 0.2648 0.36 0.0105 1.00 0.0712 0.80
CVE-2019-7582, (CVE-2018-13251) uncontrolled-memory-allocation 0.1 0.1 0.1 0.1 0.1 0.1 0.1 nan 0.50 nan 0.50 nan 0.50 nan 0.50 nan 0.50 nan 0.50
issue#155 uncontrolled-memory-allocation 1.4 1 1.3 1.4 1.2 T/O 1.6 0.1728 0.30 0.2654 0.36 0.338 0.40 0.3766 0.42 0.0105 1.00 0.2654 0.64
opj_decompress CVE-2019-6988 uncontrolled-memory-allocation 7.8 15.1 11.1 T/O T/O T/O 15.3 0.0326 0.86 0.0417 0.84 0.0075 1.00 0.0075 1.00 0.0075 1.00 0.0449 0.81
CVE-2017-12982 uncontrolled-memory-allocation 4.5 11.4 10 T/O 11.9 T/O 10 0.0147 0.72 0.2016 0.60 0.0075 1.00 0.2648 0.64 0.0075 1.00 0.3566 0.50
jasper CVE-2016-8886 uncontrolled-memory-allocation 4.1 17 22.3 T/O 10.3 T/O 18.2 0.0285 0.88 0.0059 1.00 0.0075 1.00 0.5000 0.52 0.0075 1.00 0.0296 0.88
issue#207 uncontrolled-memory-allocation 1.7 2.2 3.6 T/O 2.2 15.9 4 0.338 0.62 0.5 0.68 0.0075 1.00 0.4582 0.68 0.0075 1.00 0.2654 0.64
umber of vulnerabilities revealed 33 26 28 20 17 6 25
Detectability Improvement - +26.92% +17.86% +65.00% +94.12% +450.00% +32.00%
Average Time Use 5.4 11.6 11.6 11.9 14.5 20.3 11.2
Time Improvement - 2.15X 2.15X 2.20X 2.69X 3.76X 2.07X



Memory Leakage


Memory leak bugs are a little different from uncontrolled-recursion and uncontrolled-memory-allocation bugs, because they may not lead to program crashes immediately. Only enough memory is leaked, it would produce Denial-of-Service (DoS) attack, for example, in a long time running programs (e.g., banking service). To evaluate the effectiveness of fuzzers in finding memory leak, we look into the number of total bytes leaked within 24 hours during 6 different state-of-the-art fuzzers.

The table shows the number of total bytes leaked for each fuzzer on different programs. We can see that \toolname shows an obvious advantage over other baseline fuzzers. The number of bytes leaked is improved by from 257% to 49633%, compared to other baseline fuzzers. This is because MemLock tries to maximize each allocation and generates inputs with high memory consumption. When the memory leak happens, those memory-consuming inputs will often cause more-bytes memory leakage. In particular, after manual analysis, we have found that the total leak size of AFL depends too much on the choice of the initial seeds. It lacks the capability of breaking through the limitation caused by the initial seed inputs.



Quality of seeds on Memory Consumption

Since MemLock seeks to generate test inputs that consume more and more memory. In this experiment, we evaluate the test input distribution according to memory consumption for each fuzzer we have tested. A fuzzer that maintains a seed pool with a larger proportion of high memory consumption inputs is considered to have a better chance of detecting memory consumption-related bugs.

The figure shows the input distribution based on memory consumption. In general, we can clearly see that MemLock can generate more seeds with higher memory consumption. This is because the guidance mechanisms in ExMen help to gradually add more and more memory consuming inputs into the seed pool. In particular, for the uncontrolled-recursion bugs (nm, nasm, flex and yara), MemLock generates a large number of inputs that hold more than 30,000 function calls in the call stack, while PerfFuzz generates only a few and AFL/AFLfast can hardly generate inputs that hold more than 10,000 function calls. The pattern is similar for uncontrolled-memory-allocation bugs (readelf, openjpeg, jasper and libming). MemLock can generate a considerable amount of inputs with high memory consumption while the inputs of the other fuzzers concentrate on the low memory consumption region. The results clearly demonstrate the effectiveness of the strategies of MemLock in generating inputs with high memory consumption.


Additional Evaluation for Overhead

To determine the runtime overhead caused by the guidance mechanisms and static analysis required for MemLock, we compare the code coverage, the execution speed and regular crashes for each baseline fuzzer.


Case Study: CVE-2019-6293

Vulnerable code in flex/src/nfa.c

To demonstrate the reason behind MemLock's superiority, we present the case of CVE-2019-6293, where can found by only MemLock in our experiments. It is an uncontrolled-recursion vulnerability in flex, which is a lexical analyzer generator. This CVE is triggered by the mark_beginning_as_normal function making recursive calls to itself in certain scenarios involving lots of '*' characters, resulting in stack-overflow. We can directly look at the snippet code of nfa.c, as shown in the Figure. Intuitively, in some cases, function mark_beginning_as_normal shows the behavior of recursive calls.

As a brief explanation, This function mark each "beginning" state in a machine as being a "normal" states, and the "beginning" states are the epsilon closure of the first state. transchar represents the transition character of NFA state, trans1 is the state that NFA state arrives after accepting tranchar, and trans2 is the state of NFA reachable through epsilon. To make this function call to itself, firstly, the input should be valid and can success to reach mark_beginning_as_normal function at line 3. Second, the execution should get into the true branch of the if condition on line 10. Third, to reach line 15, this requires the current state to reach another state through epsilon. To satisfy these three requirements, rules with many '*' characters need to be constructed in the input. AFL discard those input that can make mark_beginning_as_normal functions to call itself multiple times, as those inputs do not increase coverage. We can see the peak length of call stack in the Figure. AFL does not retain any seed over 5000 lengths. Comparing to AFL, MemLock intentionally keeps seeds that increase the peak length of call stack, and finally triggering stack-overflow. This explains the reason why MemLock can found the vulnerability, while other fuzzers can not detect it in all 5 runs.

More interestingly, MemLock takes only 5.4 hours on average to discover this vulnerability (CVE-2019-6293), while other fuzzers all fail. We investigate MemLock's mutation history and identify a key mutation step. After a large number of mutation for initial seed, the execution successfully entry the mark_beginning_as_normal function. Once an input causes a recursive call to itself many times, MemLock update such input in Seed Queue. It increases the recursive depth of mark_beginning_as_normal function by adding some '*' characters into the input context, through havoc operation. Usually MemLock retains some seeds from different paths to mark_beginning_as_normal function. MemLock retains a seed for each path that can produce the maximum peak length of call stack. If the splice operation take two inputs with a large number of '*' characters, there is a certain probability that the the peak length of call stack can be doubled instantaneously. After 5 hours, MemLock success in generating inputs that make the peak length of call stack exceed 25000, and finally triggering stack-overflow.

Case Study: CVE-2018-17985

Vulnerable code in cp-demangle.c in Binutils v2.31

We illustrate the limitations of existing coverage-based grey-box fuzzing techniques for detecting memory consumption bugs with two examples summarized from real-world vulnerabilities. We use the vulnerability CVE-2018-17985 in the Figure to demonstrate a uncontrolled-recursion bug.

In the Figure, the function cplus_demangle_type recursively calls itself in line 12 when the input contains the character 'P'. The depth of recursion depends on the number of character 'P's in the input. With a sufficiently large recursive depth, the execution would run out of stack memory, causing stack overflow. To trigger a stack overflow, the fuzzer would need to generate inputs containing a large number of character 'P's.

However, existing coverage-based grey-box fuzzers do not have enough awareness about the change in recursive depth and solely use coverage information to retain interesting inputs. Take AFL as an example, it is aware of repeatedly executed CFG edges but only in a coarse manner. To be specific, AFL adopts the concept of "loop bucket" to retain interesting inputs. The loop bucket cannot tell the fine-grained change in recursive depth. Specially, it does not differentiate the change when the recursive depth is greater than $255$. Nevertheless, this number is still very far from causing stack exhaustion, which normally requires tens of thousands of recursive depth.

Therefore, to expose uncontrolled-recursion effectively, grey-box fuzzers need to have precise awareness about the stack memory consumption of the target program when executing an input.

Case Study: CVE-2018-4868

Vulnerable code in jp2image.cpp in Exiv2 v0.26

The Figure demonstrates an uncontrolled-memory-allocation problem in exiv2. At line 11-12, when the program parses a subBox in readMetadata(), a length is extracted from the user inputs. Then the length is fed directly into DataBuf() at line 13. Finally, this value is used as the size of a memory allocation request at line 4. Note that the program does not check the size before allocating memory. By carefully handcrafting the input, an adversary can provide arbitrarily large values for subBox.length, leading to program crash (i.e., std::bad_alloc) or running out of memory.

To trigger this problem, the fuzzer would need to generate inputs with a large subBox.length. For this purpose, the fuzzer needs to collect information about the value of subBox.length to retain the interesting inputs that can incur a large memory consumption.

However, existing coverage-based grey-box fuzzers lack awareness about the value of subBox.length. Therefore, they cannot effectively generate inputs causing subBox.length to become larger. Take AFL as an example, let us assume AFL now holds a seed input a which incurs the subBox.length of 100 and causes the function to enter the while at line 11 and eventually return at line 16. After some mutations, AFL may generate another input b which incurs the subBox.length of 10000 and also causes the function to enter the while at line 11 and return at line 16. We can clearly see that comparing with a, b consumes much more memory and is closer to running out of memory. However, AFL will discard input $b$ and will not retain it as a seed because b does not bring new branch coverage. Consequently, AFL cannot detect this uncontrolled-memory-allocation problem effectively.

Therefore, to expose uncontrolled-memory-allocation effectively, grey-box fuzzers also need to have precise awareness about the amount of consumed heap memory of the target program when executing an input.

Video of Presentation

Video of Tool Demo