Fuzz Testing

Table of Contents

What is fuzz testing?

Fuzz testing or fuzzing, is an automated software testing technique that introduces large amounts of random input data into a system to expose flaws and vulnerabilities in the software. Inputs are introduced by a fuzzing tool which monitors, identifies, and logs errors like crashes and data leaks. Input data can be random characters, malformed data, or extreme values.

The purpose of fuzz testing is to identify potential security, performance, or quality issues by observing how the system reacts to inputs.

‍

How does fuzz testing work

Fuzz testing in traditional application development focuses on intentionally causing the applications to crash or behave unexpectedly by exposing vulnerabilities. In smart contract development, fuzz testing is designed to do this, but additionally to test a contract’s logic and verify it maintains predefined properties (invariants). Traditional applications also have invariant tests, but these invariant tests are critical to smart contracts, because violating one could have catastrophic consequences to the financial wellbeing of users.

‍

Importance of fuzz testing in smart contract development

Fuzz testing helps developers and security researchers strengthen the security of systems before vulnerabilities can be exploited.

Benefits of fuzz testing include:

Bug detection: Uncover vulnerabilities throughout the development process, saving time and resources.
Improved security: By identifying potential points of attack, developers can strengthen the security of smart contracts.
Performance optimization: Reveal performance bottlenecks.
Improved reliability: Ensures smart contracts, dApps, and other blockchain components like oracles and consensus mechanisms remain performant during abnormal, system-stressing conditions.

‍

Fuzz run

“Fuzz testing” involves sending random inputs to a smart contract multiple times and checking these properties. It involves:

Sending random data
Checking to see if the invariant/property holds
Resetting the contract state

Performing these three steps is considered one “fuzz run.” Typically, fuzz testing involves performing many fuzz runs. In most fuzzing systems, developers choose the number of “fuzz runs” to perform before they feel confident an invariant holds. Find more information below in the “how many fuzz runs are enough” section.

‍

Types of fuzz testing for smart contracts

Stateless and stateful fuzz testing are the two most common approaches used to assess how smart contracts handle inputs and maintain reliability. They differ in how a blockchain’s state is tracked and handled during testing.

‍

Stateless fuzz testing

Stateless fuzz testing sends random data to a function or contract and discards the results of previous tests. Each fuzz run ignores previous inputs and outputs. You can think of it as such:

Random data sent to a contract in a transaction
Invariant checked
Contract state reset
Repeat

It’s called “stateless” fuzz testing, because the contract’s state is reset after each transaction.

Stateless fuzzing can reveal straightforward, shallow issues like input validation errors or incorrect function behavior in isolation. However, it may miss complex issues that depend on state changes.

‍

Stateful fuzz testing

Stateful fuzz testing is where a fuzz run continues through multiple rounds of random data being sent and the contract’s state persists. The number of times random data is sent to a smart contract is called the “depth” of the fuzz run. If 200 separate transactions are called in a single fuzz run, that stateful fuzz test suite is said to have a “depth” of 200 runs.

A stateful fuzz run with a depth of 3 might look like:

Random data sent to a contract in a transaction
Invariant checked
Random data sent to a contract in a transaction
Invariant checked
Random data sent to a contract in a transaction
Invariant checked
Contract state reset

And of course, this will be repeated for as many fuzz runs as the testers like.

Stateful fuzz testing helps identify problems resulting from actions that happen in a particular order. For example, a contract not maintaining balances correctly after multiple transfers, issues caused by having too many or too few tokens, or attacks where a contract function is called before it finishes its previous execution.

‍

Fuzz testing tools for smart contracts

Several tools help security researchers perform fuzz tests.

Echidna (Solidity)
Foundry (Solidity)
Medusa (Solidity)
Hypothesis (Python, Vyper, Solidity)

‍

How many fuzz runs are enough?

Fuzzers are constantly looking for “new fuzz states” they haven’t seen before (see below). When using echidna or Medusa as a fuzzer, they will indicate the number of states explored. For best results, one should run the fuzzer until the number of states explored stops going up, then set a 24 hour timer. If the number of states increases again, reset the 24 hour timer. This ensures a contract’s state has been explored in as many ways as humanly possible.

‍

“State” in fuzzing vs “state” in a smart contract

A "state" in fuzzing typically refers to a unique, observable condition of the program being tested, characterized by:

The specific code path or execution flow being followed.
The program's interactions with external systems or resources.
The program's output or behavior in response to inputs.
Any new or unique error conditions or exception handling paths triggered.

Consider a simple conditional:

if (a) {
  // State A
  do_stuff()
} else {
  // State B
  something_else()
}

‍

Here, there are two “branches” or “code paths” the computation can reach. Each of these in execution would be considered a new “state.” This is different from “storage” or “contract state.”

uint256 a = 1; 

a = 2;

‍

In the above, there is a storage variable whose value changes. This is not what’s referred to when talking about different fuzzer states. States in a fuzzer, are about a different set of code being executed or code path being followed. “Contract states,” on the other hand, are about storage values in a contract.

It's worth noting that the exact definition of a "state" can vary somewhat depending on the specific fuzzing tool or technique being used, as different fuzzers may track and categorize program behaviors in slightly different ways.

‍

Challenges of smart contract fuzz testing

Fuzz testing also has challenges:

Complexity: Fuzzing complex systems effectively requires a deep understanding of the underlying system. It is sometimes necessary to compartmentalize the system into smaller modules to test individual components first. Then, test the complete system.
Completeness: Since you are sending random data, it’s computationally infeasible to explore every possible state, and one can never be 100% sure a fuzz suite ensures that an invariant can’t be broken.
Resource intensive: Fuzz testing can be resource intensive, especially with systems that involve a large number of transactions or computations. To overcome these challenges, researchers can optimize fuzzing algorithms, use cloud-based or distributed testing environments, and use incremental testing techniques.‍
False positives/negatives: Not all issues detected by fuzz testing are true vulnerabilities. And real issues might go undetected if the fuzzing process is not thorough enough. False positives occur when the tool identifies issues that seem problematic but don’t pose a real security risk. This could be due to broad test cases, misinterpretation of results, tool limitations, or edge cases that don't occur in real-world scenarios. False negatives occur when real vulnerabilities are not detected due to inadequate test coverage, complex logic, or configuration limitations. To overcome these, researchers employ strategies such as enhanced test coverage, iterative testing, manual review and verification, and tool calibration.

Related Terms

No items found.