Written by

Patrick Collins

Published on

June 26, 2024

How to fix ‘Data location must be memory or calldata‘ | Where can the EVM read and write data?

Learn where the EVM can read and write data, what calldata, memory, and storage are, and the best practices to know when writing your solidity or Vyper smart contracts

Table of Contents

Introduction

You’ve probably seen this error in solidity:

function doStuff(string stuff) public {
// The above will not compile, throwing an error saying:
// TypeError: Data location must be "memory" or "calldata" for parameter in 
// function, but none was given

Why do we get that?

What is "Solidity" memory or calldata?

And finally, why does this image represent the EVM?

If you want to get a full understanding of what’s going on under the hood, be sure to check out the Cyfrin Updraft Assembly and Formal Verification course, which goes deeper than what we are going to cover here.

The evm codes site does a great job of keeping up to date with EVM opcodes and what they do.

In order to really understand this article, we recommend you first understand what Bits and Bytes are.

Let’s dive in.

‍

Where can the Ethereum Virtual Machine (EVM) access data?

The EVM can read and write data from the following locations:

Stack
Memory
Storage
Transient Storage
Calldata
Code
Return Data

The EVM can write but not read data to the following locations:

Logs

The EVM can read but not write data from the following locations:

Transaction data
Chain data
Gas data
A few other hyper-specific places

The EVM stack

The stack in the EVM world is a data structure where items can only be added or removed from the top. It has two main operations:

push: Add to the top of the stack
pop: Remove from the top of the stack

In this regard, you can think of the stack like a stack of pancakes.

Most of the time, in Solidity or Vyper, anytime you create a variable, under the hood, you're placing an object onto the stack:

uint256 myNumber = 7;

This will place a temporary variable on the stack with a PUSHX opcode, where the number 7 is “pushed” onto the stack.

PUSH1 0x7 //0x7 is 7 in hex

When the EVM sees this, it automatically transforms 7 into it’s 32 byte edition with a bunch of leading 0s.

Objects can only be “pushed” onto the stack if they are smaller than 32 bytes. 7 is represented in 32 bytes as:

0x0000000000000000000000000000000000000000000000000000000000000007

The stack currently has a maximum limit of 1024 values, so in our pancake example, “1024 pancakes.” This is why many Solidity developers run into the infamous “stack too deep” error because their solidity code results in too many variables on the stack.

The stack is temporary, and objects on it are destroyed after a transaction* is completed. This is why when you create a variable in Solidity or Vyper, it doesn’t persist after the transaction ends (*technically the call context). It’s because the stack is deleted.

function doStuff() public {
	// This variable is added onto the stack when someone calls `doStuff`
	// Since it's on the stack, after the function call ends, or
 	// the transaction ends, the stack is deleted, therefore the 
	// variable 7 is deleted as well
uint256 myNumber = 7; 
}

The stack is the cheapest place (gas-wise) to store and retrieve data, and it’s the only place in the EVM where operations can be done “on” the data, for example, addition, subtraction, multiplication, left shifts, etc. However, it’s not always the best place to store data.

*Note: The stack is technically deleted when the call context ends, but you can read more about that in evm.codes. For now, just assume that when the transaction ends, the stack is destroyed. We will explain the call context in the transient storage section of this article.

EVM Memory

Now, the next temporary data location is going to be memory. Memory, like the stack, is deleted after the transaction ends. There are times when the stack isn’t good enough to place data, so instead we use memory.

Let's take a look at the following Solidity code:

uint8[3] memory myArray = [1,2,3]

An array, for example, wouldn’t fit onto the stack. For arrays, we need to store each of the elements and the array length. So, under the hood, we call the MSTORE opcode to store data in the memory data structure of the EVM. You can read from your memory later by calling the MLOAD opcode.

You’ll notice that in order to store anything in a memory array, we need to first put the objects onto the stack. This is one of the reasons why storing data in memory is a bit more expensive gas-wise than storing data on the stack. There are additional reasons, including memory expansion gas costs, which you can learn more about via this link.

PUSH1 0x1
PUSH0       // Pushes 0 onto the stack
MSTORE      // This results in 0x1 being stored at location 0x0 in memory

Memory, like the stack, is deleted after the call context ends (just assume “call context” means the transaction if that’s confusing. It’s a slight lie, but for learning, it’s ok).

Keep all this in mind for when we talk about calldata, since down there we will discuss why we see that error at the start: Data location must be "memory" or "calldata”.

Variables inside functions, like uint256 myNumber = 7, are always set up as a stack variable first, and depending on the compiler, they may also be stored in memory. Variables outside functions, aka “state variables,” are stored in storage.

EVM Storage

Now unlike memory and the stack, storage is stored permanently. When you store data as a state variable, it will be stored permanently. This is why when you create a public variable in Solidity, you are able to “get” that value by calling the function. However, creating a variable in a function sets it up as a temporary variable (in either memory or just on the stack).

contract MyContract{
uint256 myStorageVar = 7;     // This is in storage

function doStuff() public {
	uint256 myStackVar = 7; // This is on the stack
}
}

Storing an object into storage uses the same opcode setup as memory, just instead of MSTORE or MLOAD we use SSTORE and SLOAD.

The above code with myStorageVar will likely compile to a string of opcodes that looks like this:

PUSH1 0x7
PUSH0
SSTORE        // This stores the number 7 at storage slot 0

Storing data into storage is the most expensive way (gas wise) to store data in the EVM. Since we are storing the data permanently, all EVM nodes must have the data persist even after a transaction has ended. Since all nodes are required to do this “extra work” of storing the data permanently essentially, they increase how much gas it costs to run.

For the most part, storage is much simpler than memory, the stack, transient storage, and calldata. So let’s get into some of the more interesting places.

EVM Calldata

Now calldata is a little trickier to define since it’s a bit of an overloaded term. When we are referring to calldata we are referring to it in one of two means:

The Solidity keyword calldata
The EVM concept calldata

According to evm.codes, the calldata (as an EVM concept) is:

The calldata region is the data sent to a transaction as part of a smart contract transaction. For example, when creating a contract, calldata would be the constructor code of the new contract. Calldata is immutable, and can be read with instructions CALLDATALOAD, CALLDATASIZE, and CALLDATACOPY.

Whenever we call a function, we send data to a contract in the form of calldata. So when the EVM needs to read the data we sent to a contract, it reads from the calldata. For example, in foundry / cast, I can send a transaction by defining my calldata. Or if I’m sending a transaction from Metamask, I can see the calldata that is being sent by checking the hex tab.

‍

A sample calldata example.

‍

This is essentially the same as the Solidity keyword calldata, but we can make the definition a bit simpler when referring to the Solidity calldata keyword. In Solidity, only function parameters can be considered calldata because only functions can be called with calldata.

Calldata cannot be changed once sent in a transaction. It must be stored in another data structure (like the stack, memory, storage, etc) to be manipulated.

Now that we’ve learned both about calldata and memory, let’s now return to the error we had when we started this article.

function doStuff(string stuff) public {
// The above will not compile, throwing an error saying:
// TypeError: Data location must be "memory" or "calldata" for parameter in 
// function, but none was given

In our function doStuff, we need to tell the solidity compiler how we should handle the string stuff object. The string stuff object is a special object in solidity, a string. Strings are not-so-secretly bytes array objects. And since they are arrays, they are likely larger than 32 bytes, so they cannot fit on the stack. So we need to tell the solidity compiler whether the data that is coming in will be stored in memory or calldata.

If memory:

We can manipulate the stuff object (add to the string, save new strings, etc)
We can call the doStuff function with data stored in memory or calldata

If calldata:

We cannot manipulate the stuff object
We can only call the doStuff function with data stored as calldata

Whenever we call a function from outside the blockchain (for example, calling transfer on an ERC20 contract and signing with your Metamask or other browser wallets), that data is always sent as calldata. However, if a contract calls another function parameter, it can either send data as a calldata or memory.

Solidity is smart enough to convert calldata -> memory by storing the calldata into memory, but it cannot move data in memory into calldata. The calldata is part of the original transaction, and we cannot edit the original transaction data.

// Let's initially call this function from Metamask / A browser wallet
function calledFromMetamask(uint256[] calldata myArray) public {
	// calldata -> calldata
	calledFromFunctionCalldata(myArray);
	// calldata -> memory
	calledFromFunctionMemory(myArray);
}

function calledFromFunctionCalldata(uint256[] calldata myArray) public {
	// calldata -> calldata -> memory
	calledFromFunctionMemory(myArray);
}

function calledFromFunctionMemory(uint256[] memory myArray) public {
	// Uncommenting the line below will not compile because we have 
	// converted myArray from calldata -> memory

	// calledFromFunctionCalldata(myArray);
}

The distinction is important because it involves many tradeoffs for gas, and tells the compiler where to look for data.

‍

Calldata is deleted once the transaction or calling context has ended, and can be considered a temporary data location like the stack and memory.

‍

EVM Transient Storage

As of EIP-1153, there is now an additional location that acts like storage but is deleted after the transaction ends, making it another temporary storage location. However, unlike the stack, memory, and calldata, which are deleted after the calling context ends, transient storage is deleted after the transaction ends. Let’s learn what the “calling context” or “call context” is to understand this.

What is Call Context?

Whenever a function called in a transaction (an external function call or internal), a new “call context” is created. In the image above, you can see we’ve highlighted the area that is considered a “call context” which includes:

Program Counter
Available gas
Stack
Memory

Essentially, these are isolated environments for functions to store and manipulate data. And it’s the same reason why two functions can’t access variables in each other.

‍

In the example below, it’s why these two functions can have the exact same variable name, but they will never overlap. Whenever you call doStuff or doMoreStuff they will each get their own call context, with their own stack, memory, calldata, etc.

function doStuff() public {
	uint256 myNumber = 7;
}

function doMoreStuff() public {
	uint256 myNumber = 8;
}

A call context is ended when a RETURN, STOP, INVALID, or REVERT opcode is reached, or when a transaction reverts.

‍

Understanding this, we can now go back to understanding transient storage. As of Solidity version 0.8.24, we can use the TSTORE and TLOAD opcodes in yul.

modifier nonreentrant(bytes32 key) {
        assembly {
            if tload(key) { revert(0, 0) }
            tstore(key, 1)
        }
        _;
        assembly {
            tstore(key, 0)
        }
    }

The TSTORE and TLOAD opcodes work exactly like the SSTORE and SLOAD storage opcodes, but instead of storing the data permanently, the data is stored for the entire duration of the transaction and deleted once the transaction ends.

At the bottom of this article, we will have a cheat sheet table to help illustrate the differences.

Code

One of the final places we can store data, is as a contract, aka in the “code” location of the EVM. This is pretty straightforward, and it’s why using variables in solidity labeled constant and immutable are unable to be changed.

uint256 constant MY_VAR = 7;
uint256 immutable i_myVar = 7;

With immutable and constant variables, they are stored directly in the contract code, which can never be changed.* According to the solidity docs:

> The contract creation code generated by the compiler will modify the contract’s runtime code before it is returned by replacing all references to immutables with the values assigned to them.

This is why these values cannot be changed, they are stored in the contract bytecode itself.

*A contract be only be deleted with the SELFDESTRUCT opcode, and then that contract can be later replaced. This opcode however is surrounded by contraversy, and is slated for removal at some point.

‍

EVM Data Structure cheat sheet

Data Structure	Temporary?	Can be modified?	when is it deleted?	Data size?
Storage	No	Yes	It's not	32 bytes per SSTORE up to almost any amount
Memory	Yes	Yes	After call context or transaction ends	32 bytes per MSTORE up to almost any amount
Stack	Yes	Yes	After call context or transaction ends	32 bytes a slot up to 1024 values
Transient Storage	Yes	Yes	After transaction ends	32 bytes per TSTORE up to almost any amount
Calldata	Yes	No	After call context or transaction ends	Up to almost any amount
Code	No	No	It's not	Depends on the contract size limit of the specific EVM implementation

*Only when the SELFDESTRUCT opcode is called

Depends on the contract size limit of the specific EVM implementation

Return data

One of the final places the EVM can read and write to is the return data location. According to evm.codes:

> The return data is the way a smart contract can return a value after a call. It can be set by contract calls through the RETURN and REVERT instructions and can be read by the calling contract with RETURNDATASIZE and RETURNDATACOPY.

Essentially, whenever you see the return keyword, that is going to create the RETURN opcode to store data into the return data location.

function doStuff() public returns(uint256) {
	return uint256(7);
}

This can be read by other functions that call this data with a CALL, STATICCALL, CREATE, DELEGATECALL, and a few other opcodes. The return data is a bit odd, where calling the RETURN opcode will end the current call context and then pass the resulting data to the parent calling context as return data. The data can then be accessed with RETURNDATASIZE and RETURNDATACOPY. There is only one return data, and calling these opcodes will return the return data of the most recently ended call context. Return data does not persist and can be easily overwritten by a sub context calling the RETURN opcode.

This means that yes, in a call context, there can only be one piece of data in there. However, this data can be larger than 32 bytes, so you can fit whole arrays and other large variables into the return data.

‍

Write, but not read

Logs

Logs are a storage location in the EVM where code is purely written. In Solidity, this is done with the emit keyword.

event myEvent();
emit myEvent();

Read, but not write

In the EVM, there are many places where the EVM can read data. You can see examples of this in solidity:

msg.sender; 
block.chainid;
blobhash(0);
gasleft();

And many more globally available units.

‍

Summary

Hopefully, with this information you’ll understand the inner workings of the EVM better, to allow you to make more informed decisions!

And crucially, you now know why you see those common “stack too deep” and “must use calldata or memory” compiler errors in solidity!

‍

Please note that the EVM is constantly being improved, and this information was accurate as of 06/03/2024. If something seems out of date, please ping @cyfrinupdraft or @patrickalphac on Twitter.

‍