Introduction
You’ve probably seen this error in solidity:
Why do we get that?
What is "Solidity" memory or calldata?
And finally, why does this image represent the EVM?
If you want to get a full understanding of what’s going on under the hood, be sure to check out the Cyfrin Updraft Assembly and Formal Verification course, which goes deeper than what we are going to cover here.
The evm codes site does a great job of keeping up to date with EVM opcodes and what they do.
In order to really understand this article, we recommend you first understand what Bits and Bytes are.
Let’s dive in.
The EVM can read and write data from the following locations:
The EVM can write but not read data to the following locations:
The EVM can read but not write data from the following locations:
The stack in the EVM world is a data structure where items can only be added or removed from the top. It has two main operations:
In this regard, you can think of the stack like a stack of pancakes.
Most of the time, in Solidity or Vyper, anytime you create a variable, under the hood, you're placing an object onto the stack:
This will place a temporary variable on the stack with a PUSHX opcode, where the number 7 is “pushed” onto the stack.
When the EVM sees this, it automatically transforms 7 into it’s 32 byte edition with a bunch of leading 0s.
Objects can only be “pushed” onto the stack if they are smaller than 32 bytes. 7 is represented in 32 bytes as:
0x0000000000000000000000000000000000000000000000000000000000000007
The stack currently has a maximum limit of 1024 values, so in our pancake example, “1024 pancakes.” This is why many Solidity developers run into the infamous “stack too deep” error because their solidity code results in too many variables on the stack.
The stack is temporary, and objects on it are destroyed after a transaction* is completed. This is why when you create a variable in Solidity or Vyper, it doesn’t persist after the transaction ends (*technically the call context). It’s because the stack is deleted.
The stack is the cheapest place (gas-wise) to store and retrieve data, and it’s the only place in the EVM where operations can be done “on” the data, for example, addition, subtraction, multiplication, left shifts, etc. However, it’s not always the best place to store data.
*Note: The stack is technically deleted when the call context ends, but you can read more about that in evm.codes. For now, just assume that when the transaction ends, the stack is destroyed. We will explain the call context in the transient storage section of this article.
Now, the next temporary data location is going to be memory. Memory, like the stack, is deleted after the transaction ends. There are times when the stack isn’t good enough to place data, so instead we use memory.
Let's take a look at the following Solidity code:
An array, for example, wouldn’t fit onto the stack. For arrays, we need to store each of the elements and the array length. So, under the hood, we call the MSTORE opcode to store data in the memory data structure of the EVM. You can read from your memory later by calling the MLOAD opcode.
You’ll notice that in order to store anything in a memory array, we need to first put the objects onto the stack. This is one of the reasons why storing data in memory is a bit more expensive gas-wise than storing data on the stack. There are additional reasons, including memory expansion gas costs, which you can learn more about via this link.
Memory, like the stack, is deleted after the call context ends (just assume “call context” means the transaction if that’s confusing. It’s a slight lie, but for learning, it’s ok).
Keep all this in mind for when we talk about calldata, since down there we will discuss why we see that error at the start: Data location must be "memory" or "calldata”.
Variables inside functions, like uint256 myNumber = 7, are always set up as a stack variable first, and depending on the compiler, they may also be stored in memory. Variables outside functions, aka “state variables,” are stored in storage.
Now unlike memory and the stack, storage is stored permanently. When you store data as a state variable, it will be stored permanently. This is why when you create a public variable in Solidity, you are able to “get” that value by calling the function. However, creating a variable in a function sets it up as a temporary variable (in either memory or just on the stack).
Storing an object into storage uses the same opcode setup as memory, just instead of MSTORE or MLOAD we use SSTORE and SLOAD.
The above code with myStorageVar will likely compile to a string of opcodes that looks like this:
Storing data into storage is the most expensive way (gas wise) to store data in the EVM. Since we are storing the data permanently, all EVM nodes must have the data persist even after a transaction has ended. Since all nodes are required to do this “extra work” of storing the data permanently essentially, they increase how much gas it costs to run.
For the most part, storage is much simpler than memory, the stack, transient storage, and calldata. So let’s get into some of the more interesting places.
Now calldata is a little trickier to define since it’s a bit of an overloaded term. When we are referring to calldata we are referring to it in one of two means:
According to evm.codes, the calldata (as an EVM concept) is:
The calldata region is the data sent to a transaction as part of a smart contract transaction. For example, when creating a contract, calldata would be the constructor code of the new contract. Calldata is immutable, and can be read with instructions CALLDATALOAD, CALLDATASIZE, and CALLDATACOPY.
Whenever we call a function, we send data to a contract in the form of calldata. So when the EVM needs to read the data we sent to a contract, it reads from the calldata. For example, in foundry / cast, I can send a transaction by defining my calldata. Or if I’m sending a transaction from Metamask, I can see the calldata that is being sent by checking the hex tab.
A sample calldata example.
This is essentially the same as the Solidity keyword calldata, but we can make the definition a bit simpler when referring to the Solidity calldata keyword. In Solidity, only function parameters can be considered calldata because only functions can be called with calldata.
Calldata cannot be changed once sent in a transaction. It must be stored in another data structure (like the stack, memory, storage, etc) to be manipulated.
Now that we’ve learned both about calldata and memory, let’s now return to the error we had when we started this article.
In our function doStuff, we need to tell the solidity compiler how we should handle the string stuff object. The string stuff object is a special object in solidity, a string. Strings are not-so-secretly bytes array objects. And since they are arrays, they are likely larger than 32 bytes, so they cannot fit on the stack. So we need to tell the solidity compiler whether the data that is coming in will be stored in memory or calldata.
If memory:
If calldata:
Whenever we call a function from outside the blockchain (for example, calling transfer on an ERC20 contract and signing with your Metamask or other browser wallets), that data is always sent as calldata. However, if a contract calls another function parameter, it can either send data as a calldata or memory.
Solidity is smart enough to convert calldata -> memory by storing the calldata into memory, but it cannot move data in memory into calldata. The calldata is part of the original transaction, and we cannot edit the original transaction data.
The distinction is important because it involves many tradeoffs for gas, and tells the compiler where to look for data.
Calldata is deleted once the transaction or calling context has ended, and can be considered a temporary data location like the stack and memory.
As of EIP-1153, there is now an additional location that acts like storage but is deleted after the transaction ends, making it another temporary storage location. However, unlike the stack, memory, and calldata, which are deleted after the calling context ends, transient storage is deleted after the transaction ends. Let’s learn what the “calling context” or “call context” is to understand this.
Whenever a function called in a transaction (an external function call or internal), a new “call context” is created. In the image above, you can see we’ve highlighted the area that is considered a “call context” which includes:
Essentially, these are isolated environments for functions to store and manipulate data. And it’s the same reason why two functions can’t access variables in each other.
In the example below, it’s why these two functions can have the exact same variable name, but they will never overlap. Whenever you call doStuff or doMoreStuff they will each get their own call context, with their own stack, memory, calldata, etc.
A call context is ended when a RETURN, STOP, INVALID, or REVERT opcode is reached, or when a transaction reverts.
Understanding this, we can now go back to understanding transient storage. As of Solidity version 0.8.24, we can use the TSTORE and TLOAD opcodes in yul.
The TSTORE and TLOAD opcodes work exactly like the SSTORE and SLOAD storage opcodes, but instead of storing the data permanently, the data is stored for the entire duration of the transaction and deleted once the transaction ends.
At the bottom of this article, we will have a cheat sheet table to help illustrate the differences.
One of the final places we can store data, is as a contract, aka in the “code” location of the EVM. This is pretty straightforward, and it’s why using variables in solidity labeled constant and immutable are unable to be changed.
With immutable and constant variables, they are stored directly in the contract code, which can never be changed.* According to the solidity docs:
> The contract creation code generated by the compiler will modify the contract’s runtime code before it is returned by replacing all references to immutables with the values assigned to them.
This is why these values cannot be changed, they are stored in the contract bytecode itself.
*A contract be only be deleted with the SELFDESTRUCT opcode, and then that contract can be later replaced. This opcode however is surrounded by contraversy, and is slated for removal at some point.
*Only when the SELFDESTRUCT opcode is called
Depends on the contract size limit of the specific EVM implementation
One of the final places the EVM can read and write to is the return data location. According to evm.codes:
> The return data is the way a smart contract can return a value after a call. It can be set by contract calls through the RETURN and REVERT instructions and can be read by the calling contract with RETURNDATASIZE and RETURNDATACOPY.
Essentially, whenever you see the return keyword, that is going to create the RETURN opcode to store data into the return data location.
This can be read by other functions that call this data with a CALL, STATICCALL, CREATE, DELEGATECALL, and a few other opcodes. The return data is a bit odd, where calling the RETURN opcode will end the current call context and then pass the resulting data to the parent calling context as return data. The data can then be accessed with RETURNDATASIZE and RETURNDATACOPY. There is only one return data, and calling these opcodes will return the return data of the most recently ended call context. Return data does not persist and can be easily overwritten by a sub context calling the RETURN opcode.
This means that yes, in a call context, there can only be one piece of data in there. However, this data can be larger than 32 bytes, so you can fit whole arrays and other large variables into the return data.
Logs are a storage location in the EVM where code is purely written. In Solidity, this is done with the emit keyword.
In the EVM, there are many places where the EVM can read data. You can see examples of this in solidity:
And many more globally available units.
Hopefully, with this information you’ll understand the inner workings of the EVM better, to allow you to make more informed decisions!
And crucially, you now know why you see those common “stack too deep” and “must use calldata or memory” compiler errors in solidity!
Please note that the EVM is constantly being improved, and this information was accurate as of 06/03/2024. If something seems out of date, please ping @cyfrinupdraft or @patrickalphac on Twitter.