Bytecode

Table of Contents

What is bytecode?

Bytecode is a low-level, machine-readable representation of code, executed by a virtual machine (VM). Bytecode is a compact binary format consisting of opcodes (operation codes) and data. It acts as an intermediary between high-level source code and the virtual machine, enabling execution of programs.

Developers write source code in high-level programming languages like Solidity, Vyper, Java, and Python which are then compiled into bytecode.

Bytecode in blockchain

In blockchain development, bytecode is closely linked to smart contracts, which are deployed on blockchains as bytecode.

In Ethereum, Solidity or Vyper source code is compiled into Ethereum bytecode and executed on-chain by the Ethereum Virtual Machine (EVM).

On Solana, Rust source code is compiled into the Solana Bytecode Format (sBPF), a modified version of Berkeley Packet Filter (eBPF) bytecode and is run by Solana's runtime environment.

Other examples include Polkadot, which uses WASM bytecode, executed by its Polkadot Virtual Machine (PVM) and Cosmos, which uses the CosmWasm Virtual Machine (CVM) and WASM bytecode to support modular and scalable smart contract functionality across its ecosystem.

Bytecode ensures cross-platform compatibility, allowing smart contracts to execute identically across all nodes, regardless of the underlying hardware.

‍

The lifecycle of bytecode: From source code to deployment

Generating bytecode involves compiling high-level source code into a compact, low-level format that virtual machines can interpret and execute.

This process can be broken down into the following steps:

Write source code:
Developers write smart contracts using high-level programming languages, such as Solidity or Vyper. These languages provide human-readable syntax for defining business logic, contract functions, and state variables.

Compile to bytecode:
Source code is processed by a compiler, such as the Solidity Compiler (Solc). This process involves:some text
- Translating source code into an intermediate representation (IR), which is an assembly-like code that allows for further optimization and transformation.
- Processing the IR to produce bytecode. This involves translating IR instructions into opcodes, which are specific operations (e.g., ADD, STORE) encoded as binary or hexadecimal values. Here, the compiler also performs platform-specific optimizations, such as removing redundant instructions, reordering operations, and managing resources like stack or memory.
- Bytecode is then optimized by the compiler for execution on specific virtual machines like the EVM or SVM.

Deploy bytecode:
When deploying a smart contract, developers package compiled bytecode into a blockchain transaction. This transaction contains the executable bytecode and, if needed, initialization data like constructor arguments. When the transaction is confirmed, the blockchain assigns a unique contract address where the bytecode is stored. Once deployed, bytecode is immutable. Users interact with it via a smart contract by sending transactions to its unique address and invoking specific functions.

Example:

Consider a simple Solidity function:
‍

// SPDX-License-Identifier: MIT
pragma solidity 0.8.24;
contract Example {
    function add(uint a, uint b) public pure returns (uint) {
        return a + b;
    }
}

‍

After compilation, it is represented in bytecode like this:
‍

608060405234801561000f575f80fd5b506101a58061001d5f395ff3fe6
08060405234801561000f575f80fd5b5060043610610029575f3560e01c
8063771602f71461002d575b5f80fd5b610047600480360381019061004
291906100a9565b61005d565b60405161005491906100f6565b60405180
910390f35b5f818361006a919061013c565b905092915050565b5f80fd5
b5f819050919050565b61008881610076565b8114610092575f80fd5b50
565b5f813590506100a38161007f565b92915050565b5f8060408385031
2156100bf576100be610072565b5b5f6100cc85828601610095565b9250
5060206100dd85828601610095565b9150509250929050565b6100f0816
10076565b82525050565b5f6020820190506101095f8301846100e7565b
92915050565b7f4e487b710000000000000000000000000000000000000
00000000000000000005f52601160045260245ffd5b5f61014682610076
565b915061015183610076565b925082820190508082111561016957610
16861010f565b5b9291505056fea26469706673582212207c625f8a2583
ea4909fc33bcc9e4218d739a554f4f07971c78b86bdf1fa443a064736f6
c63430008180033

‍
Execution of bytecode in blockchain

When a user interacts with a smart contract (e.g., submitting a transaction), bytecode stored on the blockchain is executed within the virtual machine sequentially. Execution is deterministic and produces the same result across all nodes in the network.

Execution of bytecode is not free. It consumes computational resources and, thus, gas. Each bytecode instruction has an associated cost reflecting its complexity. Users must pay gas fees to execute smart contracts, and this fee is proportional to the computational work required.

‍

Bytecode in general programming

Outside of blockchain, bytecode has been widely used in general programming for decades. Common examples include:

Java Bytecode: Java programs are compiled into bytecode, and run on the Java Virtual Machine (JVM). This allows Java applications to execute on any platform with a JVM.
Python Bytecode: In Python, the interpreter first compiles source code into bytecode, which is then executed by the Python Virtual Machine (PVM). This internal compilation improves efficiency by skipping recompilation during subsequent executions.
.NET Intermediate Language (IL): Programs written in C#, VB.NET, or other .NET languages are compiled into Microsoft Intermediate Language (MSIL), a type of bytecode executed by the Common Language Runtime (CLR).

Bytecode is used to ensure portability, platform independence, and efficient execution through virtual machines.

Related Terms

No items found.