A look at some basic concepts of the Ethereum Virtual Machine (EVM)

A look at some basic concepts of the Ethereum Virtual Machine (EVM)

This article is written to serve devs wanting to learn about how Ethereum works on a deeper level, without having to pick up the Ethereum Yellowpaper.

ยท

14 min read

What is this article about?

This article has been written to serve as an informative guide for blockchain & Solidity developers eager to broaden their understanding of the Ethereum Virtual Machine (EVM).
To ensure the article remains highly pertinent, I have intentionally left out information that may not be directly relevant from a developer's standpoint.

A noteworthy mention: In the creation of this article, I utilized various sources, but my primary reference was an insightful walkthrough of the Ethereum Yellow paper, courtesy of Ackee Blockchain Security.
This comprehensive resource greatly contributed to the content structure and depth of this guide, and it would be disingenuous of me not to acknowledge their effort.

You can view their full video, available on their YouTube channel, by clicking on the image below.

Without any further ado, let us dive right in.

Accounts in the EVM

An account, in the context of Ethereum, is any address capable of sending transactions or transferring ETH on the network.

Each account is represented by a 20-byte hexadecimal string. Since each byte contains two characters, the string is 40 characters long. Prefixed with
"0x", each one of these strings represents a unique EVM address.

Note ๐Ÿ“: Hexadecimal strings can be made up of digits "0-9", and characters "a-f". The characters represent numbers 10-15, since the hex number system is base-16.
If you are not comfortable with converting decimal numbers to hexadecimal, check out this small thread I made.

There are broadly two types of accounts:

  1. Externally owned accounts(EOAs): These are accounts that can be controlled by anyone with a Private key, and don't hold any code nor do they have any storage.

  2. Contract accounts: These are accounts that have smart contracts deployed on them.
    They definitely contain code and may or may not have storage associated with them, depending on the code that was deployed.

Note ๐Ÿ“: A key difference between the two types of accounts is that only EOAs can initiate transactions.

Yes, contracts can call other contracts, but they can only do so if such a transaction is initiated by a user through an EOA.

Before moving on, please know that the Ethereum yellow paper defines the world state as a mapping between addresses (20 bytes/40 characters long) and account states.
Basically, the world state is comprised of information about different accounts.

Now, a single account's state is comprised of the following fields:

  1. nonce : Holds different meanings for the two types of accounts.
    For an EOA, the nonce basically denotes the number of transactions executed by an address on a particular network.
    If you used address \(A\) to execute 10 transactions on Ethereum but never did anything on Polygon for example, then \(A\)'s nonce will be 10 on Ethereum and 0 on Polygon.
    A smart contract can be used to generate new contracts. For a contract account, its' nonce value denotes the number of contracts generated by that particular smart contract.
    The default nonce of a contract account, unlike EOAs, is 1.

  2. balance : The number of WEI held by that particular address.
    \(1 \, \text{ETH} = 10^{18} \, \text{Wei}, \text{ and, } 1 \, \text{ETH} = 10^{9} \, \text{GWEI}\)

  3. codeHash : Put simply, this immutable value is the hash of the EVM code of this account.
    As you may already deduce, this field is immutable once a contract has been deployed.

Note ๐Ÿ“: If the codeHash field of an account is the Keccak-256 hash of an empty string, i.e \(\sigma[a]_c = KEC(())\), then the account is an EOA.

  1. storageRoot : Also known as storageHash.
    Again, to put this simply, the storage hash of an account address is the collective hash of all the different data points that make up the contract's storage. This hash changes every time the contract's state is updated.
    For an EOA, this will be an empty field.
    To learn about this complex topic in more detail, you will need to be familiar with MERKLE PATRICIA TREES. You can start from Ethereum.org.

Note ๐Ÿ“: An account is said to be empty if it has no code, zero nonce, and zero balance.

Transactions

A transaction(T) is a single cryptographically signed instruction constructed by an EOA.

Note ๐Ÿ“: Transactions can NOT be initialized by contract accounts. Smart contracts can call functions on other contracts, but that is not the same as initializing transactions.

Transactions can be of two types:

A transaction contains the following fields:

  1. from : The address of the EOA that initialized the transaction.

  2. to : If you're simply transferring ETH, this will simply be the address of the recipient EOA.
    If you are using code from a smart contract, this field will refer to the contract's address.

  3. value : Amount of Wei transferred to an EOA.
    In case this transaction is creating a new smart contract, the value field will be zero by default. You can, however, send ETH to the smart contract in the same transaction while deploying it. In that case, the transaction will have value.

  4. signature : The identifier of the from EOA. This unique proof is generated when a transaction is signed using a valid private key.

  5. nonce : Indicates the number of transactions that have been sent by the initializing EOA. An account can include only one transaction per nonce in the blockchain.
    If two different transactions with the same nonce are sent to the network, only one of them will be included.

  6. gasPrice : Amount of Wei you being paid for a single unit of gas for the transaction. This amount will vary depending on network conditions.

  7. gasLimit : The maximum amount of gas that can be used to execute the transaction.

  8. gasLimit: You can also add an extra tip for the validator in Wei to try to hasten your transaction execution.

  9. maxFeePerGas : The final gas fee you are willing to pay (gasPrice + maxFeePerGas ).

  10. data : A bytes array of arbitrary size, specifying the input data of the message call.

Note ๐Ÿ“: Each transaction generates a unique cryptographic signature, as you already know.
However, a transaction's original message structure does NOT actually compe populated with a lot of the fields above.
In fact softwares like Etherscan decode the signature to fill up those fields.
You can read more about this here.

Blocks

A block in the blockchain network represents a distinct batch of transactions that have reached consensus.

Every block possesses a unique hash within the blockchain, serving as its specific identifier. In addition, each block refers to all preceding blocks by including the hash of its immediate predecessor, or 'parent' block. This parent block, in turn, holds the hash of its own parent, and this chain continues in retrospect. Consequently, every new block indirectly references the entire history of blocks before it.

If there's any alteration in the history of blocks, it effectively leads to the genesis of a completely new blockchain.

Block production in POS

  • Not all nodes are validators. Only the nodes that stake at least 32 ETH into the network can propose new blocks, thus becoming validators.

  • Post-merge, Ethereum produces a new block every 12 seconds.

  • Every 12 seconds a validator is chosen at random to propose a new block, using the RANDAO algorithm.

  • Other validators re-execute the transactions submitted by the block proposer, and if everything checks out, add the new block to their own database.

  • The chain thus moves on.

Transaction execution

Executing a transaction is one of the most common, yet also one of the most complex parts of the Ethereum protocol.

Any transaction \(T\), needs to pass the following preliminary tests of validity before being included in the mempool:

  1. The transaction is encoded correctly. Read more here.

  2. The transaction has a valid signature.

  3. The transaction nonce is valid (see above for more context).

  4. The sender's account has no code deployed, i.e: the codeHash field of an account is the Keccak-256 hash of an empty string, i.e: \(\sigma[a]_c = KEC(())\).
    This feature was actually implemented in EIP-3607, and makes for an interesting read.

  5. The transaction is initialized with a sufficient gas limit.

  6. The sender's account has sufficient balance.

Formally, consider a Transaction Function (\(\Upsilon\)) being used on a Transaction( \(T\)).
Let (\(\sigma\)) and (\(\sigma^{'}\)) represent the current and new state respectively. Then,

\(\sigma^{'} = \Upsilon(\sigma, T)\)

Contract creation

The creation of a new contract requires several intrinsic parameters to be defined:

Sender (s), transaction originator(o), available gas (g), gas price the sender is willing to pay (p), along with these values:

  • endowment (v) : The amount of ETH initially transferred to the new contract,

  • EVM code: The actual bytecode that is being deployed,

  • the depth of the current transaction/call stack,

  • and a salt value (an optional value used for creating unique contract addresses),

Note ๐Ÿ“: Please note that the sender and the transaction originator may or may not be the same address.
A smart contract can, in response to a transaction by an EOA, create new contracts. In that case the sender and the txn originator will be different addresses.

An actual smart contract can be created using the CREATE or the CREATE2 opcode.

In case of the former, the address of the new contract is determined by the Keccak-256 hash of the sender's address and its' nonce.
If the contract creation is caused by the CREATE2 opcode, then the salt value is used in combination with the sender's address and initialization code to create the contract address.

In other words:

// CREATE
Contract address = hash(sender, nonce)

//CREATE2
Contract address = hash(0xFF, sender, salt, bytecode)

Note ๐Ÿ“: 0xFF is just a constant used to prevent collissions with the CREATE opcode. You can read more about CREATE2 at EIP 1014.

Account initialization

When a contract account is first created, it has:

  • a nonce of 1,

  • a balance equal to the endowment,

  • and no code.

Its storage is empty, and the hash of its code is the hash of an empty string. The balance of the sender is reduced by the endowment + gas paid.

Code execution

Finally, the account is initialized via the execution of the actual bytecode by the EVM in accordance with the execution model (explained in detail below).

This execution can affect the account's storage, create further accounts, or make further calls. If there is not enough gas to execute the initialization code, an out-of-gas exception is thrown, and the contract creation is reverted.

If the contract creation is not reverted, the remaining gas is refunded to the sender, and the changes to the state are finalized.

A subtle note

Please noted that if the initialization code execution ends without providing any code for the new contract (or if the execution ends with a SELFDESTRUCT instruction), then the new account still exists but doesn't do anything. This is referred to as a "zombie account". Any remaining balance in such an account is locked and can't be accessed.

Note ๐Ÿ“: Please note that the removal of the SELFDESTRUCT opcode s been proposed in the EIP 4758, but hasn't been finalised yet.

Execution model

The execution model specifies how the state of the EVM will be altered given a series of bytecode instructions( basically smart contract code), and input data.

The amount of computation that can be expended on this set of instructions is determined by a parameter gas, which I am sure you have heard of before.

Basics

The EVM is based on a stack architecture. The 'word' size of a stack is 256-bit(32 bytes).
This size was chosen so that the stack could gel in well with the Keccak-256 hashing algorithm and the elliptic-curve computations needed.

Note ๐Ÿ“: The Keccak-256 hashing algorithm can take any string, number, or any random piece of data of any arbitrary size, and will return a 256-bit string hash of the data.

Fees Overview

Fees (denominated in gas) is primarily paid under three main circumstances:

  1. The fees inherent to the execution of a transaction or contract creation,

  2. The fees required to fund nested message calls or contract creations, and

  3. Each contract is allocated an empty array to use as memory for performing calculations. This empty array is expanded into different sizes depending on the contract's needs, and gas is required for such usage of memory.

Storage is expensive on any blockchain to incentivize minimal usage and also exhibits nuanced behavior.

In Ethereum, each smart contract has a dedicated storage space where it can keep data that persists between function calls and transactions. This storage is part of the global Ethereum state, which is maintained by all full nodes in the network. The larger the Ethereum state, the more computational resources are required to operate a full node, which can lead to centralization and reduced network health.

To incentivize developers to reduce their usage of storage, Ethereum charges a high gas fee for operations that increase the size of storage, i.e., when you store something in a storage slot that was previously empty. Conversely, when you clear a storage slot (set its value back to zero), you are reducing the size of the global Ethereum state. The protocol rewards this behavior by not only waiving the gas fee for the clearing operation but also providing a gas refund.

The gas refund can be used to offset the gas costs of other operations in the same transaction. However, it's important to note that the refunded gas doesn't translate into a real Ether refund, it just reduces the overall transaction cost.

Execution environment

In addition to the system state and the gas required for computation, several more pieces of information are required by the EVM.

The execution agent must provide these series of parameters:

  • \(I_a\), the address of the account from where the code is being executed.

  • \(I_o\), the EOA that initialized the transaction.

  • \(I_p\), the gas price set for this transaction.

  • \(I_d\), the byte array or transaction data that forms the input data.

  • \(I_v\), the transaction value, in Wei, passed to this account as part of the transaction.

  • \(I_b\), the byte array or the bytecode that is to be executed.

  • \(I_H\), the block header of the present block.

  • \(I_e\), the depth of the message call or contract-creation.

  • \(I_w\) , permission to make modifications to the state.

The execution model defines the function \(\Xi\), which can compute the resultant state \(\sigma^{'}\), the remaining gas \(g^{'}\), the resultant accrued substate \(A^{'}\) and the resultant output, \(o\), given these definitions. For the present context, we will define it as:

\((\sigma^{'} , g^{'} , A^{'} , o) \equiv \Xi(\sigma, g, A, I)\)

Execution overview & Execution cycle

From the previous section we know that:

The \(\Xi\) function is a crucial part of the EVM execution model, which describes how the system state and machine state change over time as Ethereum operations are executed. It's a function that takes the system state \((\sigma)\), remaining gas \((g)\), substate \((A)\), and execution environment information \((I)\) and returns the new state of these variables after the execution.

Now let us define the machine state \((\mu)\).

The machine state, denoted as \((\mu)\), is defined as a tuple containing several variables related to the state of the EVM at a particular point in time. This includes the amount of gas remaining \((\mu_g)\), the program counter \((\mu_{pc})\), the memory contents \((\mu_m)\), the active number of words in memory \((\mu_i)\), and the stack contents \((\mu_s)\).

The gas remaining after each execution is updated by subtracting the gas cost of the current operation from the current remaining gas.

In practical terms, this part of the Ethereum Yellow Paper is explaining the step-by-step process the EVM goes through when executing operations, how it handles different halting states, and how it calculates and updates the amount of gas remaining after each operation.

The execution cycle of the EVM involves adding or removing stack items from the left-most, lower-indexed portion of the series, reducing the gas by the instruction's gas cost, and incrementing the program counter. However, for some instructions (JUMP, JUMPI), the program counter is updated by a function 'J' instead of simple increment. In general, memory, accrued substate, and system state do not change, but certain instructions can alter one or several of these values.

This section provides a rigorous definition of how the EVM processes instructions, detailing how the machine state is updated, when the machine should halt (either due to exceptional or normal conditions), and how the execution cycle operates.

Conclusion

This article was an attempt to help a blockchain developer go through some of the deeper concepts of the EVM using the yellow paper without having to dive into the math for himself.

This article has skipped over two main aspects of the paper:

  1. The complicated mathematical equations, even though I have attempted to show some important equations using LATEX,

  2. the concepts of storage and the instruction set. I plan to cover the complex topics of storage management and Opcodes in the next article. Stay tuned for that.

Resources used

To make this article I referred to multiple resources, but here are the three main ones:

  1. The Ethereum yellow paper.

  2. Walkthrough of the yellow paper by Ackee blockchain security

  3. Ethereum.org

I'll see you in the next one.

Did you find this article valuable?

Support Priyank Gupta by becoming a sponsor. Any amount is appreciated!

ย