A project by Nethermind Research and Nethermind core developers, supported by a grant from the Ethereum Foundation.
Understanding the distribution of Ethereum’s execution-layer and consensus-layer clients used by validators is vital to ensure a resilient and diverse network. Although there are currently methods to estimate the Beacon Chain’s client distribution among validators, the same cannot be said about execution client distribution. Also, there is no standard means of anonymously showcasing which EL and CL clients are being utilized. This proposal aims to research and design a way to submit and extract this crucial data while potentially avoiding compromising user anonymity and network performance.
To motivate the need for this research, we begin by briefly analyzing the current methods for measuring client diversity in Ethereum, along with their weaknesses. In general, we can cluster these methods into three main categories:
Among the approaches above, we see no reliable methods to obtain execution client usage data for Ethereum’s validator set. For the consensus client, heuristic approaches on the proposed blocks can sample the usage distribution, although this strategy is subject to inaccuracies via misclassification. Figure 1 below shows some concerns regarding Blockprint’s ability to reliably identify blocks proposed by a Lodestar or Prysm consensus client.
Figure 1: Statistics of Blockprint’s accuracy, obtained by running the classifier on a validation set for which the correct consensus client is known. Here, the true positive rate (TPR), true negative rate (TNR), and positive predictive value (PPV) are shown. These numbers show good accuracy for clients such as Teku but paint a pessimistic picture for Prysm and Lodestar—especially for the latter, where the classifier appears to be unable to identify Lodestar blocks on the validation set. Source: https://blockprint.sigp.io/. Consulted on 20/10/2023.
Moreover, not distinguishing clearly between the client distribution for validators vs nodes can lead to misleading assessments of the true resilience of the network against the failure of a specific client. We illustrate an example in Figures 2 and 3 below.
Figure 2: Execution-client diversity amongst synced nodes, according to Ethernodes. The figure above represents node diversity (not validator diversity) and is not representative of the effects of a client’s malfunction on consensus. Contrast with Figure 3 below. Source: ethernodes.org. Consulted on 20/10/2023.
Figure 3: Execution-client diversity according to clientdiversity.org. Note that this distribution paints a picture very different from Ethernodes. This distribution asserts that a supermajority of validators run Geth—if there is a bug in the Geth client leading to an unexpected/unintended state change, it would be accepted and eventually finalized by the validators. Source: clientdiversity.org. Consulted on 20/10/2023.
After surveying the landscape of Ethereum’s client diversity assessment techniques and their limited effectiveness, we posit that Ethereum can benefit from the research and implementation of mechanisms to make this data available more reliably, for both EL and CL. More specifically, we are interested in exploring the feasibility of validators directly reporting this data through the Ethereum protocol. Ideally, such reports would allow validators to provide information while staying anonymous—for example, utilizing ZK proofs or a suitable encryption scheme.
We capture all of these goals in the objectives below.