One of the important indicators of how much load the Ethereum blockchain can safely handle is how the uncle rate reacts to the gas consumption of a transaction. In all Satoshian proof-of-work variant blockchains, every published block is at risk of becoming stale, i.e. not being part of the main chain, because another miner published a competing block before the recently published one block has reached them, leading to a situation where there is a “race” between two blocks and one of the two is inevitably left behind.
An important fact is that the more transactions a block contains (or the more gas a block consumes), the longer it takes to propagate through the network. On the Bitcoin network, a seminal study on this was Decker and Wattenhofer (2013) who found that the average propagation time of a block was about 2 seconds plus an additional 0.08 seconds per kilobyte in the block (i.e. a 1MB block would take ~ 82 seconds). A recent study by Bitcoin Unlimited showed that improvements in transaction propagation technology reduced this to ~0.008 seconds per kilobyte. We can also see that if a block takes longer to propagate, it is more likely to become obsolete; With a block time of 600 seconds, a 1 second increase in runtime should equate to an increased chance of 1/600 being left behind.
On Ethereum, we can do a similar analysis, except thanks to Ethereum’s “uncle” mechanics, we have very solid data to analyze. Stale blocks in Ethereum can be reintroduced onto the chain as “uncles”, where they receive up to 75% of their original block reward. This mechanic was originally introduced to reduce the pressure of centralization by reducing the advantage that well-connected miners have over poorly connected miners, but it also has several side benefits, one of which is that stale blocks are kept in a very easily searchable form database – the blockchain itself. We can take a data dump of blocks 1 to 2283415 (before the September 2016 attacks) as the data source for the analysis.
Here is a script to generate some source data: http://github.com/ethereum/research/tree/master/uncle_regressions/block_datadump_generator.py
Here is the source data: http://github.com/ethereum/research/tree/master/uncle_regressions/block_datadump.csv
The columns represent, in that order, the block number, the number of uncles in the block, the total uncle reward, the total gas consumed by the uncles, the number of transactions in the block, the gas consumed by the block, and the length of the block bytes and length of the block in bytes without null bytes.
We can then use this script to analyze it: http://github.com/ethereum/research/tree/master/uncle_regressions/base_regression.py
The results are as follows. In general, the uncle rate is constant at about 0.06-0.08, and the average gas consumption per block is about 100,000-300,000. Since we have the gas consumption of both blocks and uncles, we perform a linear regression to estimate how much 1 unit of gas increases the probability that a given block is an uncle. The coefficients are as follows:
Block 0 to 200k: 3,81984698029E-08 block 200k to 400K: 5.35265798406E-08 block 400K: 2,3363832951E-08 block 600 to 800K: 2,12452216E-08 block 800k up to 1000K: 2.12452166E-08 block up to 1000K: 2000K: 2.70221310273302. 2,8640900022E-08 block 1200K: 3,244899383e-08 Block 1400K: 3,1225820862E-08 Block 1600K to 1800K: 3,18276549008 KOB 1800K: 2,4111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111108-08-100K: 2.4110KK: 2.4110KK -08-08-200K. 2200k to 2285k: 1.86635688756e-08
Therefore, each 1M gas transaction contained in a block now increases ~1.86% the chance of that block becoming an uncle, although this was closer to 3-5% during Frontier. The “base” (ie the oncle rate of a 0-gas block) is constant at ~6.7%. We will leave this result as is for now and draw no further conclusions; There is a further complication, which I shall address later, at least in relation to the impact this finding has on gas containment policy.
Another issue touching oncle rates and transaction spread is gas pricing. It is often argued in bitcoin development discussions that block size limits are unnecessary as miners already have a natural incentive to limit their block sizes, meaning that every kilobyte they add increases the stale rate and thus threatens their block reward. Given the impedance of 8 seconds per megabyte found by the Bitcoin Unlimited study, and given that each second of impedance corresponds to a 1/600 chance of losing a 12.5 BTC block reward, this suggests an equilibrium transaction fee of 0 .000167 BTC per kilobyte out if no block size limitations are assumed.
In the Bitcoin environment, there are reasons to be skeptical about the long-term economics of such a no-limit incentive model, as there will eventually be no block reward and if the only thing miners have to lose by including too many transactions is fees from theirs other transactions, then there is an economic argument that the equilibrium aging rate will be up to 50%. However, there are changes that can be made to the protocol to limit this coefficient.
In the current Ethereum environment, block rewards are 5 ETH and will remain so until the algorithm is changed. Accepting 1 million gas means a 1.86% chance the block will become an uncle. Fortunately, Ethereum’s uncle mechanism has a pleasant side effect here: the average uncle reward is around 3.2 ETH as of late, so 1 million gas means only a 1.86% chance of risking 1.8 ETH, ie. an expected loss of 0.033 ETH and not 0.093 as would be the case without an uncle mechanism. Therefore, ~21 Shannon’s current gas prices are actually quite close to 33 Shannon’s “economically reasonable” gas price (this is before the DoS attacks and resulting tweaks; it’s probably even lower now).
The easiest way to push the equilibrium gas price further down is to improve the uncle inclusion mechanic and try to include uncles in blocks as quickly as possible (perhaps by propagating each block separately as a “potential uncle header”. ); At the limit, with every uncle included as soon as possible, the equilibrium gas price would drop to about 11 Shannon.
Is data undervalued?
A second linear regression analysis can be performed with source code here: http://github.com/ethereum/research/tree/master/uncle_regressions/tx_and_bytes_regression.py
The purpose here is to see if there is any correlation with the number of transactions or with the size of a block in bytes left after considering the coefficients calculated above for Gas. Unfortunately, we don’t have numbers on block size or transaction count for uncles, so we have to resort to a more indirect trick that looks at blocks and uncles in groups of 50. The gas coefficients this analysis finds are higher than the previous analysis: about 0.04 uncle rate per million gas. One possible explanation is that if a single block has a high maturity and leads to an uncle, there is a 50% chance that that uncle is the high maturity block, but there is also a 50% chance that the Uncle will be the other block it competes against. This theory fits well with the finding of 0.04 per million “social uncle rate” and ~0.02 per million “private uncle rate”; hence we take it as the most likely explanation.
The regression finds that after accounting for this social uncle rate, one byte accounts for an additional uncle rate of ~0.000002. Bytes in a transaction take up 68 gas, of which 61 gas make up their contribution to bandwidth (the remaining 7 are used to bloat the history database). If we want both the bandwidth coefficient and the calculation coefficient in the gas table to reflect the propagation time, this means that if we really wanted to optimize the gas cost, we would have to increase the gas cost per byte by 50 (i.e. to 138). This would also mean an increase in the base gas cost of a transaction by 5500 (note: such a rebalancing would not mean that everything becomes more expensive; the gas limit would be increased by ~10%, leaving the average transaction throughput unchanged). On the other hand, the risk of worst-case denial of service attacks is greater for execution than for data, so execution requires greater security factors. Therefore, there is arguably not strong enough evidence to make any reassessments here, at least for now.
A possible long-term protocol change would be the introduction of separate gas pricing mechanisms for in-EVM execution and transaction data; The argument here is that since transaction data can be calculated separately from everything else, the two are much easier to separate, and therefore the optimal strategy might be to allow the market to somehow balance them; however, precise mechanisms for this have yet to be developed.
Gas Limitation Policy
For a single miner setting their gas price, the relevant statistic is the “private uncle rate” of 0.02 per million gas. From the point of view of the system as a whole, the “social uncle rate” of 0.04 per million gas is important. If we didn’t care about security factors and were ok with an uncle rate of 0.5 uncles per block (which means a “51% attack” would only need 40% hash power to succeed, actually not as bad as it sounds), then at least that Analysis suggests that the gas limit could theoretically be raised to ~11 million (20 tx/s at an average of 39,000 gas per tx as is the case with current usage, or 37 tx/s worth of single shipments). With the latest tweaks, this could be increased even further. However, since we care about safety factors and prefer to have a lower uncle rate to mitigate centralization risks, 5.5 million is probably an optimal level for the gas limit, although in the medium term it’s a “dynamic gas limit” formula that’s set to a targeting specific target Block processing time would be a better approach as it could quickly and automatically adapt to attacks and risks.
Note that the concern for centralization risks and the need for security factors do not overlap. The reason for this is that the blockchain needs to survive during an active denial of service attack and not be resistant to economic centralization in the long term; The argument is that if the attacker’s goal was to economically encourage centralization, the attacker could simply donate money to the largest pool to bribe other miners to join them.
Going forward, we can expect virtual machine improvements to continue to lower uncle rates, although eventually network improvements will also be required. The scalability of a single chain is limited, with the primary bottleneck being disk reads and writes, so after a certain point (likely 10-40 million gas) sharding will be the only way to handle more transactions. If we just want to lower equilibrium gas prices, Casper will help significantly by making the “slope” of the uncle rate to gas consumption close to zero, at least to a certain point.