Bitcoin's Empty Blocks Analaysis. | Bitcointalksearch.org

kano

legendary

Activity: 4634

Merit: 1851

Linux since 1997 RedHat 4

Quote from: PrimeNumber7 on May 11, 2020, 01:06:03 AM

How frequently over the past 5 years have miners put their name into the coinbase tx as you describe? Is this how most block explorers determine which pool found each block this way? Or do they look at different/additional information?

At the moment I get about 99% identification from a text search or address scan of each block's coinbase transaction,
just like any of the large explorers do.

True unknown's are rare as you'll see on any of the large explorers.

Usually when it drops to the low 90s, it means someone changed their sig text so I just look at an 'unknown' and see what's changed
- e.g. f2pool changed the text a bit, recently.

kano

legendary

Activity: 4634

Merit: 1851

Linux since 1997 RedHat 4

Quote from: pooya87 on May 03, 2020, 12:11:01 AM

the main reason for empty blocks has always been SPV mining, that is the fact that miners didn't verify the previous block completely before starting the next so they start with an empty block until they verify the previous one and be able to update their mempool and add new transactions to their block. sometimes they get lucky and find the answer to that empty block and publish that.
since during the past couple of years there has been a lot of improvements in the speed of transaction verification process with all the optimizations done by core team and SPV mining is not that common anymore we are seeing a much reduced number of empty blocks nowadays.
...

What has changed is that they simply update the work to the miners faster now than they used to do it.
Simple example:

Before
Blockchange ... send out empty work.
30s later ... send out new work with transactions.

Over time I guess they realised how stupid this is and changed it to:

Now
Blockchange ... send out empty work.
A few seconds later ... send out new work with transactions.

Magic! Now they generate 10% of the empty blocks they used to generate if they take 3s instead of 30s
If they take 0.3s, empty blocks drop to 1% ... etc.

You can see that most pools are still ignoring transactions on a block change.
If you can see the mempool after each block change ... you check that, any time, after you see an empty block on the network.
... I can coz I have extra information in the logs coded into all the bitcoinds that run on my pool so I can always see the size of all work generated.

P.S. I'm one of the VERY few pools that has NEVER mined an empty block

I can only think of one other pool - and that effectively died recently when they lost yet another block due to negligence by the guy who ran the pool.

P.P.S. the time to validate a block and generate a full block is less than 200ms

PrimeNumber7

copper member

Activity: 1666

Merit: 1901

Amazon Prime Member #7

Quote from: pooya87 on May 11, 2020, 07:27:28 AM

Quote from: PrimeNumber7 on May 11, 2020, 01:06:03 AM

How frequently over the past 5 years have miners put their name into the coinbase tx as you describe? Is this how most block explorers determine which pool found each block this way? Or do they look at different/additional information?

from genesis block (block zero) coinbase transactions were allowed to put anything they wanted in their input's signature script since coinbase transaction doesn't really have any input (empty outpoint and max index). and Satoshi is the first miner who put something there (Chancellor on brink of second bailout for banks).
but it is optional, so only big mining pools put their name mostly as advertisement. interesting enough if you look closer each pool has multiple servers and each server puts a different name in there. there may be miners who don't though.
i believe that is the only way block explorers or any other chain analyzers determine hashrate distribution. the only other way would be if the pools publicly published ALL their addresses so we could see which address is receiving which blocks' rewards. which is not possible since addresses can change constantly.

I have noticed over the years that some pools use the same address for block reward payout, while others don’t and some pools have paid out to their miners directly.

I guess my question is if you use the verbose 2 option for the getblock RPC call, how many blocks could you determine the pool that mined each block?

I would not be surprised to see major pools to have different servers putting different message variations into the input signature script of the Coinbase transaction for debugging purposes.

pooya87

legendary

Activity: 3472

Merit: 10611

Quote from: PrimeNumber7 on May 11, 2020, 01:06:03 AM

How frequently over the past 5 years have miners put their name into the coinbase tx as you describe? Is this how most block explorers determine which pool found each block this way? Or do they look at different/additional information?

from genesis block (block zero) coinbase transactions were allowed to put anything they wanted in their input's signature script since coinbase transaction doesn't really have any input (empty outpoint and max index). and Satoshi is the first miner who put something there (Chancellor on brink of second bailout for banks).
but it is optional, so only big mining pools put their name mostly as advertisement. interesting enough if you look closer each pool has multiple servers and each server puts a different name in there. there may be miners who don't though.
i believe that is the only way block explorers or any other chain analyzers determine hashrate distribution. the only other way would be if the pools publicly published ALL their addresses so we could see which address is receiving which blocks' rewards. which is not possible since addresses can change constantly.

PrimeNumber7

copper member

Activity: 1666

Merit: 1901

Amazon Prime Member #7

How frequently over the past 5 years have miners put their name into the coinbase tx as you describe? Is this how most block explorers determine which pool found each block this way? Or do they look at different/additional information?

pooya87

legendary

Activity: 3472

Merit: 10611

Quote from: PrimeNumber7 on May 10, 2020, 11:09:25 PM

This will not tell you which pool mined each block, you will need to get this from somewhere else.

if you use Verbosity 2 then each transaction in block will also be fully shown in JSON format and the first one is obviously the coinbase tx which contains the possible name of the pool or miner that found the block. all you have to do is to find result.tx[0].vin.coinbase then use UTF8 to decode the result. for example for block (currently last block) 0000000000000000000fcf675c44734ddbe2af6dddd480f7d7aba324488f138d

Code:

��	4Ÿ^vip/www.okex.com/��mm�d8OR :Y:�ٜ�U��Juf��䶎;x

then search inside that string for known names which could found online with a little search.

it could also split into two calls (reduces the size of the first response):
first with verbosity 1 as before then fetch the first transaction ID then use that in getrawtransaction with verbose true to get the coinbase JSON as a single transaction and do the same UTF8 decode again.

test here if you don't have a node close:
https://chainquery.com/bitcoin-cli/getblock
https://chainquery.com/bitcoin-cli/getrawtransaction

PrimeNumber7

copper member

Activity: 1666

Merit: 1901

Amazon Prime Member #7

Quote from: mikeywith on May 02, 2020, 11:16:59 PM

Quote from: ranochigo on May 02, 2020, 09:56:32 PM

Great analysis. Could you factor in the timestamp/the timing received by blockchair into the graph? It would be a lot clearer to see the timings between the empty blocks and the block before it.

I could but there are two problems, the first one is scraping the data of blockchain,took me forever to extract the data for empty blocks which is nothing compared to the total number of blocks, but if someone has a better way of getting those data, put them in a table format like excel and send them to me, I can do it and even more perhaps.

The second problem is that time-stamping isn't exactly accurate,

If you are running a full node, you can use RPC commands to obtain timestamp information about each block, using the RPC command "getblock" using verbosity 1. This will not tell you which pool mined each block, you will need to get this from somewhere else.

The published timestamp has the potential to be off by up to 2 hours, although in most cases, I believe it is closer to accurate. To have an authoritative timestamp, you will need to have a well-connected node that you specifically program to keep track of the time each block is received. If you only have the published timestamp, that will have to do.

I would be curious if there is a correlation between the percentage of blocks found, and the percentage of zero tx fee blocks/total pool blocks found, and if so, what this correlation is. Does the percentage of zero tx fee blocks/total pool blocks found increase as the percentage of blocks found increases? Are there any outliers in this particular slope?

I would not want to pre-judge data, but I would propose that pools that find more blocks generally can invest in better equipment/network connectivity, to process found blocks, to propagate found blocks, and to identify found blocks. I would also propose that some pools may have different groups miners try to find blocks based on different sets of transactions.

mikeywith

legendary

Activity: 2464

Merit: 6688

be constructive or S.T.F.U

As requested by a few members, and after getting the data from LoyceV, here is what I came up with.

I excluded the years before 2016 so the data is only for 2016-2019, the reason behind this is the fact that in the early days, we had a ton of empty blocks, simply because at many times there were no transactions to be added to the block anyway, also most of the blocks come from "unknown" miners.

In this chart, I show the total time for all timestamps of empty blocks - the block which preceded it for the top 5 pools.

But I realized this figure might be misleading because finding more empty blocks will give the miner a lead in this chart, and since Antpool did find a ton of empty blocks most likely do the use of covert Asicboost, the figure above might not show any evidence of delaying transactions on purpose.

The next step was getting the average timestamp for those empty blocks, I got the total time difference and divided it by the total number of empty blocks per pool per year.

The above figure is pretty fair since it takes into account that larger pools tend to find more blocks and therefore find more empty blocks.

Based on the points mentioned in the previous comments, I will not attempt to conclude anything out of this study, everyone is free to interpret
these figures the way they like.

mikeywith

legendary

Activity: 2464

Merit: 6688

be constructive or S.T.F.U

Quote from: LoyceV on May 06, 2020, 07:15:52 AM

is this what you were looking for?

By the look of it, it seems like that is more than I need, I have yet to download the data I need and see if it fits in perfectly in my excel sheets, you did a great job.

Quote from: tranthidung on May 06, 2020, 06:20:01 AM

It has not yet included block heights. Could you include block heights, please.

It is there, he refers to it as ID > http://loyce.club/blockdata/id.txt

Quote from: BrewMaster on May 06, 2020, 11:00:14 AM

keeping in mind that a miner first sets the block time in the header and then starts mining it the reason for the differences in matter of seconds between blocks and real time could simply be because the miner took longer to update the time.

You are right indeed, but that difference in time is too small to be considered, updating the time in the block header probably takes a few milliseconds and since it doesn't affect the miner chances of solving the block and also keeping in mind that for miners' best interest the time is either exact or in the future (to make the best use of next difficulty adjustment) it's safe to say at least large mining pools make sure the time is rather in the future, not the past.

but really combining all these factors, small or large, makes the analysis a bit far from accurate, but I will do it either way.

BrewMaster

legendary

Activity: 2128

Merit: 1293

There is trouble abrewing

Quote from: mikeywith on May 06, 2020, 05:59:35 AM

1- The clock is off (behind)
2-The clock of the miner who mined the previous block is off (ahead)
3- The miner doing that on purpose to

a- hide the fact that they had enough time to include transactions but refused doing so. (this is what people are suspecting)

b- The miner wants to artificially increase the difficulty by tricking the protocol as if blocks were mined faster than they actually were (no logical reason for any miner to do so)

keeping in mind that a miner first sets the block time in the header and then starts mining it the reason for the differences in matter of seconds between blocks and real time could simply be because the miner took longer to update the time.

for example the miner could set the time then start going through the nonces then after failing to find a good hash they change their "extra nonce" and hash again, and so on. then right before they get around to update the time they could find a good hash and release it. so the result is a couple of seconds in the past.

tranthidung

legendary

Activity: 2310

Merit: 4085

Farewell o_e_l_e_o

Quote from: LoyceV on May 06, 2020, 07:15:52 AM

see Bitcoin block data in CSV format.

You are actually a data-mining beast, LoyceV. Thanks for another data which is huge this time. Cheesy

LoyceV

legendary

Activity: 3290

Merit: 16489

Thick-Skinned Gang Leader and Golden Feather 2021

Quote from: tranthidung on May 06, 2020, 06:20:01 AM

Quote from: LoyceV on May 06, 2020, 04:49:50 AM

I just checked (using this work in progress)

It has not yet included block heights. Could you include block heights, please.

I was still working on it, see Bitcoin block data in CSV format.

@mikeywith: is this what you were looking for?

tranthidung

legendary

Activity: 2310

Merit: 4085

Farewell o_e_l_e_o

Quote from: LoyceV on May 06, 2020, 04:49:50 AM

I just checked (using this work in progress)

It has not yet included block heights. Could you include block heights, please.

mikeywith

legendary

Activity: 2464

Merit: 6688

be constructive or S.T.F.U

Quote from: DaCryptoRaccoon on May 06, 2020, 05:08:55 AM

someone had worked out how to manipulate the time to ensure they could mine the next block empty before anyone else even started on the block.

Nop, can not start mining block N before everybody else knows about block N-1 ( ignoring network delays for certain nodes), block timestamps mean nothing in regards to the chain order.

Quote

So are people now thinking that antpool and bitmain are doing something sketchy to ensure they can mine those empty blocks while the rest of us are trying to play on the un-level playing field

That is not what anybody else is thinking, you see having negative times between blocks only means 1 of 3

1- The clock is off (behind)
2-The clock of the miner who mined the previous block is off (ahead)
3- The miner doing that on purpose to

a- hide the fact that they had enough time to include transactions but refused doing so. (this is what people are suspecting)

b- The miner wants to artificially increase the difficulty by tricking the protocol as if blocks were mined faster than they actually were (no logical reason for any miner to do so)

DaCryptoRaccoon

hero member

Activity: 1241

Merit: 623

OGRaccoon

This is a very interesting topic I brought up something about this a long time ago back in 2018 when I spotted strange things happening in the block explorers.

https://bitcointalksearch.org/topic/m.45156234

My post was actually deleted at the time. (Screen shot 2)

I felt it was a valid question to ask as it was not something you normally see happening on the chain it looked to me like someone had worked out how to manipulate the time to ensure they could mine the next block empty before anyone else even started on the block.

So are people now thinking that antpool and bitmain are doing something sketchy to ensure they can mine those empty blocks while the rest of us are trying to play on the un-level playing field?

I was under the impression that most miners used the NTP time pool's for there clock source I know my old miners used NTP and even had the ability to run as NTP server.

Thanks

Magic

LoyceV

legendary

Activity: 3290

Merit: 16489

Thick-Skinned Gang Leader and Golden Feather 2021

Quote from: pooya87 on May 06, 2020, 01:13:22 AM

which means it will never be negative

I just checked (using this work in progress), and using yesterday's data I found 14,145 blocks that had a time earlier than the previous block.
Examples:
628453 - 2020-05-01 19:02:08
628454 - 2020-05-01 19:02:54
628455 - 2020-05-01 19:02:47 (-7 seconds)
628456 - 2020-05-01 19:02:34 (-13 seconds)
628457 - 2020-05-01 19:13:10

I would have expected miners to synchronize their clocks much more.

mikeywith

legendary

Activity: 2464

Merit: 6688

be constructive or S.T.F.U

I am not good at C++ too, but I know it can't be 64 bit both singed and unsigned because 64 bits is 8 Bytes and thus it won't fit into the block header, so it must be 32, by quickly skimming the public class code

Code:

uint32_t nTime{0};

So no negatives here, only 0 and above, all the way to 4294967295 and that so happen to be equivalent to 02/07/2106 @ 6:28am (UTC) according to https://www.unixtimestamp.com/index.php, so in 86 years, some work and a fork will be needed to keep the blockchain going.

pooya87

legendary

Activity: 3472

Merit: 10611

i'm the worst at reading C++ but it seems like core is storing the blocktime (ntime) as an unsigned 32-bit integer[1] and during comparisons (eg.[2]) the GetBlockTime() method [3] is called which will cast that into a 64-bit signed integer which means it will never be negative (casted UInt32.Max = 0x00000000ffffffff => always positive)

[1] https://github.com/bitcoin/bitcoin/blob/78dae8caccd82cfbfd76557f1fb7d7557c7b5edb/src/primitives/block.h#L27
[2] https://github.com/bitcoin/bitcoin/blob/master/src/validation.cpp#L350
[3] https://github.com/bitcoin/bitcoin/blob/54f812d9d29893c690ae06b84aaeab128186aa36/src/chain.h#L247-L250

mikeywith

legendary

Activity: 2464

Merit: 6688

be constructive or S.T.F.U

Quote from: pooya87 on May 05, 2020, 10:48:32 PM

(t2-t1<0)?

Given the context i think it is pretty clear what i mean is the quoted part, however you brought up an interesting point,
the timestamps in the header are the seconds elapsed since 1st jan 1970, can unix timestamp be negative? I don't know.

Let's assume you managed to sequeze in a negative value and kept it at 4bytes, what will actually happen? Is there any piece in the code that specifically checks that? The only issue is that timestamps - - timestamps will result in a timestamps that way far into the future, which then will be invalidated by this rule.

Code:

MAX_FUTURE_BLOCK_TIME =
2 * 60 * 60

pooya87

legendary

Activity: 3472

Merit: 10611

Quote from: mikeywith on May 05, 2020, 05:55:27 PM

so timestamps can be negative,

do you mean the difference between two timestamps (t2-t1<0)?
because if not, and if you are indeed talking about the 4 bytes after the merkleroot hash in a header then i would love to see how this works. is it another weird thing in bitcoin protocol that we let happen?

Topic: Bitcoin's Empty Blocks Analaysis. (Read 870 times)