Pages:
Author

Topic: List of all Bitcoin addresses with a balance (Read 9297 times)

legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
January 03, 2025, 12:51:18 PM
We were talking about that in our private messages.
Thank you for pointing this out Smiley

Quote
Maybe sorting should use LC_ALL=C or LC_ALL=C.UTF-8 before sorting command so it could be always one type of sorting for all systems (it should work like that)
I'll wait if someone responds with a good reason to keep things the way they are. If not, I think I'll go for LC_ALL=C.

Quote
Because systems/servers/OSes differ, we always should give the sorting way for each sorting command (LC_ALL...)
I agree. I just didn't know about the difference, and (before my dedicated server disappeared) never stumbled upon this problem.

Quote
If we change that now, we can break peoples' scripts, but we should make one way of sorting forever, that's a engineering idea as it should be
Let's say give it 2 weeks. But I guess most people don't read here, until after I broke their script by changing things Tongue

Quote
We can see in sorted file, on first page that fits the screen that the sorting differs depending on system or given LC_ALL; it is visible by naked eye that the addresses are sorted other way (mainly lowercase-uppercase are in other order)
Here's the difference:
Code:
11111111111111111111HV1eYjP
11111111111111111111HeBAGj
11111111111111111111QekFQw
11111111111111111111UpYBrS
11111111111111111111g4hiWR
11111111111111111111jGyPM8
11111111111111111111o9FmEC
11111111111111111111ufYVpS
vs:
Code:
11111111111111111111g4hiWR
11111111111111111111HeBAGj
11111111111111111111HV1eYjP
11111111111111111111jGyPM8
11111111111111111111o9FmEC
11111111111111111111QekFQw
11111111111111111111ufYVpS
11111111111111111111UpYBrS
That is annoying to deal with!



This can of course easily be avoided by sorting the data on your local system before using it. For this project, it's quite easy. But for all Bitcoin addresses ever used, it can take hours to sort the data.
full member
Activity: 297
Merit: 133
It was brought to my attention that my "sort" is "different" now, and I got these results testing:

Code:
cat Bitcoin_addresses_LATEST.txt.gz | gunzip | sha256sum
df0baad2301e9b897a02bd3fccb115968c82eb3956143e2f5b4c3ad7b2c227bf  -

So far so good.
Now, this file is sorted on my server from a cronjob. But when I sort it on my local computer, I get this:
Code:
cat Bitcoin_addresses_LATEST.txt.gz | gunzip | sort -S20% | sha256sum
27c2541369d0546ec7c7e70d09d807d8fc6d39435f8857e5ebbf8386584be2d2  -

Has anyone else noticed an incompatible sorting method? Should I change this to a different sorting? Or would that break scripts from people who are currently using it?

We were talking about that in our private messages.

My suggestions:
  • Use pv instead of cat, so you could see progress, it won't affect the result
  • Maybe sorting should use LC_ALL=C or LC_ALL=C.UTF-8 before sorting command so it could be always one type of sorting for all systems (it should work like that)
  • Because systems/servers/OSes differ, we always should give the sorting way for each sorting command (LC_ALL...)
  • If we change that now, we can break peoples' scripts, but we should make one way of sorting forever, that's a engineering idea as it should be
  • We can see in sorted file, on first page that fits the screen that the sorting differs depending on system or given LC_ALL; it is visible by naked eye that the addresses are sorted other way (mainly lowercase-uppercase are in other order)
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
It was brought to my attention that my "sort" is "different" now, and I got these results testing:

Code:
cat Bitcoin_addresses_LATEST.txt.gz | gunzip | sha256sum
df0baad2301e9b897a02bd3fccb115968c82eb3956143e2f5b4c3ad7b2c227bf  -

So far so good.
Now, this file is sorted on my server from a cronjob. But when I sort it on my local computer, I get this:
Code:
cat Bitcoin_addresses_LATEST.txt.gz | gunzip | sort -S20% | sha256sum
27c2541369d0546ec7c7e70d09d807d8fc6d39435f8857e5ebbf8386584be2d2  -

Has anyone else noticed an incompatible sorting method? Should I change this to a different sorting? Or would that break scripts from people who are currently using it?
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
It's back!
My site addresses.loyce.club/ is back online. For now, there's only the most recent snapshot. During the next year, I'll keep more snapshots again.

Direct link to LATEST versions
blockchair_bitcoin_addresses_and_balance_LATEST.tsv.gz (currently 1.5GB)
Bitcoin_addresses_LATEST.txt.gz (currently 1.3GB)

Bandwidth
Starting December 2024, I have a new VPS. This server is allowed 16 TB bandwidth per month. Enjoy!



I've received a few PMs from people who missed my data. It's always good to see it fills a need.
Mek
jr. member
Activity: 75
Merit: 7
mtc.mekweb.eu - mega transistor clock
I still have the "latest" txt file on my disk, list of addresses without balances. Date modified says 29 Nov 2024. Would that help?
I'd appreciate having it. Any where to download it?
In retrospect, I should have made backups of this too. I will next time, I now regret losing monthly snapshots from the past year.
Sure, link sent via PM Smiley
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
I still have the "latest" txt file on my disk, list of addresses without balances. Date modified says 29 Nov 2024. Would that help?
I'd appreciate having it. Any where to download it?
In retrospect, I should have made backups of this too. I will next time, I now regret losing monthly snapshots from the past year.
Mek
jr. member
Activity: 75
Merit: 7
mtc.mekweb.eu - mega transistor clock
I still have the "latest" txt file on my disk, list of addresses without balances. Date modified says 29 Nov 2024. Would that help?
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
To get good list of addresses with balances, I used a script in python.
That's a long way to turn the list with balances into the list without balances (which I had on my site already). But it doesn't fix the problem that they're still unavailable.

I'm still adding data to my new server, when I'm done, I'll see if I can get this list from Bitcoin Core myself.
full member
Activity: 297
Merit: 133
To get good list of addresses with balances, I used a script in python.
It is very good to then check other addresses with this file if you need fast solution.
Both files must be sorted, then use "comm -12" to find addresses that are in both files.

Code:
#!/usr/bin/env bash
# apt install aria2
echo Downloading...
aria2c -x4 http://addresses.loyce.club/blockchair_bitcoin_addresses_and_balance_LATEST.tsv.gz
echo Unpacking...
pv blockchair_bitcoin_addresses_and_balance_LATEST.tsv.gz | gunzip > blockchair_bitcoin_addresses_and_balance_LATEST.tsv
echo Sorting...
pv -B 1M -cN input blockchair_bitcoin_addresses_and_balance_LATEST.tsv | cut -f 1 | sort -u --parallel=16 | pv -cN output > addrs-with-bal.txt
echo Done!
>&2 echo -ne "\a"
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
Update:
I have a new server for this project. But.... I didn't keep a backup of the actual data. There was just a lot of data, so I kinda skipped this one Sad I have my script to download and process it, but the page design is hard to find back, and my year of monthly snapshots is gone.
To make matters worse, gz.blockchair.com where I get the data was last updated on November 30, and hasn't updated since. This has happened before and usually they come back, but for now I have no data.
If it takes too long I'll have to figure out how to get this data from Bitcoin Core myself.
If anyone happens to have some old backups, please let me know Smiley
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
Still working for me.  https://loyce.club/
That's on AWS, still going strong. "This topic" is on my dedicated Xeon server:
The data
See addresses.loyce.club. I keep 18 snapshots of Blockchair's daily data.
Vod
legendary
Activity: 3668
Merit: 3010
Licking my boob since 1970
It looks like my server is gone Sad I'm waiting for a response from my sponsor.

Still working for me.  https://loyce.club/
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
It looks like my server is gone Sad I'm waiting for a response from my sponsor.
Mek
jr. member
Activity: 75
Merit: 7
mtc.mekweb.eu - mega transistor clock
I noticed my bandwidth consumption dropped from 20 to 14 TB/month Wink But now that I think about it, my memory may be incorrect. I think it used to be 10 TB/month, which means it increased. I'm not so sure anymore, I don't really keep track.
Possibly because there were many days without update. No need to redownload old files.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
Hahaha, is this post something that really needs a BTCump?
I noticed my bandwidth consumption dropped from 20 to 14 TB/month Wink But now that I think about it, my memory may be incorrect. I think it used to be 10 TB/month, which means it increased. I'm not so sure anymore, I don't really keep track.

Quote
After that users always ask about how to process/edit or manage those big files... most of them don't know any Linux environment or command   Undecided
If you know an equivalent that works in Windows, please post it.

Quote
Anyway thanks for your contributions to this community.
No problem Smiley
hero member
Activity: 862
Merit: 662
Bump

Hahaha, is this post something that really needs a BTCump?

Always that a user ask me for a list of addresses with balance I always suggest them to search on google for "LoyceV  addresses with balance" and it always works



After that users always ask about how to process/edit or manage those big files... most of them don't know any Linux environment or command   Undecided

Anyway thanks for your contributions to this community.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
It should be something like this (not tested):
I would wget the .gz file, pipe it through gunzip, and go from there. That way you only need to download a fraction of the file if you kill it once you reach amounts lower than the desired minimum.
newbie
Activity: 1
Merit: 0
It should be something like this (not tested):

Quote
awk -F'\t' 'NR>1 && $2>=100000000{print $1}' blockchair_bitcoin_addresses_and_balance_LATEST.tsv > blockchair_bitcoin_addresses_selected.txt

The probability to match by brute force so small number of addresses from the entire set is so small that you should be extremely lucky if you can find even an address with 1 satoshi in it. Asking for reduced number of addresses is telling me you have a weak hardware. This is not what you need for this task. You may need the entire calculating power available on Earth for many years.
legendary
Activity: 2352
Merit: 6089
bitcoindata.science
Hey guys,
can we have a list of only addresses with more than 1 BTC - the so called Rich List.
I need one because the full list is more than 2GB now and that tends to slow some programs down. A lightweight version would be really nice, but I can't find any.
Much appreciated.

You can use python and the pandas library. It will open files of any size and sort and slice them data easily.
Ask chatpgpt for help , it is east. But you won't make any money brute forcing them.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
can we have a list of only addresses with more than 1 BTC - the so called Rich List.
I need one because the full list is more than 2GB now and that tends to slow some programs down.
So you want to steal millions by brute-forcing Bitcoin addresses, but you can't figure out how to download blockchair_bitcoin_addresses_and_balance_LATEST.tsv.gz (which is sorted by highest balance first) and use only the top of this file?
Pages:
Jump to: