Pages:
Author

Topic: List of all Bitcoin addresses ever used - currently UNavailable on temp location - page 4. (Read 3697 times)

newbie
Activity: 18
Merit: 10
BTW,  I think mega - mega.co.nz give you something like 50GB of storage for free.
That's a terrible site, I used it once to download a large file, it forced me to install their program first. So I prefer a VPS.
I'll let you know when it's available.

Update: I got you http://107.191.98.18/addresses_sorted.txt.gz ! It's 19 GB. Please let me know when I can nuke the VPS again.

I've got the file, thank you so much for sharing.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
Update: I got you http://107.191.98.18/addresses_sorted.txt.gz ! It's 19 GB. Please let me know when I can nuke the VPS again.
"Your download will take ~4 hours to complete"
I get this (in England):
Code:
-                     0%[                    ] 139.10M  33.9MB/s    eta 10m 15s

What bandwidth limitations do you have? Cheesy I don't just want to make problem your problem.

I remember I had another offer:
For living we host high-end enterprise, just in case you need some space or mirrors, you’re welcome if ever in need.
Is this offer still valid?
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
IMO, you should upload the file to a GCS/AWS/Azure/Oracle/etc storage bucket, set the permissions to "anyone can access" but set the object so that the "requestor pays" for downloads. This will result in you paying under a dollar per month in storage costs, but anyone who accesses your file will pay a few dollars to get your data in seconds. 

Or you can just ask me nicely and I'll host it on my site's public directory.
copper member
Activity: 1610
Merit: 1898
Amazon Prime Member #7
Update: I got you http://107.191.98.18/addresses_sorted.txt.gz ! It's 19 GB. Please let me know when I can nuke the VPS again.
"Your download will take ~4 hours to complete"

IMO, you should upload the file to a GCS/AWS/Azure/Oracle/etc storage bucket, set the permissions to "anyone can access" but set the object so that the "requestor pays" for downloads. This will result in you paying under a dollar per month in storage costs, but anyone who accesses your file will pay a few dollars to get your data in seconds.

Maintainng a multigigabyte file that is accessible to the public for free, that can be accessed unlimited times is really not feasible.
legendary
Activity: 1568
Merit: 6660
bitcoincleanup.com / bitmixlist.org
BTW,  I think mega - mega.co.nz give you something like 50GB of storage for free.
That's a terrible site, I used it once to download a large file, it forced me to install their program first. So I prefer a VPS.

Most cloud storage sites can't upload files several GB large well without constantly breaking the connection, and will throttle the download speed even more which makes them unsuitable for downloading those files as well.

Also it's not 50GB of free storage, it's much smaller than that. Most of that free storage is temporary and expires after a year.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
BTW,  I think mega - mega.co.nz give you something like 50GB of storage for free.
That's a terrible site, I used it once to download a large file, it forced me to install their program first. So I prefer a VPS.
I'll let you know when it's available.

Update: I got you http://107.191.98.18/addresses_sorted.txt.gz ! It's 19 GB. Please let me know when I can nuke the VPS again.
Update: link expired.
newbie
Activity: 18
Merit: 10
If you can and its not too hard for you, version 1 would be awesome!

1. All Bitcoin addresses ever used, in chronological order, without duplicates.
Sample: addresses_in_order_of_first_appearance.txt.gz: (Warning: 18 GB):
Code:

1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa
12c6DSiU4Rq3P4ZxziKxzrL5LmMBrzjrJX
1HLoD9E4SDFFPDiYfNYnkBLQ85Y51J3Zb1
.......
3GFfFQAFgXKiA1qqUK6rqBpEpG4vZDos6t
3Mbtv47gZ2eN6Fy7owpgHHwSLYHS42P56P
38JyF2RQknBUMETyRT2yGndDJFYSp6hJNg

Thanks.

BTW,  I think mega - mega.co.nz give you something like 50GB of storage for free.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
Any news from the provider, when your server is coming back up?
Nope, they're awefully quiet Sad

1. All Bitcoin addresses ever used, in chronological order, without duplicates.
Sample: addresses_in_order_of_first_appearance.txt.gz: (Warning: 18 GB):
Code:
1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa
12c6DSiU4Rq3P4ZxziKxzrL5LmMBrzjrJX
1HLoD9E4SDFFPDiYfNYnkBLQ85Y51J3Zb1
.......
3GFfFQAFgXKiA1qqUK6rqBpEpG4vZDos6t
3Mbtv47gZ2eN6Fy7owpgHHwSLYHS42P56P
38JyF2RQknBUMETyRT2yGndDJFYSp6hJNg

2. All Bitcoin addresses ever used, sorted by address, without duplicates.
Sample: addresses_sorted.txt.gz: (Warning: 16 GB):
Code:
1111111111111111111114oLvT2
111111111111111111112BEH2ro
111111111111111111112xT3273
.......
s-ffd80dee5966fb23c1a483b28f6bfcbc
s-fff5d0faa9628c188e97661f0e185fce
s-ffff291613d413b4ac128df96a462294
Which one would you prefer? The sorted version is much more practical for most uses, so unless you have a specific reason to want the addresses in chronological order, I'd say go for the sorted file.
newbie
Activity: 18
Merit: 10
Hi LoyceV,

Any news from the provider, when your server is coming back up?
 Do yo have temp location I can download files from ?

Thanks
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
I can suggest sharing files via torrent. Technically you'd have to upload a lot less since a lot of data will be transferred between people downloading.
I haven't tried that yet for these reasons: I don't want to upload from my desktop, so I still need a VPS. I don't expect many simultaneous downloads, so most of the uploads will still come from me. Every update will make an existing torrent useless again (and I don't want to keep posting new magnet links).

Quote
Linux question

what is the benefit of two directional syntax -
Code:
cat <(gunzip -c addresses_sorted.txt.gz) > secondgunzip
instead of just
Code:
gunzip addresses_sorted.txt.gz
or
Code:
gunzip -c addresses_sorted.txt.gz > out.txt
?
I was isolating a part from the longer code:
Code:
This:
<(gunzip -c addresses_sorted.txt.gz)
Came from:
....t -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(s....
I didn't edit it so it's clear where it came from.

Quote
or another exmple:
Same reason Smiley



I am now pretty sure the inconsistent results were caused by using sort S, --buffer-size=SIZE. I was trying to be smart enforcing efficient memory usage, but I now believe this sometimes showed an error, which was then piped into the next command. If I omit the -S40% part, it works fine. This is actually good news, because it's much faster than my previous method.
newbie
Activity: 18
Merit: 10
I can suggest sharing files via torrent. Technically you'd have to upload a lot less since a lot of data will be transferred between people downloading.
You can share magnet link here, seed for some time and then let people share it, just a suggestion.

Linux question

what is the benefit of two directional syntax -
Code:
cat <(gunzip -c addresses_sorted.txt.gz) > secondgunzip
instead of just
Code:
gunzip addresses_sorted.txt.gz
or
Code:
gunzip -c addresses_sorted.txt.gz > out.txt
?

or another exmple:

Code:
cat <(sort -u daily_updates/*.txt) > firstsort

instead of
 
Code:
cat daily_updates/*.txt |sort -u > firstsort

Thanks
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
all links are down.
Sorry, I forgot to post this here:
This server is currently offline. I don't know why (yet).
Still no response from my webhost Sad
newbie
Activity: 18
Merit: 10
Hi there, first of all thanks for nice data source.

Could you check the links , all links are down.
Thanks
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
What can be the cause of this? The first and third run gave the same results, the others are different.
~
I'm still running the same command on the same (old) system, but from a different HDD (with tmpfiles on a different drive too).
I'm now trying the same on a fresh RamNode cloud instance, and have the same problem:
Code:
loyce@160gb:~/alladdresses.loyce.club$ cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) <(cat <(cat <(cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(sort -u daily_updates/*.txt)) | nl -nln | sort -k2 -S80%) | uniq -df1 | sort -nk1 -S80% | cut -f2) | md5sum
f96f2952151451b88edcf01332ec907d  -
loyce@160gb:~/alladdresses.loyce.club$ cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) <(cat <(cat <(cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(sort -u daily_updates/*.txt)) | nl -nln | sort -k2 -S80%) | uniq -df1 | sort -nk1 -S80% | cut -f2) | md5sum
70d3d472590b3fb8356348e9fd189ddb  -
I no longer think my very old PC is the problem. As far as I know, these commands should produce the exact same output given the exact same input data. But the data changes somehow.

I realize Bitcointalk is probably not the best forum, but I'm not actively using a more specialized forum, so I post it here.

Update: I tried two more times:
Code:
loyce@160gb:~/alladdresses.loyce.club$ cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) <(cat <(cat <(cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(sort -u daily_updates/*.txt)) | nl -nln | sort -k2 -S80%) | uniq -df1 | sort -nk1 -S80% | cut -f2) | md5sum
f96f2952151451b88edcf01332ec907d  -
loyce@160gb:~/alladdresses.loyce.club$ cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) <(cat <(cat <(cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(sort -u daily_updates/*.txt)) | nl -nln | sort -k2 -S80%) | uniq -df1 | sort -nk1 -S80% | cut -f2) | md5sum
d4327e249819af8d025862bd4079d44d  -
I reproduced the same checksum only once. I need booze Tongue



This was how I started:
For comparison, here's the md5sum for the result from my old code (gunzipped):
Code:
md5sum newchronological.txt
4070c03f974da0ee05ea51084d0f04ac  newchronological.txt
And if I split up my above command string and write some temporary files to disk, I get the exact same (correct) result again:
Code:
cat firstgunzip thirdsort | md5sum
cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2 > firstcat
cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) > firstgunzip
cat <(sort -u daily_updates/*.txt) > firstsort
cat <(gunzip -c addresses_sorted.txt.gz) > secondgunzip
cat <(comm -13 secondgunzip firstsort) > firstcomm
#cat <(cat firstcat firstcomm | nl -nln | sort -k2 -S80%) > secondsort
cat <(cat firstcat firstcomm | nl -nln | sort -k2) > secondsort
cat <(cat secondsort | uniq -df1 | sort -nk1 | cut -f2) > thirdsort
cat firstgunzip thirdsort | md5sum
4070c03f974da0ee05ea51084d0f04ac  -



It gets weirder: I now can't even reproduce the same weird problem again, so I can't know whether or not my changes fixed it.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
Your URL alladdresses.loyce.club/?C=M;O=D is not working. Can you please check?
I still haven't found a new host for this, so it's still on it's temporary location:
I've uploaded the latest version to a temporary location: blockdata.loyce.club/alladdresses/.
The latest update was 2.5 months ago (because of weird problems).

Thanks for the reminder though, I'll do some more testing to get updates working again.
newbie
Activity: 16
Merit: 8
Hi LoyceV,
Thanks for the nice resource of data!

Your URL alladdresses.loyce.club/?C=M;O=D is not working. Can you please check?

Background
To follow up on List of all Bitcoin addresses with a balance and this post, I made a list of all Bitcoin addresses that have ever been used.

The data
See alladdresses.loyce.club (new location)
I now have the resources (RAM, CPU power and disk space) and code to show unique addresses in their original order. Each address is only shown once. I have 2 large files:

legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
What can be the cause of this? The first and third run gave the same results, the others are different.
Code:
cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) <(cat <(cat <(cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(sort -u daily_updates/*.txt)) | nl -nln | sort -k2 -S80%) | uniq -df1 | sort -nk1 -S80% | cut -f2) | md5sum
7d2f923c7ce1d9534629b4502c37680d  -

cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) <(cat <(cat <(cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(sort -u daily_updates/*.txt)) | nl -nln | sort -k2 -S80%) | uniq -df1 | sort -nk1 -S80% | cut -f2) | md5sum
cede1315137bb4a2ab20c5438e4525ba  -

cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) <(cat <(cat <(cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(sort -u daily_updates/*.txt)) | nl -nln | sort -k2 -S80%) | uniq -df1 | sort -nk1 -S80% | cut -f2) | md5sum
112100c359f74c0e60b95afa92de990d  -

cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) <(cat <(cat <(cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(sort -u daily_updates/*.txt)) | nl -nln | sort -k2 -S80%) | uniq -df1 | sort -nk1 -S80% | cut -f2) | md5sum
1bc7138bb4a367c117002234a604d444  -

cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) <(cat <(cat <(cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(sort -u daily_updates/*.txt)) | nl -nln | sort -k2 -S80%) | uniq -df1 | sort -nk1 -S80% | cut -f2) | md5sum
cc91ef352ffa1e641a7c47dcc3d743f3  -
I'm still running the same command on the same (old) system, but from a different HDD (with tmpfiles on a different drive too).
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
Quoting myself:
Code:
Old code:
time cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2 | gzip > newchronological.txt.gz

Code:
New code:
time cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) <(cat <(cat <(cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(sort -u daily_updates/*.txt)) | nl -nln | sort -k2 -S80%) | uniq -df1 | sort -nk1 -S80% | cut -f2) > new.alladdresses_chronological.txt
But it's wrong
I tried again:
Code:
time cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2 > oldcode.txt
time cat <(gunzip -c addresses_in_order_of_first_appearance.txt.gz) <(cat <(cat <(cat daily_updates/*.txt | nl | sort -uk2 -S40% | sort -nk1 -S40% | cut -f2) <(comm -13 <(gunzip -c addresses_sorted.txt.gz) <(sort -u daily_updates/*.txt)) | nl -nln | sort -k2 -S80%) | uniq -df1 | sort -nk1 -S80% | cut -f2) > newcode.txt
The files were too large to diff, so I split them in parts (10 million lines each). There are a few differences:
Code:
< 3BS6oQKHDwrzz4RC69iAbSV13xpbGZvXLj
---
> 3BS6oQKHDwrzz4SC69iAbSV13xpbGZvXLj

< 17Q7LN9nCmS6HdjkDj3C4MdhduFobGp4hv
---
> 17Q7LN9nCmS6HdjkDk3C4MdhduFobGp4hv

< 1rVH156qu1djPVFGoKaZ29Kw8zEpmh283
9863597a9863597
> 1rVH156qu1djPVFGoKaZ29Kw8zEpmh283

< 3Kw9pkLTLExTd9LZW2qbbNUdZRpUW3JTac
---
> 3Kw9pkLTLExTd9LZW2qbbNUdZSpUW3JTac

< 1Q7NSpgjxDHTTPUkGskTNDioCYw6MQazBG
---
> 1Q7NSpgjyDHTTPUkGskTNDioCYw6MQazBG

< 331xujHAg6AGvKzPwUKZ9AJxukaemCXeRw
8496039c8496038
< 3PLoa4ccMdxyGY6mAEStSu45xwqdRftd1b

> 3PLoa4ccMdxyGY6mAEStSu55xwqdRftd1b
10000000a10000000
> 3B92y4bFFPZvjviNhtLWeBKoYXmHVwr3CD

< 3B92y4bFFPZvjviNhtLWeBKoYXmHVwr3CD
7735278a7735278
> 331xujHAg6AGvKzPwUKZ9AJxukaemCXeRw

Let's highlight this one:
Quote
< 1Q7NSpgjxDHTTPUkGskTNDioCYw6MQazBG
---
> 1Q7NSpgjyDHTTPUkGskTNDioCYw6MQazBG
The first one (with x) is correct.

I have no idea what causes this. I'm now checking if I can reproduce the exact same data change, or that it's caused by hardware failure.
copper member
Activity: 1610
Merit: 1898
Amazon Prime Member #7
Resulting in O(n + k log k + 2k). In this particular case one might even argue that n > k log k + 2k, therefore O(2n) = O(n) However, it's late here and I don't like to argue.

You only need enough memory to keep the new addresses in memory and enough disk space to keep both the new and old version on disk at the same time.

You are correct. I had not considered updating the list not from scratch. Accessing a single line from a file can degrade performance, and it should be considered if paying for more RAM would be cost effective considering the additional time required to update the list.
newbie
Activity: 29
Merit: 50
Just to let you guys know i updated my bitcoin-all-addresses list on 2021 Jan 19.
That is available in my github repo https://github.com/mountaineerbr/bitcoin-all-addresses
All addresses are uniquely printed in the order they first appeared in Blockchair output dumps.

I was able to reproduce my methodology after 6 moths from the first lists.
The methodology is described in the read me of the git repo
and some code i used here: https://github.com/mountaineerbr/bitcoin-all-addresses/blob/master/blockchair.btcoutputs.process.sh

If you export LANG=C and LC_ALL=C, that will speed up sorting and as we are dealing
with base58 and segwit base addresses, that should be OK.
Pages:
Jump to: