Author

Topic: All used addresses (Read 738 times)

newbie
Activity: 29
Merit: 50
August 24, 2020, 06:44:49 AM
#29
i wrote my methodology for my addresses list with more details in my github repo ..
Here is how i did the sorting..

Code:
$ export TMPDIR='/large/tmp/dir'
$ export LC_ALL=C
$ nl concat.txt | sort -k2 -u | sort -n | cut -f2 > final.txt

Note that using LC_ALL=C will greatly speed up sorting!
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
August 24, 2020, 06:44:38 AM
#28
Code:
cat -n input.txt | sort -uk2 | sort -nk1 | cut -f2- > output.txt
This looks genius in simplicity! It worked on a small sample, I'm currently transfering and extracting 31 GB of data to do the full test. There's no way the double sort will fit the 100 GB VPS, but this should work without using a lot of RAM.
I'll continu my test results in my own topic: List of all Bitcoin addresses ever used.
legendary
Activity: 1624
Merit: 2481
August 24, 2020, 06:13:58 AM
#27
I want to remove duplicate lines without changing the order, so only keeping the first occurrence.

If i am not mistaken, the following should work:
Code:
cat -n input.txt | sort -uk2 | sort -nk1 | cut -f2- > output.txt

None of these commands needs to hold the file in memory all at once.
But as mentioned previously, sort does need quite some disk space to create temporary files. So that might be a bottleneck, depending on your system specs.
hero member
Activity: 1659
Merit: 687
LoyceV on the road. Or couch.
August 24, 2020, 05:43:44 AM
#26
I might have just misunderstood the problem, might want to elaborate the actual issue?
I want to remove duplicate lines without changing the order, so only keeping the first occurrence.

Say:
A
G
D
A
B
C
D

I want to keep:
A
G
D
B
C
legendary
Activity: 1624
Merit: 2481
August 24, 2020, 05:10:57 AM
#25
The sort command can keep chronological order without using ram
How? I haven't found that option.

The command sort only takes roughly 50% of your available RAM.
If you are running out of memory using sort on a large file, most likely there isn't enough space on your hard drive.

What sort does, when the file is larger than your available RAM, is that it creates temporary files on the hard drive which are then merge-sorted at the end.
So the overall capacity needed is the sizes of the file times 3 (if you keep the original one) or 2 (if you overwrite the original file).


I might have just misunderstood the problem, might want to elaborate the actual issue?
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
August 24, 2020, 04:59:41 AM
#24
The sort command can keep chronological order without using ram
How? I haven't found that option.
newbie
Activity: 29
Merit: 50
August 23, 2020, 05:06:59 PM
#23
I made List of all Bitcoin addresses ever used.

one feature of my lists is i tried to keep the original order in which addresses first appeared in the blockchair dumps..
It works with awk:
Code:
awk '!a[$0]++'
But this requires far too much memory. I can use this on data per day, but not on all data.
So for now, I gave up trying to keep addresses in chronological order. I'll keep the original data in case I find a different solution (or enough RAM) later.

Hey that is very nice , bro.
You have got the means, work hard and you are very good at it.
I knew that awk one-liner you wrote, tho i tried using perl because thought that might need less ram..
The sort command can keep chronological order without using ram but it needs a large temp directory (/tmp will not work as it is limited by the system ram value).
ok, cheers!
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
August 03, 2020, 03:27:11 AM
#22
I made List of all Bitcoin addresses ever used.

one feature of my lists is i tried to keep the original order in which addresses first appeared in the blockchair dumps..
It works with awk:
Code:
awk '!a[$0]++'
But this requires far too much memory. I can use this on data per day, but not on all data.
So for now, I gave up trying to keep addresses in chronological order. I'll keep the original data in case I find a different solution (or enough RAM) later.
newbie
Activity: 29
Merit: 50
July 20, 2020, 07:44:45 PM
#21
please donate to me instead( LOL), i never received a donation lol
no i am not selling data
[it will take me some days (not many, though) to upload every thing to git as of today]

and i think this is a much powerful proof we are NOT
really interested in money, except some private companies may


only one liners,  yeah oh... go learn some more
@loyce you didnt post the resulting lists, how can anybody be sure about; 
@windows/osx ;people please remove windows or mac os
newbie
Activity: 29
Merit: 50
July 20, 2020, 02:03:27 PM
#20
really pleased with the GNU sort (that is called the genius sort)..

this ought to be about right.. will try and upload split unique address list files to git.

one feature of my lists is i tried to keep the original order in which addresses first appeared in the blockchair dumps..

you should not do it all at once, if you have got disc space.. i have produced some intermediate  files for processing..


dump files from blockchair (4210 files):

from 20090103 to 20200718

addresses total: 1483853800

unique addresses total: 692773144

https://github.com/mountaineerbr/bitcoin-all-addresses
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
July 20, 2020, 03:23:10 AM
#19
Answering this deleted post, I'm currently running this:
Code:
for file in *.tsv.gz; do gunzip -c $file | grep -v is_from_coinbase | cut -f 7 >> /tmp/addresses.txt; done
It takes a while, but doesn't consume a lot of memory. When done, I'll sort | uniq the file and get the result. Now that I think about it, I could have piped the whole thing through sort in the same line instantly.

When I have a bit more time, I'll create daily updates for all used Bitcoin addresses and make it available for download. It's going to be a big file though.



Blockchair's /bitcoin/outputs currently takes 106 GB, and grows 2 GB per month. At 100 KB/s, it takes just under 2 weeks to download from Blockchair. For $32 per year, I can run a VPS in Germany with 1 Gbit connection, enough disk space to keep up for a few years, and enough bandwidth to allow 9 downloads per month. If anyone can use this, let me know.



The list with all addresses is 49 GB in size. If you tried to load it to RAM, that's probably why you ran out of memory.
Total address count: 1,484,589,749
1... address count: 1,039,899,708
3... address count: 343,485,961
bc1q... address count: 55,006,904
...-... (with a "dash") address count: 46,197,161

Unique address count:
1... address count: 470,943,308
3... address count: 167,941,821
bc1q... address count: 39,137,878
...-... (with a "dash") weird address count: 15,157,808

And here it stops for now: after processing data for 5 hours, I made a mistake and accidentally overwrote my end-result. I'll restart later.
I'd like to see which address has received most transactions.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
July 16, 2020, 01:08:37 PM
#18
Sorry i can't repay.
No problem. I'll run out of disk space on the VPS soon though, so I'll have to start deleting older files.
Judging by bandwidth consumption, someone else downloaded the data already. I'll also run out of monthly bandwidth if only a few people download all 84 GB, but since I have no other use for the VPS, that's okay too.

Quote
here is a shell script i'm using for downloading these files (requires curl and bash)..
https://github.com/mountaineerbr/scripts/blob/master/blockchair.btcblockhain.outputs.sh
Mine is a lot shorter, usually I type everything on just one line. I still have to adjust it for processing the data, but I'll do that when it's complete.
newbie
Activity: 29
Merit: 50
July 16, 2020, 08:22:01 AM
#17
Hey LoyceV

Thank you very much for your effort for uploading those dumps..
Sorry i can't repay.

I am downloading the files, should not take too long now..

As for cheating Blockchair's download restrictions, i don't think that would be bad or evil.

here is a shell script i'm using for downloading these files (requires gnu coreutils)..
https://github.com/mountaineerbr/scripts/blob/master/blockchair.btcoutputs.sh

Take care everyone!

legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
July 16, 2020, 06:55:15 AM
#16
I tried downloading it and here shows 100 KB/s. I can download them and upload them to google drive if you want Smiley I'll be away for a few weeks but I can do that job, just let me know.
Thanks, but no need, I downloaded 4 months since yesterday, so I should be done in a 4 days.
legendary
Activity: 2240
Merit: 3150
₿uy / $ell ..oeleo ;(
July 16, 2020, 03:46:18 AM
#15

The easiest way is to download Blockchair Database Dumps, but at 10 KB/s it takes months.

I tried downloading it and here shows 100 KB/s. I can download them and upload them to google drive if you want Smiley I'll be away for a few weeks but I can do that job, just let me know.

legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
July 15, 2020, 10:05:34 AM
#14
i really just wanted to grep those dumps, sort and uniq those addresses..
That's my plan Tongue I'm curious to see how many different addresses have been used.
Once I have it, I'll provide daily updates.

Quote
if i could access from three different locations
Getting 3 different VPSses (VPSs?) wouldn't even be very expensive, but it feels like cheating Blockchair's download restrictions. I don't think it will take 200 days at 100 kB/s. I'm currently at February 2019.

This is all I have so far, feel free to download 74 GB: http://loyceipv6.tk:20319/blockdata/
newbie
Activity: 29
Merit: 50
July 15, 2020, 07:25:37 AM
#13
hey LoyceV

yeah, and there is this new feature in 0.20 which seems it can dump snapshots of unspent transaction outputs with `dumptxoutset`..
when they update boincoind in my distro, i will check that out.
indeed my downloads are between 48KB/s and 98KB/s from blockchair, still if i could access from three different locations it should take less than 70 days to download everything.. if from one location, 200 days..
i really just wanted to grep those dumps, sort and uniq those addresses..
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
July 14, 2020, 11:43:00 PM
#12
i am downloading it too, with a bash script, two days and downloaded 10GB..
i think those dumps amount to 1TB, so it will take too long to download from only one location..
Blockchair increased the download speed from 10 to 100 kB/s. I'm currently downloading November 2018. I'm not sure yet if I'll share the raw data, I'd need another VPS because of the size.
It's probably much less than 1 TB though, the entire blockchain isn't that big.
newbie
Activity: 29
Merit: 50
July 14, 2020, 06:48:56 PM
#11
I'm already downloading it, and will publish the data for faster downloads once it's done. I don't know how many weeks/months it'll take though.

Hey LoyceV
Thanks for downloading and offering to share that.
i am downloading it too, with a bash script, two days and downloaded 10GB..
i think those dumps amount to 1TB, so it will take too long to download from only one location..
if you can share the files when you are ready, i will be very ineterested.
cheers!

PS: yes, i think i added some zeroes when calculating download size. this data set size from blockchair, specifically, seems close to 100GB..
sr. member
Activity: 310
Merit: 727
---------> 1231006505
July 08, 2020, 02:58:55 AM
#10
Get all balances here (currently unlocked, 771 megabyte):
https://balances.crypto-nerdz.org/balances/balances-bitcoin-20200708-0000-MXSyuyTD.gz

Please note: it's not my site/data and link will probably be locked again. So if you need it I suggest you get it now.
newbie
Activity: 17
Merit: 25
July 07, 2020, 09:26:28 PM
#9
For 0.005 btc (~50$) you can contact the person here and get a recent dump:

https://balances.crypto-nerdz.org/

Maybe you can get it cheaper if you ask. He seems to be a fair guy.

Regards
copper member
Activity: 193
Merit: 255
Click "+Merit" top-right corner
July 07, 2020, 09:20:55 PM
#8
legendary
Activity: 1624
Merit: 2481
July 03, 2020, 10:54:06 AM
#7
In this case you could proceed as mentioned above.

Use core to sync the whole blockchain with the txindex flag.
Then build a database using this data.
member
Activity: 73
Merit: 19
July 03, 2020, 09:45:40 AM
#6
thank you for your answers

i need just used address database in mainnet

sorry my english poor :/

actually i need a list of processed but empty wallet addresses

there is usually an address list with balance


legendary
Activity: 1624
Merit: 2481
July 03, 2020, 08:49:33 AM
#5
How to dump  mainnet all used addresses without balance or tx ?
If there are no balance and also no tx
He's asking for used addresses, so they must have received a transaction at some point.

And at the same time he is asking for addresses without a balance/transaction, so they have to be unused.

Since the request itself is illogical, we need to wait for OP to clarify his request.

@OP
In case you are simply looking for used addresses: Download core and the whole blockchain with the txindex flag.
You will have the complete data.
Then you just need to extract what you are searching for.

This way you don't need to rely on a 3rd party which takes weeks to download from.
Depending on your processing power and bandwidth, syncing the whole blockchain might be much faster.
legendary
Activity: 3668
Merit: 6382
Looking for campaign manager? Contact icopress!
July 03, 2020, 04:18:34 AM
#4
How to dump  mainnet all used addresses without balance or tx ?
If there are no balance and also no tx
He's asking for used addresses, so they must have received a transaction at some point.

You are right. Still, he asked about "no tx" and I wanted to make sure he understands what he wants and maybe "refine" his request.
I somehow think that he may have confused the unused addresses he can see in his walled as "used, but with no transactions".

If indeed he wants addresses with "no tx" then they've never got any transaction ever. And in that case a dump of addresses will not help him much Wink

legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
July 03, 2020, 03:05:10 AM
#3
How to dump  mainnet all used addresses without balance or tx ?
The easiest way is to download Blockchair Database Dumps, but at 10 KB/s it takes months. I'm already downloading it, and will publish the data for faster downloads once it's done. I don't know how many weeks/months it'll take though.

Update: It's done, see List of all Bitcoin addresses ever used!

Related data:
Bitcoin block data available in CSV format
List of all Bitcoin addresses with a balance



How to dump  mainnet all used addresses without balance or tx ?
If there are no balance and also no tx
He's asking for used addresses, so they must have received a transaction at some point.
legendary
Activity: 3668
Merit: 6382
Looking for campaign manager? Contact icopress!
July 03, 2020, 03:01:28 AM
#2
How to dump  mainnet all used addresses without balance or tx ?

If there are no balance and also no tx, the address is not used and you can't really know if somebody has it in his wallet or not.
So this makes you ask for pretty much all Bitcoin addresses, which is basically impossible to get, they're far too many.
member
Activity: 73
Merit: 19
July 02, 2020, 10:49:32 PM
#1
How to dump  mainnet all used addresses without balance or tx ?

Jump to: