Pages:
Author

Topic: BitIndex - The time, money, and computing power wasting database project - page 2. (Read 3593 times)

newbie
Activity: 23
Merit: 0
just like Coindesk bpi??

Wow.

 It has nothing to do with price. Its about indexing a database of bitcoin addresses and their private keys to make them searchable
full member
Activity: 228
Merit: 100
just like Coindesk bpi??
newbie
Activity: 23
Merit: 0
*****UPDATE*****

The 6th node is fully assembled and started receiving address / key pairs as of about an hour ago, we have made a minor adjustment in the way the software layer threads the data to the drives hoping to decrease the time to fill a full 270 TB's from 13.5 days down to 11 allowing us to add and fill just shy of 3 new nodes per month.

Sometime before the week is over we will be beta testing on a 36TB mini test node, a new SQL compression algo in hopes of being able to obtain 5% more address / key data per node. This would amount to almost 210 trillion more addresses on the nodes we have built and running now.
newbie
Activity: 23
Merit: 0
Had a few messages asking how we are running the address generation at current this morning when i logged in, rather than send back 6 replies i figured i would just post it.

1: download Vanitygen
2: put it in its own folder on your desktop
3: open a notepad
4: create a batch file
5: run the batch file
6: Done

To make the batch file, enter the following command on a single line of the notepad

vanitygen64 -k -o output.txt 1

then when you go to save it, switch where it says text file to "all files" and when you name it, give it a name like "Run.bat" when you click the file after saving it it will begin to generate a .txt file filled with addresses.

We wrote a java program to parse the output file into a usable .csv file that we can import into the SQL EE Database.


For those of you who messaged about donating some output files from vanitygen. Please upload them to a file sharing site somewhere and message us a link to download them. Max file size is 4GB per file. It must be the standard CPU output format not the GPU format that outputs pairs.

For those of you who messaged about helping with software and development. The database compression is kind of our pet project, things we could use is a better address generator, which is on our list of to-do's but not a priority as the main focus is first the compression in the DB. If you were talking about helping with the compression methods and schema's please message us a skype name and we will get in touch.

For the few of you who messaged us inquiring if you could pay to have access to the database. THE ANSWER IS NO. We are doing this as a project to have fun, kill time, and learn a little about data warehousing. We have not, and would not ever steal someones bitcoins just because we could even though the chances are astronomically small. If you are willing to pay to have access we are certain that your intentions are malicious which we will never support. Our location, access to anything on the same network as the SQL servers, our e-mails, pretty much everything that could lead to any misuse of the database or any theft or attacks is not and will never be allowed. Take your scamming, thieving, hair brained ideas elsewhere or go build your own database. The next time you message us with your insane ideas we will be sure and notify gmaxwell and the others with reputation enough to destroy your forum trust ratings. WE WILL NOT SUPPORT NOR ALLOW MISUSE OF OUR PROJECT.
newbie
Activity: 34
Merit: 0
newbie
Activity: 23
Merit: 0
After a few months of inner discussion, we have decided to make it public, or i guess semi-public by starting this free blog just to update how much time, money, and computing power we have wasted just because we can, and it gives us something to do and laugh about.

For those of you too busy to click and waste 10 minutes of your time on the other blog site i will copy it below so you can waste 10 minutes here instead. ( trust when i say its a waste, unless you are looking to just kill time its not worth the read as you will gain nothing from it )

The Bit Index project is a collective of a few people who met at a bitcoin conference who found out that we all live in the same state to build one of the worlds largest and growing databases, or rather data warehouses of Bitcoin address / private key pairs in existence for ethical use only with aims of lost data recovery in the 5-10 year time frame.

The Bit Index currently utilizes the 45 drives storinator as its scalable backbone and infrastructure with each database node housing 270 Terra bytes (45 Seagate STBD6000100 6 TB drives) of  data or roughly 265 trillion address / base 58 private key pairs. With our current database design, hardware configuration and multi-threaded software interface we are able to upload roughly 20 Terra bytes of data per day taking roughly 13 days per node to fill it to max capacity.

Why?

To be completely up front, this started out as a crazy weekend project between a few friends that turned into an obsession in data warehousing. Some people collect stamps, coins, baseball cards, etc. We have become obsessed with collecting address / key pairs.

To what end?

There is no end in sight, Every day we are working on better methods of data compression to allow us to store more addresses per node, better software interfaces that allow us to get more work done per worker thread to increase import times with the current hardware configurations, and better queries to handle things like duplicate entries (if they ever occur) and data retrieval (searching the database for a specific address / base 58 key )

We are currently adding 1 or 2 new nodes per month as our time allows us to build them, right now our focus is not so much on building new nodes as it is on getting the most use per node as we possibly can. Currently we could build nodes 10X faster than we could fill them, so we are working towards better utilization before we worry about increase in current data stored.

F.A.Q.

Q: Isnt that expensive?
A: Yes, very. 3 of the 5 of us involved were very early adopters of bitcoin so a job and income became irrelevant this past year. In the scheme of things each node only costs us roughly $14,800 to build and roughly $380 a month to maintain and keep operational.

Q: What is the point of all of this?
A: For now, FUN. In the long run we aim to secure the largest searchable database of address/ base 58 key pairs known to man. Our current goal is to assemble a method of database compression that allows us to increase our storage per node to 500 trillion to 1 quadrillion addresses per node. In the end game, with proof of ownership we hope to maybe some day help someone recover their lost address on the 0.000000000000000000000000000003242% chance that we might have it in the database.

Q: What do you mean proof of ownership?
A: That is still in discussion, right now and within the next 5 years the chances of it happening are slim to none or to be exact a 0.000000000000000000000000000003242% chance at our current rate of growth that we could have your address in our database. Once it becomes a more real (over 1% chance) we will think more on how to prove ownership.

Q: Doesn't that mean that you could steal someones bitcoins?
A: Theoretically yes, but the odds are very slim, we do have a way of monitoring addresses in the database and in the past 2 months we have only seen 1 positive balance come through, of which was positive balances for less than 1 hour which lead us to believe they were part of a mixing or tumbling service sequence. No we didnt steal it, and never would. we have enough bitcoin that we dont need to. This is for fun, not for malicious purposes.

Q: Your question hear.
A: The number "hippopotamus" (yes we know hippopotamus is not a number)


Summary:

This is all for fun and just a time wasting project, a few hundred thousand dollars in cost over a few years split by 5 guys who really wont miss it much, but at the same time it keeps us occupied for a few years. We have no disillusion of "breaking bitcoin" or stealing anyone's anything, most of this is an exercise in data compression technology, and address generating software.

If you would like to contribute to the project with something useful or useless, let us know.

Ways you can contribute;

We are always looking for data to index, if you have a copy of vanitygen you could run it for an hour and make us a .txt file we can parse and upload into the database.

If you are fluent in Java or SQL, we are always looking for an extra brain to bounce ideas off of in terms of software, speed, indexing, and other what nots.

Thats about it for now.

------------------------------------------------------------------------------------------------------

Todays blog post with current progress to date (copied from the blog as well )

It's done, after 4 months of work we have finished filling our 5th node full of address / key data, and we are starting to build the 6th node this week.

As our first public announcement of this time wasting, money draining, good time having, laughable project this is more of a current status post than an update. If you read a little in the about page you will understand more of what this is and what the stats below really mean.


Current Address: 1,325,000,000,000,000
Current Storage Space: 1,350 Terra Bytes
Current # of Hard Drives: 225 Seagate STBD6000100 6 TB drives (and 10 250 Gig SSD's for redundant OS drives)
Current Chassis: Custom built clone of the 45 drives Storinator
% of all addresses stored: 0.00000000000000000000000000000009066% (rough estimate, our calculator does not have enough space to get the actual figure)

This weeks hard learned lesson: DONT TOUCH THE RED WIRE !!!!!! OUCH !!!!!!!

------------------------------------------------------------------------------------------------------

For those who read the post and missed it, we jut want to be clear, the purpose of this project is really nothing other than wasting time and money but having fun while doing it. We have no delusions that any of what we are doing will ever be useful in any way and your trolling is truly wasted on us, well maybe not we always enjoy a good laugh so if you have something original please feel free to exercise your brightest attempts at stupidity below.

Thanks for letting us waste your time Smiley we hope you got as much of a laugh out of reading it as we get out of doing it. We will keep you updated to our nonsense and wastefulness as time progresses if you care to check back at a later date.
Pages:
Jump to: