[ANNOUNCE] Abe 0.7: Open Source Block Explorer Knockoff - page 44.

MORA

full member

Activity: 127

Merit: 100

Quote from: John Tobey on December 28, 2011, 02:29:47 PM

If you intend to release the source and are concerned about bugs affecting its output (misleading users about their bitcoin holdings) then I suggest you use the Bitcoin testnet during beta.

If you want to test the site live on the Internet and are concerned about security bugs (exploits) then I suggest you protect it with a password and give it to only a few trustworthy testers. Also, use security best practices where possible, such as a dedicated OS account or virtual host with no access to non-empty wallets.

If you want to get something out in public quickly but are afraid of exploits or misguided forks of your code, use common sense. If the site is non-commercial (free to use and free of advertising) I won't care much about the license until its proprietary features make it the most popular site running Abe.

Thanks for your reply.

I believe I have taken all the steps needed to protect the setup, I do web programming as my day job, so its been done before

But ofcause I cannot gurantee that I have not missed a bug or the code cant handle a specific char in someones password, or a odd but valid bitcoin address is not accepted, etc.

I plan to release the monitor part and announce the website at the same time, and then after 1-2weeks of public testing, I will release the website code itself.

I prefer public domain, and AFAIK which license I publish my work under should not be dependant on the libs/code it uses/links, as long as the license is less restrictive, after all the project is already a mix of 4 different licenses.

If you like I can send you a link in pm to the website and the monitor code.

John Tobey

hero member

Activity: 481

Merit: 529

Quote from: MORA on December 27, 2011, 04:25:26 PM

Also I have made a website to host a service around my script for those that does not want to run it them selves, and I would like to keep that closed source for now, both for security untill the worst bugs are located, and to prevent 100identical pages popping up and going down.

If you intend to release the source and are concerned about bugs affecting its output (misleading users about their bitcoin holdings) then I suggest you use the Bitcoin testnet during beta.

If you want to test the site live on the Internet and are concerned about security bugs (exploits) then I suggest you protect it with a password and give it to only a few trustworthy testers. Also, use security best practices where possible, such as a dedicated OS account or virtual host with no access to non-empty wallets.

If you want to get something out in public quickly but are afraid of exploits or misguided forks of your code, use common sense. If the site is non-commercial (free to use and free of advertising) I won't care much about the license until its proprietary features make it the most popular site running Abe.

John Tobey

hero member

Activity: 481

Merit: 529

@MORA,

Congratulations, and thanks for asking about the license. Short answer: I am not a lawyer, the software license does not cover the data, and nothing in the AGPL prevents you from putting your own code in the public domain if that code's dependencies are compatible with the GPLv3 or AGPL.

Now, it gets a little hairy if you offer a proprietary service based on Abe's tables, and it needs a running Abe to keep those tables up to date. Maybe the law would consider that a "work based on" Abe even though the service only directly reads the tables. If in doubt, describe your plan to me. If I find it in keeping with the spirit of collaboration and the goals of Abe and Bitcoin, I will write a license exception giving it explicit permission to use Abe.

BkkCoins

hero member

Activity: 784

Merit: 1009

firstbits:1MinerQ

The license for using and changing the software has no bearing on the data in the database. If such were the case every website using Apache or MySql would be infringing their licenses, but they're not. The license specifically deals with copying, distributing and changing the software. BUt if you need to be more confident you could read more on the GNU and FSF web sites.

MORA

full member

Activity: 127

Merit: 100

I am almost done with my little project, and started to look into the license issue brought up earlier in the thread.

I would like to release my script as public domain, however since it makes use of a database populated by AGPL software, Im not sure thats allowed.
-Also one of the php libs I use is public domain already.

Also I have made a website to host a service around my script for those that does not want to run it them selves, and I would like to keep that closed source for now, both for security untill the worst bugs are located, and to prevent 100identical pages popping up and going down.
The website only accceses Abe data in 1 place that I could remove (Latest block height in main chain), other than that its user administration only.

The monitor script looks in a database populated by the website (or phpMyAdmin for that sake) and the Abe database for new transactions, so it touches both parts.

I have not modified Abe, since the --no-serve works quite well for my purpose.

In short : Do you considder the data entered into the MySQL database protected by AGPL, or can we write scripts that build on the data without having to choose AGPL as the license.

John Tobey

hero member

Activity: 481

Merit: 529

Quote from: MORA on December 22, 2011, 08:36:37 AM

But do you have a good way to find the number of confirmations? (limited to say 6 or 10).
From what I understand the process is to select the next_block_id from block_next, and if any is found thats 1 confirmation, then repeat with the result untill no result or the required amount is found.

Yes, that will work. For the common case where the transaction is on the main branch, you can just subtract its block height from the longest chain's height. chain_candidate.in_longest will equal 1 when a block is on main. For the top block, you can use:

Code:

SELECT b.block_height
FROM block b
JOIN chain c ON c.chain_last_block_id = b.block_id
WHERE c.chain_id = 1

or as in DataStore.get_block_number:

Code:

            SELECT MAX(block_height)
              FROM chain_candidate
             WHERE chain_id = 1
               AND in_longest = 1

This is probably the right thing even if the transaction is not on the main branch, since users won't care about confirmations on dead ends.

MORA

full member

Activity: 127

Merit: 100

Quote from: John Tobey on December 21, 2011, 08:56:55 PM

Yes. You can also force a catch-up by running the program with --no-serve in another process from the command line while the server listens. But this won't help with the database idle connection timeouts.

Thanks.
I made a small bash script to run it with --no-serve then sleep and repeat (for now I dont need the webinterface).
Then I can also insert my script after Abe completes, and in that way be sure there are no locking problems.

To replicate the notify system I need to find new transactions and monitor them.

SELECT tx_id, txout_value, pubkey_hash FROM txout_detail WHERE tx_id > xyz
where xyz is the last completed block, is a good start, it finds the rows needed.

Then to find which blocks they were in (could ofcause JOIN it in the first query)
select block_id from block_tx where tx_id = X

But do you have a good way to find the number of confirmations? (limited to say 6 or 10).
From what I understand the process is to select the next_block_id from block_next, and if any is found thats 1 confirmation, then repeat with the result untill no result or the required amount is found.

(UPDATE)
One way I can think of is ...

Code:

SELECT b1.next_block_id
FROM block_next b1
JOIN block_next b2 ON b2.block_id = b1.next_block_id
JOIN block_next b3 ON b3.block_id = b2.next_block_id
JOIN block_next b4 ON b4.block_id = b3.next_block_id
JOIN block_next b5 ON b5.block_id = b4.next_block_id
JOIN block_next b6 ON b6.block_id = b5.next_block_id
WHERE b1.block_id = xyz

This will not tell how many confirmations it have, but if is have one for each join, a bit rough way to test, but it could do, if you think its correct.

John Tobey

hero member

Activity: 481

Merit: 529

Quote from: MORA on December 20, 2011, 08:04:53 AM

Okay, so the catch-up only happens when a page is requested from the webserver ?

Yes. You can also force a catch-up by running the program with --no-serve in another process from the command line while the server listens. But this won't help with the database idle connection timeouts.

MORA

full member

Activity: 127

Merit: 100

Okay, so the catch-up only happens when a page is requested from the webserver ?

The workaround I posted is not perfect, after a night of no activity, the web interface did not respond, and database was not updated, no errors on console, but today when I restarted it, I didnt get the block catchup messenges either, but it did catch up just fine, so maybe my terminal is broken.

John Tobey

hero member

Activity: 481

Merit: 529

MORA,

Thanks for the comments and workaround. Indeed, db idle timeouts are a problem. I use two workarounds but haven't settled on a default approach. I have a cron job request the homepage every minute to trigger the catch_up code. And there is a "catch_up_thread" branch in git that automatically does this on a separate thread. It may need merging with the master branch to get the latest features, and I have not tested it as well. ThomasV implemented it for ecdsa.org and I think uses it in Electrum.

Reconnecting automatically is a good idea. I think your patch will work in practice, although I see a slight chance of database corruption if it tries to reconnect in the middle of a transaction. The chance is remote, since transaction durations won't normally approach the idle timeout, but the 12-hour init makes me extremely cautious about corruption.

My long-term plan is to test "begin transaction" for portability and explicitly start each transaction. The start of a transaction would be the time to reconnect if needed. For now, let me know if the workarounds prove inadequate.

Red Emerald

hero member

Activity: 742

Merit: 500

I'm watching this. I've checkout out the source but haven't played with it yet.

EDIT: Oh yeah. Firstbits support will be awesome. I'm looking forward to giving someone 11235813 as my address

MORA

full member

Activity: 127

Merit: 100

Hi,

I have gotten abe to sync up with the bitcoin dir, took about 12hours on a I7-920 with SSD server, but theres alot other going on at the server.
CPU was not pegged, so probaly disc/io that was the bottleneck.

Anyways...
After it syncs it listens for a HTTP connection, to show the interface and all.
Does it still sync new data at this point, or does it need to be restarted on a timer ?
After a few minutes it looses the MySQL connection, probaly due to a timeout, and it does not
gracefully recover it, instead it crashes.

Code:

hostname - - [17/Dec/2011 15:52:12] "GET /favicon.ico HTTP/1.1" 200 3774
Traceback (most recent call last):
  File "Abe/DataStore.py", line 1874, in catch_up
    store.catch_up_dir(dircfg)
  File "Abe/DataStore.py", line 1892, in catch_up_dir
    ds = open_blkfile()
  File "Abe/DataStore.py", line 1885, in open_blkfile
    store._refresh_dircfg(dircfg)
  File "Abe/DataStore.py", line 2058, in _refresh_dircfg
    WHERE dirname = ?""", (dircfg['dirname'],))
  File "Abe/DataStore.py", line 458, in selectrow
    store.sql(stmt, params)
  File "Abe/DataStore.py", line 372, in sql
    store.cursor.execute(cached, params)
  File "/usr/lib/pymodules/python2.6/MySQLdb/cursors.py", line 166, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib/pymodules/python2.6/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
OperationalError: (2006, 'MySQL server has gone away')
Warning: failed to catch up /fast/bitcoin/.bitcoin: (2006, 'MySQL server has gone away') {'blkfile_number': 1, 'dirname': '/fast/bitcoin/.bitcoin', 'chain_id': None, 'id': Decimal('1'), 'blkfile_offset': 828778091}
Traceback (most recent call last):
  File "/usr/lib/python2.6/wsgiref/handlers.py", line 93, in run
    self.result = application(self.environ, self.start_response)
  File "/fast/bitcoin/abe/Abe/abe.py", line 198, in __call__
    abe.store.rollback()
  File "Abe/DataStore.py", line 578, in rollback
    store.conn.rollback()
OperationalError: (2006, 'MySQL server has gone away')

I would like to build a set of scripts to provide the same as the now defunct bitcoinnotify´er, but in a way that everyone can run their own if they want to.
So my plan is to use abe to parse the bitcoind files, and then make some PHP scripts that will use the MySQL database to check if there are new transactions for any of the monitored addresses.
And if any of the monitored transactions has the number of confirmations needed for a notification to be sent (be it email, post, db change, etc.)

[EDIT]
Just checked my my.cnf, the timeout it set to 60seconds, so if abe sends a keep alive less often than that, the connection will be terminated by the SQL server.
-Does anyone know what the keepalive setting is in abe?
-And abe should handle a lost connection to MySQL more gracefully, maybe attempt y reconnects with y*10seconds delay before quitting, so a monitor script can spot the missing process.

I tried to set wait_timeout to 1hour in MySQL, sofar abe has been idle for 1500seconds, so I dont think there is any keep alive builtin.
I think it should be an option for the user to either use keep-alive, or remake the connection when needed, since high traffic sites may prefer a live connection, ready to use, and sites that just keep the blocks updated in the DB will only see action when a new block is ready (if abe updates it while running).

I will try to look in the source, but python is not really my strong side, so jumping right into making a thread to send keep alives/reconnect the sql driver, may be a bit rough

[EDIT2]
This small fix handles the error when accessing a webpage after the SQL connection has been killed for whatever reason.
It simply tries to reconnect and execute once more when a execute fails.
Its a workaround rather that an actual bug fix since, the real fix (IMHO) should be to either keep the connection alive, or close it when done and make one when needed.

Code:

$ git diff
diff --git a/Abe/DataStore.py b/Abe/DataStore.py
index e256115..d013219 100644
--- a/Abe/DataStore.py
+++ b/Abe/DataStore.py
@@ -402,8 +402,12 @@ class DataStore(object):
         try:
             store.cursor.execute(cached, params)
         except Exception, e:
-            store.sqllog.info("EXCEPTION: %s", e)
-            raise
+                   try:
+                       store.reconnect();
+                           store.cursor.execute(cached, params)
+               except Exception, e:
+                store.sqllog.info("EXCEPTION: %s", e)
+                raise

     def ddl(store, stmt):
         if stmt.lstrip().startswith("CREATE TABLE "):

John Tobey

hero member

Activity: 481

Merit: 529

Quote from: slush on November 24, 2011, 08:59:46 AM

John, is there any plan to add firstbits support? I mean lookup for address using it's firstbits and generate firstbits from full address. I found that it's very hard to implement it with current data structures - database does not store address itself, only pubkey hash. Although you implemented lookup for address with prefix, it is case sensitive and I don't see a way how to make case-insensitive lookup with just a pubkey_hash in a database.

I'd like to submit a patch for Abe which will extend pubkey table with "address" column and also number of block where address firstly appeared. Afaik blocknum of first appearance is also hard to obtain with current data structures, because it requires heavy joining on fast growing db tables. I'm just asking you if there's a possibility to accept such patch to upstream.

Storing such data to another columns goes against 3rd normal form, which is usually wrong solution. However we're not doing homework from SQL but real application with milions of records and such patch will make firstbits resolution much more easier and blazingly fast. Also indexing addresses (not just pubkey hashes) can be very useful also for other projects like Casascius coin analyzer (https://bitcointalksearch.org/topic/casascius-bitcoin-analyzer-52537).

Yes, I have been thinking about supporting firstbits. I would consider a patch that adds address and first block_height (or block_id) to pubkey. If I were doing it myself, I would try a new table "firstbits" with address_version, pubkey_id, block_id, firstbits. "address" could be an optional field for applications that want it. Storing firstbits directly would give us simple two-way lookups.

Putting address or firstbits in pubkey would make me nervous about chain splits (where each side remains active) and firstbits adoption by alt chains. However, ideally I'd like to support this design for apps that want denormalization for performance. So any design involving a new table would add a view to make it look as if the fields were in pubkey, and a design that adds columns to pubkey should wrap it with a view that provides constant "00" address_version.

Abe doesn't yet have a way to turn on or off features such as firstbits. I would like to let users turn features on or off at install time: firstbits, coin-days destroyed, namecoin stuff, etc. I would store a flag in configvar for each feature (such as 'firstbits'='yes'/'no') and skip the processing associated with deselected features.

This is just my vision at the moment, I don't have any code beyond what you see. Any patch that looks useful to somebody, I'd probably accept. If it compromises too much in some area, I'd put it on its own branch until the compromises become options.

By the way, I don't know how firstbits.com would handle two addresses with the same unique prefix first appearing in the same block. I would give the shorter prefix to the address in the first transaction within the block, with ties going to the first txout within a transaction.

slush

legendary

Activity: 1386

Merit: 1097

John, is there any plan to add firstbits support? I mean lookup for address using it's firstbits and generate firstbits from full address. I found that it's very hard to implement it with current data structures - database does not store address itself, only pubkey hash. Although you implemented lookup for address with prefix, it is case sensitive and I don't see a way how to make case-insensitive lookup with just a pubkey_hash in a database.

I'd like to submit a patch for Abe which will extend pubkey table with "address" column and also number of block where address firstly appeared. Afaik blocknum of first appearance is also hard to obtain with current data structures, because it requires heavy joining on fast growing db tables. I'm just asking you if there's a possibility to accept such patch to upstream.

Storing such data to another columns goes against 3rd normal form, which is usually wrong solution. However we're not doing homework from SQL but real application with milions of records and such patch will make firstbits resolution much more easier and blazingly fast. Also indexing addresses (not just pubkey hashes) can be very useful also for other projects like Casascius coin analyzer (https://bitcointalksearch.org/topic/casascius-bitcoin-analyzer-52537).

terrytibbs

hero member

Activity: 560

Merit: 501

Quote from: John Tobey on November 20, 2011, 04:41:23 PM

How is system load? Is the database server busy with CPU? IO? Is there free memory?

My instance runs on a dedicated quad-core Xeon with about 16GB of free memory (right now), disk I/O is extremely low, doesn't even begin to reach what this box is capable of.

I'm increasing (what I'm assuming is) the bytes it keeps before committing to db now, will report back.

John Tobey

hero member

Activity: 481

Merit: 529

Quote from: molecular on November 19, 2011, 08:40:05 PM

It persists. Even accross db venors, meaning that postgres is - while faster than mysql - also having a hard time doing these inserts (manly tx, I guess)

Please try with --commit-bytes=100000 and let me know. This will prevent a commit after every tx insertion but may lead to errors when concurrent processes insert. I recently changed this setting from being the default.

How is system load? Is the database server busy with CPU? IO? Is there free memory?

When I have time, I will try some load testing.

molecular

donator

Activity: 2772

Merit: 1019

Quote from: John Tobey on November 18, 2011, 09:51:12 AM

Quote from: molecular on November 17, 2011, 08:16:54 PM

I ran into some weird and massive performance issues (litecoin data, older version of abe, mysql). Couldn't find out what exactly is wrong. mysqld just seems to be really slow. It's inserting the blocks almost as slowly as they are mined and mysqld cripples my system hogging i/o, I guess.

If it persists, I find an easy way to "profile" abe is to interrupt it (ctrl-C) and note the stack trace, then restart, let it get going, and repeat a few times. If the interrupt usually happens in the same query or two, I know what to optimize.

It persists. Even accross db venors, meaning that postgres is - while faster than mysql - also having a hard time doing these inserts (manly tx, I guess)

You don't happen to have a part with O

I think

John Tobey

hero member

Activity: 481

Merit: 529

Quote from: terrytibbs on November 17, 2011, 12:40:58 PM

Quote from: John Tobey on November 17, 2011, 11:24:13 AM

Quote from: terrytibbs on November 16, 2011, 03:01:10 PM

I'm getting "Commands out of sync; you can't run this command now" MySQL errors.

Hmm, that's a new one to me. Could you post a way to reliably produce the error in my own environment? Or the next best thing would be to run with --log-sql and post a section of log including, say, 5 SQL commands leading up to the error.

Come to think of it, a plain old stack trace would be better than nothing if you have one around...

Quote from: terrytibbs on November 17, 2011, 12:40:58 PM

I've switched to Postgres, and it seems to be holding up fairly well so far, other than the FastCGI process dying every once in a while.

Any error message in the log or browser?

John Tobey

hero member

Activity: 481

Merit: 529

Quote from: molecular on November 17, 2011, 08:16:54 PM

Hey John,

I ran into some weird and massive performance issues (litecoin data, older version of abe, mysql). Couldn't find out what exactly is wrong. mysqld just seems to be really slow. It's inserting the blocks almost as slowly as they are mined and mysqld cripples my system hogging i/o, I guess.

If it persists, I find an easy way to "profile" abe is to interrupt it (ctrl-C) and note the stack trace, then restart, let it get going, and repeat a few times. If the interrupt usually happens in the same query or two, I know what to optimize.

Quote from: molecular on November 17, 2011, 08:16:54 PM

NameError: global name 'stor' is not defined

Fixed, thanks.

molecular

donator

Activity: 2772

Merit: 1019

Hey John,

I ran into some weird and massive performance issues (litecoin data, older version of abe, mysql). Couldn't find out what exactly is wrong. mysqld just seems to be really slow. It's inserting the blocks almost as slowly as they are mined and mysqld cripples my system hogging i/o, I guess.

So I just upgraded to newest version from git (actually made a fork) and ran into a typo, I guess:

Quote from: update.py:341

if count % 1000 == 0:
store.commit()
stor.log.info("Updated %d blocks", count)

NameError: global name 'stor' is not defined

Topic: [ANNOUNCE] Abe 0.7: Open Source Block Explorer Knockoff - page 44. (Read 221099 times)