Pages:
Author

Topic: Bitcoin Binary Data Protocol, for mining, monitorblocks, etc. - page 3. (Read 26077 times)

sr. member
Activity: 247
Merit: 252
I just want to say I also think UDP is a weird/bad idea here.

And even though I have not yet time to dive in, solution looks very nice.
sr. member
Activity: 868
Merit: 251
1) Neither login request nor solution are not lost in my proposal.
2) I don't think that loss of WORK message will seriously impact the performance. If I understand the principle, new WORK will be broadcasted with every transaction received by server. And it's often enough.
3) I mean only "push" protocol, which does not use GETWORK.
4) UDP, as far as I know, is the same NAT-friendly as TCP. In both cases NAT box just maps the source port.
5) It is not widely supported by libs only because it is very simple.
6) Some features of TCP are really overhead when building a low-latency service. So it is sometimes better to reimplement some TCP features than use its full version. For example, in this case we don't need an acknowledge for every message.

And... the SYN flood...
legendary
Activity: 1596
Merit: 1100
Why not UDP?
Retransmissions imply you wind up reinventing TCP.
Is retransmission really required everywhere? I think it is useful only during login and result reporting and only on miner side. All other activity does not require TCP features.

A miner does not want to lose a WORK msg, GETWORK msg, nor have their solution lost.  Every single message -- LOGIN, GETWORK, WORK, ... -- must be retransmitted or retried by one side or the other.

But that's just one of many disadvantages of UDP.  TCP is also better supported by most programming language libs, and is more firewall- and NAT-friendly.

Having implemented many UDP servers of various sorts -- financial data feeds, gaming servers, and cloud computing coordinators -- you really do wind up reinventing TCP while attempting to simply have a robust UDP implementation.
sr. member
Activity: 868
Merit: 251
Why not UDP?
Retransmissions imply you wind up reinventing TCP.
Is retransmission really required everywhere? I think it is useful only during login and result reporting and only on miner side. All other activity does not require TCP features.
I think it can be like this:

Code:
Miner->Server: LOGIN username password
Server->Miner: OK keepalive_timeout
If miner does not receive an answer in reasonable time, it resends login request.
On login error server can simply ignore the request or send some REJECT messages.
On success the server records miner IP and port number and adds it to miner identification table together with current timestamp.
Then:
Code:
Server->Miner: WORK ....

Once during keepalive_timeout:
Code:
Miner->Server: ALIVE
or
Code:
Miner->Server: RESULT ...
Server->Miner: OK keepalive_timeout
Server must update timestamp in miner id table then.
Miner resends RESULT message if it does not receive an answer in reasonable time.

Server will remove miner id records from its table after 4*keepalive_timeout.
Note that server can adapt keepalive_timeout for certain miners using statistical data on the fly.

This will reduce traffic to minimum, IMHO.
legendary
Activity: 1596
Merit: 1100
Why not UDP?

Retransmissions imply you wind up reinventing TCP.
sr. member
Activity: 868
Merit: 251
Why not UDP?
hero member
Activity: 489
Merit: 505
Sounds like a reasonable enhancement, especially since the miners will be notified of a new block and will be able to start on the new one right away Cheesy
legendary
Activity: 1596
Merit: 1100
0. The URLs
----------------------------------------------------------------------------------
URL: http://yyz.us/bitcoin/pushpool-0.4.tar.gz
Repo: https://github.com/jgarzik/pushpool


1. The Problem
----------------------------------------------------------------------------------
With the recent slashdotting and resultant influx in new users, the 'getwork' network protocol used in mining is showing some strain, particularly on the pools.  Miners request work once every 5-10 seconds using HTTP JSON-RPC, which has several glaring inefficiencies that lead to unnecessary server load:
  • HTTP/1.1 persistent connections are uncommon, possibly because bitcoind does not support them.  This results in a new TCP connection from every miner, every 5-10 seconds, to the same network host.
  • 'getwork' data is a mere 256 bytes, but HTTP headers and binary-to-hexidecimal encoding for JSON increase the payload to more than double that
  • official bitcoin client's RPC server implementation is essentially a single-threaded loop, where requests from clients B, C, and D will be stalled and ignored until client A's request is finished -- or a 30 second timeout (see -rpctimeout).  This algorithm does not tolerate a high TCP request rate from multiple threads / computers.

Several people, pool operators in particular, have a keen interest in solving these problems.  In addition, push mining (see below) has been discussed as a future alternative to the 'getwork' polling method currently employed.


2. Design goals for a solution
----------------------------------------------------------------------------------
I have written a demonstration pool server (aka a 'getwork' proxy server) that functions in a similar fashion to the recently-discussed poold.py:  large numbers of miners connect to the pool server, which proxies 'getwork' JSON-RPC requests to the official bitcoin client.  This demonstration server implements a new binary protocol that was designed to meet the following goals:

  • Persistent TCP connections, to eliminate TCP disconnect+reconnect behavior by miners
  • Network-efficient for the common use case:  one network packet for 'getwork', one network packet for the returned data.
  • Network-efficient for similar use cases (monitorblocks, monitortx) where clients connect, and then passively wait for real-time events to be delivered
  • Existing miner client workflows supported, to minimize network protocol change impact on miners
  • Support "push mining," where server delivers new work to miners unsolicited (ie. without the miner first sending a 'getwork' message)

This is not intended to replace JSON-RPC API, but to supplement it for specific use cases.  Yes, that means bitcoind will listen to three network ports: P2P network, JSON-RPC, and binary RPC (though as now, only P2P is required for operation; the servers are always optional).


3. Let's start with a protocol example: today's getwork mining
----------------------------------------------------------------------------------
The specific details of the protocol itself are in ubbp.h and protocol.h of the above URL (pushpool-0.1.1.tar.gz).  Here is an example, to provide a suitable introduction:

* TCP connection is broken up into messages.  Each message has a 64-bit header, with 8-bit opcode and 24-bit length fields.
* Miner client connects to TCP server, and issues a LOGIN message, which is compressed JSON login data + sha256 hash of (data + shared secret).
* Server responds with an OP_LOGIN_RESP msg, compressed JSON, indicating options and capabilities
* Client issues an OP_GETWORK msg (8 bytes)
* Server responds with an OP_WORK msg (264 bytes)
* Client uses its CPU/GPU to work on proof-of-work solution...
* Client issues an OP_GETWORK msg (8 bytes)
* Server responds with an OP_WORK msg (264 bytes)
* ...

The above example intentionally matches existing 'getwork' JSON-RPC miner client workflow today.  Miner clients may even support stateless operation by pipelining the OP_LOGIN and OP_GETWORK requests together, and closing the TCP connection.  Stateless operation is not recommended, but it is supported, in order to support the widest range of existing mining clients.


4.  Tomorrow's mining:  push mining
----------------------------------------------------------------------------------
When a block or tx arrives, it is preferable to begin working immediately on the new work.  From the server's perspective, this is a classic data-broadcast problem, where the server wants to broadcast N different pieces of work to N miners.  Hence, "push mining" where the server pushes new work pro-actively to the miner clients.

This new network protocol supports pushing mining, as demonstrated in this example:

* Client connects to server, issues a LOGIN message with the "send_me_work" flag set
* Server responds with OP_LOGIN_RESP msg
* Server sends a OP_WORK msg
* Server sends a OP_WORK msg
* Server sends a OP_WORK msg
* ...


5.  A similar use case:  monitorblocks
----------------------------------------------------------------------------------
Gavin Andresen has a patch in his github which provides a very useful feature:  when a new block is received (monitorblocks) or new wallet transaction (monitortx), bitcoind sends an HTTP POST to the specified URL.  Thus, monitorblocks provides real-time monitoring of the bitcoin network, and monitortx provides real-time monitoring of the local wallet.  This sort of featureset pushes data as events occur, rather than forcing a website operator to poll JSON-RPC for certain operations to complete.

Monitoring new blocks on the bitcoin network is a very easy data broadcasting problem that this binary network protocol may easily support:

* Client connects to server, issues a LOGIN message with the "send_me_blocks" flag set
* Server responds with OP_LOGIN_RESP msg
* Server sends a OP_BLOCK msg
* Server sends a OP_BLOCK msg
* Server sends a OP_BLOCK msg
...

monitortx is more complicated, because one may specify transaction-matching criteria.  But with this new protocol's support of JSON, flexibility is not a problem.


6.  A plan to proceed - this is just a rough draft
----------------------------------------------------------------------------------
I'm thinking of the following steps to proceed, given the need to coalesce several potentially parallel push-mining efforts:
  • write a pool server / proxy server that supports the new protocol (done)
  • hack existing miner clients (cpuminer, oclminer seem easy targets) to support new protocol. volunteers?
  • iterate, test, comment.  iterate, test, comment.  lather, rinse, repeat
  • Once people are happy, implement in official bitcoind
  • in parallel with any of the above efforts, update official bitcoind's rpc.cpp with a smarter httpd implementation

Let the comments begin...  hopefully someone will volunteer to mod a GPU miner to support this?



Appendix 1:  FAQ
----------------------------------------------------------------------------------
Q. Why invent a new protocol?  Why not use Google protocol buffers or XDR?

A. protobuf and XDR both require an underlying packetizing format, such as UBBP that I've presented here.  That implies the choice would be UBBP+protobuf or UBBP+JSON.  Given the bitcoin community's embrace of JSON, the latter was chosen.  JSON is actually more flexible than protobufs, because more dynamic data structures may be described using JSON.


Q.  Why did you not address glaring problems in getwork?

A.  I focused purely on a network-efficient protocol.  getwork implementation choices are outside the scope of this work.


Q.  What is the state / quality of this code release?

A.  Uh, it compiles and runs... but no clients yet exist for it.  Without a miner client for testing, it's about as useful as spitting on a fish...



Pages:
Jump to: