Pages:
Author

Topic: To Fee or not to Fee (Read 2082 times)

legendary
Activity: 1512
Merit: 1036
April 12, 2013, 01:06:22 PM
#22
The analysis to do is on no-fee transactions that have a priority > 57.6M. Compare with the times of those < 57.6M which include exactly the minimum per int(KB+1) fee. These are the default sending modes of Bitcoin-qt with no optional fee.

My hypothesis is that Bitcoin's default behaviour creates two classes of transactions, where the "preferred/non-spammy" class actually gets a lower quality of service from miners.
donator
Activity: 2058
Merit: 1007
Poor impulse control.
April 12, 2013, 08:30:49 AM
#21
This might help. I've excluded all the txs that were included in the next block, since most of them were.



         n  meanTime medianTime feePerByte
 [1,] 40969 245.57797      221.0          0
 [2,] 10445 226.43025      195.0          1
 [3,]  1334 191.66117      194.0          2
 [4,]   261 118.05747       75.0          3
 [5,]   277 130.15162       96.0          4
 [6,]    73 113.31507       46.0          5
 [7,]    46 145.69565       48.5          6
 [8,]    49  49.73469       39.0          7
 [9,]    56  97.51786       25.0          8
[10,]    15  83.13333       39.0          9
[11,]    22  55.68182       39.0         10
[12,]    35  94.05714       45.0         11
[13,]     7  60.71429       38.0         12
[14,]    18 100.72222       23.0         13
[15,]     6  41.66667       33.0         14
[16,]     6  54.00000       24.0         15
[17,]     4  10.50000       10.0         16
[18,]     1 148.00000      148.0         17
[19,]     1  11.00000       11.0         21
[20,]     1   7.00000        7.0         22
[21,]     2 140.50000      140.5         24
[22,]     6  32.83333       45.0         34



Jan
legendary
Activity: 1043
Merit: 1002
April 12, 2013, 08:16:17 AM
#20

X-axis denotes how many blocks it took for a given transaction to confirm from the time where it was observed (0 means next block)


If I am reading this correctly, some transactions took 7000+ blocks to confirm?!
Well, ehrm... mix of axis... Smiley  7000+ satoshis/byte in fee
sr. member
Activity: 424
Merit: 250
April 12, 2013, 07:47:58 AM
#19

X-axis denotes how many blocks it took for a given transaction to confirm from the time where it was observed (0 means next block)


If I am reading this correctly, some transactions took 7000+ blocks to confirm?!
Jan
legendary
Activity: 1043
Merit: 1002
April 12, 2013, 07:12:16 AM
#18
Analyze away.

I think it's easy to spot the following cut-off points in your sample dataset:
- a fee over ~5300 s/byte will guarantee inclusion in the next block
- a fee over ~3000 s/byte will guarantee inclusion in the next two blocks
- a fee over ~500 s/byte will guarantee inclusion in the next three blocks (*)
- etc.
- even txns with 0 fee are picked up in less than ~35 blocks

(*) The only exception for this a single transaction with a ~2000 s/byte fee that took about 7 blocks; I suspect it didn't have all it's inputs confirmed and that's why it's an outlier. So the  correct block delay should be computed from the moment when all inputs are confirmed.

"Guaranteed" should of course be understood "in light of historical data". You can make no guarantee that a transaction will be picked up, but absent a major shift in the behavior of miners or transaction rate, historical data is our best chance to compute a "fair" fee that won't waste money for a given confirmation speed. If the client sets a fee that is too low for current conditions, the resulting delay will signal to other clients the new market conditions and they will increase the fee accordingly. A fee that is too high can also be detected by the clients, if they target say a 90-95% inclusion at a given block instead of 100% guarantee of inclusion.



Given the full set of transactions as you have provided us, the algorithm to compute the correct fee seems a simple binary search :
   start with a guesstimate fee F = maxfee/2
   set search step S = F
   set delay target T (ex. 3 blocks)
   set inclusion confidence L (ex. 90%)
   set precision of fee determination epsilon (ex. 1%)
   loop
      n = count(transactions with fee <= F AND confirmation delay <= T)
      N = count (transactions with fee > F AND confirmation delay > T)
      compare n/(n+N) with L
      F=F+/-S (sign depending on the compare)
      S=S/2
      break if S < epsilon*F
   continue loop
   return F

So I basically split the data into four quadrants depending on fee and delay, as compared to our delay target and current fee. Since they bring us no information, I discard transaction with low fee and high delay, as well as those with high fee and low delay. Then compare the data in the remaining quadrants with our target and continue to search for the optimal fee for achieving our goal. I don't have any functional code but this should work. At least that's how the theory goes Smiley

The trouble with this algorithm is that it doesn't scale too well because it needs to walk all transactions multiple times. At high rates it could be a killer. So instead of walking actual transactions, you could sample them in discrete bins just once, then run the algorithm on the bins. The bins would form a 2D table, and each bin will consist of a single integer that records how many transactions match the given fee interval and delay. For the fee axis I would go with an exponential pace, for example each bin represents a 10% increase in fee. Of course the precision of fee determinations drops but you only need to build the table once and after that it's just a few kB to download from a central location for example.

An even better algorithm could take into consideration current unconfirmed transaction rate and adjust the fee (linearly ? needs experimentation); this is especially important when the block limit is reached and fee ceilings transition from a fixed ceiling set by the miner ("don't include transactions with less than x BTC fee") to a dynamic rate set by the market ("include the transactions with the largest fees possible in a given block size").

Interesting. I'll give it a spin and see how it works out.
Jan
legendary
Activity: 1043
Merit: 1002
April 12, 2013, 07:10:05 AM
#17
Would you mind plotting the data on a log/log chart? Cheers.

same data with log/log:
cat raw_tx_data.csv | cut -d ',' -f 2,3 | sed -e 's/,/ /g' | gnuplot -p -e "set yrange [1:60]; set logscale; set terminal png size 1500,1000; set output 'fig.png'; set xrange[1:8000] ;plot '-'"

Note that blockcount=0 omitted (gnuplot does not like log(0))

donator
Activity: 2058
Merit: 1007
Poor impulse control.
April 09, 2013, 05:13:51 PM
#16
Would you mind plotting the data on a log/log chart? Cheers.
sr. member
Activity: 504
Merit: 250
April 09, 2013, 05:01:12 PM
#15
Analyze away.

I think it's easy to spot the following cut-off points in your sample dataset:
- a fee over ~5300 s/byte will guarantee inclusion in the next block
- a fee over ~3000 s/byte will guarantee inclusion in the next two blocks
- a fee over ~500 s/byte will guarantee inclusion in the next three blocks (*)
- etc.
- even txns with 0 fee are picked up in less than ~35 blocks

(*) The only exception for this a single transaction with a ~2000 s/byte fee that took about 7 blocks; I suspect it didn't have all it's inputs confirmed and that's why it's an outlier. So the  correct block delay should be computed from the moment when all inputs are confirmed.

"Guaranteed" should of course be understood "in light of historical data". You can make no guarantee that a transaction will be picked up, but absent a major shift in the behavior of miners or transaction rate, historical data is our best chance to compute a "fair" fee that won't waste money for a given confirmation speed. If the client sets a fee that is too low for current conditions, the resulting delay will signal to other clients the new market conditions and they will increase the fee accordingly. A fee that is too high can also be detected by the clients, if they target say a 90-95% inclusion at a given block instead of 100% guarantee of inclusion.



Given the full set of transactions as you have provided us, the algorithm to compute the correct fee seems a simple binary search :
   start with a guesstimate fee F = maxfee/2
   set search step S = F
   set delay target T (ex. 3 blocks)
   set inclusion confidence L (ex. 90%)
   set precision of fee determination epsilon (ex. 1%)
   loop
      n = count(transactions with fee <= F AND confirmation delay <= T)
      N = count (transactions with fee > F AND confirmation delay > T)
      compare n/(n+N) with L
      F=F+/-S (sign depending on the compare)
      S=S/2
      break if S < epsilon*F
   continue loop
   return F

So I basically split the data into four quadrants depending on fee and delay, as compared to our delay target and current fee. Since they bring us no information, I discard transaction with low fee and high delay, as well as those with high fee and low delay. Then compare the data in the remaining quadrants with our target and continue to search for the optimal fee for achieving our goal. I don't have any functional code but this should work. At least that's how the theory goes Smiley

The trouble with this algorithm is that it doesn't scale too well because it needs to walk all transactions multiple times. At high rates it could be a killer. So instead of walking actual transactions, you could sample them in discrete bins just once, then run the algorithm on the bins. The bins would form a 2D table, and each bin will consist of a single integer that records how many transactions match the given fee interval and delay. For the fee axis I would go with an exponential pace, for example each bin represents a 10% increase in fee. Of course the precision of fee determinations drops but you only need to build the table once and after that it's just a few kB to download from a central location for example.

An even better algorithm could take into consideration current unconfirmed transaction rate and adjust the fee (linearly ? needs experimentation); this is especially important when the block limit is reached and fee ceilings transition from a fixed ceiling set by the miner ("don't include transactions with less than x BTC fee") to a dynamic rate set by the market ("include the transactions with the largest fees possible in a given block size").
legendary
Activity: 1120
Merit: 1164
April 06, 2013, 07:13:37 AM
#14
I changed my timestamper to set nLockTime on every transaction it creates to the current block height. Every 10 minutes it checks if any timestamps are pending, and if so, creates a timestamp transaction; from the point of view of block creation the timestamps are created randomly. I also changed it to randomly set the fees between 2 and 5mBTC for each timestamp, which works out to a range of 5mBTC/KB to 12.5mBTC/KB.

Note that I have not changed the sequence numbers from MAXINT, so the nLockTime thing is informational only and won't affect propagation.

I seem to be getting a dozen timestamp requests per day at what appear to be random intervals, and I added some temporary code to create some additional dummy ones. Over time it could be an interesting data source, albeit one with an unusual transaction form.

54e243ffb097f5b6acd6ea4a55925fddba4078ca750b4aef4142b837ffdff2b8 is an example transaction; they all have 13MH4zmU4UT4Ct6BhoRFGjigC8gN9a9FNn as the first multisig address. (the second is the tip of the merkle tree of submitted digests to be timestamped)
Jan
legendary
Activity: 1043
Merit: 1002
April 06, 2013, 05:31:28 AM
#13
Any completely automated metric is potentially abused by miners to run up fees.

Challenge accepted, what about this: You watch the last few days of transactions and build a histogram, delay vs transaction fee. Delay is expressed as blocks mined from txn broadcast to inclusion (starts from 1, the txn in included in the next block) and the fee is expressed in satoshi/byte. Using such a histogram and the correct algorithm you can spot the minimal fee that has say a 95% chance of getting you into the next block, or the next n blocks. You can expose this to the user via a slider and he can chose to pay a high fee for 99% chance of catching the next block or a lower fee for 80% chance in the next 10 blocks.

The miners could send high fee transactions among themselves all they want, they will not affect the probability of other genuine transactions of getting in. If the fee market is competitive the histogram will reflect the true minimal fee that gets you in. The only avenue of manipulation is for miners to prioritize special low fee transactions and make it appear as if the entry fee is lower than it really is. But there's little incentive for miners to create that perception.

OK, so lets get some real life stats on the table. The following plot was made by observing blocks 229764 - 229939



Every + is a transaction. Total transactions 60608 in 175 blocks

X-axis denotes how many blocks it took for a given transaction to confirm from the time where it was observed (0 means next block)
Y-axis denotes the fee paid pr byte measured in satoshis

Raw data: raw_tx_data.csv (confirming-block-height, fee-pr-byte, confirmation-time-in-blocks, txid)

Plot made like this:
cat raw_tx_data.csv | cut -d ',' -f 2,3 | sed -e 's/,/ /g' | gnuplot -p -e "set yrange [-1:60]; set terminal png size 1500,1000; set output 'fig.png'; set xrange[-100:8000] ;plot '-'"

Analyze away.
full member
Activity: 227
Merit: 100
April 05, 2013, 09:43:09 PM
#12
I think there doesn't have to be One True Answer, and I'd like to see the different clients experiment with different ways of estimating fees.

I like your idea, Jan-- go for it!

I want Consumer Reports magazine to do an article in 15 years comparing bitcoin wallets and figuring out which one gives the fastest transactions for the lowest fees...


15 years!?
How 'bout 5!   Smiley
sr. member
Activity: 504
Merit: 250
April 05, 2013, 04:22:24 PM
#11
Any completely automated metric is potentially abused by miners to run up fees.

Challenge accepted, what about this: You watch the last few days of transactions and build a histogram, delay vs transaction fee. Delay is expressed as blocks mined from txn broadcast to inclusion (starts from 1, the txn in included in the next block) and the fee is expressed in satoshi/byte. Using such a histogram and the correct algorithm you can spot the minimal fee that has say a 95% chance of getting you into the next block, or the next n blocks. You can expose this to the user via a slider and he can chose to pay a high fee for 99% chance of catching the next block or a lower fee for 80% chance in the next 10 blocks.

The miners could send high fee transactions among themselves all they want, they will not affect the probability of other genuine transactions of getting in. If the fee market is competitive the histogram will reflect the true minimal fee that gets you in. The only avenue of manipulation is for miners to prioritize special low fee transactions and make it appear as if the entry fee is lower than it really is. But there's little incentive for miners to create that perception.
legendary
Activity: 1400
Merit: 1013
April 05, 2013, 03:51:48 PM
#10
I think there doesn't have to be One True Answer, and I'd like to see the different clients experiment with different ways of estimating fees.
It sounds like you're suggesting that price discovery is a superior method for answering the question than central planning.
legendary
Activity: 1652
Merit: 2316
Chief Scientist
April 05, 2013, 03:50:25 PM
#9
I think there doesn't have to be One True Answer, and I'd like to see the different clients experiment with different ways of estimating fees.

I like your idea, Jan-- go for it!

I want Consumer Reports magazine to do an article in 15 years comparing bitcoin wallets and figuring out which one gives the fastest transactions for the lowest fees...
Jan
legendary
Activity: 1043
Merit: 1002
April 05, 2013, 03:20:24 AM
#8
The only way I see a miner can game my proposal is by adding transactions sending to himself with large fees.
When doing that he would not broadcast the transactions, but silently include them in his block. (If he broadcasts them another miner would be likely to collect the fee if he finds the block first.)

Let me know if you think there is another way to game it.

There is simple way to detect and filter those transactions out. Simply disregard transactions that you did not see observe as unconfirmed before they were included in a block.

Using this mechanism means that your node has to be running for some time before it can give a good estimate for a fee.


newbie
Activity: 28
Merit: 0
Jan
legendary
Activity: 1043
Merit: 1002
April 04, 2013, 03:57:59 AM
#6
It's worth reviewing the many threads on this subject, that have come before Smiley

Any completely automated metric is potentially abused by miners to run up fees.

I don't intend to totally automate it. The user would be presented with something like

"Choose the transaction fee for your transaction:
Fast: Fee X*2 BTC (Send)
Normal: Fee X BTC (Send)
Economic: Fee X/2 BTC (Send)
(Cancel)
"

Miners gaming the fee calculation is a risk though, as they can include transactions sent to themselves with enormous fees. My thinking is to try to get around this by cutting away the top 25% and letting the user choose the economic option (half of the remaining average) or cancel.
If a miner tries to drive up the fee he will have to let more than 25% of the transactions be sent from himself to himself,  and have at a loss due to unrealized profits from fees in 'legit' transactions.

Since mining is not a monopoly and the average fee-pr-byte is calculated over say 144 blocks the miner who wants to drive up the fees will have a loss to the remaining miners (unrealized fees), and will have to share the gain with them.

If the miners collude they can drive the fees to whatever they want, but this is aslo the case with the current structure, by simply not including any transactions with say a 10% fee.

hero member
Activity: 700
Merit: 500
April 04, 2013, 12:27:14 AM
#5
So to me the question: How do I predict the cost of getting my transaction into the next few blocks?
Is really: How do I predict the cost of getting a byte into the next few blocks?

My half-baked thinking on this is to:
1. Look at the last 144 blocks (24 hours)
2. Filter away gratis transactions
3. Filter away the bottom 25% and top 25% transactions that include a fee (measured in fee-pr-byte)
4. Calculate f1 = my-transaction-size-in-bytes *
5. Calculate f2 = the fee using traditional fee mechanisms for my transaction (0.0005 pr 1000 bytes)
6. Final fee =  max(f1,f2)

(Obviously step 4 and 5 will have to iterate until a fix point is reached as the size of the transaction may be affected by the fee size.)
Step 6 is there to avoid situations where f1 < f2, and miners still use the classic fee rules

On top of this the user could configure his wallet to be:
Economic: f1 = f1 * 1/2 (get confirmed when there is room)
Normal: f1 unmodified (get confirmed at a 'normal' rate)
Priority: f1 = f1 * 2 (get confirmed quickly)

This way end users have some control over their fees/confirmation-times without having an extended math degree, plus we get a fee market going.

What do you think?

This sounds like a fairly reasonable setting for your client, whether you set is as the default or make it an option. As long as not every client bids for space in this same manner it shouldn't be tempting enough for miners to try to game it.

For your app, make the default fee 0.0005 but in the advanced options, users can set the fee to 0.0001 if they want to. Then just put a notice that fees other than the default are NOT guaranteed to be included in the next block.

I like the idea of being able to set a default fee too, but for the other reason. I like bidding a bit above market to make sure my transactions are confirmed fast.
legendary
Activity: 3416
Merit: 1912
The Concierge of Crypto
April 03, 2013, 11:12:27 PM
#4
For your app, make the default fee 0.0005 but in the advanced options, users can set the fee to 0.0001 if they want to. Then just put a notice that fees other than the default are NOT guaranteed to be included in the next block.

The transaction should "expire" when a certain time or blocks has passed such as 1 day or 144 blocks later.
legendary
Activity: 1596
Merit: 1100
April 03, 2013, 05:39:10 PM
#3
It's worth reviewing the many threads on this subject, that have come before Smiley

Any completely automated metric is potentially abused by miners to run up fees.

Pages:
Jump to: