Pages:
Author

Topic: Pruning OP_RETURNs with illegal content (Read 4168 times)

legendary
Activity: 1890
Merit: 1086
Ian Knowles - CIYAM Lead Developer
July 03, 2015, 10:39:03 AM
#23
I built a simple testbed that encodes data in "sigs" (that use random K values) and although not the cheapest way to store data (the experiment was two bytes per sig and on an average laptop was taking roughly 15 seconds per byte pair to "mine" the sigs) it demonstrates that it is impossible to stop encoding of whatever data you want in the blockchain without even using OP_RETURN stuff.
legendary
Activity: 1232
Merit: 1094
One way to help with this is to make P2SH mandatory.

This means that the UTXO set could be stored as a set of hashes.  You don't even have to store the hash target.

UTXO pays to H = hash(output script)

You can store hash(salt | hash(output script)).  The salt isn't really required.

When someone tries to spend the output, they have to provide the output script as part of the P2SH process.

Archive nodes would have to store H though.  Attackers could encode their data in H.

This could be helped by requiring that blocks near the tip include H, but blocks more than 144 blocks deep can encode hash(H).

New blocks

UTXO: pay to H1 = hash(output script)

Old blocks

UTXO: pay H2 = hash(hash(output script))

This ensures that all the long term hashes (H2) have a known H1 that they are a hash of.

This makes it harder to encode info in the hashes.  You could still encode it in the last few bits though.
member
Activity: 66
Merit: 10
All due respect Peter, but being concerned about having illegal data in a system is not absurd

I'm not Peter, but yes it absolutely is absurd.  What is legal is constantly changing and has thousands of different definitions depending on where you happen to be.

The whole point of Bitcoin is technology that works across all borders and allows for consensus everywhere, irrespective of borders and man-made laws.  If you don't like some of the data that gets stored on your computer, you are free to delete it.  But yes, full nodes must keep all data passed to them.  That's how the system works.  It's *good* it works that way.

Yes tools that enable free speech also allow bad guys to say bad things, but having free speech guaranteed by technology is worth that price.  Tor is valuable even if it allows pedos to set up hidden services.  PGP is valuable even if it means the terrorists can plan things.  Bitcoin is valuable even if it means nodes get passed data they don't approve of.  The benefits of these technologies far outweight these concerns, and their preservation is of the utmost importance, far outweighing the concern for the bad things bad people do with them.  Robbers and kidnappers use cars and phones to do the bad things they do -- but would you think it's appropriate that the government should monitor where you go and who you call to make sure that you don't engage in that activity?

There is no tool that only does good things for good people.  There are only tools.  We accept that some tools make life easier for bad people, because they also make life so much better for good people.  Having phones and cars... and Tor and PGP and Bitcoin, is better than a world without those things.  And talking about crippling them so that their bad uses are harder does nothing but cripple the entire system, and subverts the entire point of the technology.

There is no phone that can only be used by good people.  And there is no distributed consensus system that can take any kind of data except illegal data.  And to talk about such a thing is silly... but to *want* such a thing is absurd.

Yes, it is absurd.

I have to agree with that about PGP.
newbie
Activity: 5
Merit: 0
March 02, 2015, 03:14:32 AM
#20
In my opinion, information implies storage and presentation. Take child pornography as an example, the data stored in the hard disk or even blockchain is storage, the video which decoded from that data is presentation. You CAN NOT prevent somebody publishing the data, cause the data can be transferred into infinite formations. If it's said that blockchain must solve this problem, so ALL the information systems also need to solve it.
legendary
Activity: 1022
Merit: 1008
Delusional crypto obsessionist
February 28, 2015, 07:41:10 AM
#19
All due respect Peter, but being concerned about having illegal data in a system is not absurd

I'm not Peter, but yes it absolutely is absurd.  What is legal is constantly changing and has thousands of different definitions depending on where you happen to be.

The whole point of Bitcoin is technology that works across all borders and allows for consensus everywhere, irrespective of borders and man-made laws.  If you don't like some of the data that gets stored on your computer, you are free to delete it.  But yes, full nodes must keep all data passed to them.  That's how the system works.  It's *good* it works that way.

Yes tools that enable free speech also allow bad guys to say bad things, but having free speech guaranteed by technology is worth that price.  Tor is valuable even if it allows pedos to set up hidden services.  PGP is valuable even if it means the terrorists can plan things.  Bitcoin is valuable even if it means nodes get passed data they don't approve of.  The benefits of these technologies far outweight these concerns, and their preservation is of the utmost importance, far outweighing the concern for the bad things bad people do with them.  Robbers and kidnappers use cars and phones to do the bad things they do -- but would you think it's appropriate that the government should monitor where you go and who you call to make sure that you don't engage in that activity?

There is no tool that only does good things for good people.  There are only tools.  We accept that some tools make life easier for bad people, because they also make life so much better for good people.  Having phones and cars... and Tor and PGP and Bitcoin, is better than a world without those things.  And talking about crippling them so that their bad uses are harder does nothing but cripple the entire system, and subverts the entire point of the technology.

There is no phone that can only be used by good people.  And there is no distributed consensus system that can take any kind of data except illegal data.  And to talk about such a thing is silly... but to *want* such a thing is absurd.

Yes, it is absurd.

There should be an upvote or karma button on this forum.
Or otherwise a bitcoin voting system something.

I like your post. I would upvote.
legendary
Activity: 1022
Merit: 1008
Delusional crypto obsessionist
February 28, 2015, 07:37:57 AM
#18
Just researching another possible threat.
What if an individual who is angry at Bitcoin for whatever reason tries to insert some illegal content into blockchain?

The funny thing is, bitcoin doesn't give a f*ck about rules or law.
Bitcoin doesn't 'know' what is illegal or not.
So, it doesn't matter.

I would like to invite anybody to put as much 'illegal' zeros and ones into the blockchain as possible.
I'm pretty sure that bitcoin will live on like nothing really happened.

Isn't it great?

#unicorns
legendary
Activity: 1400
Merit: 1013
February 27, 2015, 10:21:59 PM
#17
It's not like you can expect to grab other people's random pubkeys and do things with them and have anything but doom come of it.
Sure, you wouldn't want to do that with random scripts in the blockchain.

On the other hand, colored coin techniques allow you to know if the recipient of certain utxos has indicated a readiness to participate in Diffie-Hellman payments due to the fact that they've handled the outputs in a manner consistent with the color rules.

If one of the coloring rules happens to be "outputs of this color must be stored in pay-to-pubkey or pay-to-multisig outputs", then you could have the colored coins represents shares, and use Diffie-Hellman to derive dividend payment addresses.

Doesn't even require OP_RETURN for any step of the process.
staff
Activity: 4242
Merit: 8672
February 27, 2015, 09:18:20 PM
#16
Honestly I think new non-P2SH outputs should be made invalid at some point in the future not because of illegal activity but because if used for arbitrary data it bloats not just the blockchain but the far more critical UTXO set.
On the other hand, if those pubkeys are actually pubkeys and not arbitrary data, you can do Diffie-Hellman with them.
That's a useful property.
It's true but that data can be provided in a way thats optional-- aux data that may or may not come along with the blocks-- or even completely externally.  It's not like you can expect to grab other people's random pubkeys and do things with them and have anything but doom come of it. ("What do you mean you didn't know, I sent you a message!" "No you didn't." "Yes I did, it was right here." "Thats not my house, thats the neighbors flower bed." "Yea, so? it was accessible to you! plus it was encrypted with your key!" "and thats not my key!" "Sure it is, I took this random key you used in 1997 and added four to it, you could have decoded that"...)

Quote
* I think the problem is more universally described as can a blockchain be constructed such that non-transaction data is limited only to outputs that can be pruned without affecting blockchain validation.  I believe the answer is no.  It can be made harder to accomplish but it can't be made impossible.
The answer is yes. In two ways:  The easier but less useful way is to point out that Zero Knowledge proofs for general computation are known to be possible (and are verging on practical for non-trivial problems now).  In theory I could give you a blockchain tip, a utxo set for it, and it's total work behind it, along with a ZK proof that the chain was completely valid, and that the utxo set agrees. This meets your criteria.

The other way depends exactly on what you mean by "blockchain validation". Basically, do you ever consider signatures prunable? e.g. when they're burried deep in the chain.  We do know how to make txout scriptPubKeys "provable a hash". If signatures are pruned and all txouts are hashes then there are basically no non-trivial sidechannels left.

legendary
Activity: 1400
Merit: 1013
February 27, 2015, 01:19:25 AM
#15
Honestly I think new non-P2SH outputs should be made invalid at some point in the future not because of illegal activity but because if used for arbitrary data it bloats not just the blockchain but the far more critical UTXO set.
On the other hand, if those pubkeys are actually pubkeys and not arbitrary data, you can do Diffie-Hellman with them.

That's a useful property.
staff
Activity: 4242
Merit: 8672
February 25, 2015, 06:09:33 PM
#14
It may be absurd that there is reason to be concerned about such things, but there is reason to be concerned none the less.  Sometimes the world is absurd.
hero member
Activity: 793
Merit: 1026
February 25, 2015, 04:14:52 PM
#13
All due respect Peter, but being concerned about having illegal data in a system is not absurd

I'm not Peter, but yes it absolutely is absurd.  What is legal is constantly changing and has thousands of different definitions depending on where you happen to be.

The whole point of Bitcoin is technology that works across all borders and allows for consensus everywhere, irrespective of borders and man-made laws.  If you don't like some of the data that gets stored on your computer, you are free to delete it.  But yes, full nodes must keep all data passed to them.  That's how the system works.  It's *good* it works that way.

Yes tools that enable free speech also allow bad guys to say bad things, but having free speech guaranteed by technology is worth that price.  Tor is valuable even if it allows pedos to set up hidden services.  PGP is valuable even if it means the terrorists can plan things.  Bitcoin is valuable even if it means nodes get passed data they don't approve of.  The benefits of these technologies far outweight these concerns, and their preservation is of the utmost importance, far outweighing the concern for the bad things bad people do with them.  Robbers and kidnappers use cars and phones to do the bad things they do -- but would you think it's appropriate that the government should monitor where you go and who you call to make sure that you don't engage in that activity?

There is no tool that only does good things for good people.  There are only tools.  We accept that some tools make life easier for bad people, because they also make life so much better for good people.  Having phones and cars... and Tor and PGP and Bitcoin, is better than a world without those things.  And talking about crippling them so that their bad uses are harder does nothing but cripple the entire system, and subverts the entire point of the technology.

There is no phone that can only be used by good people.  And there is no distributed consensus system that can take any kind of data except illegal data.  And to talk about such a thing is silly... but to *want* such a thing is absurd.

Yes, it is absurd.
member
Activity: 98
Merit: 10
GlideSEC - www.glidesec.com
February 19, 2015, 09:03:36 AM
#12
You can much more easily publish data via the blockchain w/o OP_RETURN, and furthermore, you can easily put that data in to the UTXO set which all nodes *must* have if they are to maintain consensus.

Mike Hearn suggested we adopt blacklists to solve this problem back when someone put the child porn sections of the hidden wiki into the UTXO set; no-one's come up with a better solution since. You can make publishing that data more expensive by a small linear factor - about 10x to 100x - but that's the best you can do.

The best solution to this problem is legal and political: the idea that you have to prevent every last trace of "illegal data" from getting into a public ledger is absurd.

well said Peter.
staff
Activity: 4242
Merit: 8672
February 18, 2015, 09:11:34 PM
#11
There is no reason for you, personally, to keep around any old transactions for things burred in the blockchain. Pruning removes all transactions and signatures already and a full verifying node can happily be run this way.
legendary
Activity: 4130
Merit: 1307
February 18, 2015, 04:35:53 PM
#10
The question is: illegal content WHERE?

Illegal where the individual user is located.  I don't think there is a technical solution because Bitcoin's scripting language is so open ended*.  If it was impossible to store arbitrary data outside of OP_RETURN outputs and OP_RETURN outputs could be pruned (intra-transaction pruning) then it wouldn't be universally pruned.  Local users would prune what they feel is objectionable or illegal on an individual basis.  All that is academic though because it is so easy to encode arbitrary data in the blockchain in a manner which simply can not be pruned by full
...

I don't think there is a technical solution either.  I don't think there is a political one either given the immense number of jurisdictions.  It is like trying to remove pee from a pool instantly.

If the "illegal where" is "where the individual user is located" there is really no solution since some data can not be pruned except some type of blacklist (a bad idea as above) where the most restrictive laws anywhere are enforced.  That was kind of the point in asking the question. :-)
donator
Activity: 1218
Merit: 1079
Gerald Davis
February 18, 2015, 02:27:21 PM
#9
The question is: illegal content WHERE?

Illegal where the individual user is located.  I don't think there is a technical solution because Bitcoin's scripting language is so open ended*.  If it was impossible to store arbitrary data outside of OP_RETURN outputs and OP_RETURN outputs could be pruned (intra-transaction pruning) then it wouldn't be universally pruned.  Local users would prune what they feel is objectionable or illegal on an individual basis.  All that is academic though because it is so easy to encode arbitrary data in the blockchain in a manner which simply can not be pruned by full nodes.

The simplest method would be in a native (non-P2SH) multisig output.  Any PkScript would work but up to 3 of 3 native multisig with an output just above the dust threshold is considered standard, can't be pruned, and can be used multiple times in one transaction.

PkScript:  1 3 OP_CHECKMULTISIG

Instead of three valid pubkeys one could encode up to 195 bytes:

PkScript:  1 <65 bytes> <65 bytes> <65 bytes> 3 OP_CHECKMULTISIG

Now this can never be provably never be spent so nodes could remove it from the UTXO but that is easily solved by encoding 128 bytes using two fake pubkeys w/ proper prefix and one valid key.

PkScript:  1 <04|64 bytes> <04|64 bytes> 3 OP_CHECKMULTISIG


* I think the problem is more universally described as can a blockchain be constructed such that non-transaction data is limited only to outputs that can be pruned without affecting blockchain validation.  I believe the answer is no.  It can be made harder to accomplish but it can't be made impossible.



donator
Activity: 1218
Merit: 1079
Gerald Davis
February 18, 2015, 02:23:56 PM
#8
Nah, see, the whole point of OP_RETURN is that unless you're bootstrapping new clients you don't actually need to store it. If pruning is implemented, you're not pruning the transaction - you're pruning the output. The coins that were used as inputs are still used up in your local state.

Bitcoin doesn't support intra-transaction pruning.  It would be possible using a merkle tree of inputs and outputs but currently you can't prune just an output and still validate the transaction.  Now you are right we don't use the blockchain to validate new blocks and txns.  We use the blockchain to build the UTXO and use the UTXO to validate new txns and blocks.  Technically you could delete the entire historical blockchain once you parse it without any reduction in security but that isn't what most people mean when they say prune the blockchain.

An individual could easily delete an OP_RETURN output but with the current chain validation someone, somewhere must record it.  You can't validate a txn without all the outputs.  You can't validate a block without validating all the transactions.  You can't validate the blockchain without validation all the blocks.

Still I think the worry about OP_RETURN misses the point.  Bitcoin makes it pretty easy to encode arbitrary data in transactions without using OP_RETURN.  Even if OP_RETURN outputs were easily pruned to comply with local laws what about all the other transactions.  Right now you can use native multisig to encode up to 192 bytes per output at the cost of just of one satoshi higher than the dust threshold.  Honestly I think new non-P2SH outputs should be made invalid at some point in the future not because of illegal activity but because if used for arbitrary data it bloats not just the blockchain but the far more critical UTXO set.
legendary
Activity: 4130
Merit: 1307
February 18, 2015, 01:04:38 PM
#7
...
...the risks of illegal content become more important....

The question is: illegal content WHERE?

e.g.  It is illegal to publish stolen national security information in the US (e.g. James Rosen as "abetting a leaker" or Snowden).  Is it illegal in the blockchain?

It is illegal to publish an image of Muhammad in some places, legal in others.  Is it illegal in the blockchain?

Marriages may be made as children and consummated at 9 or 10 in some places, but that is illegal other places.  Would that be considered child porn if images were published even if it is legal somewhere to do with your wife?

The real question is, who decides what is illegal and what is not?  Where is this illegality?  A town of 50 might prohibit something.  Or a town of 50,000, a city of 5 million? A country of 20 million?   What is the cutoff?  Who decides?

The blockchain is just data.  I could look at a block or sequence of blocks and publish an algorithm to decode those blocks into pretty much anything offensive to someone, somewhere or has enough power in some jurisdiction to make it illegal.   E.g. take bytes 1 through 50000, and apply this code.  You could even write a routine that would say, "I want this as the result, given this input, create an overlay algorithm to create it".  It would not be difficult.

Blacklisting is a bad idea to start with, but blacklisting to serve some unknown political masters from every potential jurisdiction in the world opens up a huge can of worms.  This would create some non-bitcoin alt-coin as I can't imagine would fly as "bitcoin".

:-)

legendary
Activity: 960
Merit: 1028
Spurn wild goose chases. Seek that which endures.
February 18, 2015, 10:51:13 AM
#6
Even if it were implemented, it would create a mess if some nodes pruned a transaction while others did not. Also, the inputs of fully pruned transactions could be double spent.
Nah, see, the whole point of OP_RETURN is that unless you're bootstrapping new clients you don't actually need to store it. If pruning is implemented, you're not pruning the transaction - you're pruning the output. The coins that were used as inputs are still used up in your local state.

Of course, that's easiest when you're taking the approach of using the blockchain for bootstrapping only, and just using the UTXO set for day to day transactions. If you want to store the blockchain too, you need some extra mechanism if you want to support redacting OP_RETURN data blocks. You could theoretically do it with ZKPs, though - basically attaching a proof that "for transaction X, I blotted out their data, but I totally know an input that still makes that transaction hash to its txid".

All that said, though, I feel like we should cross this bridge when we come to it. Sound technical solutions exist, at least once you decide that you want to delete a particular OP_RETURN output. But there's no point in implementing them if this is still just a theoretical attack.

The OP should consider that his computer contains a sufficient amount of '1's and '0's which could all be simply rearranged somewhat to form 1000s of child porn images - ON HIS COMPUTER.  But those '1's and '0's on his computer, just because they are presently out of order, nevertheless do constitute child porn.  He is clearly a pig.  That is right, your computer is loaded with child porn!!!  The cops are going to find you and arrest you for this and you will go to jail for life.
That's a little disingenuous. OP_RETURN data blocks are opaque binary data, and every image in a commonly used image format begins with a preamble that in practice make the interpretation of the blob pretty unambiguous.
legendary
Activity: 1386
Merit: 1000
KawBet.com - Anonymous Bitcoin Casino & Sportsbook
February 17, 2015, 04:38:21 PM
#5
You can much more easily publish data via the blockchain w/o OP_RETURN, and furthermore, you can easily put that data in to the UTXO set which all nodes *must* have if they are to maintain consensus.

Mike Hearn suggested we adopt blacklists to solve this problem back when someone put the child porn sections of the hidden wiki into the UTXO set; no-one's come up with a better solution since. You can make publishing that data more expensive by a small linear factor - about 10x to 100x - but that's the best you can do.

The best solution to this problem is legal and political: the idea that you have to prevent every last trace of "illegal data" from getting into a public ledger is absurd.
Peter - your bandwidth is precious.  Please don't waste time responding to moronic posts.

The OP should consider that his computer contains a sufficient amount of '1's and '0's which could all be simply rearranged somewhat to form 1000s of child porn images - ON HIS COMPUTER.  But those '1's and '0's on his computer, just because they are presently out of order, nevertheless do constitute child porn.  He is clearly a pig.  That is right, your computer is loaded with child porn!!!  The cops are going to find you and arrest you for this and you will go to jail for life.
hero member
Activity: 546
Merit: 500
February 12, 2015, 03:14:08 PM
#4
You can much more easily publish data via the blockchain w/o OP_RETURN, and furthermore, you can easily put that data in to the UTXO set which all nodes *must* have if they are to maintain consensus.

Mike Hearn suggested we adopt blacklists to solve this problem back when someone put the child porn sections of the hidden wiki into the UTXO set; no-one's come up with a better solution since. You can make publishing that data more expensive by a small linear factor - about 10x to 100x - but that's the best you can do.

The best solution to this problem is legal and political: the idea that you have to prevent every last trace of "illegal data" from getting into a public ledger is absurd.

I absolutely agree that the problem is a legal/political one. Attempting to use technical solutions for legal/political problems is not always a recipe for success. 
Pages:
Jump to: