Stupid question - Why don't we just compress the blocks?

sAt0sHiFanClub

hero member

Activity: 546

Merit: 500

Warning: Confrmed Gavinista

Quote from: AliceWonderMiscreations on January 16, 2016, 09:27:56 AM

And video compression is typically lossy. Lossy would never work for the blockchain.

Lossy would better describe MtGox and Cryptsy, amirite? Grin

shorena

copper member

Activity: 1498

Merit: 1528

No I dont escrow anymore.

Quote from: AlexGR on January 16, 2016, 10:35:08 AM

Quote from: shorena on January 16, 2016, 10:33:37 AM

Quote from: AlexGR on January 16, 2016, 10:20:14 AM

Quote from: shorena on January 16, 2016, 10:07:28 AM

You dont get it do you? If it would be possible. I could also store your entire being in a single bit. How can a single bit represent you? Are you 0 or 1?

It wouldn't work that way. Say you have compression savings of 1%. When you compress down to 100 bits you then go to 99 (-1%). Then what? You would need 98.01 bits storage, so there is your limit. 98.01 = 99 = you can't go further.

You failed to get the point, I will assume you are represented by a 0.

I thought you were trying to prove the impossibility of multiple-iteration compression by creating a single-bit paradox Tongue

The single bit is not the important part. The important part is that you cant represent 2ⁿ+1 different things with n bits. Well you can, but you will have at least 1 collision and thus lose information. You can change the way you store the information however you like. All compression does is remove redundancy and usually our encoding schemes are pretty redundant. E.g. english texts have a high amount of e's thus a perfect encoding would use a short code[1] for e and a longer one for a symbol that is used less often, like q. Encrypted data however is different, because every symbol is equally likely[2], thus there are no general shortcuts for encrypted data. All you will get is that you need more data to store your additional encoding information, be that in the datachunk itself or in a predistributed table.

[1] short and long in terms of number of bits used.
[2] If this would not be a property of encrypted data it would be vulnerable to a frequency analysis.

AlexGR

legendary

Activity: 1708

Merit: 1049

Quote from: shorena on January 16, 2016, 10:33:37 AM

Quote from: AlexGR on January 16, 2016, 10:20:14 AM

Quote from: shorena on January 16, 2016, 10:07:28 AM

You dont get it do you? If it would be possible. I could also store your entire being in a single bit. How can a single bit represent you? Are you 0 or 1?

It wouldn't work that way. Say you have compression savings of 1%. When you compress down to 100 bits you then go to 99 (-1%). Then what? You would need 98.01 bits storage, so there is your limit. 98.01 = 99 = you can't go further.

You failed to get the point, I will assume you are represented by a 0.

I thought you were trying to prove the impossibility of multiple-iteration compression by creating a single-bit paradox Tongue

shorena

copper member

Activity: 1498

Merit: 1528

No I dont escrow anymore.

Quote from: AlexGR on January 16, 2016, 10:20:14 AM

Quote from: shorena on January 16, 2016, 10:07:28 AM

You dont get it do you? If it would be possible. I could also store your entire being in a single bit. How can a single bit represent you? Are you 0 or 1?

It wouldn't work that way. Say you have compression savings of 1%. When you compress down to 100 bits you then go to 99 (-1%). Then what? You would need 98.01 bits storage, so there is your limit. 98.01 = 99 = you can't go further.

You failed to get the point, I will assume you are represented by a 0.

CIYAM

legendary

Activity: 1890

Merit: 1086

Ian Knowles - CIYAM Lead Developer

BTW - people were trying this "ultimate compression" scam back in the 1990's (on usenet) so your idea is really not much newer than the Nigerian Prince one.

I suggest you try harder next time. Wink

AlexGR

legendary

Activity: 1708

Merit: 1049

Quote from: shorena on January 16, 2016, 10:07:28 AM

You dont get it do you? If it would be possible. I could also store your entire being in a single bit. How can a single bit represent you? Are you 0 or 1?

It wouldn't work that way. Say you have compression savings of 1%. When you compress down to 100 bits you then go to 99 (-1%). Then what? You would need 98.01 bits storage, so there is your limit. 98.01 = 99 = you can't go further.

CIYAM

legendary

Activity: 1890

Merit: 1086

Ian Knowles - CIYAM Lead Developer

Quote from: AlexGR on January 16, 2016, 10:15:21 AM

You don't need to be 100% static in your approach like I proposed upthread. You could alternate compression technique in each compression iteration.

You are spouting nonsense - why?

My guess is that you are trying to scam people because there is no logical reason to spout such nonsense otherwise.

I think I would be warning others to be very careful taking anything this forum member has to post with more than a grain of salt.

AlexGR

legendary

Activity: 1708

Merit: 1049

Quote from: CIYAM on January 16, 2016, 10:08:12 AM

Quote from: AlexGR on January 16, 2016, 10:05:55 AM

Saying you'll get 99.9% sounds much different than saying I can get 0.1%, doesn't it? Yet if you can get 0.1% consistently, in every data set (obviously that includes compressed ones), then over several hundreds / thousand iterations you'll end up with a fraction of the original size.

You cannot get 0.1% consistently - if you are given a random set of values with no repeat then how on earth did you compress it by any percentage at all?

You are either deluded or wanting to scam but in either case your statements are not even close to rational.

You don't need to be 100% static in your approach like I proposed upthread. You could alternate compression technique in each compression iteration.

For example you could be using a classic compression in odd iterations and a pre-shared table in even iterations. This way, each step would be easier to compress as a pre-shared table would be like starting compression from scratch / the classic one would start with a fresh data set. Look, I'm not developing 99% compression algorithms but I'm fairly confident it can be done. And it will be done (if it hasn't been already).

CIYAM

legendary

Activity: 1890

Merit: 1086

Ian Knowles - CIYAM Lead Developer

Quote from: AlexGR on January 16, 2016, 10:05:55 AM

Saying you'll get 99.9% sounds much different than saying I can get 0.1%, doesn't it? Yet if you can get 0.1% consistently, in every data set (obviously that includes compressed ones), then over several hundreds / thousand iterations you'll end up with a fraction of the original size.

You cannot get 0.1% consistently - if you are given a random set of values with no repeated ones then how on earth did you compress it by any percentage at all?

You are either deluded or wanting to scam but in either case your statements are not even close to rational.

shorena

copper member

Activity: 1498

Merit: 1528

No I dont escrow anymore.

Quote from: CIYAM on January 16, 2016, 09:59:54 AM

Hmm... I think this silly topic has now started to go too far even for entertainment purposes.

Arent you excited? 3D-bits could be a revolution. Enless compression, all the data on a single bit, imagine the things we could do.

Footnote: There might be some hidden information in this message.

Its sarcasm.

Quote from: AlexGR on January 16, 2016, 10:05:55 AM

Quote from: CIYAM on January 16, 2016, 09:52:29 AM

Quote from: AlexGR on January 16, 2016, 09:50:22 AM

If you ask me "is 99.9% compression feasible" in every data set, I have 100% confidence that it is. I just don't know the method.

Then unfortunately the only thing to say about that is that you probably shouldn't repeat that (and oops - I just quoted you so now you can't erase it - damn that stupid internet never forgets thing).

There is a very extensive list about things that we "shouldn't" have achieved, and experts were sure of it, yet we did. As I said, if you find something that can shave off even 1 or 0.1% per iteration, it's just a number of how many times you fold the data from that point onward.

Saying you'll get 99.9% sounds much different than saying I can get 0.1%, doesn't it? Yet if you can get 0.1% consistently, in every data set (obviously that includes compressed ones), then over several hundreds / thousand iterations you'll end up with a fraction of the original size.

You dont get it do you? If it would be possible. I could also store your entire being in a single bit. How can a single bit represent you? Are you 0 or 1?

AlexGR

legendary

Activity: 1708

Merit: 1049

Quote from: CIYAM on January 16, 2016, 09:52:29 AM

Quote from: AlexGR on January 16, 2016, 09:50:22 AM

If you ask me "is 99.9% compression feasible" in every data set, I have 100% confidence that it is. I just don't know the method.

Then unfortunately the only thing to say about that is that you probably shouldn't repeat that (and oops - I just quoted you so now you can't erase it - damn that stupid internet never forgets thing).

There is a very extensive list about things that we "shouldn't" have achieved, and experts were sure of it, yet we did. As I said, if you find something that can shave off even 1 or 0.1% per iteration, it's just a matter of how many times you fold the data from that point onward.

Saying you'll get 99.9% sounds much different than saying I can get 0.1%, doesn't it? Yet if you can get 0.1% consistently, in every data set (obviously that includes compressed ones), then over several hundreds / thousand iterations you'll end up with a fraction of the original size.

CIYAM

legendary

Activity: 1890

Merit: 1086

Ian Knowles - CIYAM Lead Developer

Hmm... I think this silly topic has now started to go too far even for entertainment purposes.

AlexGR

legendary

Activity: 1708

Merit: 1049

Quote from: erre on January 16, 2016, 09:48:01 AM

I'm not very tech-savy, but I think I can understand the problem: how can you make a summary of a book made up of random words? The problem is not easy, but I was thinking about that: what if we substitute the most repeating series of numbers with a symbol? I.e. you could replace the 00000 (if it happens many times in the block) with @, 037753198 (if it happens more than N times) with ₩ and so on.....

Could it work?

It definitely would. However the symbol would end up being represented by binary data. So if you have, say, one byte of storage, that's 255 symbols and a no-symbol, which are then represented by 8 binary digits. (0 or 1 from 00000000 to 11111111)

If we had a hardware chip that had a table of, say, 100.000 "symbols" that didn't operate in binary, it would work. But it would have to "cooperate" with the rest of the system that would then need "translators".

I've read something similar here:

http://www.endlesscompression.com/

"In the Dutch book "De broncode"(*) Jan Sloot talks about an other way of thinking, something what work at hardware level, what's an other way of coding he also named it "seven" translated from Dutch it can also mean "to sieve" or "to filter". He didn't use zeros and one's any more because that was two dimensional he explains that there are three dimensions."

CIYAM

legendary

Activity: 1890

Merit: 1086

Ian Knowles - CIYAM Lead Developer

Quote from: AlexGR on January 16, 2016, 09:50:22 AM

If you ask me "is 99.9% compression feasible" in every data set, I have 100% confidence that it is. I just don't know the method.

Then unfortunately the only thing to say about that is that you probably shouldn't repeat that (and oops - I just quoted you so now you can't erase it - damn that stupid internet never forgets thing).

Cheesy

And while your at it I guess you would also believe 100% in this: https://en.wikipedia.org/wiki/Russell's_teapot

Shocked

shorena

copper member

Activity: 1498

Merit: 1528

No I dont escrow anymore.

Quote from: erre on January 16, 2016, 09:48:01 AM

I'm not very tech-savy, but I think I can understand the problem: how can you make a summary of a book made up of random words? The problem is not easy, but I was thinking about that: what if we substitute the most repeating series of numbers with a symbol? I.e. you could replace the 00000 (if it happens many times in the block) with @, 037753198 (if it happens more than N times) with ₩ and so on.....

Could it work?

No, because there is no "most repeating series" in random numbers. They are all equal. If you create a huffman code[1] for random data all code words have the same length, because every symbol has the same probability to appear. Compression has fundamental limits if you do not want lose data.

Think about it like this. You have 4 things you want to represent, what is the smallest amount of data that you can use for that? The answer is 4 bits.

00 - thing #1
01 - thing #2
10 - thing #3
11 - thing #4

Now if someone comes along and claims they can compress 6 things into 4 bits, they either lose data (2/6) or are full of shit[2].

[1] https://en.wikipedia.org/wiki/Huffman_coding
[2] https://bitcointalksearch.org/topic/m.13573352

fuathan

hero member

Activity: 1092

Merit: 520

Aleph.im

Quote from: SamusNi on January 16, 2016, 08:59:15 AM

Quote from: CIYAM on January 16, 2016, 08:49:58 AM

Quote from: SamusNi on January 16, 2016, 08:47:23 AM

instead of all of this discussions to increase the block size, why don't we just compress the blocks, leaving the size as it is?

Blocks consist of transactions that for the most part are effectively random numbers (such as hashes, public keys and signatures) so they simply won't compress much at all (as you can't in any sensibly usable way compress random information).

The efforts that are going on behind the scenes will make a much bigger difference than any tiny percent you could compress the content of a block.

Are you sure about that? Wouldn't something like gzip applied to the blocks reduce their size by like 99%?

Which efforts are going on behind the scenes exactly?

If you want to compress anything in digital you need to find a methodology to zip bytes and then decompress it later with this methodology (or formula.

There is no methodology for digital numbers that create a block. If they find a methodology for it they can easily produce fake bitcoins.

AlexGR

legendary

Activity: 1708

Merit: 1049

Quote from: CIYAM on January 16, 2016, 09:38:06 AM

Quote from: AlexGR on January 16, 2016, 09:33:05 AM

This is not about video per se. It's about an algorithm that a neural network discovered, which could compress a lot of data with very high percentage ratio.

The fact that they publish nothing about this supposed algorithm suggests that it is in fact a hoax rather than some revolutionary new thing.

It's strange that people will just accept "we can't publish stuff because of X" when in fact they could publish the specific algorithm used (for the supposed video mentioned) without giving away how that algorithm was created (as supposedly the algorithm was simply one of an infinite number that this amazing AI could create).

Some ideas can be so radical that even hinting at the direction of the proposed solution could ignite "lamps" over other people's heads that would try to reproduce the solution.

If you ask me "is 99.9% compression feasible" in every data set, I have 100% confidence that it is. I just don't know the method.

Theoretically even if you find an algorithm that reduces size by even 1-2% in every possible data set, then you only have to fold the data multiple times and bring them to near zero over a large number of iterations. It would have a cpu tradeoff though.

AliceWonderMiscreations

full member

Activity: 182

Merit: 107

A couple guys I use to work with developed something they called the Internet Compression Algorithm.

It could compress the entire Internet into a single bit.

I bet for enough bitcoin they would share it... Grin

erre

legendary

Activity: 1680

Merit: 1205

I'm not very tech-savy, but I think I can understand the problem: how can you make a summary of a book made up of random words? The problem is not easy, but I was thinking about that: what if we substitute the most repeating series of numbers with a symbol? I.e. you could replace the 00000 (if it happens many times in the block) with @, 037753198 (if it happens more than N times) with ₩ and so on.....

Could it work?

fairglu

legendary

Activity: 1100

Merit: 1032

Quote from: AlexGR on January 16, 2016, 09:33:05 AM

Quote from: AliceWonderMiscreations on January 16, 2016, 09:27:56 AM

Quote from: AlexGR on January 16, 2016, 09:19:59 AM

Quote from: Lauda on January 16, 2016, 09:13:05 AM

This is a nice example of why people with IT degree need to decide on the technicalities (not trying to be offensive). Compressing random data usually results in 0% saved space or the compressed file ends up actually being bigger than the original.

We just need a revolutionary new compression method. Something like that:

http://www.theserverside.com/feature/Has-a-New-York-startup-achieved-a-99-compression-rate

From presentations I've seen, this is not only for video. It was thought that it would be marketed best for video because video takes up most internet bandwidth nowadays.

Perhaps people with neural networks will start competing on finding increasingly more efficient compression, compared to what we have now.

Video has a lot of predictable redundant data. Random data is, well, random.

And video compression is typically lossy. Lossy would never work for the blockchain.

This is not about video per se. It's about an algorithm that a neural network discovered, which could compress a lot of data with very high percentage ratio. Video is just one of the deployment markets because it takes up >50% of internet bandwidth, so, naturally, they went after it. But, from what I saw in one of the presentations, it's more like data agnostic.

Plus, in that example, the original file is an MP4, which has already reduced redundant data (and has loss of quality).

MP4 still has a lot of redundant data, it only looks for local changes over a few frames.

That said, that article triggered a few of my "snake oil salesman" sensors

Topic: Stupid question - Why don't we just compress the blocks? (Read 1200 times)