Pages:
Author

Topic: [PRE-ANN][ZEN][Pre-sale] Zennet: Decentralized Supercomputer - Official Thread - page 18. (Read 57089 times)

hero member
Activity: 897
Merit: 1000
http://idni.org
Quote
My last reply to this still stands, Gauss was "wrong" about our model since he wasn't considering intentionally introduced fault.  Gauss-Markov (which I love, being ML guy) ends up being double wrong, because the Markov side of things assumes the errors are uncorrelated.  An attacker introducing error may certainly introduce correlated error, and may even have reason to explicitly do so!  Wink
man stop mixing miscalculation with procfs spoofing
sr. member
Activity: 434
Merit: 250
3. system translates procfs to zencoins using terms of benchmarks

number 3 may be tampered. and here you have all other means i mentioned.
the control on the price and logic is all the user's. the system just helps them translate procfs to zencoin.

Precisely the contradiction, restated yet again.  "Translate procfs to zencoin" is the exact same problem as "price procfs" so this statement is precisely restated as "The control on the price is the user's. The system just helps them by doing pricing."

We collectively decide what the coin is worth.  A user decides how many coins one "proc unit" is worth.  The system decides how many "proc units" one execution is worth.

The system has priced the resource, not the user.  The user priced the token in which that resource pricing is denominated, and the market priced the token in which that token is denominated, but the resource itself was valuated systematically by some code, not by the user or the broader market.

(Or do you only consider things priced if they are priced denominated in fiat dollars?   Huh)

yeah, i forgot to mention that Gauss also promised they're uncorrelated and with equal variances. so it's not marginal only. see gauss-markov theorem

My last reply to this still stands, Gauss was "wrong" about our model since he wasn't considering intentionally introduced fault.  Gauss-Markov (which I love, being ML guy) ends up being double wrong, because the Markov side of things assumes the errors are uncorrelated.  An attacker introducing error may certainly introduce correlated error, and may even have reason to explicitly do so!  Wink

Quote
it is a well guided mitigation. with parameters promised to normally distribute while being uncorrelated.

I'm becoming really quite convinced that where you've "gone all wrong" is in this repeated assumption of semi-honest participation.

Much of what you're saying, but particularly this, simply doesn't hold in the explicit presence of an attacker.

Quote
2. many problems are very easy to verify, so the investment is far from doubling itself. example for such problems are matrix inversion, eigenvalues problems, svd, al NP-Complete problems, numerically solving (mainly-time-dependent-)PDEs, root finding and many more.

Sure, but again I suspect that most useful jobs will not be "pure" and so will not fall into any of these.

Quote
3. also in real life, you don't have real promises, just probabilities

The one exception to this, of course, being formal proof.  We can actually offer real promises, and this is the even central to the "seemingly magic" novelty of bitcoin and altcoins.  Bitcoin really does promise that no-one successfully double spends with any probability as long as hashing is not excessively centralized and the receiver waits an appropriate number of confirms.  (Both seemingly reasonable assumptions.)

Why you're so readily eschewing approaches that can offer any real promises, even go so far as to deny they exist "in real life," despite our Bitcoin itself being a great counterexample, is confusing to me.

Quote
4. we both know that most of the users won't tamper the code,

I assume most users will be rational and will do whatever maximizes their own profit.

I actually go a bit further to assume that users will actually behave irrationally (at their own expense) if necessary to maximize their own profit.  (There's some fun modal logic!)

(I further have always suspected this is the root cause behind the failure of many corporations.)

Quote
so the risk probability to lower even more, isn't so high at the first place. they have nothing to earn from miscomputation.

This caries a similar flavor as your HPC giants being motivated to turn in your bugs for your 1BTC bounty.

Miners have everything to earn from violating correctness.

If I can sell you on some resource contract, deliver some alternate and cheaper resource, and have you walk away none the wiser, I profit.

If I can sell you on some resource contract, convince you to make a deposit, and then abscond with almost no cost, I profit.

If I can sell you on some resource contract, actually perform the work and get paid, then turn around and sell your data and software to the highest bidder, I profit.

People *will* do these things at any given opportunity, and people will use any *other* vulnerability of the system to carry out these practices.  Even if you do everything possible to mitigate these concerns, it will all come down to people bulk mining identities to burn.

How exactly is this not just like CPUShare again, and why exactly shouldn't we expect it to fail for the exact same reasons, again?
hero member
Activity: 897
Merit: 1000
http://idni.org
Quote
Sure, my point is that the benchmarks are presented as "one size fits all" and they aren't.  It's ultimately the same problem as the benchmark having nothing to do with the job, in a way.  How are you going to maintain and distribute useful benchmarks for every new computing kit and model that comes out?  (At least that problem is not so bad as trying to maintain a benchmark for any job algorithm used.  There can be only finite technologies.  Wink)

it's not about "one size fits all". it's about describing how much has to be invested in a job. like saying "each task of mine is as about 1000 fft of random numbers" or a linear combination of many such tasks.
again, such task decomposition is not mandatory, only the ongoing procfs msmts
another crucial point is that we don't have to have accurate benchmarks at all. just many of them, as different as possible.

Quote
Being continuously re-coded and added to in order to support change does not really fit my definition of flexible.  That is kind of like saying a brick is flexible because you can stack more bricks on top of it.

it is *able* to be customized and coded, hence flexible.
it has to be done only for totally new creatures of hardware.
hero member
Activity: 897
Merit: 1000
http://idni.org
Quote
Given that once one owns an identity one can get at least one issuance of some payment for no work, why would anyone bother running actual computations and building up a positive reputation instead of mining IDs to burn by not providing service?

The ID POW can only have any effect if the cost to generate an ID is significantly higher than the gain from burning it, the cost the publisher pays for utilizing the attacker's non-working worker.  In other words, it has to be enforced somewhere/somehow that the cost of an ID is more than the "first round" cost of a job.  I don't see how/where this can be enforced, but I'm not precluding it just yet.  If you would enforce this somehow, it would solve at least the "trust model in a vacuum" part of the problem by relating the value of the trust entity to the value of its resource.

very easy: each client picks any amount of ID POW to require from its parties.
hero member
Activity: 897
Merit: 1000
http://idni.org
I misquoted a comment so I deleted it and writing again:

Quote
Quote
it is possible to take the algorithm which the publisher wants to run, and decompose its procfs measurements into linear combination of canonical benchmarks.
Is it?  (In a way aside from my proposed functional extension mechanism, that also doesn't give the publisher some free ride?)
If so, this might be a useful approach.  If anything, this should be explored further and considered.

now if i get to explain you this, i think you'll understand much more of the broader picture.

that's trivial! let me convince you that's trivial:

i have a program which i want to decompose into a linear combination of benchmarks w.r.t. procfs measurements (abbrev. msmts).

i.e., i want to describe the vector x of procfs msmts of a given program as a linear combination of procfs msmts vectors of another n programs.

assume the dimension of the vectors is k.

all i need to do is have n>=k and have the matrix having n rows which are our vectors be of rank k.
i will get it for almost surely if i just pick programs "different enough", or even if they're just "a bit different", i may increase n.


of course, only small tasks are considered.
hero member
Activity: 897
Merit: 1000
http://idni.org
Is this like the Amazon service where you can rent processing power, but on the blockchain instead?

it is a free market alternative to AWS aimed for massive computations. it consists also from blockchain technology.
member
Activity: 64
Merit: 10
Is this like the Amazon service where you can rent processing power, but on the blockchain instead?
hero member
Activity: 897
Merit: 1000
http://idni.org
Quote
Again this comment doesn't match up.  The attacker can just as well lie every few seconds.
yeah but then he'll get blocked, and if ID POW is in use by the publisher, it'll be difficult for him to cheat again.

Quote
Sure, an attacker can only slope the vector normal, not outright violate the normality constraint.  This says nothing of the violation of the linear assumption, though?
an attacker can do *anything* to the measurements. but the mitigation makes the expectation of the risk negligible. at least on wide enough cases.
the linear estimator gives me normally distributed errors so i can easily identify outliers. this property exists thanks to the linear estimator, and does not depend on the UVs linearity assumption (which, and partially answering a question you raised, is more accurately "low-enough-dimension assumption"). how low is low enough? order of magnitude of the number of different benchmarks i can afford to run.

Quote
Cool, so now you are starting to accept and admit that you actually do not offer solution.  I consider this progress!  Now the challenge will just be in getting you to try for some solutions instead of just making a fancy new CPUShare. XD

and again: i'm not looking for the kind of solution you're looking for. i'm looking for a working solution, which from economic (and performance!!) point of view, is preferred over existing alternatives. it doesnt necessarily have to be cryptographically proven work etc.
sr. member
Activity: 434
Merit: 250
Quote
Quote
true. procfs doesn't have this linearity. that's exactly what my algo comes to fix.
but *any* n-dim vector can be written as a linear combination of n linearly independent vectors.
Sure, but that doesn't magically make the measure linearly constrained.  You're still assuming that the measurements are actually done at all, which is a mistake.  We can't assume the semi-honest model for this step.  We have to assume the attacker just makes up whatever numbers he wants, here.

accumulated procfs readings are taking place every few seconds, then updating the micropayments transactions accordingly.
see my list in prev comments about miscalculations

Again this comment doesn't match up.  The attacker can just as well lie every few seconds.

Quote
Quote
Quote
those latter n vectors are the unknowns (their number might be different than n, yet supported by the algo).
note that I never measure the UVs, nor give any approximation of them. I only estimate their product with a vector.
Er, of course you are measuring the UVs.  The performance of {xk} constitutes the measurement of all UVs, together.  Just because you measure them through indirect observation doesn't mean you aren't measuring them.  In any case I'm not sure what this has to do with anything?  (Or were you just stating this in case I had missed the implication?)

note that measuring a vector vs measuring its inner product with another vector, is estimating 1 number vs estimating n numbers, while variance goes down by 1/n (since it's all normal, since my estimator's errors are normal, uncorrelated, and with zero mean, *even if the system is nonlinear and not-normal*)

Sure, an attacker can only slope the vector normal, not outright violate the normality constraint.  This says nothing of the violation of the linear assumption, though?

Quote
as for mitigation, that's the very part of zennet's solution. we come from a recognition that we can't promise 100% and we don't need to promise 100%, cause all you have in real life is probabilities.

Cool, so now you are starting to accept and admit that you actually do not offer solution.  I consider this progress!  Now the challenge will just be in getting you to try for some solutions instead of just making a fancy new CPUShare. XD
sr. member
Activity: 434
Merit: 250
before i answer in detail,
since you mentioned the coin,
let me just make sure that you noticed that the computational work has nothing to do with mining, coin generation, transaction approval etc., and it all happens only between two parties with no implications at all to any 3rd party, not to mention the whole network.

I fully understand this.  It is not relevant.
hero member
Activity: 897
Merit: 1000
http://idni.org
Quote
Quote
true. procfs doesn't have this linearity. that's exactly what my algo comes to fix.
but *any* n-dim vector can be written as a linear combination of n linearly independent vectors.
Sure, but that doesn't magically make the measure linearly constrained.  You're still assuming that the measurements are actually done at all, which is a mistake.  We can't assume the semi-honest model for this step.  We have to assume the attacker just makes up whatever numbers he wants, here.

accumulated procfs readings are taking place every few seconds, then updating the micropayments transactions accordingly.
see my list in prev comments about miscalculations

Quote
Quote
those latter n vectors are the unknowns (their number might be different than n, yet supported by the algo).
note that I never measure the UVs, nor give any approximation of them. I only estimate their product with a vector.
Er, of course you are measuring the UVs.  The performance of {xk} constitutes the measurement of all UVs, together.  Just because you measure them through indirect observation doesn't mean you aren't measuring them.  In any case I'm not sure what this has to do with anything?  (Or were you just stating this in case I had missed the implication?)

note that measuring a vector vs measuring its inner product with another vector, is estimating 1 number vs estimating n numbers, while variance goes down by 1/n (since it's all normal, since my estimator's errors are normal, uncorrelated, and with zero mean, *even if the system is nonlinear and not-normal*)

as for mitigation, that's the very part of zennet's solution. we come from a recognition that we can't promise 100% and we don't need to promise 100%, cause all you have in real life is probabilities.
hero member
Activity: 897
Merit: 1000
http://idni.org
before i answer in detail,
since you mentioned the coin,
let me just make sure that you noticed that the computational work has nothing to do with mining, coin generation, transaction approval etc., and it all happens only between two parties with no implications at all to any 3rd party, not to mention the whole network.
sr. member
Activity: 434
Merit: 250
got it. if you call it multi-objective optimization, you just didn't understand the math. sorry.

How is it not?  You have multiple candidates each with unique feature-sets and are trying to find a best fit subset of candidates along all feature dimensions.  Sounds like a textbook example to me!

Quote
go over "least squares" again.

...

Quote
It's not about the metric, it's about the information I have.
When I say procfs I mean any measurements that the OS gives you (we could get into the kernel level but that's not needed given my algo).

Of course it is about the metric.  This metric is the only thing we have, so far.  Wink

The most relevant piece of information available, the job algorithm itself, gets ignored.  As long as this remains the case there is no meaningful association between the benchmark and the application.

Quote
true. procfs doesn't have this linearity. that's exactly what my algo comes to fix.
but *any* n-dim vector can be written as a linear combination of n linearly independent vectors.

Sure, but that doesn't magically make the measure linearly constrained.  You're still assuming that the measurements are actually done at all, which is a mistake.  We can't assume the semi-honest model for this step.  We have to assume the attacker just makes up whatever numbers he wants, here.

Quote
those latter n vectors are the unknowns (their number might be different than n, yet supported by the algo).
note that I never measure the UVs, nor give any approximation of them. I only estimate their product with a vector.

Er, of course you are measuring the UVs.  The performance of {xk} constitutes the measurement of all UVs, together.  Just because you measure them through indirect observation doesn't mean you aren't measuring them.  In any case I'm not sure what this has to do with anything?  (Or were you just stating this in case I had missed the implication?)

Quote
sure, wonderful works out there. we have different goals, different assumptions, different results,

Your goals don't seem to align with market demands, your assumptions can't be safely assumed, and where they get great results I'm skeptical of what your results really are, let alone their merit.

This is the core of the problem.  If your coin were based on these established works that meet lofty goals under sound assumptions and show practical results, I'd have no trouble seeing the value proposition.  Since your coin is based on "known to fail" goals (which we'll just generally call the CPUShare model, for simplicity) and makes what appear to be some really bad assumptions, I don't expect comparable results.  Why should I?

Quote
yet of course i can learn a lot from the cs literature, and i do some.
you claim my model is invalid but all you show till now is misunderstanding. once you get the picture you'll see the model is valid. i have a lot of patience and will explain more and more again and again.

Please do.

Quote
sure. see my previous comment about normal distribution and mitigating such cases.

Again, mitigation is not solution.  You've claimed solutions, but still aren't presenting them.  (In this case I'm not even really sure that the assumption around the mitigation actually holds.  Gauss only assumes natural error, and not induced error.  Can we assume any particular distribution when an attacker can introduce skew at will?  Wink)

Quote
you're not even in the right direction. as above, of course procfs aren't linear. and of course they're still n-dim vectors no matter what.

And of course this doesn't magically repair our violated linearity constraint in any way, does it?  Maybe this is what I'm missing....?

Of course it seems that I will be somewhat "doomed" to be missing this, as filling that hole requires establishing a reduction from arbitrary rank N circuits to rank 1 circuits.  I'm pretty sure this is impossible, in general.

Quote
that follows from the definition of the UVs you're looking for.
even though I wrote detailed doc, as any good math, you'll still have to use good imagination.
just think of what you're really looking for, and see that procfs is projected linearly from them. even the most nonlinear function is linear in some higher dimensional space. and there is no reason to assume that the dim of the UVs is infinite.. and we'd get along even if it was infinite.

Being careful not to tread too far into the metaphysical I'll approach this delicately.  You are closest to seeing the root of this particular problem with this statement, I feel.

If I say being linear in a higher dimension does not resolve the violation of the linearity constraint in the lower dimension, would you agree?  If so, you must in turn admit that the linearity constraint is still violated.

It is fine to "patch over the hole" in this way if we can assume correctness of the higher dimensional representation.... unfortunately the attacker gets to construct both, and therein lies the rub.  He gets to break the linearity, and hide from us that he has even done so.

(This is precisely the premise for the attack I mentioned based on the "main assumption.")

If attackers can violate the linearity constraint at all, even if they can make it look like they haven't... well, they get to violate the constraint. :-)  I say again, linearity can not be enforced, only assumed.  (Well, without some crypto primitive "magic" involved, anyway.)

Quote
why different circuit? the very same one.

No, the benchmarks are canonical and are not the circuit representative of my job's algorithm.

Quote
the provider tells the publisher what is their pricing, based on benchmarks. yes, it can be tampered, but it can be recognized soon, loss is negligible, reputation going down (the publisher won't work with this address again, and with the ID POW they can block users who generate many new addresses), and all other mechanisms mentioned on the docs.
on the same way, the publisher can tell how much they are willing to pay, in terms of these benchmarks. therefore both sides can negotiate over some known objective variables.

Except they're subjective, not objective.  They're potentially even a total fantasy.  Worse yet, they have no real relation, in either case, to the transaction being negotiated.  

Quote
again, we're not measuring the UVs, but taking the best estimator to their inner product with another vector. this reduces the error (error at the final price calculation, not in the UVs which are not interesting for themselves) in order of magnitude.

Again this is only true assuming a semi-honest model, which we can't, for reasons already enumerated. (Are you understanding, yet, that the critical failure of the model is not in any one of these problems, but in the combination?)

Quote
Local history & ID POW as above

We're running in circles, now.  ID POW does nothing to filter spammers, only discourages them by effectively charging a small fee.  The post requires stamps and I still get plenty of junk mail.

Given that once one owns an identity one can get at least one issuance of some payment for no work, why would anyone bother running actual computations and building up a positive reputation instead of mining IDs to burn by not providing service?

The ID POW can only have any effect if the cost to generate an ID is significantly higher than the gain from burning it, the cost the publisher pays for utilizing the attacker's non-working worker.  In other words, it has to be enforced somewhere/somehow that the cost of an ID is more than the "first round" cost of a job.  I don't see how/where this can be enforced, but I'm not precluding it just yet.  If you would enforce this somehow, it would solve at least the "trust model in a vacuum" part of the problem by relating the value of the trust entity to the value of its resource.

Quote
it is possible to take the algorithm which the publisher wants to run, and decompose its procfs measurements into linear combination of canonical benchmarks.

Is it?  (In a way aside from my proposed functional extension mechanism, that also doesn't give the publisher some free ride?)
If so, this might be a useful approach.  If anything, this should be explored further and considered.


Quote
that's exactly what i'm saying. we don't care about bursts on such massive distributed apps.

We do, because we can't price the distribution as a whole we can only price the individual instances.  The concern is not that one node does 1000/sec and another node does 1/sec, it is that the 1000/sec node might suddenly start doing 1/sec - and might even do so through neither malicious intent or any fault!

Quote
as above, that's an open problem, which i'm happy i don't need to solve in zennet. that's why zennet is not adequate for really-real-time apps.
but it'll fold your protein amazingly fast.

Or it'll tell you it is folding your protein, take your money, spit some garbage data at you (that probably even looks like a correct result, but is not) and then run off.

Or it tell you it can fold your protein really fast, take your money, actually fold your protein really slowly, and you'll have to walk away and start over instead of giving it more money to continue to be slow.

We can't know which it will do until we hand it our money and find out!  No refunds.  Caveat emptor!

Quote
they are not called canonical from the historical point of view, but from the fact all participants at a given time know which benchmark you're talking about.

Sure, my point is that the benchmarks are presented as "one size fits all" and they aren't.  It's ultimately the same problem as the benchmark having nothing to do with the job, in a way.  How are you going to maintain and distribute useful benchmarks for every new computing kit and model that comes out?  (At least that problem is not so bad as trying to maintain a benchmark for any job algorithm used.  There can be only finite technologies.  Wink)

Further, how are you going to benchmark some nonlinear, progressive circuit "at all?"  How can you benchmark something that only does unsupervised classification learning?  How can you meaningfully benchmark a system with "online LTO" where the very act of running the benchmark will result in the worker re-configuring into a different performance profile, partly or totally invalidating your benchmark results? (Likely further separating the bench-marked performance profile from that of the eventual job, too!)

Quote
it doesn't must be phoronix (but it's a great system to create new benchmarks). all the programs have to do is to make the HW work hard. if you have a new HW, we (or the community) will have to write a program that utilizes it in order to be able to trade its resources over zennet, and participants will have to adopt it as a canonical benchmark.

flexibility.

the last thing the system is, is "rigid".

Being continuously re-coded and added to in order to support change does not really fit my definition of flexible.  That is kind of like saying a brick is flexible because you can stack more bricks on top of it.
hero member
Activity: 897
Merit: 1000
http://idni.org
1. When you said that the free market idea contradicts the pricing idea, recall that it's all about Zencoins in which their value can fluctuate, in addition to the fact that the user sets the price for canonical benchmarks, and this (together with procfs measurements during the benchmark) deterministically sets the price.

This is another case where the reply doesn't seem to match up to the critical statement.  My problem was not with any association between the price of a resource and the price of the coin.  The contradiction is in that you say the users set all of the pricing, but also say that the algorithm handles pricing determination for the users!  Either the users control the pricing or the system controls the pricing, but you seem to somehow want it both ways.

let's make some order:
1. market decides on how much $ worth 1 zencoin
2. users set the price for benchmarks
3. system translates procfs to zencoins using terms of benchmarks
4. users may set not only price but any arbitrary behavior in form of JS code given the zennet's envorinment variables.

number 3 may be tampered. and here you have all other means i mentioned.
the control on the price and logic is all the user's. the system just helps them translate procfs to zencoin.


Quote
2. Since we're solving a least squares problem, Gauss promises us that the errors are normally distributed with zero mean. Hence, publisher can easily (and of course automatically) identify outliers.

Sure, but it can't infer anything else as to why it is marginal.  I refer back to my notion of "why is there any reason to think a publisher will ever converge (or potentially be able to converge) on an optimal set of workers?"

yeah, i forgot to mention that Gauss also promised they're uncorrelated and with equal variances. so it's not marginal only. see gauss-markov theorem


Quote
3. More than the above local statistics, the publisher knows to compare a provider to many more.

Again, I see this is a problem in and of itself, not a useful solution.  At best it is a misguided mitigation.

it is a well guided mitigation. with parameters promised to normally distribute while being uncorrelated.


Quote
The comparison is the number of finished small jobs w.r.t. paid coins.

It goes a little beyond this.  The jobs must not only be finished, but correct.

Quote
This can happen in seconds, and by no means, renting a PC for seconds should exceed not-too-much-satoshis.

The problem is it is not "a PC for seconds" in anything but toy cases.  If I want one node, I really need at least two.  If I want 100 nodes I really need at least 200.  If I want 10000 I really need at least 20000.  This quickly becomes costly.

Unlike the problem of weeding out sub-optimal (but correct) workers, I can't just "drop the under-performing half" either.  I need to run each task to completion on each replica in order to compare results.

Really, I should expect to need much more than just a doubling of effort, as well.  My ability to infer correctness of the result increases asymptotically. I can be twice as confident with four workers per instance, and twice as confident as that with eight.

I can never fully infer correctness, regardless of how much I spend.


right, but correctness is a whole different story than alomst everything i talked about.
as for correctness:
1. as said, risk can be decreased exponentially with linearly growing investments
2. many problems are very easy to verify, so the investment is far from doubling itself. example for such problems are matrix inversion, eigenvalues problems, svd, al NP-Complete problems, numerically solving (mainly-time-dependent-)PDEs, root finding and many more.
3. also in real life, you don't have real promises, just probabilities
4. we both know that most of the users won't tamper the code, so the risk probability to lower even more, isn't so high at the first place. they have nothing to earn from miscomputation.
sr. member
Activity: 434
Merit: 250
1. When you said that the free market idea contradicts the pricing idea, recall that it's all about Zencoins in which their value can fluctuate, in addition to the fact that the user sets the price for canonical benchmarks, and this (together with procfs measurements during the benchmark) deterministically sets the price.

This is another case where the reply doesn't seem to match up to the critical statement.  My problem was not with any association between the price of a resource and the price of the coin.  The contradiction is in that you say the users set all of the pricing, but also say that the algorithm handles pricing determination for the users!  Either the users control the pricing or the system controls the pricing, but you seem to somehow want it both ways.

Quote
2. Since we're solving a least squares problem, Gauss promises us that the errors are normally distributed with zero mean. Hence, publisher can easily (and of course automatically) identify outliers.

Sure, but it can't infer anything else as to why it is marginal.  I refer back to my notion of "why is there any reason to think a publisher will ever converge (or potentially be able to converge) on an optimal set of workers?"

Quote
3. More than the above local statistics, the publisher knows to compare a provider to many more.

Again, I see this is a problem in and of itself, not a useful solution.  At best it is a misguided mitigation.

Quote
The comparison is the number of finished small jobs w.r.t. paid coins.

It goes a little beyond this.  The jobs must not only be finished, but correct.

Quote
This can happen in seconds, and by no means, renting a PC for seconds should exceed not-too-much-satoshis.

The problem is it is not "a PC for seconds" in anything but toy cases.  If I want one node, I really need at least two.  If I want 100 nodes I really need at least 200.  If I want 10000 I really need at least 20000.  This quickly becomes costly.

Unlike the problem of weeding out sub-optimal (but correct) workers, I can't just "drop the under-performing half" either.  I need to run each task to completion on each replica in order to compare results.

Really, I should expect to need much more than just a doubling of effort, as well.  My ability to infer correctness of the result increases asymptotically. I can be twice as confident with four workers per instance, and twice as confident as that with eight.

I can never fully infer correctness, regardless of how much I spend.
hero member
Activity: 897
Merit: 1000
http://idni.org
The solution is sufficient without POW. Since the risk is low and can be controlled to be much lower. I agree that there are a lot of advantages on real computation POW, some with some innovative mathematical or cryptographic ideas, some hardware implemented, but their cost in additional computation or by sending VM snapshorts etc. is high.

?!

This used to be true, but isn't anymore.  Recent advances have brought the tradtionally associated computational cost down significantly.

will be glad to discuss it with you and maybe implement it on zennet

Quote
Nevertheless, I don't rule out implementing such an algo one day. Moreover, nothing on the current Zennet's design blocks such an implementation by the publisher, or by backward compatibility.

The problem is in lack of authentication not over the jobs themselves, but over the benchmark.  Combine this with a lack of association between the benchmark metrics and the job.

I understand that authentication and/or privacy preserving mechanism can be layered on top of the job.  This would just further distance the job's performance profile from what the benchmark metrics would imply.

answered in post that posted after this
hero member
Activity: 897
Merit: 1000
http://idni.org
Tell me what you think of the pricing algo.

First, I love a good multi-objective optimization as much as the next guy, so I don't really have any problem with the overall goal in the abstract.  I do agree that this approach to pricing optimization will "fairly" price computational resources.

got it. if you call it multi-objective optimization, you just didn't understand the math. sorry.
go over "least squares" again.


My problem is that it prices entirely the wrong resources, in several ways.

Quote
"An intuitive approach is the measure accumulated procfs variables"
...
"Big troubles, but, procfs is all we have."

Why?  This is what has me most baffled.  Why is this your only metric considered?  You say yourself that this approach is intuitive, and I agree, but is it sound?  Why should one believe it to be?


It's not about the metric, it's about the information I have.
When I say procfs I mean any measurements that the OS gives you (we could get into the kernel level but that's not needed given my algo).


Quote
"Assume that there exists n such physical unknown principal uncorrelated variables"
...
"We now make the main assumption: we claim that the procfs variables are the result of a linear projection acting on the UVs."

Beautifully put.

One of my three hypothetical attacks on your hypothetical network is based directly on this (bad) assumption.

You seem to have a grasp on the notion that the overall problem is one of measurable linear increments, but fail to realize that "projected procfs" does not represent this linearity, and that linearity can only ever be assumed, not enforced.  Here we get to one of the technical details of the matter, your UVs can only be correctly measured (and as such your formula can only hold as rational) as a metric over a rank one construction!!!!!!!  In other words, unless your UV is derived starting from some "gate count" you can't safely assume that any potential attack is constrained, relatively speaking, to linearity.  Woops.


true. procfs doesn't have this linearity. that's exactly what my algo comes to fix.
but *any* n-dim vector can be written as a linear combination of n linearly independent vectors.
those latter n vectors are the unknowns (their number might be different than n, yet supported by the algo).
note that I never measure the UVs, nor give any approximation of them. I only estimate their product with a vector.


(For more detail on this you should read up on recent works following from Yao's garbled circuits, and the work of folks like Gentry et al, Huang, etc.  In particular, I suggest reading the work specifically being done on objective cost models (done the "right" way, heh, no offense) by Kerschbaum et al.  I also suggest spending some hands-on time with the JustGarble toolkit, or similar, and proving to yourself what sort of things do/don't violate linearity assumptions under different transforms.  (It probably isn't what you'd expect!  Bleeding edge work is showing that a garbled circuit can even be side-effecting and perform interactive I/O without necessarily violating linearity or increasing rank order of the netlist!  Fascinating stuff going on, there.))

sure, wonderful works out there. we have different goals, different assumptions, different results, yet of course i can learn a lot from the cs literature, and i do some.
you claim my model is invalid but all you show till now is misunderstanding. once you get the picture you'll see the model is valid. i have a lot of patience and will explain more and more again and again.


It is as if your attacker can arbitrarily change the cost of any given operation at will.  (Measurement over the whole netlist becomes meaningless/worthless.)  It is like this because it is precisely the case.

sure. see my previous comment about normal distribution and mitigating such cases.


Your model only holds if your linear projection is actually a linear projection.  Your measurement of procfs is not, so the rest does not follow.

If procfs offered some rank one representation of system state (like some SSA form of a closure over the last instruction executed) this would be a different story.  I'm not sure this would even be possible without some sort of magical self-introspecting CPU, and suspect that it would certainly preclude any microcode optimizations in such a CPU.

This should be "plain and simple" enough, but I'll go on.

you're not even in the right direction. as above, of course procfs aren't linear. and of course they're still n-dim vectors no matter what.


Quote
"The rationale is that we do seek accumulated"...

Love all of this.  Apply this over a properly represented process and you're almost on to something.  Authenticate the structure and it's gold!

that follows from the definition of the UVs you're looking for.
even though I wrote detailed doc, as any good math, you'll still have to use good imagination.
just think of what you're really looking for, and see that procfs is projected linearly from them. even the most nonlinear function is linear in some higher dimensional space. and there is no reason to assume that the dim of the UVs is infinite.. and we'd get along even if it was infinite.


Quote
"Take N programs" ...

Here we see the rigidity that I alluded to, and our second big problem.  We are pricing the xn circuits in order to decide on a cost for an entirely different circuit.

why different circuit? the very same one. the provider tells the publisher what is their pricing, based on benchmarks. yes, it can be tampered, but it can be recognized soon, loss is negligible, reputation going down (the publisher won't work with this address again, and with the ID POW they can block users who generate many new addresses), and all other mechanisms mentioned on the docs.
on the same way, the publisher can tell how much they are willing to pay, in terms of these benchmarks. therefore both sides can negotiate over some known objective variables.


In other words, what we're measuring our UVs over is not necessarily meaningfully representative (in any way) of what we are actually attempting to price.

again, we're not measuring the UVs, but taking the best estimator to their inner product with another vector. this reduces the error (error at the final price calculation, not in the UVs which are not interesting for themselves) in order of magnitude.


Of course the argument would be made that the only way we can price the algorithm we are "actually shopping for" with this sort of approach involves the worker running a part of it at no cost, leading to a situation where publishers could "leach" computation by only requesting quotes, which is obviously no good.

Local history & ID POW as above


This tells us that either our UVs have to be taken in situ over the algorithm being shopped, without running it (again "gate count" metrics) or that we need a mechanism to assert functional extensional projections from the algorithm.  What I mean by this alternative is that the system could require the publisher to offer up, in their RFQ, the algorithm they intend to run, the input they intend to run it on, an extraction of (bounded?) portions of the algorithm that they intend to use for price shopping, and a generator function for a second set of input.  Doing it this way would be some awesome work, but would also be a bit of the-moon-on-a-stick probably.

it is possible to take the algorithm which the publisher wants to run, and decompose its procfs measurements into linear combination of canonical benchmarks. that's another formula but it's very similar to the main algo. but such measurements will have some distribution of values and not fixed one per each job (even something standard like 1000 FFT take different time if you have more zeros on your data). that's one of the strengths of the system: it's fully customizable. you'll be able to insert arbitrary JS code to the client which contains your decisions regarding who to work with. of course, non-techs won't need to mess with this, but the system and the protocol are open for full customizability since there is no middleman between the two parties, while the product and the network will defenitely develop with time.


Quote
"In addition, if Singular Value Decomposition is used"...

O HAI!  Now I has FOUR ideas!

If we consider that an attacker can "change gate cost at will" he can also "game" your decomposition, right?  Woops.

answered above


Quote
"Note that this pricing model is adequate for massively distributed applications only: say that a pending IO"...

As far as I can tell, paravirt boundary I/O cost and bursting semantics are not exercise-able as UVs.

that's exactly what i'm saying. we don't care about bursts on such massive distributed apps.


As such this basically enforces that regardless of how you measure the circuit, your associated cost function will be incorrect, pragmatically.  Even parallel, these still break the linearity over any one measure, and add skew to the net pricing.

This is actually a huuuuuuuuge open problem in the HPC community.  Most of your "big player"s would kill for a way to model cost with parameters over IO quantification and bursting semantics.  Such a model would be worth a fortune, since it is the only place where they get slippage in their profit margin.  (Academically, there's probably a lot of application for related open problems with fair schedulers, etc.)

Solve this and you've got more than gold, you've got the future of computing.  I don't expect you to be able to solve this.

as above, that's an open problem, which i'm happy i don't need to solve in zennet. that's why zennet is not adequate for really-real-time apps.
but it'll fold your protein amazingly fast.


Quote
"This root assumption is therefore vital for the acceptability of the solution."

Yup.  Bad assumption number two.  Effed on the way in and the way back out.  It's got fail coming out both ends, so to speak.   Grin

As I said, I don't expect a solution applied here, as I understand that it is an unavoidable assumption without constraining the IO and bursting semantics of the worker.

I think it would be best to just go ahead and constrain the IO and bursting semantics of the worker, myself.

as above


Quote
"Those xn are called, on the Zennet's framework, the Canonical Benchmarks."

The canonical benchmarks define the rigidity of the system.  How are you going to execute your phoronix suite on my recombinant FPGA chip?  How is your canonical benchmark going to tell you anything about the performance of my nascent memristor array?  (I'm not even kidding, here.)

How do those benchmarks apply when this classical hardware of today gets swept under by the new computing models of tomorrow?  (It is already happening!)  It seems like the canonical benchmarks will need to be a moving target, and thus not very canonical.  Wink

As I see it, any actual solution to these problems will begin with either an authenticated circuit construction, or some measure of cost of alpha/beta reductions and pi/sigma type construction.  In other words you're going to have to start from something well behaved, like Turing's machine or Church-Rosser and Barendregt's cube, not from something "wild" like procfs under paravirt.

they are not called canonical from the historical point of view, but from the fact all participants at a given time know which benchmark you're talking about. it doesn't must be phoronix (but it's a great system to create new benchmarks). all the programs have to do is to make the HW work hard. if you have a new HW, we (or the community) will have to write a program that utilizes it in order to be able to trade its resources over zennet, and participants will have to adopt it as a canonical benchmark.

flexibility.

the last thing the system is, is "rigid".
sr. member
Activity: 434
Merit: 250
The solution is sufficient without POW. Since the risk is low and can be controlled to be much lower. I agree that there are a lot of advantages on real computation POW, some with some innovative mathematical or cryptographic ideas, some hardware implemented, but their cost in additional computation or by sending VM snapshorts etc. is high.

?!

This used to be true, but isn't anymore.  Recent advances have brought the tradtionally associated computational cost down significantly.

Quote
Nevertheless, I don't rule out implementing such an algo one day. Moreover, nothing on the current Zennet's design blocks such an implementation by the publisher, or by backward compatibility.

The problem is in lack of authentication not over the jobs themselves, but over the benchmark.  Combine this with a lack of association between the benchmark metrics and the job.

I understand that authentication and/or privacy preserving mechanism can be layered on top of the job.  This would just further distance the job's performance profile from what the benchmark metrics would imply.

Quote
Note that the RFC is significantly older than the "About Zennet" doc.

Both contain significant amounts of what appears to be superseded information.  A cleanup of these materials is badly needed.

Quote
I didn't understand (you obviously noted I'm not a native English speaker).

I mean that you summed up your bad assumptions so nicely that I pretty much didn't need to formulate the statements myself.

Your English is very good.

sr. member
Activity: 434
Merit: 250
Tell me what you think of the pricing algo.

First, I love a good multi-objective optimization as much as the next guy, so I don't really have any problem with the overall goal in the abstract.  I do agree that this approach to pricing optimization will "fairly" price computational resources.

My problem is that it prices entirely the wrong resources, in several ways.

Quote
"An intuitive approach is the measure accumulated procfs variables"
...
"Big troubles, but, procfs is all we have."

Why?  This is what has me most baffled.  Why is this your only metric considered?  You say yourself that this approach is intuitive, and I agree, but is it sound?  Why should one believe it to be?

Quote
"Assume that there exists n such physical unknown principal uncorrelated variables"
...
"We now make the main assumption: we claim that the procfs variables are the result of a linear projection acting on the UVs."

Beautifully put.

One of my three hypothetical attacks on your hypothetical network is based directly on this (bad) assumption.

You seem to have a grasp on the notion that the overall problem is one of measurable linear increments, but fail to realize that "projected procfs" does not represent this linearity, and that linearity can only ever be assumed, not enforced.  Here we get to one of the technical details of the matter, your UVs can only be correctly measured (and as such your formula can only hold as rational) as a metric over a rank one construction!!!!!!!  In other words, unless your UV is derived starting from some "gate count" you can't safely assume that any potential attack is constrained, relatively speaking, to linearity.  Woops.

(For more detail on this you should read up on recent works following from Yao's garbled circuits, and the work of folks like Gentry et al, Huang, etc.  In particular, I suggest reading the work specifically being done on objective cost models (done the "right" way, heh, no offense) by Kerschbaum et al.  I also suggest spending some hands-on time with the JustGarble toolkit, or similar, and proving to yourself what sort of things do/don't violate linearity assumptions under different transforms.  (It probably isn't what you'd expect!  Bleeding edge work is showing that a garbled circuit can even be side-effecting and perform interactive I/O without necessarily violating linearity or increasing rank order of the netlist!  Fascinating stuff going on, there.))

It is as if your attacker can arbitrarily change the cost of any given operation at will.  (Measurement over the whole netlist becomes meaningless/worthless.)  It is like this because it is precisely the case.

Your model only holds if your linear projection is actually a linear projection.  Your measurement of procfs is not, so the rest does not follow.

If procfs offered some rank one representation of system state (like some SSA form of a closure over the last instruction executed) this would be a different story.  I'm not sure this would even be possible without some sort of magical self-introspecting CPU, and suspect that it would certainly preclude any microcode optimizations in such a CPU.

This should be "plain and simple" enough, but I'll go on.


Quote
"The rationale is that we do seek accumulated"...

Love all of this.  Apply this over a properly represented process and you're almost on to something.  Authenticate the structure and it's gold!

Quote
"Take N programs" ...

Here we see the rigidity that I alluded to, and our second big problem.  We are pricing the xn circuits in order to decide on a cost for an entirely different circuit.

In other words, what we're measuring our UVs over is not necessarily meaningfully representative (in any way) of what we are actually attempting to price.

Of course the argument would be made that the only way we can price the algorithm we are "actually shopping for" with this sort of approach involves the worker running a part of it at no cost, leading to a situation where publishers could "leach" computation by only requesting quotes, which is obviously no good.

This tells us that either our UVs have to be taken in situ over the algorithm being shopped, without running it (again "gate count" metrics) or that we need a mechanism to assert functional extensional projections from the algorithm.  What I mean by this alternative is that the system could require the publisher to offer up, in their RFQ, the algorithm they intend to run, the input they intend to run it on, an extraction of (bounded?) portions of the algorithm that they intend to use for price shopping, and a generator function for a second set of input.  Doing it this way would be some awesome work, but would also be a bit of the-moon-on-a-stick probably.

Quote
"In addition, if Singular Value Decomposition is used"...

O HAI!  Now I has FOUR ideas!

If we consider that an attacker can "change gate cost at will" he can also "game" your decomposition, right?  Woops.

Quote
"Note that this pricing model is adequate for massively distributed applications only: say that a pending IO"...

As far as I can tell, paravirt boundary I/O cost and bursting semantics are not exercise-able as UVs.

As such this basically enforces that regardless of how you measure the circuit, your associated cost function will be incorrect, pragmatically.  Even parallel, these still break the linearity over any one measure, and add skew to the net pricing.

This is actually a huuuuuuuuge open problem in the HPC community.  Most of your "big player"s would kill for a way to model cost with parameters over IO quantification and bursting semantics.  Such a model would be worth a fortune, since it is the only place where they get slippage in their profit margin.  (Academically, there's probably a lot of application for related open problems with fair schedulers, etc.)

Solve this and you've got more than gold, you've got the future of computing.  I don't expect you to be able to solve this.

Quote
"This root assumption is therefore vital for the acceptability of the solution."

Yup.  Bad assumption number two.  Effed on the way in and the way back out.  It's got fail coming out both ends, so to speak.   Grin

As I said, I don't expect a solution applied here, as I understand that it is an unavoidable assumption without constraining the IO and bursting semantics of the worker.

I think it would be best to just go ahead and constrain the IO and bursting semantics of the worker, myself.

Quote
"Those xn are called, on the Zennet's framework, the Canonical Benchmarks."

The canonical benchmarks define the rigidity of the system.  How are you going to execute your phoronix suite on my recombinant FPGA chip?  How is your canonical benchmark going to tell you anything about the performance of my nascent memristor array?  (I'm not even kidding, here.)

How do those benchmarks apply when this classical hardware of today gets swept under by the new computing models of tomorrow?  (It is already happening!)  It seems like the canonical benchmarks will need to be a moving target, and thus not very canonical.  Wink

As I see it, any actual solution to these problems will begin with either an authenticated circuit construction, or some measure of cost of alpha/beta reductions and pi/sigma type construction.  In other words you're going to have to start from something well behaved, like Turing's machine or Church-Rosser and Barendregt's cube, not from something "wild" like procfs under paravirt.

hero member
Activity: 897
Merit: 1000
http://idni.org
Few more points:

1. When you said that the free market idea contradicts the pricing idea, recall that it's all about Zencoins in which their value can fluctuate, in addition to the fact that the user sets the price for canonical benchmarks, and this (together with procfs measurements during the benchmark) deterministically sets the price.

2. Since we're solving a least squares problem, Gauss promises us that the errors are normally distributed with zero mean. Hence, publisher can easily (and of course automatically) identify outliers.

3. More than the above local statistics, the publisher knows to compare a provider to many more. The comparison is the number of finished small jobs w.r.t. paid coins. This can happen in seconds, and by no means, renting a PC for seconds should exceed not-too-much-satoshis.

Pages:
Jump to: