Author

Topic: Opt-in telemetry (Read 268 times)

legendary
Activity: 3472
Merit: 1722
September 10, 2019, 06:05:08 PM
#10
It would be objectively useful to have real statistics on what are the most efficient devices in syncing nodes, what is debatable is if it's worth the privacy concerns of collecting metadata even if it was opt-in and the meta would be anonymous.

You can always conduct your own experiments with different combinations of hardware components. If you're on a limited budget, you can still compare IBD times on different virtual machines, with different amounts of cores and memory assigned, and with different CPU clock speeds (either by under- or overclocking the CPU, or by limiting CPU usage of a process).
legendary
Activity: 1610
Merit: 1183
September 10, 2019, 05:04:18 PM
#9
What would be some good ideas to develop a database and rank how much time it is needed to sync given a certain setup? Someone needs to do this.
Nobody literally needs to do that. What nonsense are you blabbing about?

It would be objectively useful to have real statistics on what are the most efficient devices in syncing nodes, what is debatable is if it's worth the privacy concerns of collecting metadata even if it was opt-in and the meta would be anonymous.

It is definitely a bad idea to embed this into Bitcoin Core, but it could work as a separate software. However nobody would run the software so it's kinda pointless. The only way will be to manually benchmark different setup and take notes. If someone wanted to know what is the fastest setup available to sync nodes then looking at that data would be useful for that person.

This is what I thought. What would be some good ideas to develop a database and rank how much time it is needed to sync given a certain setup? Someone needs to do this. Unless it's done manually and reported case by case, I can't think of any accurate way to gather data without some sort of telemetry involved.

you could do this but you can't ask for it to be built in bitcoin-core client itself. you have to build it on top of it as a separate thing that people could run voluntarily and then report their results if they wished to in a centralized database. start by writing the open source benchmark that you think could measure things you want listed in that database, add things like becnh of different versions of core or different implementations of the full node,.... create the database (website?) then release it to public.
something like what these sites do: https://hdd.userbenchmark.com/

I had something like this in the works but realized that it would be useless since you would need to interact with Bitcoin Core to some capacity to pull data from as well as separately downloading the software and most people wouldn't bother vs clicking an option within the same software. There's no point, it will have to be an old fashioned manual testing ranking.
I wanted to develop something like https://gpu.userbenchmark.com/ but for nodes.

Never underestimate privacy risks of metadata and anonymized data. With enough related data and match it with metadata/anonymized data, de-anonymization is fully possible. Example :
Researchers reverse Netflix anonymization
Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization

There's reason why metadata remover tool is exist

Well you are comparing closed source telemetry to open source were it's clear what is being collected and how. I still understand the raised concerns.
legendary
Activity: 3472
Merit: 10611
September 10, 2019, 12:05:41 AM
#8
This is what I thought. What would be some good ideas to develop a database and rank how much time it is needed to sync given a certain setup? Someone needs to do this. Unless it's done manually and reported case by case, I can't think of any accurate way to gather data without some sort of telemetry involved.

you could do this but you can't ask for it to be built in bitcoin-core client itself. you have to build it on top of it as a separate thing that people could run voluntarily and then report their results if they wished to in a centralized database. start by writing the open source benchmark that you think could measure things you want listed in that database, add things like becnh of different versions of core or different implementations of the full node,.... create the database (website?) then release it to public.
something like what these sites do: https://hdd.userbenchmark.com/
legendary
Activity: 3472
Merit: 1722
September 09, 2019, 11:24:37 PM
#7
Not a chance, you might have more luck with altcoins, contact their dev teams, I'm sure if you pay them enough something could be arranged.
legendary
Activity: 2674
Merit: 2965
Terminated.
September 09, 2019, 10:21:54 AM
#6
What would be some good ideas to develop a database and rank how much time it is needed to sync given a certain setup? Someone needs to do this.
Nobody literally needs to do that. What nonsense are you blabbing about?
legendary
Activity: 3122
Merit: 2178
Playgram - The Telegram Casino
September 09, 2019, 08:24:39 AM
#5
Given the nature of Bitcoin, you want things to stay anonymous, so this information must guarantee to not contain the wrong metadata.

If we've learned one thing about the corporate and governmental data collection binge of the last few years than it is that it doesn't matter whether it's "only" metadata. When someone wants to stay anonymous, any amount of metadata is a liability.


You could raise the same point for bitcoin.org maintenance.

People do raise the same point for bitcoin.org Wink

Let's not introduce a new problem just because a similar one already exists.
legendary
Activity: 1610
Merit: 1183
September 09, 2019, 07:30:42 AM
#4
This will never happen. Any sort of telemetry will get shot down immediately and never be merged. Pretty much every contributor will Concept NACK it. Such a PR probably wouldn't even last an hour before it was closed.



This is what I thought. What would be some good ideas to develop a database and rank how much time it is needed to sync given a certain setup? Someone needs to do this. Unless it's done manually and reported case by case, I can't think of any accurate way to gather data without some sort of telemetry involved.

Besides the fact that there are privacy concerns, there is also the issue of who is going to host the database that receives the data? What happens if that person stops working on Core (or dies)? How will those servers be paid for?


You could raise the same point for bitcoin.org maintenance.
legendary
Activity: 2674
Merit: 2965
Terminated.
September 09, 2019, 02:01:57 AM
#3
No, just no. Please close down your thread.
staff
Activity: 3458
Merit: 6793
Just writing some code
September 08, 2019, 08:03:18 PM
#2
This will never happen. Any sort of telemetry will get shot down immediately and never be merged. Pretty much every contributor will Concept NACK it. Such a PR probably wouldn't even last an hour before it was closed.

Besides the fact that there are privacy concerns, there is also the issue of who is going to host the database that receives the data? What happens if that person stops working on Core (or dies)? How will those servers be paid for?
legendary
Activity: 1610
Merit: 1183
September 08, 2019, 07:56:26 PM
#1
Im developing some sort of database which will have relevant info when it comes to time spent syncing the blockchain from scratch, given variables of the device used and the speed of the internet connection.

Since there is a big variability due nodes connected at x time and differences in blocks (initial blocks go through faster than crowded ones specially in times of spam) I was wondering if Bitcoin Core could have a built-in, of course optional and disabled by default, telemetry option to send all those useful variables to develop said database. Right now im just using standard deviations and rough estimates, however if I had a proper data collector that grows I could make great use of that.

Given the nature of Bitcoin, you want things to stay anonymous, so this information must guarantee to not contain the wrong metadata. My question is simple: Is this possible within this context, or the whole concept it will never get ACKd? I understand why it wouldn't, but I also see the positive on having real data on what is the most effective setup to run a node.

PS: Please move thread to software dev subforum.
Jump to: