Author

Topic: Decentralised Web Data Mining. Thoughts? (Read 673 times)

newbie
Activity: 4
Merit: 0
August 04, 2017, 06:13:51 PM
#5
Hi, Patatas, I am one of Daratus Team members.

Happy to hear your questions and happy to discuss them while validating the idea.

Quote
Traffic and processing power ?Why ? I could just write very efficient python bot to do the same.I didn't understand mining webdata,can't hadoop do the same with better precision ?

Let's say that web data mining is data extraction from data source, from any accessible website or webService and data aggregation is done on Daratus before sending it to user or sending it directly in a raw form. So it is the most simple scenario which could be implemented on one to one server connection. But if we have many clients demanding data from many web sites it is impossible to do it from one or few hosts host especially if data is time sensitive. There is also data sensitive to geographical location or accessible only from specific places. Need for computing power will be driven by market demand and at the moment it is  hard to say will we need 10k or 500k PCs.

There is also more sophisticated scenarios where it will be huge demand for computing power. Let's say we need to monitor reputation of some company X, in this case we need not only data extraction but also evaluation which is very computationally expensive. For example picture recognition is another area where power is needed. We have a plenty of complicated scenarios where we need massive amounts of computing power, but those are just a little bit faraway for today. We don't want to risk with a too big scope, so we start from something quite complex but clear, defined and in our capabilities to deliver. Eventually Web-Data-Mining might become Big-Data-Mining and later even AI-Cloud. If to talk about AI, network could be capable of running AI models quite effectively, creation of AI models was not evaluated yet.

Quote
I still don't understand why you need so much of computer power ? The data would be text or collection of files,images,videos etc ? If you intend to share the data storage on miners computers,it makes sense.

Network nodes - data miners will have upgradable software which will help enable new features when network evolves. At the moment in current scope  storage sharing as a service is not included.

Quote
Interesting.I'd follow your project on  github or something.

Project sources eventually will appear on gitHub. We plan to launch Daratus MVP network as soon as it will be available.
Always welcome to join our slack http://daratus.com/slack/  


legendary
Activity: 1750
Merit: 1115
Providing AI/ChatGpt Services - PM!
August 04, 2017, 02:46:25 PM
#4
Shortly, it is a peer-to-peer network of miners who share their traffic and processing power to mine the web data.
Traffic and processing power ?Why ? I could just write very efficient python bot to do the same.I didn't understand mining webdata,can't hadoop do the same with better precision ?

Web data mining enables businesses to automatically and rapidly extract large scale data from any accessible web and save it as structured data. We create open source decentralised data mining service with nearly unlimited resources - unlimited computer power, unlimited network access points.
I still don't understand why you need so much of computer power ? The data would be text or collection of files,images,videos etc ? If you intend to share the data storage on miners computers,it makes sense.

Project is at very early stage and run by 4 enthusiasts (2 of them are phd's in computer science). Daratus network launch will strongly rely on community. Therefore your thoughts and suggestions would be highly appreciated.
Interesting.I'd follow your project on  github or something.
newbie
Activity: 4
Merit: 0
August 04, 2017, 12:09:24 PM
#3
Nice project. I know a type of people who will be very happy to use such project and its advantages. Even for some companies it can be a resource much needed. A lot of companies use a lot of different data to take their decisions, I know that currently, it costs a lot of money and resources to extract data at an industry level

Many thanks for thoughts. As per your comment, resources were mentioned twice. We see warrant of resources as the biggest challenge. Do you see any other challenges of project implementation?
copper member
Activity: 2940
Merit: 4101
Top Crypto Casino
August 04, 2017, 10:56:04 AM
#2
Nice project. I know a type of people who will be very happy to use such project and its advantages. Even for some companies it can be a resource much needed. A lot of companies use a lot of different data to take their decisions, I know that currently, it costs a lot of money and resources to extract data at an industry level
newbie
Activity: 4
Merit: 0
August 04, 2017, 07:55:16 AM
#1
Hey community,

We are team of 4 people working in web extraction and data analysis businesses. Our experience led us to project called Daratus – Decentralised Web Data Mining

Shortly, it is a peer-to-peer network of miners who share their traffic and processing power to mine the web data. Web data mining enables businesses to automatically and rapidly extract large scale data from any accessible web and save it as structured data. We create open source decentralised data mining service with nearly unlimited resources - unlimited computer power, unlimited network access points.

Project is at very early stage and run by 4 enthusiasts (2 of them are phd's in computer science). Daratus network launch will strongly rely on community. Therefore your thoughts and suggestions would be highly appreciated.

Feel free to ask any questions.

Medium article: https://medium.com/daratus/decentralised-web-data-mining-are-you-in-263a855e6dcf
Slack: http://daratus.com/slack/
Website: http://daratus.com/

Jump to: