Happy to hear your questions and happy to discuss them while validating the idea.
Let's say that web data mining is data extraction from data source, from any accessible website or webService and data aggregation is done on Daratus before sending it to user or sending it directly in a raw form. So it is the most simple scenario which could be implemented on one to one server connection. But if we have many clients demanding data from many web sites it is impossible to do it from one or few hosts host especially if data is time sensitive. There is also data sensitive to geographical location or accessible only from specific places. Need for computing power will be driven by market demand and at the moment it is hard to say will we need 10k or 500k PCs.
There is also more sophisticated scenarios where it will be huge demand for computing power. Let's say we need to monitor reputation of some company X, in this case we need not only data extraction but also evaluation which is very computationally expensive. For example picture recognition is another area where power is needed. We have a plenty of complicated scenarios where we need massive amounts of computing power, but those are just a little bit faraway for today. We don't want to risk with a too big scope, so we start from something quite complex but clear, defined and in our capabilities to deliver. Eventually Web-Data-Mining might become Big-Data-Mining and later even AI-Cloud. If to talk about AI, network could be capable of running AI models quite effectively, creation of AI models was not evaluated yet.
Network nodes - data miners will have upgradable software which will help enable new features when network evolves. At the moment in current scope storage sharing as a service is not included.
Project sources eventually will appear on gitHub. We plan to launch Daratus MVP network as soon as it will be available.
Always welcome to join our slack http://daratus.com/slack/