Author

Topic: Gathering and curation of datasets (Read 96 times)

jr. member
Activity: 224
Merit: 2
ICO Communtiy Management & Engagement happymod.io
March 29, 2018, 06:16:08 AM
#2
Agreed. Lots of machine learning projects require big datasets, but those generally aren't available (either in quantity, or quality).
newbie
Activity: 13
Merit: 0
December 28, 2017, 11:00:54 AM
#1
Gathering and quality control of structured data are the main hassles of machine learning engineers. Gathering methods generally produce a single set of structured data (this can be mitigated by smart gathering methods, classifications and some software features)

Examples of datasets gathering method:
●   Bots: Automated script crawling the web of database to “mine” data;
●   Thirds parties: External third parties which will provide inputs , that are aggregated together to form new set of structured data;
●   Bulk buys: External data processors or open sources datasets.
Jump to: