Author

Topic: Harvesting/Scraping (Read 2396 times)

full member
Activity: 170
Merit: 100
August 04, 2011, 10:54:46 AM
#7
If you're doing scraping for some purpose other than spamming or stealing somebody else's data, I have no problem with it.  In my experience, however, the majority of cases where somebody scrapes web sites for hire involve email addresses to be used for spamming.  That doesn't mean that there aren't legitimate uses for scraping, just that I don't often see them in use.
During the past few years, I sold a couple of data scraping solutions, none of which fitting into your less than legitimate categories. So, if you are seeing mostly illegitimate uses of scraping, maybe you want to look for another peer group of people to work with?

More importantly: rather than insulting people enriching the bitcoin economy by providing services, why not offer services yourself? Just an idea... :-)
legendary
Activity: 1386
Merit: 1000
August 04, 2011, 10:39:16 AM
#6
You should market your services as "Data import", "Data cleaning", "Data processing" and provide a portfolio of some interesting results.

How well do you know http://kettle.pentaho.com/ ?
Can you integrate your parsers with it ?
sr. member
Activity: 322
Merit: 250
July 07, 2011, 07:00:27 PM
#5
Yeah. It's sad that the negatives get more publicity. :-/
full member
Activity: 126
Merit: 100
July 07, 2011, 06:30:37 PM
#4
If you're doing scraping for some purpose other than spamming or stealing somebody else's data, I have no problem with it.  In my experience, however, the majority of cases where somebody scrapes web sites for hire involve email addresses to be used for spamming.  That doesn't mean that there aren't legitimate uses for scraping, just that I don't often see them in use.
sr. member
Activity: 322
Merit: 250
July 07, 2011, 12:39:09 PM
#3
Scraping, like hacking, has a bad rap. 

It's just a tool.  Sure, most people, when they hear/think of "scraping" a website, think of "spam" and "scraping emails."

Scraping is a tool.  Too bad it has been used for spamming in the public eye. But scraping in and of itself, is not bad, or negative, or time wasting.

Big companies do it all the time, and they're not scraping email addresses. They are scraping other things, information. Big companies scrape eachother.

As a marketer, I've scraped a few of my merchant's sites (when they don't offer a datafeed).  I create my own datafeed.

Scraping is not evil.  It's what people do with what's scraped, that can be positive or negative.

But I digress.
full member
Activity: 126
Merit: 100
July 05, 2011, 06:11:27 PM
#2
Not to be a spoilsport, but if the "data" that you are offering to scrape is email addresses, please be aware that Spamhaus, Spamcop, SURBL, URIBL, and most of the other widely-used anti-spam blocklists consider sending bulk email to scraped addresses to be spam.  Spamhaus lists the web sites that sell lists of email addresses in their main blocklist for spam support. 

If you can write code to scrape web sites, you can also write code to do other things that are more productive and less likely to piss people who don't like spam off at you.  Just an idea. :-)
hero member
Activity: 576
Merit: 514
July 03, 2011, 09:34:57 AM
#1
I have a bit of spare time, so if someone needs data scraped from a website, drop me a PM.

Data can be provided as CSV or SQL.

Price depends on the complexity and amount of data.
Jump to: