Author

Topic: Scraping policy? (Read 176 times)

hero member
Activity: 2128
Merit: 532
FREE passive income eBook @ tinyurl.com/PIA10
April 16, 2020, 06:51:19 AM
#5
Your ISP will thank you.  Smiley

It should be the other way round, he paid for the service, man Cheesy
Vod
legendary
Activity: 3668
Merit: 3010
Licking my boob since 1970
April 16, 2020, 05:14:05 AM
#4
Hi all, I made a small script that polls bitcointalk for new posts (every 10 seconds) and sends me an alert on Discord when a new post matches a keyword. I was wondering if this was an acceptable use of the site? I couldn't find any rules or guidelines related to automation/bots/scraping/etc.

Thx! Kev

You'll find ten seconds is too fast.  Most of the time you'll have 0 new posts except for a couple hours a day when there might be 11 posts in 10 seconds.  Instead you should set it for a minute or so, and then if you parse ten new records on the first page, parse the second page, and so on, until you get to Page 10.  It wouldn't be likely the forum would get over one hundred posts in a minute.

Your ISP will thank you.  Smiley
legendary
Activity: 2758
Merit: 6830
April 15, 2020, 10:21:52 PM
#3
There is nothing wrong with doing this. There are at least 2 big projects that scrape pretty much every aspect of the forum and make the data public (trust, posts, merit, users, etc...). And they are both maintained by trusted members.

http://loyce.club/
http://bpip.org/

Just respect the 1 request per second limit and the rules and you will be fine.

The rules are the same as for humans. But keep in mind:
- No one is allowed to access the site more often than once per second on average. (Somewhat higher burst accesses are OK.)
- Every post must be on-topic. Any bot response to a topic is almost certainly off-topic. Changetip's behavior of responding to user commands publicly would not be allowed, for example.
- If someone complains about an unsolicited PM you send them, then you're probably going to be banned.
Those IPs are not blocked currently. But your other abusive IPs were blocked. Just your quotefast requests (which are only part of what your crawler does) were occurring at an average frequency of 7.6 requests per second in the most recent access logging period. Your requests constituted 3.4% of all forum requests in this period. This is entirely unacceptable and of course resulted in those IPs being banned.

The maximum allowed bot request frequency is 1 request per second. Those IPs are now accessing pages at an average of 2.5 requests per second combined. If you continue exceeding the allowed request limit, we will continue banning your IPs.
hero member
Activity: 952
Merit: 662
April 15, 2020, 10:21:43 PM
#2
I think if it's not a bug or anything threaten about this forum, it's allowed

Quote
28. Exploiting bugs or flaws (even if the result is harmless) in the forum's software is not allowed
newbie
Activity: 3
Merit: 1
April 15, 2020, 10:12:42 PM
#1
Hi all, I made a small script that polls bitcointalk for new posts (every 10 seconds) and sends me an alert on Discord when a new post matches a keyword. I was wondering if this was an acceptable use of the site? I couldn't find any rules or guidelines related to automation/bots/scraping/etc.

Thx! Kev
Jump to: