Pages:
Author

Topic: [BPIP] Bitcointalk Public Information Project [Back in Action] - page 22. (Read 19683 times)

legendary
Activity: 3248
Merit: 3098
I'm not sure you changed anything here, but it seems merit count again does not work well.
For example, user @suchmoon has earned "Merit Score   5497" but in merit information table stay "Has received 4495 ". There is a difference of 2 latest earned merit. I don't count merit assigned at the beginning.
legendary
Activity: 3654
Merit: 8909
https://bpip.org
Yes please, it would make things much easier especially now that I need to get additional info from other sources users are referenced by username.

I have added a column for user name, added column headers, and set it up to run around midnight UTC (give or take a few minutes depending on other tasks in the same batch).

https://bpip.org/all_trust.csv

I didn't realize until now that it's relatively easy to get DT Trust scores for all users, and I'm not sure if BPIP uses it already: by checking the sent feedback for all DT-members, you know exactly who has received DT-feedback. It's probably less than 1000 pages to scrape for an update, but it's more work to set it up. I'm currently scraping 13651 profiles per week for an update, so this can be improved.

IIRC we parse from the receiving end. Your idea seems faster.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
Damn I forgot that to view profiles you need to be logged in
Although I've had some troubles with it, it works quite easily by feeding your scraper your browser cookie. My scraper hasn't been bothered by Cloudflare in a long time.

That's the biggest advantage of bpip imho. So for getting DT ratings gonna use bpip and to confirm 0 feedback users gonna use LoyceV's trust list. You guys service complement each other Smiley
I didn't realize until now that it's relatively easy to get DT Trust scores for all users, and I'm not sure if BPIP uses it already: by checking the sent feedback for all DT-members, you know exactly who has received DT-feedback. It's probably less than 1000 pages to scrape for an update, but it's more work to set it up. I'm currently scraping 13651 profiles per week for an update, so this can be improved.
sr. member
Activity: 840
Merit: 375


I can add a username to the file if it helps you. It would be UTF8.

Yes please, it would make things much easier especially now that I need to get additional info from other sources users are referenced by username.

That makes sense, I just checked some users that couldn't be found and all of them have 0 feedback. Which means that if an user can't be found in this csv it either means that the user has Black trust OR it's an incorrect username/bitcointalk profile url (I'm thinking of a scenario where a bounty manager have a list of users and a bounty hunter provides an incorrect url or username) so gotta differentiate the two cases. To do that I'll do additional checking using theymos trust dump or Loyce's viewer.

Thanks!
plus more-frequent-than-weekly updates. At least as far as the DT ratings are concerned. LoyceV obviously maintains non-DT trust lists, which we don't.
That's the biggest advantage of bpip imho. So for getting DT ratings gonna use bpip and to confirm 0 feedback users gonna use LoyceV's trust list. You guys service complement each other Smiley
legendary
Activity: 3654
Merit: 8909
https://bpip.org
That makes sense, I just checked some users that couldn't be found and all of them have 0 feedback. Which means that if an user can't be found in this csv it either means that the user has Black trust OR it's an incorrect username/bitcointalk profile url (I'm thinking of a scenario where a bounty manager have a list of users and a bounty hunter provides an incorrect url or username) so gotta differentiate the two cases. To do that I'll do additional checking using theymos trust dump or Loyce's viewer.

Thanks!

I can add a username to the file if it helps you. It would be UTF8.

I don't want to discourage you from double-checking with other sources, or having a backup if BPIP has an issue, but I think we have the same data that's in the trust dump or LoyceV's list, plus more-frequent-than-weekly updates. At least as far as the DT ratings are concerned. LoyceV obviously maintains non-DT trust lists, which we don't.
sr. member
Activity: 840
Merit: 375
Can you give me some examples of users that you can't find? Do those users have DT ratings?

The list includes only users who have at least one (negative, positive, or neutral) rating from a DT member, and/or have at least one DT-supported flag, and/or are DT1/2 members themselves.

If a user is not in the list then it should mean they have none of the above (i.e. trust color "black") but if you could get me some examples I can find out what's going on.

That makes sense, I just checked some users that couldn't be found and all of them have 0 feedback. Which means that if an user can't be found in this csv it either means that the user has Black trust OR it's an incorrect username/bitcointalk profile url (I'm thinking of a scenario where a bounty manager have a list of users and a bounty hunter provides an incorrect url or username) so gotta differentiate the two cases. To do that I'll do additional checking using theymos trust dump or Loyce's viewer.

Thanks!
legendary
Activity: 3654
Merit: 8909
https://bpip.org
because if it's the case from my tests 27k is definitely not enough , like half the users I tested are not found in it Cry

Can you give me some examples of users that you can't find? Do those users have DT ratings?

The list includes only users who have at least one (negative, positive, or neutral) rating from a DT member, and/or have at least one DT-supported flag, and/or are DT1/2 members themselves.

If a user is not in the list then it should mean they have none of the above (i.e. trust color "black") but if you could get me some examples I can find out what's going on.
sr. member
Activity: 840
Merit: 375
Your bot will need a login, possibly deal with CloudFlare
Damn I forgot that to view profiles you need to be logged in


Since you mentioned Excel I created it in a CSV format, which you should be able to import into a spreadsheet. If you prefer a different format, like JSON or XML - let me know. The columns are:

Code:
user_id, positive_score, negative_score, neutral_score, trust_color, dt_status, dt1_strength, dt2_strength, flag_count

If this is good I can set it up to be updated once a day.

That format works for me! It would be great if you also added a "username" column so I can make it cross-compatible wether the user included bitcointalk profiles urls or usernames in the excel file.



Now, there is only ~27k users, is that what you meant by

It's technically possible that some trust ratings (colors) aren't updated quickly if e.g. a rarely active DT member posts a rating for another rarely active user, so keep that in mind.
because if it's the case from my tests 27k is definitely not enough , like half the users I tested are not found in it Cry

Hmm, maybe I should at first use this CSV to check the user's trust, if he's not in it, I will search for him in LoyceV's Custom trust list


I assume you included me because I created it.  thx!   But I'm not part of the official team anymore.  Smiley
Yes that's why I included you,I wasn't aware you left the team sorry!
Vod
legendary
Activity: 3668
Merit: 3010
Licking my boob since 1970
Hey BPIP team @ibminer @Vod @suchmoon

Hey Aveatrex

https://bpip.org/about.aspx

I assume you included me because I created it.  thx!   But I'm not part of the official team anymore.  Smiley
legendary
Activity: 3654
Merit: 8909
https://bpip.org
Now that you say it, I wonder if i shouldn't make the bot do the same? i.e scrape directly bitcointalk profiles instead of using bpip as a middleman? That for sure wouldn't solve the false postives detected by 2 out of 68 anti viruses (using virus total) since it will still be making automated requests  Cry but is a solution to be considered

You could do that but it's quite a bit of hassle. Your bot will need a login, possibly deal with CloudFlare, and of course scrape multiple pages.

Here's a file of all trust ratings that we have on BPIP - let me know if it works for you: https://bpip.org/all_trust.csv

Since you mentioned Excel I created it in a CSV format, which you should be able to import into a spreadsheet. If you prefer a different format, like JSON or XML - let me know. The columns are:

Code:
user_id, positive_score, negative_score, neutral_score, trust_color, dt_status, dt1_strength, dt2_strength, flag_count

If this is good I can set it up to be updated once a day.
sr. member
Activity: 840
Merit: 375
No, the part just under that, if it's Red, Black or Green.
I also have Trust score images, but it depends on how many you need.

Well if I'm going to release the software publicly, I can't predict how many users an user X will include in his excel file so that won't work.

Technically, I can use and scrape this page instead of bpip it would be much easier but the downside is that it updates every week which is a bit too long to my taste.

We continuously scrape every user profile, prioritizing those who are most active.
Now that you say it, I wonder if i shouldn't make the bot do the same? i.e scrape directly bitcointalk profiles instead of using bpip as a middleman? That for sure wouldn't solve the false postives detected by 2 out of 68 anti viruses (using virus total) since it will still be making automated requests  Cry but is a solution to be considered
legendary
Activity: 3654
Merit: 8909
https://bpip.org
Edit: I'm curious, if theymos trust dump updates every week, how bpip updates the trust of users on demand?

We continuously scrape every user profile, prioritizing those who are most active. It's technically possible that some trust ratings (colors) aren't updated quickly if e.g. a rarely active DT member posts a rating for another rarely active user, so keep that in mind.

I'll see how I can make this easier for you.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
No, the part just under that, if it's Red, Black or Green.
I also have Trust score images, but it depends on how many you need.

Quote
Edit: I'm curious, if theymos trust dump updates every week, how blip updates the trust of users on demand?
Theymos only dumps Trust lists (only for users who have at least one post), not Trust (feedback) scores. BPIP has to scrape each profile for each update.
sr. member
Activity: 840
Merit: 375

What do you mean by trust - is it this part
No, the part just under that, if it's Red, Black or Green.

You can also use the tool created by LoyceV to see trust scores of all users in a single page.
Custom Trust list viewer, (Topic on bitcointalk)

Thank you for your suggestion I'll take a look at it

Edit: I'm curious, if theymos trust dump updates every week, how bpip updates the trust of users on demand?
legendary
Activity: 2380
Merit: 5213
...Is it possible to have a compressed file of the trust of each user updated like each 24 hours, or is that too much to ask for?
I don't think you need to scrape BPIP.
You can use the following file provided by theymos to extract all trusts and distrusts. It is updated once per week.
trust.txt.xz

You can also use the tool created by LoyceV to see trust scores of all users in a single page.
Custom Trust list viewer, (Topic on bitcointalk)
legendary
Activity: 3654
Merit: 8909
https://bpip.org
I'm scraping the trust of a set list of users from an excel file that the user selects and saves the results on the same excel file. The problem of this is that it's slow, is heavy on the server and triggers false positives from 1-2 anti-virus that thinks the software is performing a ddos attack or is part of some bot net. Is it possible to have a compressed file of the trust of each user updated like each 24 hours, or is that too much to ask for?

What do you mean by trust - is it this part?

Loading...
Edited 2020-11-30 to fix a broken image
sr. member
Activity: 840
Merit: 375
Hey BPIP team @ibminer @Vod @suchmoon,

I'm currently developing a mini-software for the community, is web scraping your website allowed via automated requests? I prefer to ask before as I don't know if you can afford that with your hosting especially that right now, the website's response seem to be slow

Let me know if you need additional information on the intended usage

Let me know what you want to scrape (via PM if you don't want to make it public) and we might have a better solution for it.

I'm scraping the trust of a set list of users from an excel file that the user selects and saves the results on the same excel file. The problem of this is that it's slow, is heavy on the server and triggers false positives from 1-2 anti-virus that thinks the software is performing a ddos attack or is part of some bot net. Is it possible to have a compressed file of the trust of each user updated like each 24 hours, or is that too much to ask for?
legendary
Activity: 3654
Merit: 8909
https://bpip.org
Hey BPIP team @ibminer @Vod @suchmoon,

I'm currently developing a mini-software for the community, is web scraping your website allowed via automated requests? I prefer to ask before as I don't know if you can afford that with your hosting especially that right now, the website's response seem to be slow

Let me know if you need additional information on the intended usage

Let me know what you want to scrape (via PM if you don't want to make it public) and we might have a better solution for it.
sr. member
Activity: 840
Merit: 375
Hey BPIP team @ibminer @Vod @suchmoon,

I'm currently developing a mini-software for the community, is web scraping your website allowed via automated requests? I prefer to ask before as I don't know if you can afford that with your hosting especially that right now, the website's response seem to be slow


Let me know if you need additional information on the intended usage
legendary
Activity: 2184
Merit: 3134
₿uy / $ell
Just change the :
Code:
https://bpip.org/report.aspx?r=mostmerited
to:
Code:
https://bpip.org/report.aspx?r=earnedmerit
and it will be fine. I think there was an update, now I get also reward for the "Most earned merit" Smiley Well I lost 2 positions due to inactivity the last few months but.. still in top 25 Tongue

Most merited is not the same as earned merit - the link is contained in the first shield of the five shields on each users' page information.

Hopefully it's just a glitch during the fortnightly update. (have gone back from 39th to 37th with my last fortnight's posts, so about the same as before. Wink )




Are your house building works done yet?

Then mostmerited = mostmerit
Code:
https://bpip.org/report.aspx?r=mostmerit

Quote
Are your house building works done yet?
Renovating. 2 floors done, now is the basement this winter. Build some small things in the garden as well, almost finished with shed for wood. I'm doing all alone so it takes time.
Pages:
Jump to: