Pages:
Author

Topic: Ninjastic.space - BitcoinTalk Post/Address archive + API - page 27. (Read 15134 times)

legendary
Activity: 2940
Merit: 3030
Awesome nice work and useful tool buddy  Cool

Maybe you can add something in the search post section .
Would be helpful if there would be a Ignore field for Usernames so if you search for some content that the posts from this User dont show up in the results.

Dont know if that is possible and easy to code into the webpage and bot.

 
legendary
Activity: 2758
Merit: 6830
Maybe you can include userIDs in the data dump? Or make a separate data dump for "userID,addresses".
I'll make some changes in the way I store the addresses which will probably allow me to get this data.

When I search for a simple text ("test"), I quickly get 200 results. However, when I search for a rare text (such as "1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD") without Author or Topic ID, the Search button keeps spinning for what seems like indefinitely. I assume it's still searching, but this might cause a problem with server load at some point.
You're right. The database should have a timeout for long queries but it wasn't working. It is now. Tongue

You also made me realize that the query was taking too long in some cases when there were little results. For example, pretty much any address search would take forever. Now you can (mostly) search them and it is way faster.

This is cool Smiley Maybe make it 10 minutes though, that's the time limit after which the forum will show a post as edited.
I'll think about it. I'm also thinking about making more checks, like another one after 1 or 2 hours.



I don't have enough free time, and sometimes I do research/post from the phone, while in bed, so having a bbcode format of the results from the search to copy from will be a very handy tool at least for me.
Will do! Thanks!



I had already mentioned your old website in the German section. If it is desired I can translate the big new update and advertise it in our section.
Your effort and invested time should help as many users as possible Smiley
Thank you! I really appreciate that. Wink



~
You gave me some ideas. I won't do it exactly like this since I would need to make some big chances, but I may have found a possible solution.


Edit: Just saw the bot/website received a big donation. Thanks however did it! That pretty much pays the bot for a whole year. Grin
copper member
Activity: 1610
Merit: 1899
Amazon Prime Member #7

Since it's all in a DB, it would be possible to associate a user with all of the addresses they've posted, no? I see the opposite being available and I can't help but think of also searching by user.
Not exactly. Depends on the database schema. I'm already working on this, but I still didn't find a good solution that is fast and effective.
Your 'posts' table should be as follows:
postID (P) - int
time
UID - int
posttext - str
addressposted - bool (this is optional, but may make searching for addresses easier)
(any other information that does not appear in your DB elsewhere)

You will have an 'addresses' table as follows:
addressID (P) - int (this is an arbitrary number)
address - str
(any other information that does not already appear elsewhere in your DB that you want to keep track of, such as address type, or coin)

postedaddresses table:
ID (P) - int (arbitrary number)
postID - int
addressID - int

legendary
Activity: 3654
Merit: 8909
https://bpip.org
Let me know if you'd like some help with the unknown titles. I can give you a dump of post IDs and titles that could significantly reduce the number of posts you'd need to re-scrape.
That would be great.

https://bpip.org/titles_20200907.zip

Format is CSV because it's relatively compact for this volume of data. The zip file is ~800MB, uncompressed size ~3GB. It'd be possible to make it smaller by grouping the titles together ("Title",) since most replies typically have the same title but it would probably be a pain to import. Let me know if you prefer a different format.

Sample (note the double quote escapes):

Code:
"PostId","Title"
28,Welcome to the new Bitcoin forum!
29,Repost: Bitcoin Maturation
30,Repost: Request: Make this anonymous?
31,Re: Repost: Bitcoin Maturation
32,Re: Repost: Request: Make this anonymous?
33,Repost: How anonymous are bitcoins?
34,Re: Repost: How anonymous are bitcoins?
36,Repost: Linux/UNIX compile
37,Re: Repost: Linux/UNIX compile
38,[OLD THREAD] Bitcoin version 0.2 development status
40,A few suggestions
41,Re: A few suggestions
42,Re: A few suggestions
43,Questions about Bitcoin
44,Re: A few suggestions
45,Re: A few suggestions
46,Re: Questions about Bitcoin
47,Re: A few suggestions
48,Re: Questions about Bitcoin
49,Re: Questions about Bitcoin
50,Re: A few suggestions
51,Re: A few suggestions
52,Re: A few suggestions
53,Break on the supply's increase
54,Re: A few suggestions
55,Re: A few suggestions
56,Re: Break on the supply's increase
57,Re: A few suggestions
58,Re: A few suggestions
59,Re: A few suggestions
60,Re: A few suggestions
61,Re: A few suggestions
62,Re: A few suggestions
63,Re: A few suggestions
64,Re: A few suggestions
65,"New Exchange Service: ""BTC 2 PSC"""
66,Re: A few suggestions
67,Re: A few suggestions
68,Re: A few suggestions
69,Re: A few suggestions
70,Re: A few suggestions
71,Re: A few suggestions
72,Re: A few suggestions
73,Bitcoin 0.2 released!
74,Re: Bitcoin 0.2 released!
75,FreeBSD build patch
76,Re: A few suggestions
77,Re: A few suggestions
78,Re: A few suggestions
79,Re: A few suggestions
81,Re: A few suggestions
82,"Re: New Exchange Service: ""BTC 2 PSC"""
83,Is my second Transaction working correctly? +Transfer Question
84,Re: Is my second Transaction working correctly? +Transfer Question
85,Re: Is my second Transaction working correctly? +Transfer Question
86,Re: Is my second Transaction working correctly? +Transfer Question
87,"64bit support"
88,Re: Bitcoin 0.2 released!
90,Web UI ideas
91,Re: Web UI ideas
92,Re: Web UI ideas
93,Re: Web UI ideas
94,Re: Web UI ideas
95,"Re: New Exchange Service: ""BTC 2 PSC"""
96,"Re: New Exchange Service: ""BTC 2 PSC"""
97,Re: 64bit support
100,New exchange (Bitcoin Market)
staff
Activity: 2310
Merit: 2632
Join the world-leading crypto sportsbook NOW!
I had already mentioned your old website in the German section. If it is desired I can translate the big new update and advertise it in our section.
Your effort and invested time should help as many users as possible Smiley
legendary
Activity: 2184
Merit: 3134
₿uy / $ell
This is amazing tool, I love it.
Sorry, No more merit to give... and theymos is silent on my Merit Source Application Sad

I'll add as small suggestion.

I don't have enough free time, and sometimes I do research/post from the phone, while in bed, so having a bbcode format of the results from the search to copy from will be a very handy tool at least for me.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
Since it's all in a DB, it would be possible to associate a user with all of the addresses they've posted, no? I see the opposite being available and I can't help but think of also searching by user.
Not exactly. Depends on the database schema. I'm already working on this, but I still didn't find a good solution that is fast and effective.
Maybe you can include userIDs in the data dump? Or make a separate data dump for "userID,addresses".

When I search for a simple text ("test"), I quickly get 200 results. However, when I search for a rare text (such as "1NXYoJ5xU91Jp83XfVMHwwTUyZFK64BoAD") without Author or Topic ID, the Search button keeps spinning for what seems like indefinitely. I assume it's still searching, but this might cause a problem with server load at some point.

- Post edit history.

If you go to a post's page, you can see the original unedited post and its edited version (edited up to 5 minutes later).
This is cool Smiley Maybe make it 10 minutes though, that's the time limit after which the forum will show a post as edited.
legendary
Activity: 1484
Merit: 1355

So your statement that searching for addresses is no different from searching for ordinary words is incorrect.

Not really. As TryNinja replied, it is probably possible to get some bounty profiles but the results will not be complete or completely accurate.
Anyone who has ever searched for alt accounts of ban evaders and bounty cheaters knows that it requires a bit of creativity as well.  Wink
legendary
Activity: 2758
Merit: 6830
New Update!

- Post edit history.

If you go to a post's page, you can see the original unedited post and its edited version (edited up to 5 minutes later). Also, you will be notified by the bot if someone edits the post and mentions you/quotes you later (again, up to 5 min). Smiley

Example: https://ninjastic.space/post/55146574



It is possible to get the bounty profiles (at least some of them). I also save the post's board, so I can filter them to the "Bounties" board. I will see if I can take a look at this when I have more time.
legendary
Activity: 1484
Merit: 1355
Of course, it would be great if such a function worked not only on wallet addresses, but also on social media addresses (twitter, facebook, telegram, etc.)

I don't think it's that simple. There is no uniform format for social media addresses. For example, someone can post only their profile name as opposed to the full url to the profile page. Searching for such data is no different than searching for any other words.
legendary
Activity: 2758
Merit: 6830
Maybe don't rush with that. Ignoring quotes could be a feature, at least an optional one. Most of the time I would probably want someone the user posted themselves, not when they quoted something.
I already plan to do that. The index I had created was not only removing the quotes but also parts of the original post, so changes are required anyways.

However searching for partial words would be great. E.g. "bitcoin" should find "bitcoins". Or perhaps it should be an option too, for those cases where you don't want "ninja" to find "tryninja".
You do get results for "bitcoins" if you search "bitcoin". Most plural words are equal to their singular forms when you do a search. Ninja and TryNinja are too far apart and something like this would probably result in a lot of false positives (like in this case).

Let me know if you'd like some help with the unknown titles. I can give you a dump of post IDs and titles that could significantly reduce the number of posts you'd need to re-scrape.
That would be great.

Since it's all in a DB, it would be possible to associate a user with all of the addresses they've posted, no? I see the opposite being available and I can't help but think of also searching by user.
Not exactly. Depends on the database schema. I'm already working on this, but I still didn't find a good solution that is fast and effective.
legendary
Activity: 2394
Merit: 1412
Leading Crypto Sports Betting & Casino Platform
Since it's all in a DB, it would be possible to associate a user with all of the addresses they've posted, no? I see the opposite being available and I can't help but think of also searching by user.
legendary
Activity: 3654
Merit: 8909
https://bpip.org
Goddammit, you are right. I messed up big with my algo. I hate regex. Angry

I will fix it and the results should increase considerably for every other search. It does look inside quotes, but the regex is eating a few parts of the post it shouldn't when there are multiple quotes.

Maybe don't rush with that. Ignoring quotes could be a feature, at least an optional one. Most of the time I would probably want someone the user posted themselves, not when they quoted something.

However searching for partial words would be great. E.g. "bitcoin" should find "bitcoins". Or perhaps it should be an option too, for those cases where you don't want "ninja" to find "tryninja".



Let me know if you'd like some help with the unknown titles. I can give you a dump of post IDs and titles that could significantly reduce the number of posts you'd need to re-scrape.
legendary
Activity: 2758
Merit: 6830
I can't even image how difficult it is.

It's just matter of convenience, but it is possible to bold text on the posts that matches the content/keyword? For example, if i use content/keyword "blockchain", all texts containing "blockchain" would be bolded automatically.
I think it is. I'll look at it later today.

Amazing! I see that you are making use of Reactjs, way to go Smiley .  I'm interested in your RESTful API, can you elaborate on how it works?
I have a few endpoints which will return the data you need. For example:

GET: /posts/55141939

Code:
{
  "id": "1c25054c-b1b8-41eb-8c66-697c8b697179",
  "post_id": 55142446,
  "topic_id": 5273824,
  "title": "Re: Ninjastic.space - BitcoinTalk Post/Address archive + API",
  "author": "Aveatrex",
  "author_uid": 950474,
  "content": "Amazing! I see that you are making use of Reactjs, way to go \"Smiley\" .  I'm interested in your RESTful API, can you elaborate on how it works? ",
  "date": "2020-09-06T12:45:48.000Z",
  "boards": [
    "Other",
    "Meta"
  ],
  "archive": false
}

For me (maybe different experience with other users), background and the text color its doesn't match. if you are happy with that, its okay nothing to change.
It will be fixed. Thanks!

Nitpick: for the front-page charts perhaps it makes sense to exclude today otherwise the charts have a bit of a misleading dip at the end there.
Makes sense. I'll remove today's data.

Looks great. Very fast. Not sure how accurate it is though. I can't believe I said "cunt" only 7 times. Does it search only outside of quotes? Only full words?
Goddammit, you are right. I messed up big with my algo. I hate regex. Angry

I will fix it and the results should increase considerably for every other search. It does look inside quotes, but the regex is eating a few parts of the post it shouldn't when there are multiple quotes.
legendary
Activity: 3654
Merit: 8909
https://bpip.org
Nitpick: for the front-page charts perhaps it makes sense to exclude today otherwise the charts have a bit of a misleading dip at the end there.

Looks great. Very fast. Not sure how accurate it is though. I can't believe I said "cunt" only 7 times. Does it search only outside of quotes? Only full words?


legendary
Activity: 2198
Merit: 1592
hmph..
Everything is good, the search post feature is very helpful, to make this feature are different with show the latest post from Bitcointalk, maybe you can add filter posts there. Not an important thing, just additional features maybe will help in the future.

Also, Link color need to change from blue with more soft color. I try to use this style:

Code:
a {
    color: darkorange;
    text-decoration: none;
    background-color: transparent;
    outline: none;
    cursor: pointer;
    -webkit-transition: color .3s;
    transition: color .3s;
    -webkit-text-decoration-skip: objects;
}

For me (maybe different experience with other users), background and the text color its doesn't match. if you are happy with that, its okay nothing to change.

example result:
       Test -       Test         
sr. member
Activity: 840
Merit: 375
Amazing! I see that you are making use of Reactjs, way to go Smiley .  I'm interested in your RESTful API, can you elaborate on how it works?
legendary
Activity: 1484
Merit: 1355
I can't even image how difficult it is.

It's just matter of convenience, but it is possible to bold text on the posts that matches the content/keyword? For example, if i use content/keyword "blockchain", all texts containing "blockchain" would be bolded automatically.

You mean, something like, how bitcointalk search function does? Yeah, that would be nice. I'm sure TryNinja can make the search term highlighted in the results.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
Some stats won't be available, since I can't get them (~topics created
Just a thought: you can probably get this if you check which user made the first post in each topic.

Quote
The "Unknown Title" shows up for the posts Loyce provided to me. His data didn't include any title, so I would need to scrape everything from zero to get them. New posts and posts which got notified to a bot user have their title.
Most topics have been deleted, so if you scrape those titles, you can skip a few million topics to speed things up.
legendary
Activity: 2758
Merit: 6830
I really like the feature search post, it's more efficient that using google search and have some unique search option.
Thanks! It was hard to make it usable with 47m+ posts, so I'm glad it's working. I had to learn a lot about how databases work and indexes. It's not perfect, though.

Is it possible to get the "general statistics for each user [e.g. like this]"?
- I'm after the last two stats (Most Popular Boards By Posts and Activity).
Yes! Or kinda of. I'm already working on something like this.

Some stats won't be available, since I can't get them (votes casted, polls created, time spent online, topics created, most popular boards per activity, etc...). But you will be able to see stuff like in which boards you posted the most (and when you did it).

I noticed it shows "(Unknown Title)" for threads written in languages other than English [not a big deal though].
The "Unknown Title" shows up for the posts Loyce provided to me. His data didn't include any title, so I would need to scrape everything from zero to get them. New posts and posts which got notified to a bot user have their title.
Pages:
Jump to: