I am trying to make a Python script that can convert any bitcointalk post into the equivalent in HTML. This will allow you to naturally embed them in web pages for example. To do this I have to write a BBCode parser that outputs an HTML tag for each BBCode tag, and also handle all tags specific to Bitcointalk. Since this is tantamount to
writing a state machine for an entire language, a daunting task, I have decided to look for some existing program to base my work on rather than write it from scratch. A fully-working program suitable for BTT does not exist as far as I know.
https://github.com/chaomodus/ppcode This is someone's bare bones implementation of a bbcode state machine, it needs a lot of work like handling url= and img= tags and recognizing "quote", it needs to be made case insensitive and it needs to handle all the other smileys, not just the smile face. But I think it will be worth it in the long run if I manage to build this. I forked it at
https://github.com/ZenulAbidin/ppcode if you want to track its progress.
Have you tried using
TryNinja's API? If you would read its documentation
[1], you can see that it already parses/scrapes all of the Bitcointalk's posts and necessary data and turn contents into an HTML format. The documentation also shows how to use the API with python scripts. And if you would make a program or an application that shows the parsed content into an embeddable HTML, you could just use iteration to access on the Key named 'content' on the JSON format from the API's Response. Here's an example:
{
"result": "success",
"message": null,
"data": [
{
"post_id": 55763102,
"topic_id": 5295719,
"author": "Maus0728",
"author_uid": 1289002,
"title": "Re: 2 new Metamask phishing site thru Google Ads",
"content": "I don't know if everyone practices installing "uBlock Origin" as one of their browser add-ons.
Though I am fully aware that this is not an ad-blocker, however, based on my experience it can effectively
help solve these kinds of phishing attempts in a form of ads that is not carefully filtered by Google
. Been using their services for quite some time and fortunately,
I never encountered such scam attempts up to this date.
[1]
https://ublockorigin.com/",
"date": "2020-12-06T05:14:44.000Z",
"board_id": 39,
"board_name": "Beginners & Help",
"archive": false,
"created_at": "2020-12-06T05:14:47.342Z",
"updated_at": "2020-12-06T05:14:50.066Z"
}]
}
from:
https://api.ninjastic.space/posts/55763102As you can see, the content was already in a snippet form, if you would make a webapp that produces embedding of posts, better check his API for an easier job.
[1] -
https://docs.ninjastic.space/