Pages:
Author

Topic: Poll on solving the imgur issue - page 2. (Read 2167 times)

legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
June 10, 2023, 01:16:45 AM
Tell me something (you thought you already said it, but remember), how much does that 800k occupy?
It's 143 GB. It could have some cleaning, like the 30k images that only say "removed", but I hardlinked all duplicates to the same file already, so it doesn't take much diskspace.
legendary
Activity: 1862
Merit: 5154
**In BTC since 2013**
June 10, 2023, 01:10:27 AM
A subdomain of talkimg could be on another webhost (that keeps the DNS responsibility in one place), say the ".m" that imgur uses most of the time. A €5.99/month or double this $58.88/year budget webhost will (probably) be enough, and since the amount of storage if fixed I expect hosting costs to go further down in the future.
The only reason I'm not doing this myself is because I don't want to deal with copyright takedown requests.

And that's exactly what I'm thinking of doing. I just need to reorganize and check which is the best option for everything to work correctly.  Wink

Tell me something (you thought you already said it, but remember), how much does that 800k occupy?
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
June 10, 2023, 12:56:56 AM
Actually, thinking about this more carefully, why even do this through TalkImg at all? Maybe it makes more sense to set up some dedicated hosting to serve those images without entangling them with TalkImg? I'm not sure who wants that challenge, but working on the problem in isolation will probably lower (and mostly fix) the costs, which might make it easier to find financial support.
A subdomain of talkimg could be on another webhost (that keeps the DNS responsibility in one place), say the ".m" that imgur uses most of the time. A €5.99/month or double this $58.88/year budget webhost will (probably) be enough, and since the amount of storage if fixed I expect hosting costs to go further down in the future.
The only reason I'm not doing this myself is because I don't want to deal with copyright takedown requests.
hero member
Activity: 510
Merit: 4005
June 09, 2023, 09:51:51 PM
I know! Nor is it because of space or bandwidth. It's because of the number of files to be loaded.
But I have no problems with space or bandwidth. The limit is on the number of files I can have on the server.
As I said, I have neither space nor bandwidth concerns at this time.
If bandwidth and space are not a problem for you, and you're only trying to work around an artificial limit on the number of files that your hosting allows, then there are some ways around that...

You could store the images (just the Imgur ones) as BLOBs in a database table, or pack them into a tar-like file format. Both of those approaches will require serving those images from behind a script (either to do a database lookup, or to seek to a specific offset and then read from the amalgamated file), but having them on their own kind of URL (distinct from other TalkImg images) actually makes a lot of sense. That way, my earlier idea of modifying the image proxy to use a "substitutions table" simplifies to just mechanically translating the links on the fly (e.g. https://i.imgur.com/7wqVXzD.png -> https://talkimg.com/imgur.php?id=7wqVXzD, or something like that).

Actually, thinking about this more carefully, why even do this through TalkImg at all? Maybe it makes more sense to set up some dedicated hosting to serve those images without entangling them with TalkImg? I'm not sure who wants that challenge, but working on the problem in isolation will probably lower (and mostly fix) the costs, which might make it easier to find financial support.
copper member
Activity: 1666
Merit: 1901
Amazon Prime Member #7
June 09, 2023, 04:33:23 PM
If the image proxy keeps served images cached for a period of time, bandwidth probably won’t be an issue. But if the image proxy queries the storage bucket every time someone accesses the image, egress charges will quickly add up.
Caching is up to the browser, so it's a per-user thing:
The image proxy isn't a caching proxy. Proxied images are never saved to disk, and only small chunks of images are stored in memory at any one time. The proxy does work with client-side caching (it passes on appropriate cache-related headers, etc.), so you may cache images.
I suspect that is the underlying root cause of the imgur problems. If there is an image in a particular thread, the image proxy server may be asking for imgur to serve the same image multiple times over a short time if there is a conversation among several people. Replying to a post with an image may result in the same person causing the image proxy server to call the image three times, and this would not count the times the person reads a thread, but does not reply.

Imgur (in the past) uses AWS, so they are paying the quoted prices for serving images to the forum image proxy. Serving only the image means imgur doesn't get any ad revenue from the forum.

Someone is going to have to pay to get images from wherever they are stored to forum users. It doesn't look like imgur is wanting to pay for this anymore.
hero member
Activity: 700
Merit: 577
June 09, 2023, 02:43:53 PM
The number two is the best option for me because linking the embedded image with link is the best option for me. And another thing again, till now my images uploaded before the imgur drama is still invalid. Because for me to go back and upload those pictures again is not easy and another thing is that I have lost all those images. So there is no way again for me to redo all over. I tried the TryNinja image recovery method but it could not work for me. So I just gave it for some times.
legendary
Activity: 1789
Merit: 2535
Goonies never say die.
June 09, 2023, 01:28:39 PM
Caching is up to the browser, so it's a per-user thing:
The image proxy isn't a caching proxy. Proxied images are never saved to disk, and only small chunks of images are stored in memory at any one time. The proxy does work with client-side caching (it passes on appropriate cache-related headers, etc.), so you may cache images.

Also in that post:
Increasing fault-tolerance is a long-term goal, but (re)creating a truly decentralized and uncensorable forum is outside of bitcointalk.org's scope.

Now I don't really believe theymos was specifically talking about the image proxy or images here, but..
Should the preservation of posted images be part of the fault-tolerance goals of the forum?  Does the removal of an image present a fault in the context of a thread/post?

As an example, let's look at the reference thread of a flag theymos created.

Without the images, as it stands now, does the flag still hold water?? or.. at least as much as it does with images?  (I'm sure there are better examples but I'm just being dramatic Tongue)


 Grin
legendary
Activity: 2758
Merit: 6830
June 09, 2023, 12:56:12 PM
Contabo (I think TryNinja still uses this) gets you 32 TB bandwidth for €17.49 per month. And unlike AWS, that includes the rest of the server too. But it probably won't reach the same very high uptime AWS has.
I do, and they stopped scheduling random maintance to my server (always accompanied with some downtime) so I have no complains.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
June 09, 2023, 12:51:03 PM
If the image proxy keeps served images cached for a period of time, bandwidth probably won’t be an issue. But if the image proxy queries the storage bucket every time someone accesses the image, egress charges will quickly add up.
Caching is up to the browser, so it's a per-user thing:
The image proxy isn't a caching proxy. Proxied images are never saved to disk, and only small chunks of images are stored in memory at any one time. The proxy does work with client-side caching (it passes on appropriate cache-related headers, etc.), so you may cache images.
legendary
Activity: 1862
Merit: 5154
**In BTC since 2013**
June 09, 2023, 12:43:10 PM


It has such a constant freshness, about 300 images a day.
And when someone uses the script to migrate the images, the growth increases that day.
As I mentioned, if the current pace is maintained, it will be very pleasant.
copper member
Activity: 1666
Merit: 1901
Amazon Prime Member #7
June 09, 2023, 11:58:49 AM
AWS S3 is really the most cost-effective option.
I did a quick price check:
Quote
First 50 TB / Month: $0.023 per GB
AWS is very reliable, but quite expensive on bandwidth. 10 TB doesn't sound unrealistic, and would cost $230 per month. Contabo (I think TryNinja still uses this) gets you 32 TB bandwidth for €17.49 per month. And unlike AWS, that includes the rest of the server too. But it probably won't reach the same very high uptime AWS has.


You are right, I completely ignored that they make money from bandwidth. Btw, how big can be the demand on bandwidth in our case? I mean, this image hosting is dedicated for this forum, one will rarely use it outside of bitalk, so, not every uploaded image will waste bandwidth.
Btw what about Hetzner? It's in Europe (Germany or Finland) and is pretty cheap. I can't swear but I'm sure it's one of the cheapest option out there, it's not as cheap as Contabo but performs way better. Probably what makes AWS more attractive for joker_josue is probably the pay-as-you approach payment model.

No, it is not very economical, because bandwidth is very expensive.
In this type of service, the focus is not so much on disk space, but on bandwidth.
If it's not a secret, I'll ask this question again: How is the demand on your service? Is it increasing? Or stabilizing? Or decreasing since imgur links aren't broken anymore?

If the image proxy keeps served images cached for a period of time, bandwidth probably won’t be an issue. But if the image proxy queries the storage bucket every time someone accesses the image, egress charges will quickly add up. For example, an attacker/troll could potentially add the same image to a single post 1000 times, and subsequently access the post 1000 times — this would result in the same image being accessed a million times.
hero member
Activity: 882
Merit: 792
Watch Bitcoin Documentary - https://t.ly/v0Nim
June 09, 2023, 11:33:56 AM
AWS S3 is really the most cost-effective option.
I did a quick price check:
Quote
First 50 TB / Month: $0.023 per GB
AWS is very reliable, but quite expensive on bandwidth. 10 TB doesn't sound unrealistic, and would cost $230 per month. Contabo (I think TryNinja still uses this) gets you 32 TB bandwidth for €17.49 per month. And unlike AWS, that includes the rest of the server too. But it probably won't reach the same very high uptime AWS has.


You are right, I completely ignored that they make money from bandwidth. Btw, how big can be the demand on bandwidth in our case? I mean, this image hosting is dedicated for this forum, one will rarely use it outside of bitalk, so, not every uploaded image will waste bandwidth.
Btw what about Hetzner? It's in Europe (Germany or Finland) and is pretty cheap. I can't swear but I'm sure it's one of the cheapest option out there, it's not as cheap as Contabo but performs way better. Probably what makes AWS more attractive for joker_josue is probably the pay-as-you approach payment model.

No, it is not very economical, because bandwidth is very expensive.
In this type of service, the focus is not so much on disk space, but on bandwidth.
If it's not a secret, I'll ask this question again: How is the demand on your service? Is it increasing? Or stabilizing? Or decreasing since imgur links aren't broken anymore?
legendary
Activity: 1862
Merit: 5154
**In BTC since 2013**
June 09, 2023, 04:35:18 AM
hat's why I always prefer dedicated server but in your case, AWS S3 is really the most cost-effective option.
~~
How is the demand on your service? Is it increasing? Or stabilizing? Or decreasing since imgur links aren't broken anymore?

No, it is not very economical, because bandwidth is very expensive.
In this type of service, the focus is not so much on disk space, but on bandwidth.



The "58 million" is the number of inodes on a 1 TB disk. Most files aren't that small, so the number of images you can fit in 1 TB will be significantly lower. So in most cases, the number of files shouldn't be a problem before you run out of diskspace.

I think I didn't explain it well. I can have 500 files with 1TB on the server, no problem. As I said, I have neither space nor bandwidth concerns at this time.

Either way, thanks for your suggestion, which I'll explore.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
June 09, 2023, 04:00:25 AM
You have gathered 800k from a hosting, let's imagine that there are more than 800k together with the other services. Which indicates 1600k images listed from the forum. Let's round it up to 2000k images.
I count about 2.5 million image links in my data collection. Some of those are dead already, so your estimate is close enough.

AWS S3 is really the most cost-effective option.
I did a quick price check:
Quote
First 50 TB / Month: $0.023 per GB
AWS is very reliable, but quite expensive on bandwidth. 10 TB doesn't sound unrealistic, and would cost $230 per month. Contabo (I think TryNinja still uses this) gets you 32 TB bandwidth for €17.49 per month. And unlike AWS, that includes the rest of the server too. But it probably won't reach the same very high uptime AWS has.

hero member
Activity: 882
Merit: 792
Watch Bitcoin Documentary - https://t.ly/v0Nim
June 09, 2023, 02:13:11 AM
You know that hosting services talk about unlimited, but there is always some kind of limit. I preferred to have unlimited disk space and bandwidth.
I remember the time when there was a boom of unlimited hosting ads from companies like Hostgator, iPage, Bluehost and I wanted to make a file sharing website, similar to wetransfer and I was always receiving warning from these hosting companies that I shouldn't upload or store high size (500MB and higher) files on my website and when I was asking them then why it's called unlimited, they were telling me that it was marked as unlimited because most businesses upload some kilobytes and they don't limit them, etc. They were just saying it's unlimited while it was actually almost as limited as free web hosting. That's why I always prefer dedicated server but in your case, AWS S3 is really the most cost-effective option.

Btw as far as I know, digitalocean doesn't have inodes limitation.

I see. That's tiny Wink

It's not that tiny, for the service in question. Especially at this early stage.
You have gathered 800k from a hosting, let's imagine that there are more than 800k together with the other services. Which indicates 1600k images listed from the forum. Let's round it up to 2000k images.
This is a large number, but it was gathered over 13 years, which averages out to 150k images per year. Let's round it up to 200k images per year.
That means I can maintain these conditions for the next 2-3 years. It will be time when I'm planning to do a new upgrade.

But, as I said, I'm evaluating several possibilities, we'll see what I get.  Wink
How is the demand on your service? Is it increasing? Or stabilizing? Or decreasing since imgur links aren't broken anymore?
legendary
Activity: 2212
Merit: 7064
June 08, 2023, 02:02:56 PM
It's not that tiny, for the service in question. Especially at this early stage.
You have gathered 800k from a hosting, let's imagine that there are more than 800k together with the other services. Which indicates 1600k images listed from the forum. Let's round it up to 2000k images.
It's totally unrealistic to think TalkImg can all of the sudden replace all Imgur images after just few weeks of existence  Cheesy
In theory everything sounds great, just replace the links and everything will work perfectly, but nobody is financially helping this project and thinking about server costs.
Maybe it's better to replace all old Imgur images with Imgbb or Postimages for now, and later we an think about alternative slow moving to TalkImg.
legendary
Activity: 1862
Merit: 5154
**In BTC since 2013**
June 08, 2023, 01:29:28 PM
I see. That's tiny Wink

It's not that tiny, for the service in question. Especially at this early stage.
You have gathered 800k from a hosting, let's imagine that there are more than 800k together with the other services. Which indicates 1600k images listed from the forum. Let's round it up to 2000k images.
This is a large number, but it was gathered over 13 years, which averages out to 150k images per year. Let's round it up to 200k images per year.
That means I can maintain these conditions for the next 2-3 years. It will be time when I'm planning to do a new upgrade.

But, as I said, I'm evaluating several possibilities, we'll see what I get.  Wink



I used to use shared hosting, until my last one disappeared. Since then, I've become much more comfortable with VPS servers, which in general give better specs for money, but don't come with a pre-installed selection of scripts.

I'm using one of the biggest hosting services on the market. Without any intermediary company, that is I can choose: shared server; vps; or dedicated server.
Therefore, this possibility is more remote.

But even a VPS has file limits. The service you use goes up to 58,000,000!?  Huh


legendary
Activity: 1789
Merit: 2535
Goonies never say die.
June 08, 2023, 01:14:38 PM
It sounds like we're piping a bunch of traffic through 1 IP, which might create problems from any site or image host out there, but maybe depending on the level of traffic we get up to on the forum?

Without exceptions, I'd imagine firewalls or other security devices out there potentially interpreting this activity as a weak DoS attempt from 1 IP, or just odd/malicious traffic of some sort which is taking enough resources to get blocked.

Maybe even just a curious IT guy browsing logs and seeing a shit load of unusual traffic from 1 IP and blocking it? Tongue  I've done this plenty in my world, but I obviously don't run any image hosts, so I'm not sure how they (or their providers) would interpret or handle this any differently.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
June 08, 2023, 12:54:10 PM
Right now I can have around 600k files, a very comfortable number for the next months/years.
I see. That's tiny Wink
For comparison: a "normal" filesystem on a 1 TB disk has 58 million inodes (which means it could handle that many small files).

Quote
You know that hosting services talk about unlimited, but there is always some kind of limit. I preferred to have unlimited disk space and bandwidth.
I learned a long time ago to stay away from "unlimited". I prefer to know what I get.

Quote
So entering 800k files now would require a new server upgrade. And I didn't want to do that now. But, I'm checking several options that can make everything viable, without having to do a new upgrade. Therefore, what I ask now is patience to be able to support this type of action. I am studying what I can do to accommodate these images.
I used to use shared hosting, until my last one disappeared. Since then, I've become much more comfortable with VPS servers, which in general give better specs for money, but don't come with a pre-installed selection of scripts.
legendary
Activity: 1862
Merit: 5154
**In BTC since 2013**
June 08, 2023, 12:36:25 PM
If you have the space, why would the number of pictures be a problem? Just disable directory view in your webserver and it should just be able to handle it.

But I have no problems with space or bandwidth. The limit is on the number of files I can have on the server. Right now I can have around 600k files, a very comfortable number for the next months/years. You know that hosting services talk about unlimited, but there is always some kind of limit. I preferred to have unlimited disk space and bandwidth.

So entering 800k files now would require a new server upgrade. And I didn't want to do that now. But, I'm checking several options that can make everything viable, without having to do a new upgrade. Therefore, what I ask now is patience to be able to support this type of action. I am studying what I can do to accommodate these images.
Pages:
Jump to: