And since i don’t post my valid urls anywhere no web-scraper can find them
You would, ah… be surprised. My website isn’t published anywhere and I currently have 4 active decisions and over 300 alerts from crowdsec.
Comment on How to secure Jellyfin hosted over the internet?
gagootron@feddit.org 4 days ago
I use good ol’ obscurity. My reverse proxy requires that the correct subdomain is used to access any service that I host and my domain has a wildcard entry. So if you access asdf.example.com you get an error, the same for directly accessing my ip, but going to jellyfin.example.com works. And since i don’t post my valid urls anywhere no web-scraper can find them. This filters out 99% of bots and the rest are handled using authelia and crowdsec
And since i don’t post my valid urls anywhere no web-scraper can find them
You would, ah… be surprised. My website isn’t published anywhere and I currently have 4 active decisions and over 300 alerts from crowdsec.
Of course i get a bunch of scanners hitting ports 80 and 443. But if they don’t use the correct domain they all end up on an Nginx server hosting a static error page. Not much they can do there
This is how I found out Google harvests the URLs I visit through Chrome.
Got google bots trying to crawl deep links into a domain that I hadn’t published anywhere.
This is true, and is why I annoying have to keep robots.txt on my unpublished domains. Google does honor them for the most part, for now.
That’s not how web scrappers work lol. No such thing as obscurity except for humans
It seems to that it works. I don’t get any web-scrapers hitting anything but my main domain. I can’t find any of my subdomains on google.
Please tell me how you believe that it works. Maybe i overlooked something…
My understanding is that scrappers check every domain and subdomain. You’re making it harder but not impossible. Everything gets scrapped
It would be better if you also did IP whitelisting, rate limiting to prevent bots, bot detection via cloudflare or something similar, etc.
If you’re using jellyfin as the url, that’s an easily guessable name, however if you use random words not related to what’s being hosted chances are less, e.g. salmon.example.com . Also ideally your server should reply with a 200 to * subdomains so scrappers can’t tell valid from invalid domains. Also also, ideally it also sends some random data on each of those so they don’t look exactly the same. But that’s approaching paranoid levels of security.
andreluis034@bookwormstory.social 3 days ago
Are you using HTTPS? It’s highly likely that your domains/certificates are being logged for certificate transparency. Unless you’re using wildcard domains, it’s very easy to enumerate your sub-domains.