Jerry
@Jerry@feddit.online
Admin/owner of this KBIN instance, https://feddit.online
I'm a liberal who also follows conservatives who use facts to form opinions & are good people. I want my beliefs challenged. I hate when people agendize everything. I play around a lot with Linux. My day job is IT for a bank.
Mastodon: https://hear-me.social
Calckey: https://bostonsocial.online
Peertube: https://my-sunshine.video
he/him/his
#StarTrek #ScienceFiction #Linux searchable
- Comment on Small or medium-sized Mastodon instances? 1 day ago:
https://hear-me.social is one possibility. It has the added benefit of a 12,000-character posting size as well, especially nice if the 500-character limit at .social was frustrating.
- Comment on Why are people using the "þ" character? 2 days ago:
I asked Gemini Pro whether the thorn character messes an LLM up? If anyone is interested, here’s the answer. The short answer is it makes it harder for humans to read the text, but LLMs are not bothered by it. So, those who are using the thorn character for “th” believing that it breaks AI are not achieving anything and are just being annoying.
Gemini Pro’s response: (Note that as I write this in PieFed, the thorn character is being converted to “th” and so you will have to convert it back in your brain.
The belief that using the character “th” (Thorn) “messes with” LLM training has gained some traction in privacy-focused and “anti-AI” circles. Here is a breakdown of the technical reality behind that claim.
- The Theory: Disrupting Tokenization
Large Language Models don’t read words; they read tokens (numerical representations of character clusters).
Common Tokens: In standard English, the word “the” is one of the most common tokens. It is highly optimized, and the model “understands” it perfectly.
The “Thorn” Hack: If you replace “the” with “<thorn>e,” you force the model’s tokenizer to handle a rare character. Because <thorn> is uncommon in English training data, the tokenizer might break “<thorn>e” into two separate, rare tokens (the byte for <thorn> and the letter “e"). Proponents believe this makes the text “invisible” or “garbage” to the model, preventing it from learning from the content or correctly categorizing it.
- Does it actually work? (Mostly No)
While the idea is clever, it is generally ineffective against modern, high-scale LLMs for several reasons:
Multilingual Training: Most modern models are trained on massive multilingual datasets. Icelandic uses <thorn> as a standard letter. Therefore, the models already have <thorn> in their vocabulary and have seen it millions of times.
Byte-Level Tokenization: Modern tokenizers (using Byte Pair Encoding or BPE) are designed to handle any Unicode character by falling back to its raw bytes. They don’t “break” when they see an unfamiliar symbol; they just process it as a less-common sequence.
Contextual Mapping: LLMs are excellent at pattern recognition. If a model sees a sentence like “<thorn>e cat sat on <thorn>e mat,” it can easily infer from the context that “<thorn>e” serves the exact same linguistic function as “the."
- The “Glitch Token” Phenomenon
There is a grain of truth to the “glitching” idea. *In the past, * certain rare characters or specific strings (often called “glitch tokens") were included in a model’s vocabulary but appeared so rarely in the training data that their “embedding” (the model’s understanding of them) was never properly trained.
When a model encounters a true glitch token, it can sometimes “hallucinate” or produce nonsensical output. However, <thorn> is a standard UTF-8 character and is too common globally to act as a true glitch token for most general-purpose models.
Summary
Using <thorn> is more of a digital protest or a linguistic “speed bump” than a functional way to break an AI. It might make your text harder for a simple keyword-search bot to find, but for a modern LLM, it’s just a slightly “noisy” version of English that we can still read and understand quite easily.
- Comment on [deleted] 1 week ago:
I’d ask for a refund.
- Comment on Is Pixelfed sawing off the branch that the Fediverse is sitting on? 4 weeks ago:
I read the post and understood the problem. Then I read Dan’s reply. And he’s fixing it. In fact, he’s fixing it in the exact way the blog post says it should be fixed, “… but those filters should be manually triggered and always removable.”
Here’s Dan’s reply:
“… I hear the community: you want text-only posts. We are going to build this as an opt-in feature. If you want microblogging, turn it on. If you prefer the classic media-only experience, nothing changes.”
Why was this blog post even written then? Now it’s seeming to be personal.
- Comment on Looking for a PeerTube instance that actually accepts new users 5 weeks ago:
my-sunshine.video is one with no restrictions and has open registration. Don’t know about others.
- Comment on My Mastodon instance refuses to federate with a specific instance, and nobody seems to know why. 1 month ago:
Yes, it claims to proxy the traffic, but then you have to connect to the fedibuzz relay server.
It sounded from the post that there was an attempt to directly connect to a Mastodon server through the relay connection, which won’t work.
- Comment on My Mastodon instance refuses to federate with a specific instance, and nobody seems to know why. 1 month ago:
A relay, as in the Mastodon relay feature? As in Administration->Relays? If so, does mastodonapp.uk have an active relay? It can’t be used to connect to just any Mastodon server. It has to connect to a special relay server.
Not sure whether this is helpful.
- Comment on Those who are hosting on bare metal: What is stopping you from using Containers or VM's? What are you self hosting? 3 months ago:
Depends on the application for me. For Mastodon, I want to allow 12K character posts, more than 4 poll question choices, and custom themes. Can’t do it with Docker containers. For Peertube, Mobilizon, and Peertube, I use Docker containers.
- Comment on Proton Mail Suspended Journalist Accounts at Request of Cybersecurity Agency 3 months ago:
Yes, well stated. This is why I usually skip reading people's comments. The vast majority see everything through their own agendas and just echo words they hear.
- Comment on Mississippi Age Verification Law 4 months ago:
Doesn't work that way. States agree to enforce each other's civil orders
- Comment on Mississippi Age Verification Law 4 months ago:
On feddit.online I block both the UK and France in addition to Mississippi. However, I believe in a future upgrade, PieFed can be configured to block people from specific countries from accessing NSFW and NSFL communities (feddit.online doesn't allow NSFL communities). When that upgrade happens, I will open it again to the UK and France but keep it closed for Mississippi.
- Comment on Mississippi Age Verification Law 4 months ago:
Why is this post NSFW???
- Comment on Mississippi Age Verification Law 4 months ago:
I'm exhausted with all this. And it's not my fight. The fight belongs to the people of Mississippi. They elected their "leaders."
Until I know for sure that I am not on the hook to pay a $10K penalty for each person on my servers, I've blocked all Mississippi IP addresses from logging in and registering on my Mastodon, Piefed, and Friendica servers.
Wyoming will probably be next.
- Comment on Can no longer access my old instance (lemmings.world) because I'm from the UK. I made several communities there. Is there any way I can mod them again or do I move them to this instance? 5 months ago:
They can notify the hosting company that the server is violating UK law, the registrars, and payment services. This is the fear for sites not hosted in the UK. There are inter-country agreements to support civil actions.
- Comment on Can no longer access my old instance (lemmings.world) because I'm from the UK. I made several communities there. Is there any way I can mod them again or do I move them to this instance? 5 months ago:
Just mentioning that Mozilla VPN uses Mullvad, and with their Firefox extension you can exclude individual websites from VPN protection or set preferred server locations for specific sites. So you can stay on a UK server for UK banking sites but switch to a different country server for a social site.
Only works on Windows for now. But maybe useful given this situation.
- Comment on Just received an email from feddit.online saying they've geoblocked UK IPs due to the Online Safety Act 5 months ago:
Yes, the U.S. and the U.K. have cooperation agreements for Civil actions.
- Comment on Just received an email from feddit.online saying they've geoblocked UK IPs due to the Online Safety Act 5 months ago:
A public enforcement action by Ofcom could make it difficult because payment processors can refuse to work with the site owner, domain registrars could be pressured to suspend the domain, and hosting providers might refuse to provide services.
Who needs this drama?
- Comment on Just received an email from feddit.online saying they've geoblocked UK IPs due to the Online Safety Act 5 months ago:
Piefed.social isn't as affected because they restrict the NSFW communities. Feddit.online doesn't have the restriction, so it's more exposed.
The fear is a complaint being made to Digital Ocean that a server they host is violating UK law. It would be much easier for DO to remove the server than to take any other action.
- Comment on Just received an email from feddit.online saying they've geoblocked UK IPs due to the Online Safety Act 5 months ago:
The Mozilla VPN with their Firefox extension (not yet on Linux), for example, lets you change the VPN server's country based on the domain you connect to and even bypass the VPN for certain domains. So, I believe it can be configured to select a U.S. VPN server, for example, when visiting a U.S. social site, but stay on the native connection when accessing BBC services. It uses Mullvad as the provider, actually, which is high quality. They can't be the only one.
The Internet always seems to find ways to bypass blocks.
- Comment on PieFed.World is now open 5 months ago:
Likely, then, that lemmy.world has the same restriction.
- Comment on PieFed.World is now open 5 months ago:
I think it depends on the Piefed instance; it's up to the Admin.
It should work on, for example, feddit.online. If not, something is broken.
- Comment on PieFed.World is now open 5 months ago:
My understanding is that your one PR broke logins, and it took maybe 5 days for someone else to fix the code? Maybe your changes that were kicked away just weren't so good after all, or trusted?
- Comment on PieFed.World is now open 5 months ago:
This is all sour grapes.
I've read your interaction with him, and, frankly, if I were moderating a community where you incessantly carried on over insignificant details, continuing to question things after you got your answer (sea lioning), insisting on focusing on nothing, and never ever stopping, I'd block you too, and I've only blocked 2 people in my entire life as a mod.
Now you're in here trying to malign him, for revenge, for shutting you down so he could get work done and he can focus on important work instead of debating you over never-ending trivial topics.
He is the opposite of the image you are trying to give him.
- Comment on PieFed.World is now open 5 months ago:
This sounds like a Piefed.world announcement that a Lemmy developer would have written.
- Comment on Friendica's marketing is terrible. 6 months ago:
Just curious. Have you had trouble loading my-place.social?
- Comment on Friendica's marketing is terrible. 6 months ago:
They never explained well how to use Friendica, so it's all guessing. But, I believe, to get a Facebook-like experience, you mark people as "Friends" who you want in a Facebook-like environment. This maps to "Friends" on Facebook. Then you click on the "Friends" circle, and you only see posts and conversations from your friends.
You can also set up groups that federate to other instances, and you can control access to the groups. I've never used it, so I don't completely understand how to do this.
But, I think these are the 2 closest Friendica features for Facebook emulation.
https://wiki.friendi.ca/docs/groups-and-privacy#groups_and_privacy
- Comment on Friendica's marketing is terrible. 6 months ago:
- Comment on Friendica's marketing is terrible. 6 months ago:
The stalls are because the database queries are suboptimal. There is one that occasionally runs that, on my instance (I have 337 active users), can sometimes run for 15 minutes and will lock tables. Everything stalls and backs up.
This query was discussed, and I believe in the next release (but unsure) it will be replaced. Instead of using a ton of "not in" clauses, it does a left join now. In testing, someone mentioned it went from multi minutes to multi seconds to run. But there are a lot more such queries.
I think what Friendica needs desperately is a MariaDB/MySql expert to clean up the queries.
Because Friendica supports groups, you can connect to Lemmy communities. This is what kills Freindica. It just cannot handle the hundreds of thousands of daily connections that come in just from lemmy.world alone. Basically, it then becomes a Lemmy/Piefed/MBIN instance plus a Mastodon instance. The database grows by leaps and bounces, queues back up, and it stalls. CPU pegs without relief.
On mine, I finally had to block the Lemmy User Agent at the Cloudflare firewall. I calculated I would have had to spend another $500/month to allow the server to handle the Lemmy traffic comfortably, excluding the continuing cost for DB space. So far, I haven't blocked Piefed and MBIN, but this could change.
Friendica groups were designed for small private groups or specialized groups. Not public forums. I don't think they ever anticipated someone connecting to Lemmy.world communities and that such groups would become so active. I've told people on my instance that if they want to connect to these groups, they should do them using Piefed/Lemmy/Mbin, not Friendica.
- Comment on Friendica's marketing is terrible. 6 months ago:
I haven't looked at the code, but it's standard PHP so probably not terribly hard for PHP developers to get involved.
The developers still work on it, are active in the support groups, and answer questions, but they just can't put in the time they used to. I'm sure they'd be happy if someone wanted to get involved. But, I don't think it would need to be taken over, just helped.
- Comment on Friendica's marketing is terrible. 6 months ago:
It's an acquired taste. Now that I've been using it for months, I prefer it. I like that I have my Bluesky and Tumblr posts completely integrated into my timeline. I can reply to Bluesky posts. When I post something, it automatically gets posted to Bluesky, and Tumblr, if I like. I never go onto Bluesky any longer. Likes and replies from Bluesky are right in with my Mastodon posts and RSS feed posts too. Yes, RSS integrates in as well (obviously you can't reply to them though).
On Mastodon, if someone posts something interesting and I want to see replies and discussions, I can't, unless I remember to go back to the post and look. In Friendica I can click that I want to follow the thread, and it will notify me of the updates, and take me right to the new comment when I click it, it takes me directly to the notification. I love this! If I interact with a post (like it), same thing. It will track it for me. And it does a better job of pulling in replies and responses from all over.
Yeah. Not all good.
I run one of the Friendica servers and it's a problem child. The database grows rapidly and struggles. The database queries urgently need work. Sine are super slow. It stalls a lot. The UI is confusing. The developers are not all that active any longer, but still active. The UI is, well, dated.
More info about it here: https://news.elenarossini.com/the-future-of-social-is-here-a-show-and-tell-part-3-friendica/