brucethemoose
@brucethemoose@lemmy.world
- Comment on How are you doing fellow kids 1 day ago:
Deported
- Comment on How are you doing fellow kids 1 day ago:
We got some scary tools now though. We’ve never been able to fuck up the planet at this scale.
- Comment on How are you doing fellow kids 1 day ago:
Look at the metrics. 4.3M views, boosted hate replies. Blue check.
Twitter and propaganda like it is why this is happening.
- Comment on Pop it in your calendars 1 day ago:
Coffee Stain’s another good example on the bigger end.
It does seems like there’s a danger zone behind a certain size threshold. It makes me worry for Warhorse (the KCD2 dev), which plans to expand beyond 250.
- Comment on Grok AI to be available in Tesla vehicles next week, Elon Musk says 1 day ago:
Funny thing is Teslas already have something more sophisticated. They could pipe the FSD’s diagnostics to some kind of HUD as an ‘overlay’ for the driver, literally run with the car’s own hardware. You’d think Tesla execs would at least know about that since it’s literally their business, and predates the LLM craze.
…But no.
- Comment on Grok AI to be available in Tesla vehicles next week, Elon Musk says 2 days ago:
This is so stupid.
To me, “AI” in a car would be like highlighting pedestrians in a HUD, or alerting you if an unknown person messes with the car, or maybe adjusting mood lighting based on context. Or safety features.
…Not a chatbot.
I’m more “pro” (locally hostable) AI than like 99% of Lemmy, but I find the corporate obsession with instruct textbots bizarre. It would be like every food corp living and breathing succulents. Cacti are neat, but they don’t need to be strapped to every chip bag, every takeout, every pack of forks.
- Comment on The Steam controller was ahead of its time 2 days ago:
Not everyone’s a big kb/mouse fan. My sister refuses to use one on the HTPC.
Hence I think that was its non-insignificant niche; couch usage. Portable keyboards are really awkward and clunky on laps, and if its a trackpad anyway the steam controller is way better.
Personally I think it was a smart business decision, because of this:
It doesnt have 2 joysticks so I just buy an Xbox one instead.
No one’s going to buy a steam-branded Xbox controller, but making it different does. And I think what killed it is that it wasn’t plug-and-play enough, eg it didn’t work out of the box with many games.
- Comment on The Steam controller was ahead of its time 2 days ago:
With respect, this doesn’t make any sense. If you want a joystick controller, just buy an Xbox controller that everything’s compatible with anyway?
The trackpads shine when one needs to emulate a mouse/kb, a nightmare with joysticks.
- Comment on The Steam controller was ahead of its time 2 days ago:
My sister still has a working one that she treats like a religious artifact, as it’s the best way to play mouse/KB games from the sofa.
I see why they discontinued them though. They need custom configs for most games, and I think post people don’t like that much tweaking.
- Comment on Grok praises Hitler, gives credit to Musk for removing “woke filters” 3 days ago:
A lot, but less than you’d think! Basically a RTX 3090/threadripper system with a lot of RAM (192GB?)
With this framework, specifically: github.com/ikawrakow/ik_llama.cpp?tab=readme-ov-f…
The “dense” part of the model can stay on the GPU while the experts can be offloaded to the CPU, and the whole thing can be quantized to ~3 bits, instead of 8 bits like the full model.
That’s just for personal use, though. The intended way to run it is on a couple of H100 boxes, and to serve it to many, many, many users at once. LLMs run more efficiently when they serve in parallel. Eg generating tokens for 4 users isn’t much slower than generating them for 2.
- Comment on Grok praises Hitler, gives credit to Musk for removing “woke filters” 3 days ago:
DeepSeek, now that is a filtered LLM.
The web version has a strict filter that cuts it off. Not sure about API access, but raw Deepseek 671B is actually pretty open. Especially with the right prompting.
There are also finetunes that specifically remove China-specific refusals:
huggingface.co/microsoft/MAI-DS-R1
huggingface.co/perplexity-ai/r1-1776
Note that Microsoft actually added saftey training to “improve its risk profile”
Grok losing the guardrails means it will be distilled internet speech deprived of decency and empathy.
Instruct LLMs aren’t trained on raw data.
It wouldn’t be talking like this if it was just trained on randomized, augmented conversations, or even mostly Twitter data. They cherry picked “anti woke” data to do this real quick, and the result effectively drove the model crazy. It has all the signatures of a bad finetune: specific overused phrases.
- Comment on Grok praises Hitler, gives credit to Musk for removing “woke filters” 3 days ago:
Nitpick: it was never ‘filtered’
LLMs can be trained to refuse excessively (which is kinda stupid and is objectively proven to make them dumber), but the correct term is ‘biased’. If it was filtered, it would literally give empty responses for anything deemed harmful, or at least noticably take some time to retry.
They trained it to praise hitler, intentionally. They didn’t remove any guardrails.
- Comment on Tesla loses $68 billion in value after Elon Musk says he is launching a political party 4 days ago:
It coincides with a bunch of other stuff, like hints at Tesla regulation, tariff’s, and regular wild swings just from being Tesla.
CNBC is basically the Fox News of finance world. Big, sensationalist and basically catering to day trader hype (not the longer term buy-and-hold. You know, what the stock market is supposed to be).
- Comment on Microsoft has never been good at running game studios, which is a problem when it owns them all 5 days ago:
Also a crime. Not just a great game in their niche, but a long history of them.
- Comment on Microsoft has never been good at running game studios, which is a problem when it owns them all 5 days ago:
Never underestimate Phil Spencer.
- Comment on ICEBlock climbs to the top of the App Store charts after officials slam it 1 week ago:
…iOS forces uses Apple services including getting apps through Apple…
Can’t speak to the rest of the claims, but Android practically does too. If one has to sideload an app, you’ve lost 99% of users, if not more.
It makes me think they’re not talking about the stock systems OEMs ship.
Relevant XKCD: xkcd.com/2501/
- Comment on Mullvad's ads are good 1 week ago:
Nah I meant the opposite. Journalistic integrity was learned through long, hard history.
Now that traditional journalism is dying, its like the streamer generation has to learn it from scratch, heh.
- Comment on Mullvad's ads are good 1 week ago:
Its kinda like influencers (and their younger viewers) are relearning the history of journalism from scratch, heh.
- Comment on Mullvad's ads are good 1 week ago:
Surpressing sponsors is a perverse incentive too; all the more reason to not disclose who’s paying the creator.
- Comment on [deleted] 1 week ago:
One thing about Anthropic/OpenAI models is they go off the rails with lots of conversation turns or long contexts. Like when they need to remember a lot of vending machine conversation I guess.
A more objective look: arxiv.org/abs/2505.06120v1
Gemini is much better. TBH the only models I’ve seen that are half decent at this are:
-
“Alternate attention” models like Gemini, Jamba Large or Falcon H1, depending on the iteration. Some recent versions of Gemini kinda lose this, then get it back.
-
Models finetuned specifically for this, like roleplay models or the Samantha model trained on therapy-style chat.
But most models are overtuned for oneshots like fix this table or write me a function, and don’t invest much in long context performance because it’s not very flashy.
-
- Comment on Recommendations for External GPU Docks for Home Lab Use - Lemmy 1 week ago:
What @mierdabird@lemmy.dbzer0.com said, but the adapters arent cheap. You’re going to end up spending more than the 1060 is worth.
A used desktop to slap it in, that you turn on as needed, might make sense? Doubly so if you can find one with an RTX 3060, which would open up 32B models with TabbyAPI instead of ollama.
- Comment on Men are opening up about mental health to AI instead of humans 1 week ago:
ChatGPT (last time I tried it) is extremely sycophantic though. Its high default sampling also leads to totally unexpected/random turns.
Google Gemini is now too. They log and use your dark thoughts.
I find that less sycophantic LLMs are way more helpful. Hence I bounce between Nemotron 49B and a few 24B-32B finetunes (or task vectors for Gemma) and find them way more helpful.
…I guess what I’m saying is people should turn towards more specialized free tools, not something generic like ChatGPT.
- Comment on Men are opening up about mental health to AI instead of humans 1 week ago:
TBH this is a huge factor.
I don’t use ChatGPT much less use it like it’s a person, but I’m socially isolated at the moment. So I bounce dark internal thoughts off of locally run LLMs.
It’s kinda like looking into a mirror. As long as I know I’m talking to a tool, it’s helpful, sometimes insightful. It’s private. And I sure as shit can’t afford to pay a therapist out of the gazoo for that.
It was one of my previous problems with therapy: payment tied to someone toxic, at preset times (not when I need it). Many sessions feels like they end when I’m barely scratching the surface. Yes therapy is great in general, but still.
- Comment on I've just created c/Ollama! 2 weeks ago:
You can still use the IGP, which might be faster in some cases.
- Comment on I've just created c/Ollama! 2 weeks ago:
Oh actually that’s a good card for LLM serving!
Use the llama.cpp server from source, it has better support for Pascal cards than anything else:
github.com/ggml-org/llama.cpp/…/multimodal.md
Gemma 3 is a hair too big (like 17-18GB), so I’d start with InternVL 14B Q5K XL: huggingface.co/…/InternVL3-14B-Instruct-GGUF
Or Mixtral 24B IQ4_XS for more ‘text’ intelligence than vision: huggingface.co/…/Mistral-Small-3.2-24B-Instruct-2…
- Comment on I've just created c/Ollama! 2 weeks ago:
1650
You mean GPU? Yeah, it’s good, I was strictly talking about purchasing a laptop for LLM usage, as most are less than ideal for the money.
- Comment on I've just created c/Ollama! 2 weeks ago:
Yeah, just paying for LLM APIs is dirt cheap, and they (supposedly) don’t scrape data. Again I’d recommend Openrouter and Cerebras! And you get your pick of models to try from them.
Even a framework 16 is not great for LLMs TBH. The Framework desktop is (as it uses a special AMD chip), but it’s very expensive. Honestly the whole hardware market is so screwed up, hence most ‘local LLM enthusiasts’ buy a used RTX 3090 and stick them in desktops or servers, heh.
- Comment on I've just created c/Ollama! 2 weeks ago:
I was a bit mistaken, these are the models you should consider:
huggingface.co/mlx-community/Qwen3-4B-4bit-DWQ
huggingface.co/AnteriorAI/…/main
huggingface.co/unsloth/Jan-nano-GGUF (specifically the UD-Q4 or UD-Q5 file)
These are state-of-the-art, as far as I know.
- Comment on I've just created c/Ollama! 2 weeks ago:
8GB?
You might be able to run Qwen3 4B: huggingface.co/mlx-community/…/main
But honestly you don’t have enough RAM to spare, and even a small model might bog things down. I’d run Open Web UI or LM Studio with a free LLM API, like Gemini Flash, or pay a few bucks for something off openrouter. Or maybe Cerebras API.
- Comment on I've just created c/Ollama! 2 weeks ago:
Actually, to go ahead and answer, the “easiest” path would be LM Studio (which supports MLX quants natively and is not time intensive to install), and a DWQ quantization (which is a newer, higher quality variant of MLX quants).
Probably one of these models, depending on how much RAM you have:
huggingface.co/…/Magistral-Small-2506-4bit-DWQ
huggingface.co/…/Qwen3-30B-A3B-4bit-DWQ-0508
huggingface.co/…/GLM-4-32B-0414-4bit-DWQ
With a bit more time invested, you could try to set up Open Web UI as an alterantive interface (which has its own built in web search like Gemini): openwebui.com
And then use LM Studio (or some other MLX backend, or even free online API models) as the ‘engine’