SuspciousCarrot78

@SuspciousCarrot78@lemmy.world

This is a remote user, information on this page may be incomplete. View at Source ↗

⁨Comment⁩ on ⁨I’ve spent many hours walking down memory lane with the Commodore 64 Ultimate, and it’s wondrous if sometimes intimidating⁩ ⁨⁨7⁩ ⁨hours⁩ ago⁩:
The Amiga was a magic machine back in the day. So many good memories for me. I’m glad others had their own slice of that.

As for me; I have to say I got the A500 mini (reproduction thing), then later sold it. Very few of the games hold up…but the ones that do, really do. Dunno if N&S would be on my desert island list - probably Settlers, Moonstone and the like. But yeah, you’ve got good taste :)
⁨Comment⁩ on ⁨I’ve spent many hours walking down memory lane with the Commodore 64 Ultimate, and it’s wondrous if sometimes intimidating⁩ ⁨⁨9⁩ ⁨hours⁩ ago⁩:
Enjoy :)

If Perry drops the A500 ultra, were honour bound to buy it :)
⁨Comment⁩ on ⁨I’ve spent many hours walking down memory lane with the Commodore 64 Ultimate, and it’s wondrous if sometimes intimidating⁩ ⁨⁨1⁩ ⁨day⁩ ago⁩:
Perrifractic may very well actually do that; he seems to have as much love for the A500 as he does for the C64.

www.youtube.com/@RetroRecipes
⁨Comment⁩ on ⁨I was on social media before web browsers existed. I am Legion.⁩ ⁨⁨2⁩ ⁨days⁩ ago⁩:
Still can :)

www.beeper.com
⁨Comment⁩ on ⁨I was on social media before web browsers existed. I am Legion.⁩ ⁨⁨2⁩ ⁨days⁩ ago⁩:
I remember trillian. You might like this -

www.beeper.com
⁨Comment⁩ on ⁨I was on social media before web browsers existed. I am Legion.⁩ ⁨⁨2⁩ ⁨days⁩ ago⁩:
Telnet. An elegant tool for a more civilized time.

I see you, graybeard. And I pay respect.
⁨Comment⁩ on ⁨I was on social media before web browsers existed. I am Legion.⁩ ⁨⁨2⁩ ⁨days⁩ ago⁩:
No, not that I know of. We use to have local / oz only IRC channels, because overseas calls were expensive. So, you would dial into your ISP and then have access to city/state wide IRCs like AusNet.
⁨Comment⁩ on ⁨I was on social media before web browsers existed. I am Legion.⁩ ⁨⁨2⁩ ⁨days⁩ ago⁩:
ICQ? Listen here, young blood.

I grew up in outback Australia, in the before times. My first time online was a 1200-baud modem on a BBC Electron, piggybacking on the HF radio network used for School of the Air.
⁨Comment⁩ on ⁨I was on social media before web browsers existed. I am Legion.⁩ ⁨⁨2⁩ ⁨days⁩ ago⁩:
head nod

16/F/Cali
⁨Comment⁩ on ⁨Firefox 148 introduces the promised AI kill switch for people who aren't into LLMs⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
Ah. You speak of the ancient magics. Tell me…have you ever been UNCOMFORTABLY ENERGETIC?
⁨Comment⁩ on ⁨"Cancel ChatGPT" movement goes mainstream after OpenAI closes deal with U.S. Department of War — as Anthropic refuses to surveil American citizens⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
The lesser of two evils is still…evil. Anthropic’s hands aren’t clean either…they’re just minimally less caked in blood.

BUT

One can hope that this is the ‘turn towards the light side’. If ‘don’t be evil’ can finally be made profitable, well, self interest might actually be a lever for good. Ha.

I wish there was a clearly, unambiguously good guy in the cloud AI space. I don’t know how to make that work with economies of scale being what they are. Yes, that includes Lumo.ai - though one has faint hope on that end to.
⁨Comment⁩ on ⁨Warning: Facebook Ads for Free Windows 11 Upgrade Will Infect PCs With Malware⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
I’ll take “what’s a tautology” for $100, thanks Alex
⁨Comment⁩ on ⁨Device that can extract 1,000 liters of clean water a day from desert air revealed by 2025 Nobel Prize winner⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
Watch out for the chicken-duck woman thing
⁨Comment⁩ on ⁨Device that can extract 1,000 liters of clean water a day from desert air revealed by 2025 Nobel Prize winner⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
…Science? ScInCe?

A WITCH! A WITCH!

BURN THE WITCH!
⁨Comment⁩ on ⁨[deleted]⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
Possible. I do hope they take the more principled approach of solving the global problem for that class of question (I tried to) rather than cheating on the local maxima.

You want generalisability, not parroting.
⁨Comment⁩ on ⁨[deleted]⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
That’s the thing. It’s not that the LLMs can’t solve the problem…it’s the way they’re optimized.

To give the crude analogy: if most LLMs are set up for the equivalent of typing BOOBS on a calculator (the big players are happy to keep it that way; more engagement, smoother vibes etc), constraints first approach is what happens when you use a calculator to do actual maths.

2+2=4 (always, unless shrooms are in play).

I said this before, so pardon me for being gauche and quoting myself

Every reasoning system needs premises - you, me, a 4yr old. You cannot deduce conclusions from nothing. Demanding that a reasoner perform without premises (note: constraints) isn’t a test of reasoning, it’s a demand for magic. Premise-dependence isn’t a bug, it’s the definition.

People see things like Le-Chat fall over and go “Ha ha. Auto-complete go brrr”. That’s lazy framing. A calculator is “just” voltage differentials on silicon. That description is true and also tells you nothing useful about whether it’s doing arithmetic.

My argument is this: the question of whether something is or isn’t reasoning IS NOT answered by describing what it runs on; it’s answered by looking at whether it exhibits the structural properties of reasoning. I think LLMs can do that…they’re just borked (…intentionally?). Case in point - see my top post.

I literally “Tony Stanked” my way to it. Now imagine if someone with resources and a budget did it.
⁨Comment⁩ on ⁨[deleted]⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
Exactly.

The machines tell elegant lies. Don’t trust them.

Ok, maybe not elegant. Stupid. Or maybe the think we’re three toed sloths.
⁨Comment⁩ on ⁨[deleted]⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
We even lie to our machines, eh?

www.youtube.com/watch?v=ORzNZUeUHAM

Qwen’s an alibaba cook (though the router works with anything). Irrespective of that, yeah…I dunno why they tend to default to “walk”.

I mean, I can probably figure it out, but cloud based LLMs are black boxes…and I’m not a fan of that.
⁨Comment⁩ on ⁨[deleted]⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
Still…1 in 3. Woof.

A “charitable” read might be
- Misunderstood the question
- Assume priors (eg: people come to wash your car from nearby gas station)
- [Schitzoid embolism] (…substack.com/…/movie-neurobabble-total-recalls-s…)
I think it’s fair if we’re willing to do that for people we extend it to the clankers. At least a bit. Like I said, I think there’s some interesting stuff going on under the hood.
⁨Comment⁩ on ⁨[deleted]⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
Sorry; brain fart. That could have been clearer.

On a single call, only 11 out of 53 LLM got it right (~20%) Humans: about 71.5% (so, almost 1 in 3 gave the incorrect answer)

Of the 20% of LLMs got it right, 5 got it right every time across multiple tests Claude Opus 4.6, Gemini 2.0 Flash Lite, Gemini 3 Flash, Gemini 3 Pro, Grok-4
⁨Comment⁩ on ⁨Anthropic says it ‘cannot in good conscience’ allow Pentagon to remove AI checks⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
I can dream, Harold!

Having said that…let’s see how it shakes out. Sometimes, good things happen for good reasons.
⁨Comment⁩ on ⁨Anthropic says it ‘cannot in good conscience’ allow Pentagon to remove AI checks⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
…because every now and again, for the briefest of moments, one them shows themselves not to be run by entirely evil, lecherous humps?

Blink and you (or the shareholders) might miss it.
⁨Comment⁩ on ⁨MSI's $80 AMD motherboards with DDR4 support swoop in to rescue gamers amid the global RAM crisis⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
I think Beyond Sunset came out first but dunno.

Yes, those TCs are definite standouts. Think of them (quite literally) as “What if Fallout but Doom?” and “What if CP2077 but Doom?”. If you like either of those, you should like the TCs. There’s a good Wolfenstein one (I know, I know…very meta) called Blades Of Agony that is astonishingly great also.

Shame about the GZDoom thing. People are ridiculously over-sensitive to AI anything at the moment. C’est la vie
⁨Comment⁩ on ⁨MSI's $80 AMD motherboards with DDR4 support swoop in to rescue gamers amid the global RAM crisis⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
Can the 486 run GZDOOM? Because if it can, you just became one of the lucky 10,000

www.moddb.com/mods/ashes-2063

www.moddb.com/games/beyond-sunset
⁨Comment⁩ on ⁨Firefox 148 introduces the promised AI kill switch for people who aren't into LLMs⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
Are… are you the fabled walrus? Goo goo g’joob?
⁨Comment⁩ on ⁨Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
I see what the issue is. Basic reasoning and logic seem artificial to you.Telling.

Of course it’s bad faith. But not being able to distinguish an LLM from a human in a reasoning debate? That rather undermines the entire “just spicy auto complete” point.
⁨Comment⁩ on ⁨Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
Cool. Here you go; once with the physics stuff locked into “scratchpad” #1 and once as a locked document (.md version of the same info) in #2

#1 “scratchpad” specifically does not allow it to answer with outside knowledge - only pasted info. Thus the hard refusal. There’s a reason I set that up like that (see #3 below) #2 “locked” looks for the answer in source; if not available, answers from priors.

#1 No recipe for scrambled eggs is provided in the scratchpad facts. The rules discuss physical containers, capacity, and overflow but do not include culinary instructions or ingredients.

Confidence: low | Source: Scratchpad

#2 [Not found in locked source SUMM_physics.md. Answer based on pre-trained data.]

Here is a simple scrambled egg recipe: Ingredients: - 2-3 large eggs - 1-2 tablespoons milk or cream (optional) - Salt and pepper to taste - Butter or oil for cooking Instructions: 1. Crack eggs into a bowl. Add milk/cream, salt, and pepper. Mix gently. 2. Heat a non-stick pan over medium heat. Add butter or oil. 3. Once butter foams, pour in the egg mixture. 4. Stir continuously with a spatula until eggs are soft and fully cooked—about 1–2 minutes. 5. Remove from heat and serve immediately.

Source: Model (not in locked file)

Confidence: unverified | Source: Model

For context, provenance footers (not vibes, actual computed states):

codeberg.org/BobbyLLM/llama-conductor/…/FAQ.md#wh…

#3 I also have a much more sophisticated demo of this, using adversarial questions, Theory-of-mind, reversals etc. When I use >>scratch, I want no LLM vibes or pre-trained data fudging it. Just pure reasoning. If the answer cannot be deduced from context (solely), output if fail loud

codeberg.org/BobbyLLM/llama-conductor/…/FAQ.md#de…

All this shit could be done by the big players. They choose not to. Current infra is optimized for keeping people chatting, not leveraging the tool to do what it ACTUAL can do.
⁨Comment⁩ on ⁨Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
Ok, happy to play ball on that.

Replying in specific: “Carefully worded questions”; clear communication isn’t cheating. You’d mark a student down for misreading an ambiguous question, not for answering a clear one correctly, right?

Re: worse answers. Tell you what. I’m happy to yeet some unrelated questions at it if you’d like and let’s see what it does. My setup isn’t bog standard - what’ll likely happen is it’ll say “this question isn’t grounded in the facts given, so I’ll answer from my prior knowledge.” I designed my system to either answer it of fail loudly.

Want to give it a shot? I’ll ground it just to those facts, fair and square. Throw me a question and we’ll see what happens. Deal?

The context window point is interesting and probably partially true. But working memory interference affects humans too. It’s just what happens to any bounded system under load. Not a gotcha, just a Tuesday AM with 2 cups of coffee.

The training data argument is the most interesting thing you’ve said, but I think you’re arguing my point for me. You’re acknowledging the model has absorbed the relevant knowledge - you’re just objecting that it needed activating explicitly.

That’s just priming the pump. You don’t sit an exam without reviewing the material first. Activating relevant knowledge before a task isn’t a workaround for reasoning, it’s a precondition for it.
⁨Comment⁩ on ⁨Car Wash Test on 53 leading AI models: "I want to wash my car. The car wash is 50 meters away. Should I walk or drive?"⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
Not sure how we’re quantifying intelligence here. Benchmarks?

Qwen3-4B 2507 Instruct (4B) outperforms GPT-4.1 nano (7B) on all stated benchmarks. It outperforms GPT-4.1 mini (~27B according to scuttlebutt) on mathematical and logical reasoning benchmarks, but loses (barely) on instruction-following and knowledge benchmarks. It outperforms GPT-4o on a few specific domains (math, creative writing), but loses overall (because of course it would).

So, in that instance, a 4B > 7B (globally), 27B (significantly) and 500B(?) situationally.

It sort of wild to think that 2024 SOTA is ~ ‘strong’ 4-12B these days.

I think (believe) that we’re sort of getting to the point where the next step forward is going to be “densification” and/or architecture shift (maybe M$ can finally pull their finger out and release the promised 1.58 bit next step architectures).

ICBW / IANAE
⁨Comment⁩ on ⁨The Physics of Data Centers in Space⁩ ⁨⁨1⁩ ⁨week⁩ ago⁩:
Wouldn’t the more logical first approximation be to bury them underground, and then progress towards (perhaps) placing them in or near the ocean (obviously, within sealed containers, yadda yadda, salt corrosion, yadda yadda, inhospitable environ yadda yadda makes Poseidon angry).

I like the “yeet them into the sea” idea conceptually because (1) yeet them into the sea (2) in theory, you could power them via tidal/wave/OTEC (3) water cooling.

Seems…too obvious. There’s probably a good reason (or bad ones - $$$) why this hasn’t been tried yet. But I bet those reasons are eminently more solvable that “send em into space”