JustTesting
@JustTesting@lemmy.hogru.ch
- Comment on Get ready to see ads on your… Samsung refrigerator 22 hours ago:
I remember reading that hotel TVs are an option. They also have an ad platform, but one intended for the hotel owner to send ads from, not some 3rd party. Not exactly dumb but also not as bad as regular TVs.
And of course a beamer or PC screen connected to some cheap small form factor PC is always an option, with Kodi or similar on it, i haven’t owned a TV in like 10 years, just using a small linux pc with beamer, and a tv tuner card in the past (nowadays my ISP offers all public channels on IPTV)
- Comment on 1 week ago:
For the byte pair encoding (how those tokens get created) i think bpemb.h-its.org does a good job at giving an overview. after that i’d say self attention from 2017 is the seminal work that all of this is based on, and the most crucial to understand. jtlicardo.com/blog/self-attention-mechanism does a good job of explaining it. And jalammar.github.io/illustrated-transformer/ is probably the best explanation of a transformer architecture (llms) out there. Transformers are made up of a lot of self attention.
it does help if you know how matrix multiplications work, and how the backpropagation algorithm is used to train these things. i don’t know of a good easy explanation off the top of my head but xnought.github.io/backprop-explainer/ looks quite good.
and that’s kinda it, you just make the transformers bigger, with more weight, pluck on a lot of engineering around them, like being able to run code and making it run more efficientls, exploit thousands of poor workers to fine tune it better with human feedback, and repeat that every 6-12 month for ever so it can stay up to date.
- Comment on 1 week ago:
Well each token has a vector. So ‘co’ might be [0.8,0.3,0.7] just instead of 3 numbers it’s like 100-1000 long. And each token has a different such vector. Initially, those are just randomly generated. But the training algorithm is allowed to slowly modify them during training, pulling them this way and that, whichever way yields better results during training. So while for us, ‘th’ and ‘the’ are obviously related, for a model no such relation is given. It just sees random vectors and the training reorganizes them tho slowly have some structure. So who’s to say if for the model ‘d’, ‘da’ and ‘co’ are in the same general area (similar vectors) whereas ‘de’ could be in the opposite direction. Here’s an example of what this actually looks like. Tokens can be quite long, depending how common they are, here it’s ones related to disease-y terms ending up close together, as similar things tend to cluster at this step. You might have an place where it’s just common town name suffixes clustered close to each other.
and all of this is just what gets input into the llm, essentially a preprocessing step. So imagine someone gave you a picture like the above, but instead of each dot having some label, it just had a unique color. And then they give you lists of different colored dots and ask you what color the next dot should be. You need to figure out the rules yourself, come up with more and more intricate rules that are correct the most. That’s kinda what an LLM does. To it, ‘da’ and ‘de’ could be identical dots in the same location or completely differents
plus of course that’s before the llm not actually knowing what a letter or a word or counting is. But it does know that 5.6.1.5.4.3 is most likely followed by 7.7.2.9.7(simplilied representation), which when translating back, that maps to ‘there are 3 r’s in strawberry’. it’s actually quite amazing that they can get it halfway right given how they work, just based on ‘learning’ how text structure works.
but so in this example, us state-y tokens are probably close together, ‘d’ is somewhere else, the relation between ‘d’ and different state-y tokens is not at all clear, plus other tokens making up the full state names could be who knows where. And tien there’s whatever the model does on top of that with the data.
for a human it’s easy, just split by letters and count. For an llm it’s trying to correlate lots of different and somewhat unrelated things to their ‘d-ness’, so to speak
- Comment on 1 week ago:
Huh that actually does sound like a good use-case of LLMs. Making it easier to weed out cheaters.
- Comment on 1 week ago:
They don’t look at it letter by letter but in tokens, which are automatically generated separately based on occurrence. So while ‘z’ could be it’s own token, ‘ne’ or even ‘the’ could be treated as a single token vector. of course, ‘e’ would still be a separate token when it occurs in isolation. You could even have ‘le’ and ‘let’ as separate tokens, afaik. And each token is just a vector of numbers, like 300 or 1000 numbers that represent that token in a vector space. So ‘de’ and ‘e’ could be completely different and dissimilar vectors.
so ‘delaware’ could look to an llm more like de-la-w-are or similar.
of course you could train it to figure out letter counts based on those tokens with a lot of training data, though that could lower performance on other tasks and counting letters just isn’t that important, i guess, compared to other stuff
- Comment on '3d-printing a screw' is a way to describe how AI integration is stupid most of the time 2 weeks ago:
one other use case where they’re helpful is ‘translation’. Like i have a docker compose file and want a helm chart/kubernetes yaml files for the same thing. It can get you like 80% there, and save you a lot of yaml typing.
Wont work well if it’s mo than like 5 services or if you wanted to translate a whole code base from one language to another. But converting one kind of file to another one with a different language or technology can work ok. Anything to write less yaml…
- Comment on Wealth inequality seems like the only outcome in a system where capital gains are taxed less than labor 4 weeks ago:
Wouldn’t that just lead to splitting off of cheap companies, with pro-bono ceos that get paid more by the parent company through side channels? I don’t think there’s any fixing it with these kinds of laws, as they’ll just find loop holes to circumvent it.
maybe if companies were forced to be democratic so figurehead ceos could be ousted by the underpaid workers, but at that point it’s not capitalism, but socialism. and that’s how it usually goes imo, the workable solution to capitalism turns out to be not-capitalism
- Comment on Sam Altman admits OpenAI ‘totally screwed up’ its GPT-5 launch and says the company will spend trillions of dollars on data centers 4 weeks ago:
I’m not really sure I follow.
Just to be clear, I’m not justifying anything, and I’m not involved in those projects. But the examples I know concern LLMs customized/fine-tuned for clients for specific projects (so not used by others), and those clients asking to have confidence scores, people on our side saying that it’s possible but that it wouldn’t actually say anything about actual confidence/certainty, since the models don’t have any confidence metric beyond “how likely is the next token given these previous tokens” and the clients going “that’s fine, we want it anyways”.
And if you ask me, LLMs shouldn’t be used for any of the stuff it’s used for there. It just cracks me up when the solution to “the lying machine is lying to me” is to ask the lying machine how much it’s lying. And when you tell them “it’ll lie about that too” they go “yeah, ok, that’s fine”.
And making shit up is the whole functionality of LLMs, there’s nothing there other than that. It just can make shit up pretty well sometimes.
- Comment on Sam Altman admits OpenAI ‘totally screwed up’ its GPT-5 launch and says the company will spend trillions of dollars on data centers 4 weeks ago:
It’s always funny to me when people do add ‘confidence scores’ to LLMs, because it always amounts to just adding ‘say how confident you are with low, medium or high in your response’ to th prompt, and then you have made up confidences for made up replies. And you can tell clients that it’s just made up and not actual confidence, but they will insist that they need it anyways…
- Comment on The triumph of AI marks the end of the information age. 5 weeks ago:
The scariest part for me is not them manipulating it with a system prompt like ‘elon is always right and you love hitler’.
but one technique you can do is have it e.g. (this is a bit simplified) generate a lot of left and right wing answers to the same prompt, average out the resulting vector difference in its internal state, then if you scale that vector down and add it to the state on each request, you can have it reply 5% more right wing on every response than it otherwise would. Which would be very subtle manipulation. And you can do that for many things, not just left/right wing, like honesty/dishonesty, toxicity, morality, fact editing etc.
i think this was one of the first papers on this, but it’s an active research area. It does have some nice examples if you scroll through.
and since it’s not a prompt, it can’t even leak, so you’d be hard pressed to know that it is happening.
and if this turns into the main form of how people interact with the internet, that’s super scary stuff. almost like if you had a knob that could turn the whole internet e.g. 5% more pro russia. All the cambridge analytica and grok hitler stuff seems crude by comparison.
- Comment on The triumph of AI marks the end of the information age. 5 weeks ago:
The internet was a bubble (dotcom) that burst and still stayed around, both things can be true.
- Comment on OpenAI Seeks Additional Capital From Investors as Part of Its $40 Billion Round 1 month ago:
But in this case they don’t really have a moat, any invention is copied or surpassed by the competition within weeks/a few months, and there’s no monopoly in sight. And they’re all running negative revenue following the same scheme, high chance that if some start failing, it will scare investors, which in turn makes the negative revenue thing harder to do for the ones still in business
- Comment on New wealth of top 1% surges by over $33.9 trillion since 2015 – enough to end poverty 22 times over, as Oxfam warns global development “abysmally off track” ahead of crunch talks 2 months ago:
One thing I always wonder is if it actually could end poverty 22 times over.
i mean, rich people hoarding money is increasing moneys scarcity for everyone else, theoretically increasing its value. And if it were suddenly distibuted fairly, it’d lose value and there would be a higher cut off for what’s considered poverty. on the other hand, a lot of their money is funny money, like being tied up in stocks and not actually worth as much in currency compared to what is said (if they sold the stock, the value would drop and they’d get less).
so i’m actually curious if anyone ever did an analysis of what would happen if e.g. the wealth of the top 0.1% is evenly spread across the population.
of course that’s super complex and hard to say what the social effects would be. But the simplistic ‘everyone would get x dollars, poverty limit is y, x > y, so no more poverty’, while useful to show the scale, always sounded too naive to me.
- Comment on Are there other options than Prusa/BambuLab? 3 months ago:
I’m super happy with my formbot Marathon IDEX, works perfectly fine with TPU (though i did have to adjust one screw guide in the extruder so it doesn’t eat the filament). it’s not very well known, since they don’t hand them out to influencers etc. The discord is pretty active and lots of helpful people there.
made with all standard components, regular Klipper firmware, so i know i can replace parts if anything ever breaks.
And IDEX in mirror/copy mode for printing multiple parts at twice the speed is great when you need it.
- Comment on Is there a federated Strava alternative? 4 months ago:
After ther recent acqusition and expecced firing of a lot of staff, it might not be the best alternative
- Comment on Are AI Models Advanced Enough To Translate Literature? The Debate Is Roiling Publishing: Major publishers are experimenting with automated translations, hundreds of which have already been produced. 5 months ago:
Actually, as to your edit, the it sounds like you’re fine-tuning the model for your data, not training it from scratch. So the llm has seen english and chinese before during the initial training. Also, they represent words as vectors and what usually happens is that similiar words’ vectors are close together. So subtituting e.g. Dad for Papa looks almost the same to an llm. Same across languages. But that’s not understanding, that’s behavior that way simpler models also have.
- Comment on Amazon’s killing a feature that let you download and backup Kindle books 6 months ago:
Kobo.com DRM is also very easy to bypass and turn into epub using knock
- Comment on Dell kills the XPS brand 8 months ago:
The newest generation of xps i shit anyways, good riddance.
i was really happy with my 2019ish xps. But the 2024 one is hot garbage. not just that it arrived with the keyboard not working and Dell taking 3 months to replace it. There’s a total of 2 usb-c ports on it. That’s all the connectors, yes. No, no headphone jack either. And one of those two is taken up with charging, so i’m left with one port if i dont use a dockingstation.
the whole function bar is touch now. you need to hit it 3 times for it to react, who needs Esc anyways. Unless you want to type in the number row, then the function row will pick up random key presses sometimes.
Copilot key no one asked for. Power button is just an unlabelled piece of plastic that looks like filler, not a button. Keyboard sucks in general, too little space between keys, you’re bound to mistype.
linux support is ok, though webcam doesn’t work in firefox, hibernate doesn’t work, every few weeks it’ll just freeze. But otherwise acceptable.
definitely my last dell, i really hate it.