patatahooligan
@patatahooligan@lemmy.world
- Comment on The EV industry can’t shake its human rights abuse problem 1 month ago:
EVs can be better than ICEs and still a terrible industry though. You phrase it as if it’s one or the other.
Regardless of abuse allegations, EVs are just not the big improvement we need to fight climate change and save the millions of people that will die because of it. We need fundamental changes like lives built around public transport, biking, and walking, not slightly better vehicles in an enormously wasteful model.
- Comment on What is a good eli5 analogy for GenAI not "knowing" what they say? 1 month ago:
Imagine you were asked to start speaking a new language, eg Chinese. Your brain happens to work quite differently to the rest of us. You have immense capabilities for memorization and computation but not much else. You can’t really learn Chinese with this kind of mind, but you have an idea that plays right into your strengths. You will listen to millions of conversations by real Chinese speakers and mimic their patterns. You make notes like “when one person says A, the most common response by the other person is B”, or “most often after someone says X, they follow it up with Y”. So you go into conversations with Chinese speakers and just perform these patterns. It’s all just sounds to you. You don’t recognize words and you can’t even tell from context what’s happening. If you do that well enough you are technically speaking Chinese but you will never have any intent or understanding behind what you say. That’s basically LLMs.
- Comment on Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT 1 month ago:
Just because something is available to view online does not mean you can do anything you want with it. Most content is automatically protected by copyright. You can use it in ways that would otherwise by illegal only if you are explicitly granted permission to do so.
Specifically, Stack Overflow licenses any content you contribute under the CC-BY-SA 4.0 (older content is covered by other licenses that I omit for simplicity). If you read the license you will note two restrictions: attribution and “share-alike”. So if you take someone’s answer, including the code snippets, and include it in something you make, even if you change it to an extent, you have to attribute it to the original source and you have to share it with the same license. You could theoretically mirror the entire SO site’s content, as long as you used the same licenses for all of it.
So far AI companies have simply scraped everything and argued that they don’t have to respect the original license. They argue that it is “fair use” because AI is “transformative use”. If you look at the historical usage of “transformative use” in copyright cases, their case is kind of bullshit actually. But regardless of whether it will hold up in court (and whether it should hold up in court), the reality is that AI companies are going to use everybody’s content in ways that they have not been given permission to do so.
So for now it doesn’t matter whether our content is centralized or federated. It doesn’t matter whether SO has a deal with OpeanAI or not. SO content was almost certainly already used for ChatGPT. If you split it into 100s of small sites on the fediverse it would still be part of ChatGPT. As long as it’s easy to access, they will use it. Allegedly they also use torrents for input data so even if it’s not publicly viewable it’s not safe. If/when AI data sourcing is regulated and the “transformative use” argument fails in court and if the fines are big enough for the regulation to actually work, then sure the situation described in the OP will matter. But we’ll have to see if that ever happens. I’m not holding my breath, honestly.
- Comment on Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT 1 month ago:
This has nothing to do with centralization. AI companies are already scraping the web for everything useful. If you took the content from SO and split it into 1000 federated sites, it would still end up in a AI model. Decentralization would only help if we ever manage to hold the AI companies accountable for the en masse copyright violations they base their industry on.
- Comment on Windows 10 reaches 70% market share as Windows 11 keeps declining 1 month ago:
I would highly advice against using Wine. It requires constant root access, just like virus scanners, making your system vulnerable.
This can’t be right. Was it maybe a particular workflow you used that required root access? I know I’ve used wine as part of Steam’s Proton as well as via Lutris and neither app has ever requested privilege escalation. I’ve also run
wine
manually from the terminal also without being root. - Comment on OpenAI transcribed over a million hours of YouTube videos to train GPT-4 2 months ago:
No, the intent and the consequences of an action are generally taken into consideration in discussions of ethins and in legislation. Additionally, this is not just a matter of ToS. What OpenAI does is create and distribute illegitimate derivative works. They are relying on the argument that what they do is transformative use, which is not really congruent with what “transformative use” has meant historically. We will see in time what the courts have to say about this. But in any case, it will not be judged the same way as a person using a tool just to skip ads. And Revanced is different to both the above because it is a non-commercial service.
- Comment on F.A.A. Audit of Boeing’s 737 Max Production Found Dozens of Issues 3 months ago:
According to The Guardian he got $60M in stock and pension for being fired. Also it seems that stock price didn’t fall much after the crashes and the grounding. It is only after COVID hit that Boeing’s price plummeted. So it might be only by pure luck that he lost anything of value at all.
- Comment on Nvidia is sued by authors over AI use of copyrighted works 3 months ago:
Humans are not generally allowed to do what AI is doing! You talk about copying someone else’s “style” because you know that “style” is not protected by copyright, but that is a false equivalence. An AI is not copying “style”, but rather every discernible pattern of its input. It is just as likely to copy Walt Disney’s drawing style as it is to copy the design of Mickey Mouse. We’ve seen countless examples of AI’s copying characters, verbatim passages of texts and snippets of code. Imagine if a person copied Mickey Mouse’s character design and they got sued for copyright infringement. Then they go to court and their defense was that they downloaded copies of the original works without permission and studied them for the sole purpose of imitating them. They would be admitting that every perceived similarity is intentional. Do you think they would not be found guilty of copyright infringement? And AI is this example taken to the extreme. It’s not just creating something similar, it is by design trying to maximize the similarity of its output to its training data. It is being the least creative that is mathematically possible. The AI’s only trick is that it threw so many stuff into its mixer of training data that you can’t generally trace the output to a specific input. But the math is clear. And while its obvious that no sane person will use a copy of Mickey Mouse just because an AI produced it, the same cannot be said for characters of lesser known works, passages from obscure books, and code snippets from small free software projects.
In addition to the above, we allow humans to engage in potentially harmful behavior for various reasons that do not apply to AIs.
- “Innocent until proven guilty” is fundamental to our justice systems. The same does not apply to inanimate objects. Eg a firearm is restricted because of the danger it poses even if it has not been used to shoot someone. A person is only liable for the damage they have caused, never their potential to cause it.
- We care about peoples’ well-being. We would not ban people from enjoying art just because they might copy it because that would be sacrificing too much. However, no harm is done to an AI when it is prevented from being trained, because an AI is not a person with feelings.
- Human behavior is complex and hard to control. A person might unintentionally copy protected elements of works when being influenced by them, but that’s hard to tell in most cases. An AI has the sole purpose of copying patterns with no other input.
For all of the above reasons, we choose to err on the side of caution when restricting human behavior, but we have no reason to do the same for AIs, or anything inanimate.
In summary, we do not allow humans to do what AIs are doing now and even if we did, that would not be a good argument against AI regulation.
- Comment on AI companies are violating a basic social contract of the web and and ignoring robots.txt 4 months ago:
AI companies will probably get a free pass to ignore robots.txt even if it were enforced by law. That’s what they’re trying to do with copyright and it looks likely that they’ll get away with it.
- Comment on The White House wants to 'cryptographically verify' videos of Joe Biden so viewers don't mistake them for AI deepfakes 4 months ago:
The general public doesn’t have to understand anything about how it works as long as they get a clear “verified by …” statement in the UI.
- Comment on Mozilla’s new service tries to wipe your data off the web 4 months ago:
lsblk
is just lacking a lot of information and creating a false impression of what is happening. I did a bind mount to try it out.sudo mount -o ro --bind /var/log /mnt
This mounts
/var/log
to/mnt
without making any other changes. My root partition is still mounted at/
and fully functional. However, all thatlsblk
shows under MOUNTPOINTS is/mnt
. There is no indication that it’s just/var/log
that is mounted and not the entire root partition. There is also no mention at all of/
.findmnt
shows this correctly. Omitting all irrelevant info, I get:TARGET SOURCE [...] / /dev/dm-0 [...] [...] └─/mnt /dev/dm-0[/var/log] [...]
Here you can see that the same device is used for both mountpoints and that it’s just
/var/log
that is mounted at/mnt
.Snap is probably doing something similar. It is mounting a specific directory into the directory of the firefox snap. It is not using your entire root partition and it’s not doing something that would break the
/
mountpoint. This by itself should cause no issues at all. You can see in the issue you linked as well that the fix to their boot issue was something completely irrelevant. - Comment on Apple refuses to relax its iron grip on iPhones in Europe 4 months ago:
If this isn’t violating the DMA then the DMA is stupid. Legislation should limit the company’s control, not force it into a specific action while allowing it to maintain as much control as possible.
In other words the DMA should effectively say “you don’t get to choose how your platform is used”, not “you get to make the rules, but just don’t be the only one who can develop for your platform”.
- Comment on I'm at a roulette table. I only bet on red. When I lose I triple my bet, when I win I restart. Is this a roulette strategy? 5 months ago:
If you have a large enough bank roll and continuously double your bet after a loss, you can never lose without a table limit.
Unless your bank roll is infinite, you always lose in the average case. My math was just an example to show the point with concrete numbers.
In truth it is trivial to prove that there is no winning strategy in roulette. If a strategy is just a series of bets, then the expected value is the sum of the expected value of the bets. Every bet in roulette has a negative expected value. Therefore, every strategy has a negative expected value as well. I’m not saying anything ground-breaking, you can read a better write-up of this idea in the wikipedia article.
If you don’t think that’s true, you are welcome to show your math which proves a positive expected value. Otherwise, saying I’m “completely wrong” means nothing.
- Comment on I'm at a roulette table. I only bet on red. When I lose I triple my bet, when I win I restart. Is this a roulette strategy? 5 months ago:
So help me out here, what am I missing?
You’re forgetting that not all outcomes are equal. You’re just comparing the probability of winning vs the probability of losing. But when you lose you lose much bigger. If you calculate the expected outcome you will find that it is negative by design. Intuitively, that means that if you do this strategy, the one time you will lose will cost you more than the money you made all the other times where you won.
I’ll give you a short example so that we can calculate the probabilities relatively easily. We make the following assumptions:
- You have $13, which means you can only make 3 bets: $1, $3, $9
- The roulette has a single 0. This is the best case scenario. So there are 37 numbers and only 18 of them are red This gives red a 18/37 to win. The zero is why the math always works out in the casino’s favor
- You will play until you win once or until you lose all your money.
So how do we calculate the expected outcome? These outcomes are mutually exclusive, so if we can define the (expected gain * probability) of each one, we can sum them together. So let’s see what the outcomes are:
- You win on the first bet. Gain: $1. Probability: 18/37.
- You win on the second bet. Gain: $2. Probability: 19/37 * 18/37 (lose once, then win once).
- You win on the third bet. Gain: $4. Probability: (19/37) ^ 2 * 18/37 (lose twice, then win once).
- You lose all three bets. Gain: -$13. Probability: (19/37) ^ 3 (lose three times).
So the expected outcome for you is:
$1 * (18/37) + 2 * (19/37 * 18/37) + … = -$0.1328…
So you lose a bit more than $0.13 on average. Notice how the probabilities of winning $1 or $2 are much higher than the probability of losing $13, but the amount you lose is much bigger.
Others have mentioned betting limits as a reason you can’t do this. That’s wrong. There is no winning strategy. The casino always wins given enough bets. Betting limits just keep the short-term losses under control, making the business more predictable.
- Comment on Screens keep getting faster. Can you even tell? | CES saw the launch of several 360Hz and even 480Hz OLED monitors. Are manufacturers stuck in a questionable spec war, or are we one day going to wo... 5 months ago:
Well, not really, because television broadcast standards do not specify integer framerates. Eg North America uses ~59.94fps. It will take insanely high refresh rates to be able to play all common video formats including TV broadcasts. Variable refresh rate can fix this only for a single fullscreen app.
- Comment on "Did you realize that we live in a reality where SciHub is illegal, and OpenAI is not?" 5 months ago:
Exactly this. I can’t believe how many comments I’ve read accusing the AI critics of holding back progress with regressive copyright ideas. No, the regressive ideas are already there, codified as law, holding the rest of us back. Holding AI companies accountable for their copyright violations will force them to either push to reform the copyright system completely, or to change their practices for the better (free software, free datasets, non-commercial uses, real non-profit orgs for the advancement of the technology). Either way we have a lot to gain by forcing them to improve the situation. Giving AI companies a free pass on the copyright system will waste what is probably the best opportunity we have ever had to improve the copyright system.
- Comment on Microsoft, OpenAI sued for copyright infringement by nonfiction book authors in class action claim 5 months ago:
Already seeing people come in to defend these suits. I just see it like this: AI is a tool, much like a computer or a pencil are tools. You can use a computer to copyright infringe all day, just like a pencil can. To me, an AI is only going to be plagiarizing or infringing if you tell it to. How often does AI plagiarize without a user purposefully trying to get it to do so? That’s a genuine question.
You are misrepresenting the issue. The issue here is not if a tool just happens to be able to be used for copyright infringement in the hands of a malicious entity. The issue here is whether LLM outputs are just derivative works of their training data. This is something you cannot compare to tools like pencils and pcs which are much more general purpose and which are not built on stole copyright works. Notice also how AI companies bring up “fair use” in their arguments. This means that they are not arguing that they are not using copryighted works without permission nor that the output of the LLM does not contain any copyrighted part of its training data (they can’t do that because you can’t trace the flow of data through an LLM), but rather that their use of the works is novel enough to be an exception. And that is a really shaky argument when their services are actually not novel at all. In fact they are designing services that are as close as possible to the services provided by the original work creators.
- Comment on Steam has now officially stopped supporting Windows 7, Windows 8, and Windows 8.1. 5 months ago:
Vista was a terrible OS. You can’t just ignore the hike in hardware requirements as if it wasn’t one of the defining parts of the Vista experience. It’s not just that people didn’t have the hardware to run Vista; people bought new hardware with Vista preinstalled that ran like dogshit! In other words, people essentially paid to have a downgrade. An OS that doesn’t run well is bad and no amount of features can change that.
- Comment on Why aren’t motherboards mostly USB-C by now? 7 months ago:
But we’re not at the point of debating whether users should replace all of their devices. If motherboards with a single USB-C are so common, we’re actually at a place where we’re expecting users to buy all their new peripherals to be USB-A as well.
- Comment on Why aren’t motherboards mostly USB-C by now? 7 months ago:
Source? Why is it expensive?
- Comment on What is the Israel thing going on? 8 months ago:
prageru is a known disinformation platform. That link is worthless.
The ongoing war in Gaza, is HAMAS against Israel.
And what about the Palestinian lands that are occupied and the Palestinians that were uprooted from there? What about the Palestinians that have been killed by Israel? The recent events might have been HAMAS, but historically this is a Palestine-Israel conflict. If you can’t be bothered to learn and understand the context, why comment at all?