mm_maybe
@mm_maybe@sh.itjust.works
- Comment on Terrified friends burn to death trapped in Tesla as doors won't open after crash 1 week ago:
You want an e-Golf, which was a beautifully stupid, half-hearted implementation of an EV by Volkswagen, who because they really didn’t want to do it, spent almost nothing on redesign, and in the process creating a ridiculously fun vehicle to drive with sporty handling and high torque at low speed, but nothing else changed from the classic Golf design. Door handles, freaking dials on the dashboard, manual climate and audio controls. Sadly, it isn’t being made anymore. We’ve outgrown ours and it’s time for me to let someone else enjoy the experience (especially with the Biden used EV sales incentives going away soon) but my daughter loves it so much that I’m dreading the tantrum that I know will come when I sell it.
- Comment on Feds Say You Don’t Have a Right to Check Out Retro Video Games Like Library Books 3 weeks ago:
Well, maybe we need a movement to make physical copies of these games and the consoles needed to play them available in actual public libraries, then? That doesn’t seem to be affected by this ruling and there’s lots of precedent for it in current practice, which includes lending of things like musical instruments and DVD players. There’s a business near me that does something similar, but they restrict access by age to high schoolers and older, and you have to play the games there; you can’t rent them out.
- Comment on X's controversial changes to blocking and AI training saw half a million users leave for rival Bluesky in just a single day 4 weeks ago:
We don’t. It probably is. Mastodon is the way, but they need to fix a few things themselves.
- Comment on every damn morning 1 month ago:
Me: I’ve cut my coffee intake down to one cup a day! Look how disciplined and restrained I am!
Also me: drinks 1.5 cans of Celsius per day
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
Y’all should really stop expecting people to buy into the analogy between human learning and machine learning i.e. “humans do it, so it’s okay if a computer does it too”. First of all there are vast differences between how humans learn and how machines “learn”, and second, it doesn’t matter anyway because there is lots of legal/moral precedent for not assigning the same rights to machines that are normally assigned to humans (for example, no intellectual property right has been granted to any synthetic media yet that I’m aware of).
That said, I agree that “the model contains a copy of the training data” is not a very good critique–a much stronger one would be to simply note all of the works with a Creative Commons “No Derivatives” license in the training data, since it is hard to argue that the model checkpoint isn’t derived from the training data.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
Yeah, I’ve struggled with that myself, since my first AI detection model was technically trained on potentially non-free data scraped from Reddit image links. The more recent fine-tune of that used only Wikimedia and SDXL outputs, but because it was seeded with the earlier base model, I ultimately decided to apply a non-commercial CC license to the checkpoint. But here’s an important distinction: that model, like many of the use cases you mention, is non-generative; you can’t coerce it into reproducing any of the original training material–it’s just a classification tool. I personally rate those models as much fairer uses of copyrighted material, though perhaps no better in terms of harm from a data dignity or bias propagation standpoint.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
Model sizes are larger than their training sets
Excuse me, what? You think Huggingface is hosting 100’s of checkpoints each of which are multiples of their training data, which is on the order of terabytes or petabytes in disk space? I don’t know if I agree with the compression argument, myself, but for other reasons–your retort is objectively false.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
I’m getting really tired of saying this over and over on the Internet and getting either ignored or pounced on by pompous AI bros and boomers, but this “there isn’t enough free data” claim has never been tested. The experiments that have come close (look up the early Phi and Starcoder papers, or the CommonCanvas text-to-image model) suggested that the claim is false, by showing that a) models trained on small, well-curated datasets can match and outperform models trained on lazily curated large web scrapes, and b) models trained solely on permissively licensed data can perform on par with at least the earlier versions of models trained more lazily (e.g. StarCoder 1.5 performing on par with Code-Davinci). But yes, a social network or other organization that has access to a bunch of data that they own, or have licensed, could almost certainly fine-tune a base LLM trained solely on permissively licensed data to get a tremendously useful tool that would probably be safer and more helpful than ChatGPT for that organization’s specific business, at vastly lower risk of copyright claims or toxic generated content, for that matter.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
The problem with your argument is that it is 100% possible to get ChatGPT to produce verbatim extracts of copyrighted works. This has been suppressed by OpenAI in a rather brute force kind of way, by prohibiting the prompts that have been found so far to do this (e.g. the infamous “poetry poetry poetry…” ad infinitum hack), but the possibility is still there, no matter how much they try to plaster over it. In fact there are some people, much smarter than me, who see technical similarities between compression technology and the process of training an LLM, calling it a “blurry JPEG of the Internet”… the point being, you wouldn’t allow distribution of a copyrighted book just because you compressed it in a ZIP file first.
- Comment on Banning TikTok Won’t Keep Your Data Safe | Pompous billionaires, authoritarian regimes, and opaque oligarchs are hoarding our data. Only an alternative online ecosystem will stop them. 2 months ago:
Last I heard he’s folding it into Pixelfed, kind of like Reels is part of Instagram. I’m psyched to try it whenever it becomes publicly available, either way. Hoping it’s more like Vine than TikTok
- Comment on YouTube creator sues Nvidia and OpenAI for ‘unjust enrichment’ for using their videos for AI training 2 months ago:
Capitalism is precisely the problem, because if the end product were never sold nor used in any commercial capacity, the case for “fair use” would be almost impossible to challenge. They’re betting on judges siding with them in extending a very specific interpretation of fair use that has been successfully applied to digital copying of content for archival and distribution as in e.g. Google Books or the Internet Archive, which is also not air-tight, just precedent.
Even fair uses of media may not respect the dignity of the creators of works used to create “media synthesizers”. In other words, even if a computer science grad student does a bunch of scraping for their machine learning dissertation, unless they ask and get permission from the creators, their research isn’t upholding the principle of data dignity, which current law doesn’t address at all, but is obviously the real issue upsetting people about “Generative AI”.
- Comment on Most consumers hate the idea of AI-generated customer service 4 months ago:
Tangential, but I absolutely loved working in technical support. The satisfaction of actually helping someone with a problem affecting their real life totally outweighed the abuse from individuals who were letting the work part of their life drag the whole rest of it down (which was just kind of sad to watch). I’ve gotten paid much more for other roles since then, but it’s one of the few roles in which I was thanked for what I did by the person I was working for, and that makes a huge difference.