FatCrab
@FatCrab@lemmy.one
- Comment on Baidu CEO warns AI is just an inevitable bubble — 99% of AI companies are at risk of failing when the bubble bursts 4 weeks ago:
AI in health and medtech has been around and in the field for ages. However, two persistent challenges make roll out slow-- and they’re not going anywhere because of the stakes at hand.
The first is just straight regulatory. Regulators don’t have a very good or very consistent working framework to apply to to these technologies, but that’s in part due to how vast the field is in terms of application. The second is somewhat related to the first but really is also very market driven, and that is the issue of explainability of outputs. Regulators generally want it of course, but also customers (i.e., doctors) don’t just want predictions/detections, but want and need to understand why a model “thinks” what it does. Doing that in a way that does not itself require significant training in the data and computer science underlying the particular model and architecture is often pretty damned hard.
I think it’s an enormous oversimplification to say modern AI is just “fancy signal processing” unless all inference, including that done by humans, is also just signal processing. Modern AI applies rules it is given, explicitly or by virtue of complex pattern identification, to inputs to produce outputs according to those “given” rules. Now, what no current AI can really do is synthesize new rules uncoupled from the act of pattern matching. Effectively, a priori reasoning is still out of scope for the most part, but the reality is that that simply is not necessary for an enormous portion of the value proposition of “AI” to be realized.
- Comment on Server dealer keeps hitting at Elon Musk for $61 million bill — Wiwynn sues X for unpaid IT infrastructure products 4 weeks ago:
Summary judgement is not a thing separate from a lawsuit. It’s literally a standard filling made in nearly every lawsuit (even if just as a hail mary). You referenced “beyond a reasonable doubt” earlier. This is also not the standard used in (US) civil cases–it’s typically a standard consisting of the preponderance of the evidence.
I’m also not sure what you mean by “court approved documentation.” Different jurisdictions approach contract law differently, but courts don’t “approve” most contracts–parties allege there was a binding and contractual agreement, present their evidence to the court, and a mix of judge and jury determines whether under the jurisdictions laws and enforceable agreement occurred and how it can be enforced (i.e., are the obligations severable, what damages, etc.).
- Comment on All Of Apple’s Foldable iPhone Prototypes Have Visible Creases, Which May Explain The Company’s Apprehension Towards A Launch 1 month ago:
My z flip is hands down my favorite phone I’ve ever owned and I didn’t get it expecting to like it much. I just needed a new phone and with Samsung’s recycling program, my old near-tablet sized phone made the switch like barely 100 bucks.
There are a lot of small advantages it provides that quickly add up to it being an overall superior experience. Now if only Bixby wasn’t the worst fucking thing ever.
- Comment on OpenAI Execs Mass Quit as Company Removes Control From Non-Profit Board and Hands It to Sam Altman 1 month ago:
Their non-profit status had nothing to do with the legality of their training data acquisition methods. Some of it was still legal and some of it was still illegal (torrenting a bunch of books off a piracy site).
- Comment on Google Search To Show If An Image Is AI Generated, Edited Or Taken With Camera. 2 months ago:
My point is just that they’re effectively describing a discriminator. Like, yeah, it entails a lot more tough problems to be tackled than that sentence makes it seem, but it’s a known and very active area of ML. Sure, there may be other metadata and contextual features to discriminate upon, but eventually those heuristics will inevitably be closed up and we’ll just end up with a giant distributed, quasi-federated GAN. Which, setting aside the externalities that I’m skeptical anyone in a position of power to address is equally in an informed position of understanding, is kind of neat in a vacuum.
- Comment on Google Search To Show If An Image Is AI Generated, Edited Or Taken With Camera. 2 months ago:
Yes, it’s called a GAN and has been a fundamental technique in ML for years.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
Like I’ve said, you are arguing this into nuanced aspects of copyright law that are absolutely not basic, but I do not agree at all with your assessment of the initial reproduction of the image in a computer’s memory. First, to be clear, what you are arguing is that images on a website are licensed to the host to be reproduced for non-commercial purposes only and that such downstream access may only be non-commercial (defined very broadly–there is absolutely a strong argument here that commercial activity in this situation means direct commercial use of the reproduction; for example, you wouldn’t say that a user who gets paid to look at images is commercially using the accessed images) or it violates the license. Now, even ignoring my parentheses, there are contract law and copyright law issues with this. Again, using thumbs and, honestly, I’m not trying to write a legal brief as a result of a random reply on lemmy, but the crux is that it is questionable whether you can enforce licensing terms that are presented to a licensee AFTER you enable, if not force, them to perform the act of copying your work. Effectively, you allowed them to make a copy of the work, and then you are trying to say "actually, you can only do x, y, and z with that particular copy–and this is also where exhaustion rears its head when you add on your position that once a trained model switches from non-commercial deployment to commercial deployment it can suddenly retroactively recharacterize the initial use as unlicensed infringement. Logistically, it just doesn’t make sense either (for example, what happens when a further downstream user commercializes the model? Does that percolate back to recharacterize the original use? What about downstream from that? How deep into a toolchain history do you need to go to break time traveling egregious breach of exhaustion?) so I have a hard time accepting it.
Now, in response to your query wrt my edit, my point was that infringement happens when you do the further downstream reproduction of the image. When you print a unicorn on a t-shirt, it’s that printing that is the infringement. The commercial aspect has absolutely no bearing on whether an infringement occurs. It is relevant to damages and the fair use affirmative defense. The sole query of whether infringement has occurred is whether a copy has been made and thus violated the copyright.
And all this is just about whether there is even a copying at the training of the models stage. This doesn’t get into a fairly challenging fair use analysis (going by SCotUS’ reasoning on copyrightability of API in Oracle v Google, I actually think the fair use defense is very strong, but I also don’t think there is an infringement happening to even necessitate such an analysis so ymmv–also, that decision was terrible and literally every time the SCotUS has touched IP issues, it has made the law wildly worse and more expensive and time-consuming to deal with). It also doesn’t get into whether outputs that are very similar to works infringe in the way music does (even though there is no actual copying–I think it highly likely it is an infringement). It also also doesn’t get into how outputs might infringe even though there is no IP rights in the outputs of a generative architecture (this probably is more a weird academic issue but I like it nonetheless). Oh, and likeness rights haven’t made their way into the discussion (and the incredible weirdness of a class action that includes right of publicity among its claims).
We can, and probably will, disagree on how IP law works here. That’s cool. I’m not trying to litigate it on lemmy. My point in my replies at this point is just to show that it is not “basic copyright law bruh”. The copyright law, and all the IP law really, around generative AI techniques is fairly complicated and nuanced. It’s totally reasonable to hold the position that our current IP laws do not really address this the way most seem to want it to. In fact, most other IP attorneys I’ve talked to with an understanding of the technical processes at hand seem to agree. And, again, I don’t think that further assetizing intangibles into a “right to extract machine learning from” is a viable path forward in the mid and long run, nor one that benefits anyone but highly monied corporate actors either.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
No, this is mostly incorrect, sorry. The commercial aspect of the reproduction is not relevant to whether it is an infringement–it is simply a factor in damages and Fair Use defense (an affirmative defense that presupposes infringement).
What you are getting at when it applies to this particular type of AI is effectively whether it would be a fair use, presupposing there is copying amounting to copyright infringement. And what I am saying is that, ignoring certain stupid behavior like torrenting a shit ton of text to keep a local store of training data, there is no copying happening as a matter of necessity. There may be copying as a matter of stupidity, but it isn’t necessary to the way the technology works.
Now, I know, you’re raging and swearing right now because you think that downloading the data into cache constitutes an unlawful copying–but it presumably does not if it is accessed like any other content on the internet. Because intent is not a part of what makes that a lawful or unlawful copying and once a lawful distribution is made, principles of exhaustion begin to kick in and we start getting into really nuanced areas of IP law that I don’t feel like delving into with my thumbs, but ultimate the point is that it isn’t “basic copyright law.” But if intent is determinitive of whether there is copying in the first place, how does that jive with an actor not making copies for themselves but rather accessing retained data in a third party’s cache after they grab the data for noncommercial purposes? Also, how does that make sense if the model is being trained for purely research purposes? And then perhaps that model is leveraged commercially after development? Your analysis, assuming it’s correct arguendo, leaves far too many outstanding substantive issues to be the ruling approach.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
Yes, inadvertent copying is still copying, but it would be copying in the output and is not evidence of copying happening in the creation of the model. That was why I used the music example, because it is rather probative of where there could be grounds for copyright infringement related to these model architectures. This may not seem an important distinction, but it has significant consequences on who is ultimately liable and how.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
I get that that’s how it feels given how it’s being reported, but the reality is that due to the way this sort of ML works, what internet archive does and what an arbitrary GPT does are completely different, with the former being an explicit and straightforward copy relying on Fair Use defense and the latter being the industrialized version of intensive note taking into a notebook full of such notes while reading a book. That the outputs of such models are totally devoid of IP protections actually makes a pretty big difference imo in their usefulness to the entities we’re most concerned about, but that certainly doesn’t address the economic dilemma of putting an entire sector of labor at risk in narrow areas.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
You are misunderstanding what I’m getting at and unfortunately no this isn’t just straightforwardly copyright law whatsoever. The training content does not need to be copied. It isn’t saved in a database somewhere (as part of the training…downloading pirated texts is a whole other issue completely removed from the inherent processes of training a model), relationships are extracted from the material, however it is presented. So the copyright extends to the right of displaying the material in the first place. If your initial display/access to the training content is non-infringing, the mere extraction of relationships between components is not itself making a copy nor is it making a derivative work in any way we haven’t historically considered it. Effectively, it’s the difference between looking at material and making intensive notes of how different parts of the material relate to each other and looking at a material and reproducing as much of it as possible for your own records.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
I have no personal interest in the matter, tbh. But I want people to actually understand what they’re advocating for and what the downstream effects would inevitably be. Model training is not inherently infringing activity under current IP law. It just isn’t. Neither the law, legislative or judicial, nor the actual engineering and operations of these current models support at all a finding of infringement. Effectively, this means that new legislation needs to be made to handle the issue. Most are effectively advocating for an entirely new IP right in the form of a “right to learn from” which further assetizes ideas and intangibles such that we get further shuffled into endstage capitalism, which most advocates are also presumably against.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
Training data IS a massive industry already. You don’t see it because you probably don’t work in a field directly dealing with it. I work in medtech and millions and millions of dollars are spent acquiring training data every year. Should some new unique IP right be found on using otherwise legally rendered data to train AI, it is almost certainly going to be contracted away to hosting platforms via totally sound ToS and then further monetized such that only large and we’ll funded corporate entities can utilize it.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
No, it isn’t storing that information in that sequence. What is happening is that it is overly encoding those particular sequential relationships along some arbitrary but tightly mapped semantic concepts represented by dimensions in a massive vector space. It is storing copies of the information on the way that inadvertent copying of music might be based on “memorized” music listened to by the infringing artist in the past.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
ML techniques have been very useful in compression, yes, but it’s sort of nuts to say that a data structure that encodes only (sometimes overly so for certain regions of its latent space/embedding space/semantics space/whatever you want to call it right now) relationships between values rather than value sequences themselves as storing contiguous copyright protected works is storing partiularized creative works in particularly identifiable manner.
- Comment on The Irony of 'You Wouldn't Download a Car' Making a Comeback in AI Debates 2 months ago:
On the other hand, it’s hard to have a serious discussion with people who insist that building a LLM or diffusion model amounts to copying pieces of material into an obfuscated database. And then having to deal with the typical reply after explanation is attempted of “that isn’t the point!” but without any elaboration strongly implies to me that some people just want to be pissy and don’t want to hear how they may have been manipulated into taking a pro-corporate, hyper-capitalist position on something.
- Comment on Kids Are Watching Brain Melting AI-Generated Videos on YouTube Without Parents Realizing 8 months ago:
No, but parenting can be pretty complex and there is a large degree of variability child to child. The idea that you either are a psychotic helicopter parent (because there is really no other interpretation to demanding a parent be around their toddler 24/7) or simply should not have children is a gross oversimplification and also, more importantly, fuckinh prima facie dumb as shit.
- Comment on Kids Are Watching Brain Melting AI-Generated Videos on YouTube Without Parents Realizing 8 months ago:
Cool. You clearly don’t know what you’re talking about then!
- Comment on OpenAI introduces Sora, its text-to-video AI model 8 months ago:
Keep in mind that this isn’t creating 3d Billy volumes at all. While immensely impressive, the thing being created by this architecture is a series of 2d frames.
- Comment on US patent office confirms AI can’t hold patents 9 months ago:
You have different fees related to bringing the patent to issuance that depend on the quality of the application (many patents just never issue) and that can rack up considerably. Then you have maintenance fees every few years after issuance that increase exponentially. In the US.
- Comment on US patent office confirms AI can’t hold patents 9 months ago:
Filing and prosecuting a patent application is already very expensive. Moreover, different entities are charged different rates, ranging from solo inventory (75% discount), to small entity (50%), and large/standard entity (0%, of course). Might be a little off on those discounts, been a minute since I’ve had to look directly at it.
- Comment on Pfizer says it will price Covid treatment Paxlovid at nearly $1,400 for a five-day course, which researchers estimate only costs Pfizer $13 to produce. That's a 10,000%+ markup. Shameful. 1 year ago:
No, they would just keep everything trade secret and we’d have no idea how to replicate the medicine.
- Comment on Pfizer says it will price Covid treatment Paxlovid at nearly $1,400 for a five-day course, which researchers estimate only costs Pfizer $13 to produce. That's a 10,000%+ markup. Shameful. 1 year ago:
This is very incorrect except for the very high level. Patents cover systems and methods and devices that are more than mere physical phenomena. Patent owners are granted an exclusive monopoly over the implementation of what the patent issued on (i.e., its eventual claims) that runs up to 20 years from the time of filing. They are an intellectual property right premised in property theory.
Trademarks cover designators of origin. Fundamentally, they are to reduce consumer confusion and are ultimately nothing more than a presumption once granted in favor of the owner in unfair competition disputes. They are also an intellectual property but are premised in totally different theories of law and can apply to literally anything that can be strongly associated with a company, more or less.
Copyright is an intellectual property, yes, but is limited to creative expression fixed in a tangible medium. This is a very short sentence but has some pretty serious depth to it. Copyright is ultimately a very specific type of right to, and this may shock you, copying a thing (fixed in a tangible medium…you do not have copyright on ideas).
That all said, pharma patents and, really, industry as a whole is super fucked and needs serious reimagining in the current era. But some form of IP absolutely is necessary to incentivize and enable drug creation of it is to persist in our free market capitalist economic structure.
- Comment on Meta deletes Al Jazeera presenter’s profile after show criticising Israel 1 year ago:
This might be very idiosyncratic to how you engage with people or with whom. I’ve lived in the deep Midwest and in an east coast major city. My name is EXTREMELY jewish. I have literally never had to explain my position on Israel or zionism when introducing myself. If Israel comes up in conversation in one way or another? Sure, people have asked what my opinion is, as a Jewish person, on Israel or such and such events, but that’s pretty reasonable and I don’t think ever frontloaded with anything.