Comment

Comment on ChatGPT's new AI store is struggling to keep a lid on all the AI girlfriends

<- View Parent

CorrodedCranium@leminal.space ⁨1⁩ ⁨year⁩ ago

I think you can self host an AI chat not these days

source

Sort:hotnew top

Killer_Tree@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
You can, and it’s easier than you might think! Check out a platform like Oobabooga and find a nice 4-bit quantized LLM of a flavor you prefer. Check out TheBloke on hugging face, they quantized a ton of great LLMs.

source
- meliaesc@lemmy.world ⁨1⁩ ⁨year⁩ ago
  What the fuck did you just say?
  
  source
- Haha@lemmy.world ⁨1⁩ ⁨year⁩ ago
  What’s an LLM. Is it a new form of pyramid scheme?
  
  source
- Lemminary@lemmy.world ⁨1⁩ ⁨year⁩ ago
  What is “quantized”?
  
  source
  - barsoap@lemm.ee ⁨1⁩ ⁨year⁩ ago
    en.wikipedia.org/…/Quantization_(signal_processin…
    
    Roughly speaking: The AI equivalent of reducing bitrate. Works quite well if you’re only running them in inference mode and don’t want to train them as the networks are quite noise-resistant (rounding all weights is, in essence, introducing noise).
    
    source
    wikibot@lemmy.world [bot] ⁨1⁩ ⁨year⁩ ago
    Here’s the summary for the wikipedia article you mentioned in your comment:
    
    Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set (often a continuous set) to output values in a (countable) smaller set, often with a finite number of elements. Rounding and truncation are typical examples of quantization processes. Quantization is involved to some degree in nearly all digital signal processing, as the process of representing a signal in digital form ordinarily involves rounding. Quantization also forms the core of essentially all lossy compression algorithms. The difference between an input value and its quantized value (such as round-off error) is referred to as quantization error.
    
    ^to^ ^opt^ ^out^^,^ ^pm^ ^me^ ^‘optout’.^ ^article^ ^|^ ^about^
    
    source
    Killer_Tree@sh.itjust.works ⁨1⁩ ⁨year⁩ ago
    Exactly! If you only want to use a Large Language Model (LLM) to run your own local chatbot, then using a quantized version will dramatically improve speed and performance. It also allows consumer hardware to run larger models which would otherwise be prohibitively resource intensive.
    
    source