Comment

Comment on ChatGPT's new browser has potential, if you're willing to pay

brucethemoose@lemmy.world ⁨3⁩ ⁨months⁩ ago

Net might get to where you need AI

I hate to say it, but we’re basically there, and AI doesn’t help a ton. If the net is trash, there’s not a lot it can do.

Hopefully by then they will have figured out a way to make it free.

Self hosted is 100% taking off. Getting a local agent to sift through the net’s sludge will be about as easy as tweaking Firefox before long.

source

Sort:hotnew top

MagicShel@lemmy.zip ⁨3⁩ ⁨months⁩ ago
Local is also slower and… less robust in capability. But it’s getting there. I run local AI and I’m really impressed with gains in both. It’s just still a big gap.

We’re headed in a good direction here, but I’m afraid local may be gated by ability to afford expensive hardware.

source
- brucethemoose@lemmy.world ⁨3⁩ ⁨months⁩ ago
  Not anymore. I can run GLM 4.6 on a Ryzen/single RTX 3090 at 7 tokens/s, and it runs rings around most API models. I can run 14-49Bs in more utilitarian cases that do just fine.
  
  But again, it’s all ‘special interest tinkerer’ tier. You can’t do ollama run, you have to mess with exotic libraries and setups to squeeze out that kind of performance.
  
  source
  - MagicShel@lemmy.zip ⁨3⁩ ⁨months⁩ ago
    I’ll look into it. OAI’s 30B model is the most I can run in my MacBook and it’s decent. I don’t think I can even run that on my desktop with a 3060 GPU. I have access to GLM 4.6 through a service but that’s the ~350B parameter model and I’m pretty sure that’s not what you’re running at home.
    
    It’s pretty reasonable in capability. I want to play around with setting up RAG pipelines for specific domain knowledge, but I’m just getting started.
    
    source
    brucethemoose@lemmy.world ⁨3⁩ ⁨months⁩ ago
    
    I have access to GLM 4.6 through a service but that’s the ~350B parameter model and I’m pretty sure that’s not what you’re running at home.
    
    It is. I’m running this model, with hybrid CPU+GPU inference, specifically: huggingface.co/…/GLM-4.6-128GB-RAM-IK-GGUF
    
    You can likely run GLM Air on your 3060 if you have 48GB+ RAM. Heck. I’ll make a quant just for you, if you want.
    
    source
    -> View More Comments