Comment

Comment on Self-Hosted AI is pretty darn cool

superglue@lemmy.dbzer0.com ⁨6⁩ ⁨months⁩ ago

What kinds of specs do you need to run it well? I’ve got a laptop with a 3070.

source

Sort:hotnew top

coffee_with_cream@sh.itjust.works ⁨6⁩ ⁨months⁩ ago
You probably want 48gb of vram or more to run the good stuff. I recommend renting GPU time instead of using your own hardware via AWS or other vendors - runpod.io is pretty good.

source
- NotMyOldRedditName@lemmy.world ⁨6⁩ ⁨months⁩ ago
  Kinda defeats the purpose of doing it private and local.
  
  I wouldn’t trust any claims a 3rd party service makes with regards to being private.
  
  source
- theterrasque@infosec.pub ⁨6⁩ ⁨months⁩ ago
  Llama3 8b can be run at 6gb vram, and it’s fairly competent. Gemma has a 9b I think, which would also be worth looking into.
  
  source
- 31337@sh.itjust.works ⁨6⁩ ⁨months⁩ ago
  IDK, looks like 48GB cloud pricing would be 0.35/hr => $255/month. Used 3090s go for $700. Two 3090s would give you 48GB of VRAM, and cost $1400 (I’m assuming you can do “model-parallel” will Llama; never tried running an LLM, but it should be possible and work well). So, the break-even point would be <6 months. Hmm, but if Severless works well, that could be pretty cheap. Would probably take a few minutes to process and load a ~48GB model every cold start though?
  
  source
  - fhein@lemmy.world ⁨6⁩ ⁨months⁩ ago
    Assuming they already own a PC, if someone buys two 3090 for it they’ll probably also have to upgrade their PSU so that might be worth including in the budget. But it’s definitely a relatively low cost way to get more VRAM, there are people who run 3 or 4 RTX3090 too.
    
    source