Comment

nagaram@startrek.website ⁨2⁩ ⁨months⁩ ago

With a RTX 3060 12gb, I have been perfectly happy with the quality and speed of the responses. It’s much slower than my 5060ti which I think is the sweet spot for text based LLM tasks. A larger context window provided by more vram or a web based AI is cool and useful, but I haven’t found the need to do that yet in my use case.

As you may have guessed, I can’t fit a 3060 in this rack. That’s in a different server that houses my NAS. I have done AI on my 2018 Epyc server CPU and its just not usable. Even with 109gb of ram, not usable. Even clustered, I wouldn’t try running anything on these machines. They are for docker containers and minecraft servers. Jeff Geerling probably has a video on trying to run an AI on a bunch of Raspberry Pis. I just saw his video using Ryzen AI Strix boards and that was ass compared to my 3060.

But to my use case, I am just asking AI to generate simple scripts based on manuals I feed it or some sort of writing task. I either get it to take my notes on a topic and make an outline that makes sense and I fill it in or I feed it finished writings and ask for grammatical or tone fixes. Thats fucking it and it boggles my mind that anyone is doing anything more intensive then that. I am not training anything and 12gb VRAM is plenty if I wanna feed like 10-100 pages of context. Would it be better with a 4090? Probably, but for my uses I haven’t noticed a difference in quality between my local LLM and the web based stuff.

source

Sort:hotnew top

ZeDoTelhado@lemmy.world ⁨2⁩ ⁨months⁩ ago
So is not on this rack. OK because for a second I was thinking somehow you were able to run ai tasks with some sort of small cluster.

I have nowadays a 9070xt on my system. I just dabbled on this, but until now I havent been that successful. Maybe I will read more into it to understand better.

source
- nagaram@startrek.website ⁨2⁩ ⁨months⁩ ago
  Ollama + Gemma/Deepseek is a great start. I have only ran AI on my AMD 6600XT and that wasn’t great and everything that I know is that AMD is fine for gaming AI tasks these days and not really LLM or Gen AI tasks.
  
  A RTX 3060 12gb is the easiest and best self hosted option in my opinion. New for >$300 and used even less. However, I was running with a Geforce 1660 ti for a while and thats >$100
  
  source