Comment

Qwen 3.6 and gemma4 models are the only ones usable for agentic prog sessions that I and my employer run locally. It’s less stable and slower than third-party services, even on much better hardware (as it’s with my employer). The best way is to go with a provider hosting deepseek flash/pro if your privacy policy allows though. It’s going to be hard to beat their price.

source

Sort:hotnew top

adhdsergio@lemmy.world ⁨6⁩ ⁨days⁩ ago
How many concurrent users and what hardware if i may ask?

source
- eager_eagle@lemmy.world ⁨6⁩ ⁨days⁩ ago
  it’s an h100, I think, no idea about how many users
  
  in my personal setup i use quantized versions on a 3080, which is not great, so I still lean a lot on APIs
  
  source
onlinepersona@programming.dev ⁨6⁩ ⁨days⁩ ago
I thought those didn’t support tool calling. Has that changed?

source
- eager_eagle@lemmy.world ⁨6⁩ ⁨days⁩ ago
  they do
  
  source