Comment

Comment on [deleted]

Naz@sh.itjust.works ⁨8⁩ ⁨months⁩ ago

Use an executable like LM Studio, and then an off the shelf pre-trained model from Huggingface.

VRAM × 0.8 for max size.

Experiment until you find one you like.

Sort:hotnew top

fishynoob@infosec.pub ⁨8⁩ ⁨months⁩ ago
Thank you. I was going to try and host Ollama and Open WebUI. I think the problem is to find a source for pretrained/finetuned models which provide such… Interaction. Does huggingface have such pre-trained models? Any suggestions?

source
- Naz@sh.itjust.works ⁨8⁩ ⁨months⁩ ago
  I don’t know what GPU you’ve got, but Lexi V2 is the best “small model” I’ve seen with emotions, that I can just cite from the top of my head.
  
  It tends to skew male and can be a little dark at times, but it’s more complex than expected for the size (8B feels like 48-70B).
  
  Lexi V2 Original
  
  Lexi V2 GGUF Version
  
  Do Q8_0 if you’ve got the VRAM, Q5_KL for speed, IQ2 or IQ3 if you’ve got a potato.
  
  source
  - fishynoob@infosec.pub ⁨8⁩ ⁨months⁩ ago
    I was going to buy the ARC B580s when they come back down in price, but with the tariffs I don’t think I’ll ever see them at MSRP. Even the used market is very expensive. I’ll probably hold off on buying GPUs for a few more months till I can afford the higher prices/something changes
    
    source
    Naz@sh.itjust.works ⁨8⁩ ⁨months⁩ ago
    If you are using CPU only, you need to look at very small models or the 2-bit quants.
    
    Everything will be extremely slow otherwise.
    
    source
    -> View More Comments