Comment

tal@lemmy.today ⁨4⁩ ⁨months⁩ ago

While I don’t think that llama.cpp is specifically a risk, I think that running generative AI software in a container is probably a good idea. It’s a rapidly-moving field with a lot of people contributing a lot of code that very quickly gets run on a lot of systems by a lot of people. There’s been malware that’s shown up in extensions for (for example) ComfyUI. And the software really doesn’t need to poke around at outside data.

Also, because the software has to touch the GPU, it needs a certain amount of outside access. Containerizing that takes some extra effort.

old.reddit.com/…/psa_please_secure_your_comfyui_i…

ComfyUI users has been hit time and time again with malware from custom nodes or their dependencies. If you’re just using the vanilla nodes, or nodes you’ve personally developed yourself or vet yourself every update, then you’re fine. But you’re probably using custom nodes. They’re the great thing about ComfyUI, but also its great security weakness.

Half a year ago the LLMVISION node was found to contain an info stealer. Just this month the ultralytics library, used in custom nodes like the Impact nodes, was compromised, and a cryptominer was shipped to thousands of users.

Granted, the developers have been doing their best to try to help all involved by spreading awareness of the malware and by setting up an automated scanner to inform users if they’ve been affected, but what’s better than knowing how to get rid of the malware is not getting the malware at all.

Ollama means sticking it in a Docker container, and that is, I think, a positive thing.

If there were a close analog, like some software package that could take a given LLM model and run in podman or Docker or something, I think that that’d be great. But I think that putting the software in a container is probably a good move relative to running it uncontainerized.

source

Sort:hotnew top

brucethemoose@lemmy.world ⁨4⁩ ⁨months⁩ ago
I don’t understand.

Ollama is not actually docker, right? It’s running the same llama.cpp engine, it’s just embedded inside the wrapper app, not containerized.

And basically every LLM project ships a docker container. I know for a fact llama.cpp, TabbyAPI, Aphrodite, vllm and sglang do.

You are 100% right about security though, in fact there’s a huge concern with compromised Python packages. This one almost got me: pytorch.org/blog/compromised-nightly-dependency/

source
- tal@lemmy.today ⁨4⁩ ⁨months⁩ ago
  I’m sorry, you are correct. The syntax and interface mirrors docker, and one can run ollama in Docker, so I’d thought that it was a thin wrapper around Docker, but I just went to check, and you are right — it’s not running in Docker by default. Sorry, folks! Guess now I’ve got one more thing to look into getting inside a container myself.
  
  source
  - hasnep@lemmy.ml ⁨4⁩ ⁨months⁩ ago
    Try ramalama, it’s designed to run models override oci containers
    
    source