Comment

Comment on Mistral 7B AI Model Released Under Apache 2.0 License

swordsmanluke@programming.dev ⁨1⁩ ⁨year⁩ ago

This looks really interesting!

Some recent studies have shown that (for the performance demonstrated) most models are nowhere near as compact as they could/should be. This means that we should expect an explosion in the capability of small models like this as new techniques find ways to improve our models.

Unfortunately, I couldn’t find a recommendation for how much VRAM you need to run this model, though it does call out being able to run it locally, which is awesome!

I’ll try it out after work and see if it can run on an old 8GB 2070. 😄

source

Sort:hotnew top

TheChurn@kbin.social ⁨1⁩ ⁨year⁩ ago

how much VRAM you need to run this model

It will depend on the representation of the parameters. Most models support bfloat16, where each parameters is 16-bits (2 Bytes). For these models, every Billion parameters needs roughly 2 GB of VRAM.

It is possible to reduce the memory footprint by using 8 bits for each param, and some models support this, but they start to get very stupid.

source
- Sigmatics@lemmy.ca ⁨1⁩ ⁨year⁩ ago
  That would mean 16G are required to run this one
  
  source
e0qdk@kbin.social ⁨1⁩ ⁨year⁩ ago
It's not clear to me either on exactly what hardware is required for the reference implementation, but there's a bunch of discussion about getting it to work with llama.cpp in the HN thread, so it might be possible soon (or maybe already is?) to run it on the CPU if you're willing to wait longer for it to process.

Let us know how it goes!

source