I suspect RAM may become increasingly useful with the shift from pure chat LLM to connected agents, MCP, and catching results and data for scaling things like public Internet search.
When I think of database system server software, a lot of performance gains are from keeping used data in RAM. With the expanding of LLM systems and it’s concerns, backing data, connective ness, and need for optimisation, a shift to caching and keeping in RAM seems to suggest itself. It’s already wasteful/big and operates on a lot of data, so it seems plausible that would not be a small cache.
humanspiral@lemmy.ca 5 months ago
It doesn’t really, though CPU inference is possible/slow at 256+gb. The problem is that they are making HBM (AI) ram instead of ddr4/5.