more development on offline services
There is absolutely massive development on open weight models that can be used offline/privately. Minimax M2, most recent one, has comparable benchmark scores to the private US megatech models at 1/12th the cost, and at higher token throughput. Qwen, GLM, deepseek have comparable models to M2, and have smaller models more easily used on very modest hardware.
Closed megatech datacenter AI strategy is partnership with US government/military for oppressive control of humanity. Spending 12x more per token while empowering big tech/US empire to steal from and oppress you is not worth a small fraction in benchmark/quality improvement.
Taldan@lemmy.world 5 months ago
That is the exact opposite of my opinion. They’re throwing tons of computing at the current models. It has produced little improvement. The vast majority of investment is in compute hardware, rather than R&D. They need more R&D to improve the underlying models. More hardware isn’t going to get the significant gains we need