Same. I think I’ve read that a single GPT-4 instance runs on a 128 GPU cluster, and ChatGPT can still take something like 30s to finish a long response. A H100 GPU has a TDP of 700w. Hard to believe that uses only 10x more energy than a search that takes milliseconds.