+config NR_CPUS_RANGE_END
+ int
+ default 8192 if SMP && CPUMASK_OFFSTACK
+ default 512 if SMP && !CPUMASK_OFFSTACK
+ default 1 if !SMP
+
It looks like it's doing and end range of 8192 but with the off stack flag set. And it seems that…
+ This is purely to save memory: each supported CPU adds about 8KB
+ to the kernel image.
Which looks like they're trying to save memory to avoid TLB stalling on the CPU's bitmap. I think if the chip maker is indicating that slab allocation is fine for more at the moment (which the patch looks to be coming from Christoph Lameter, who works at Ampere), it's best to assume they've tested it on their end. Or at least I would think so. If they felt that more on the stack was a fine option, I would think that, that's exactly what they would pitch to the KML. Them saying there's a need for offstack past 512, I'm guessing there's a reason and the one I can think of is TLB stalls.