True! Models not trained on a specific language are generally bad at that language.
However, there are some exceptions, like a Japanese tune of Qwen 32B which dramatically enhances it Japanese, but the training has to be pretty extensive.
And even that aside… the effect is still there. The point it to illustrate that LLMs are sort of “language independent” internally, like you said.