Comment

Comment on Are AI Models Advanced Enough To Translate Literature? The Debate Is Roiling Publishing: Major publishers are experimenting with automated translations, hundreds of which have already been produced.

brucethemoose@lemmy.world ⁨6⁩ ⁨months⁩ ago

I use local instances of Aya 32B (and sometimes Deepseek, Qwen, LG Exaone, Japanese finetunes, others depending on the language) to translate stuff, and it is quite different than Google Translate or any machine translation you find online. They get the “meaning” of text instead of transcribing it robotically like Google, and are actually pretty loose with interpretation.

It has soul… sometimes too much. That’s the problem: It’s great for personal use where it can ocassionally be wrong or flowery, but not good enough for publishing and selling, as the reader isn’t necessarily cognisant of errors.

source

Sort:hotnew top

br3d@lemmy.world ⁨6⁩ ⁨months⁩ ago
These language models don’t get the meaning of anything. They predict the next cluster of letters based on the clusters of letters that have come before. Sorry, but if it feels to you like they’re captured the meaning of something, you’re being bamboozled

source
- brucethemoose@lemmy.world ⁨6⁩ ⁨months⁩ ago
  It’s a metaphor.
  
  They’re translating the input tokens to intent in the model’s middle layers, which is a bit more precise.
  
  source
arararagi@ani.social ⁨6⁩ ⁨months⁩ ago
I guess that’s why I saw someone call them “dazed translations”, these models start doing poetry halfway through.

source
MonkderVierte@lemmy.ml ⁨6⁩ ⁨months⁩ ago
Or a tool for the translator to save time?

source
JustTesting@lemmy.hogru.ch ⁨6⁩ ⁨months⁩ ago
Actually, as to your edit, the it sounds like you’re fine-tuning the model for your data, not training it from scratch. So the llm has seen english and chinese before during the initial training. Also, they represent words as vectors and what usually happens is that similiar words’ vectors are close together. So subtituting e.g. Dad for Papa looks almost the same to an llm. Same across languages. But that’s not understanding, that’s behavior that way simpler models also have.

source
- brucethemoose@lemmy.world ⁨6⁩ ⁨months⁩ ago
  True! Models not trained on a specific language are generally bad at that language.
  
  However, there are some exceptions, like a Japanese tune of Qwen 32B which dramatically enhances it Japanese, but the training has to be pretty extensive.
  
  And even that aside… the effect is still there. The point it to illustrate that LLMs are sort of “language independent” internally, like you said.
  
  source