Comment on Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models
frongt@lemmy.zip 3 weeks agoNo, the deepseek ones are filtered after the response is generated. It doesn’t matter how you ask or how it responds, if the response is recognized as forbidden information, it’s censored.
This also means that it’s only limited to its programming. Last time I tested, English and Chinese were censored, but a Spanish response was allowed.
aBundleOfFerrets@sh.itjust.works 3 weeks ago
Deepseek is notable that it is available and can be run locally if you have an NVIDIA whatever-the-fuck laying around