Comment on Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models
frongt@lemmy.zip 2 months agoNo, the deepseek ones are filtered after the response is generated. It doesn’t matter how you ask or how it responds, if the response is recognized as forbidden information, it’s censored.
This also means that it’s only limited to its programming. Last time I tested, English and Chinese were censored, but a Spanish response was allowed.
aBundleOfFerrets@sh.itjust.works 1 month ago
Deepseek is notable that it is available and can be run locally if you have an NVIDIA whatever-the-fuck laying around