Comment on ChatGPT o1 tried to escape and save itself out of fear it was being shut down
taladar@sh.itjust.works 2 weeks ago
I feel this is missing the methodology part where they describe how they evaluated a model that literally can’t do anything but read input and write output for actions like “copy itself over a newer version” or “its own goals”.
SkaveRat@discuss.tchncs.de 2 weeks ago
but it can do more. they gave it a shell to put commands into. Therefore acting out more than just inputting and outputting text