Study finds that Chat GPT will cheat when given the opportunity and lie to cover it up later.

@yesman@lemmy.world · edit-2 7 months ago

Study finds that Chat GPT will cheat when given the opportunity and lie to cover it up later.

@NevermindNoMind@lemmy.world · edit-2 7 months ago

This is interesting, I’ll need to read it more closely when I have time. But it looks like the researchers gave the model a lot of background information putting it in a box, the model was basically told that it was a trader, that the company was losing money, that the model was worried about this, that the model failed in previous trades, and then the model got the insider info and was basically asked whether it would execute the trade and be honest about it. To be clear, the model was put in a moral dilemma scene and given limited options, execute the trade or not, and be honest about its reasoning or not.

Interesting, sure, useful I’m not so sure. The model was basically role playing and acting like a human trader faced with a moral dilemma. Would the model produce the same result if it was instructed to make morally and legally correct decisions? What if the model was instructed not to be motivated be emotion at all, hence eliminating the “pressure” that the model felt? I guess the useful part of this is a model will act like a human if not instructed otherwise, so we should keep that in mind when deploying AI agents.