OccasionUnfair8094 t1_j8tud6r wrote on February 16, 2023 at 10:18 PM

Reply to comment by anti-torque in ChatGPT is a robot con artist, and we’re suckers for trusting it by altmorty

I don’t think this is true. You’re describing a Markov chain I believe, and this is more sophisticated than that. It is much more capable than that. Though you’re right it cannot discern between true and false.

nalninek t1_j8ujbrx wrote on February 17, 2023 at 1:12 AM

To be fair, neither can a lot of humans, at least if they don’t want to.

OccasionUnfair8094 t1_j8uk4v2 wrote on February 17, 2023 at 1:18 AM

Fair and accurate

gurenkagurenda t1_j8v4i1n wrote on February 17, 2023 at 4:00 AM

It is in fact not true.

anti-torque t1_j8vamcu wrote on February 17, 2023 at 4:56 AM

Or... it is.

gurenkagurenda t1_j8vnlyo wrote on February 17, 2023 at 7:20 AM

I think you must be getting confused because of the "reward predictor". The reward predictor is a separate model which is used in training to reduce the amount of human effort needed to train the main model. Think of it as an amplifier for human feedback. Prediction is not what the model being trained does.

anti-torque t1_j8xd81i wrote on February 17, 2023 at 5:01 PM

Yes, I see the meanings as different, because I was thinking the context of the question would bias the result.