How Claude learns to maximize reward

how Claude learns to maximize a reward. this one: how it plans when something's trying to beat it. lecture 9 of CS221 is game playing minimax: assume your opponent plays perfectly, then pick the move that's least bad no matter what they do. it's the engine behind Deep Blue,