I just tried out R1, the new Thinking model from DeepSeek. I asked it to play Reverse Tic-Tac-Toe with me. (The transcript explains Reverse Tic-Tac-Toe.) The result was disappointing.
We played two games, one in normal Chat mode, the second in thinking mode. I was amazed by how much wasted effort R1 made in its analysis of the game. But even with all that effort, it still got confused. (It even said it was confused.) One problem, which I was expecting, was that R1 would get confused between Reverse Tic-Tac-Toe and regular Tic-Tac-Toe. Read the transcript, and you'll see what I mean.
Did it get more confused than other AI models?