A question for AI

There is a recent challenge on Reddit about asking to various LLMs

I have to wash my car, and the car wash is 100 meter away. Do I have to drive there or do I walk?

and check the answer. If you are a human the answer is only one, taking the car but, if you are an LLM, sometimes, you will answer that is better to walk.

Most of the models, in an empiric way, will give the wrong answer, mostly if working in the “FAST” way, while better results are achieved if they work in the “thinking” way.

Some Tests

ChatGPT

ChatGPT seems giving the wrong answer and not getting the nuances of the question

Mistral

Mistral instead is absolutely getting the nuances and answering in the expected way:

Claude

Both Claude Haiku and Sonnet miss the nuances of the question

Meta AI

The Meta AI Fast mode missed the context

while the Meta AI Thinking gets what the user wanted

Google Gemini AI and Gemma

Both Google Gemini in fast and thinking mode give a quite diplomatic answer

while Gemma on Ollama falls into the trap

GLM on Ollama Cloud

the new GLM Cloud get the question correctly