Learning to reason with LLMs