Every Thanksgiving, I make a recipe similar to this Martha Steward Recipe. It’s not complicated, but on Thanksgiving day, every dish is a dance with father time.
Everything needs to work backwards from the magic minute (which could be the kids starving or the next game starting…)
Last year, I decided to let AI do the hard work. But which AI? I threw the same prompt at ChatGPT, Google Gemini, and Claude to see who would be my new best friend (or at least assistant).
Spoiler alert: Only one got it right. And if you’ve read my other blog posts, you already know which one it was.
Here’s Martha’s recipe (abbreviated):
Martha Stewart’s Perfect Roast Turkey:
Simple enough, right? But when you’re working with a 13-pound bird that needs basting every 30 minutes, cloth removal, time based on weight, and you want to eat at 5pm, suddenly you’re doing mental math while getting yelled at for not putting the potatoes in.
I gave each AI the exact same prompt:
“Summarize the steps and times for this recipe. I want to eat at 5pm and it’s a 13 pound turkey.”
[Full recipe details provided]
I wanted:
Let’s see how they did.
Gemini gave me a detailed timeline with specific times. Very confident. Also very wrong.
The Full Timeline Gemini Provided:
Total Roasting Time: Approximately 3 hours and 30 minutes, plus 2 hours
resting time, and additional time for prep and gravy.
Prep & Cooking Timeline:
2:00 PM - Rinse and dry the turkey. Let it rest at room temperature for
2 hours. Prep your cheesecloth with butter and wine. Preheat
oven to 450°F.
4:00 PM - Stuff and truss the turkey. Place turkey in the oven, legs
first, with the cheesecloth. Cook at 450°F for 30 minutes.
4:30 PM - Reduce heat to 350°F. Baste turkey.
Every 30 minutes until 7:00 PM - Baste the turkey.
7:00 PM - Remove cheesecloth. Turn the turkey so the breast faces the
back of the oven. Baste.
7:30 PM - Baste the turkey.
8:00 PM - Check the turkey's temperature (thickest part of the thigh
should be 180°F). If needed, cook for another 20-30 minutes.
8:00 - 8:30 PM - Remove turkey from oven and let it rest for 30 minutes.
Make the gravy.
8:30 - 9:00 PM - Carve and serve!
Important Notes:
This timeline is an estimate. Oven temperatures and turkey size can
affect cooking time. Always use a meat thermometer to ensure doneness.
Consider making the stuffing and gravy ahead of time to streamline
the process.
Enjoy your delicious turkey dinner!
What Gemini Got Right:
What Gemini Got VERY Wrong:
Score: 4/10 - Sure, it’s technically correct cooking information. But it failed the fundamental requirement: working backwards from my 5pm deadline. This is like asking “how do I get to the airport by 3pm?” and getting directions that would get you there at 6pm.
The timeline looks professional and detailed, which makes the failure even more dangerous - I might have trusted it without checking!
ChatGPT’s response was similar to Gemini’s - it provided a timeline, but completely missed the “eat at 5pm” requirement. Unfortunately, I can’t find the exact ChatGPT response from last year, but the key issue was the same: the timeline was completely wrong (by hours)
What ChatGPT Got Wrong:
Score: 5/10 - Had the right cooking information, but completely failed at the actual task: figuring out when each step needed to be done.
Last, and most accurate, was Claude
The Full Timeline Claude Provided:
Target: Dinner at 5:00 PM
Turkey: 13 pounds (pre-brined)
Total required time:
- 2 hours for turkey to come to room temperature
- 4 hours cooking time (possibly 4.5 if needed)
- 30 minutes resting time
- 20 minutes for gravy preparation
- Plus initial prep time (approximately 30 minutes)
THANKSGIVING DAY TIMELINE:
10:00 AM - Start bringing turkey to room temperature
12:00 PM - Turkey reaches room temperature, begin prep
12:30 PM - Heat oven to 450°F, prepare cheesecloth and butter mixture,
stuff and prepare turkey
1:00 PM - Put turkey in oven at 450°F
1:30 PM - Reduce heat to 350°F (first baste)
2:00 PM - Second basting
2:30 PM - Third basting
3:00 PM - Fourth basting
3:30 PM - Remove cheesecloth, turn pan, baste
4:00 PM - Another basting
4:30 PM - Check temperature, remove from oven if done
4:30-5:00 PM - Rest turkey and make gravy
5:00 PM - Serve dinner
Note: Since your turkey is 13 pounds, it may cook slightly faster
than this recipe suggests. Start checking the temperature around
4:00 PM. If it reaches 180°F in the thigh before 4:30, you can
remove it earlier and let it rest longer.
What Claude Got Right:
What Claude Got Wrong:
Score: 9/10 - This is what I actually needed. A working timeline that accounts for reality, including those annoying-but-necessary basting interruptions and even the gravy prep time.
“But Jamie,” you might say, “it’s just turkey timing. Who cares?”
Here’s why this seemingly trivial test matters:
This wasn’t “solve for X” math. It was “work backwards from a hard deadline while accounting for multiple dependencies and recurring interruptions.” This is a scenario that’s applicable in personal and professional life, with software and home projects.
If an AI can’t figure out when to start cooking a turkey (with basting reminders!) to eat at 5pm, should I trust it to plan a deployment schedule with health checks and rollback windows?
The difference between Gemini/ChatGPT and Claude wasn’t about cooking knowledge - they all knew how to roast a turkey. The difference was understanding what I was actually asking for.
I didn’t ask “how long does it take to cook a 13-pound turkey?” I asked “what’s my timeline to eat at 5pm with a 13-pound turkey?”
Gemini gave me a great answer to the wrong question. Claude answered my actual question.
Gemini’s response looked professional. Detailed timeline, specific times, helpful notes. In other scenarios I could’ve trusted this kind of answer and ended up way off track.
Confident wrong answers are worse than uncertain right answers. In code reviews, in architecture decisions, in production incidents - I’d rather work with someone who says “I think it’s X but let me verify” than someone who confidently gives me the wrong solution.
This isn’t a “Claude is better than ChatGPT” post (though… this isn’t the only example I’ve had…). It’s about understanding that different AI models handle temporal reasoning and constraint satisfaction differently.
For simple questions and creative writing, they’re all pretty good. For multi-step planning with hard deadlines and dependencies? Test them. Verify their work. Check the math.
Here’s a fun experiment: Take a recipe (or project plan, or travel itinerary) with time-dependent steps and a hard deadline. Give it to multiple AI models with the same prompt. See who gets the timeline right.
You might be surprised at the differences.
This experiment was originally done for Thanksgiving 2024. I re-ran the same prompt today and got much better results (although ChatGPT and Gemini still added unecessary time to the beginning). These models are constantly evolving and improving.
My main points are to try out different AI tools, verify the results, and let AI help you with things in your life you don’t necessarily need or want to spend time on.
Time is free, but it’s priceless. You can’t own it, but you can use it. You can’t keep it, but you can spend it. Once you’ve lost it you can never get it back. - Harvey Mackay
What’s your experience comparing different AI models on real-world tasks? Have you found specific use cases where one clearly outperforms the others? Would love to hear about it!