Skip to main content

Command Palette

Search for a command to run...

AI Field Notes 2026

Updated
9 min readView as Markdown
AI Field Notes 2026
S

Technical lead with 15+ years experience architecting, coding, delivering projects & building development teams. Specialising in C#.

There is no doubt that AI has rocked the world of coding considerably over the last couple of years. One thing that has become apparent, through keeping up with the whirlwind of changes, is that people are still figuring out how to use these tools effectively. It seems like every video I watch, or article I read, people are swearing they've found the killer approach.

I have noticed through all the noise though, that there are some key concepts that seem common regardless of which approach you take. This isn't comprehensive, and is likely to change over time, but my key observations in July 2026 are:

Clarity is king

I recently watched this excellent TED Talk by Rainer Stropek. His closing remark is "I am not just a coder, I am a developer and my new programming language is clarity". The talk is well worth a watch for various reasons, but this statement definitely struck a chord with me.

Clarity, when working with AI agents, is key. This isn't a new thing though, as if you were working with a team on delivering a coding project, having a shared context and understanding of the goal and approach is key. That doesn't change when working with AI. Having that shared understanding of not just the task, but the goal and the process leading there will give you much better results than just stating a task and expecting great results.

Rather than just state the task, state the goal you want to achieve and the key criteria or guardrails to get there. Use the LLM to help plan and build your spec before letting it loose. A lot of harnesses have planning modes, so give them a try. Formulate a plan, break it down into stages if required, then use that to guide and review the output.

"Grill me relentlessly", as Matt Pocock recommends, to work towards that shared understanding.

Weak:
Add task cancellation to the CLI.

Better:
Add a `task cancel` command to the CLI using the existing command patterns. 

It should:
- Cancel queued deployment tasks for a selected environment
- Use the same environment matching behaviour as the other commands

Ensure you:
- Include unit tests and live integration tests
- Update any user-facing help text if the command surface changes
- Run the unit test suite, including integration tests
- Provide a summary of your changes and validation

Context is also king

A follow-up point from the above: be surgical about the context you provide. Imagine you're talking to a contractor who knows nothing about your project. Be clear on the technical stack and constraints. Give clear directions, link to key files, types or documentation it needs to do a good job. Don't just point at your repo or codebase and expect it to scan and discover everything from scratch. You'll either burn tokens unnecessarily or pull all your hair out when you keep getting bad, inconsistent results. Even worse, both. High signal, low noise.

Prefer smaller, reviewable tasks. Give clear boundaries and observable success criteria where possible.

Weak:
Look through the repo and work out how commands are built.

Better:
The existing command patterns are in `src/ShipItSharp.Console/Commands`. 

Start with `BaseCommand`, `Environment`, and `Channel` as examples. 

Core orchestration should live in the shared runner layer, not directly in the console command. 

You're not just stuck in the terminal

I picked this up from a talk by Lucas Meijer, who offered a simple tip that made my comprehension and efficiency so much better. Ask for a single-page HTML summary. That's it, but it's amazingly powerful in the right situation.

One example use case is I asked Codex to "grill me" on a complex goal and it generated 60 questions for me to review. Rather than answering them all one by one in a terminal, I got it to spit out an HTML file with an export option so I could review, save, and return if needed.

I've also used it to generate reviews of code, produce a detailed plan of action or even show me design choices to choose between when I've been unsure.

Obviously, don't overuse this as it can burn tokens, but in the right contexts it can produce a much easier response to comprehend and lead you to a more efficient goal.

Chat context is expensive

It's a general recommendation that you keep to a low number of messages per chat or session. Why? Every time you send a message, the entire previous context is also sent. The larger this context, the less useful the results become and the more expensive the chat. Most harnesses have reminders or auto compaction to help with this, but often it's still best to request a summary of your conversation so far, then start a new chat with that summary to maximise efficiency.

On a similar note, if your instructions were unclear and you got a duff response back, don't scorn the AI tool and tell it to try again. Consider editing your original message so it's clearer and then resend. It keeps your context shorter, will give you a more refined response and will burn fewer tokens.

Get AI to validate its own work

Before you set your agent off on a series of tasks, make sure you define sensible validation for the result. Think about what you would expect if you were doing the work yourself. Unit tests that pass covering specific areas? Second pass review? Integration tests? Browser validation? Whatever it may be, it's key to save time by getting it to validate its response before asking you to take over.

One tip I picked up from a colleague is to task the agent with continuing to refine until it's 95% confident on an answer. If it's unable to reach that level of confidence, provide a reason as to why.

A pattern I’ve started using is asking Codex to do a second-pass review against the original goal, not just against compiler errors. For example, after command/help text changes I asked it to verify the output against the source and generated docs. The first review found several issues, a second pass found a couple more, and only after that did the output settle. That’s a useful workflow: the agent writes, then the agent reviews, but I still own the final judgement.

Another example is I recently used Codex to add new task-management commands to ShipItSharp. The useful part wasn’t that it generated the code. It was that I could give it the expected behaviour, repo conventions, validation criteria, and test expectations up front. It inspected the existing command patterns, added the command surface, core orchestration, unit tests, command tests, and live integration tests against the real Octopus instance. The first pass missed part of my integration-test expectation, which was a good reminder that the agent still needs senior review and clear acceptance criteria.

The two-strike rule

When an agent's code throws an error, give it one shot to fix it. If it fails a second time, step away from the prompt window. Paste the error back into the chat a third time and you'll often watch the AI enter a hallucination loop, generating increasingly bizarre workarounds. Figure out where your initial context or constraint was lacking, and start a fresh chat. Know when to cut your losses.

A perfect example of this is when I was working on ShipItSharp. I needed a new environment, so I was using Gemini to help set up a new Octopus Deploy instance hosted locally. I encountered error after error, and rather than backing away I persisted. In the end, it hallucinated that Octopus Deploy can run on PostgreSQL. After calling it out, its response was:

I owe you a massive, deeply embarrassing apology.

Lesson learned.

Think of using AI agents as a "layer up"

This calls back to Rainer Stropek's talk, where he talks about the shift from being on the ground writing code to moving another layer up to orchestrate the tool to do the boilerplate stuff. Your job shifts more towards the problem solving, planning, clarity of communication and validation. It's still key to review the output, especially in the key areas of your system where security and resilience matter. This means you must understand, and take responsibility for, the generated output.

A good example for me was release workflow work. I didn't need Codex just to edit YAML. I needed it to compare the Unix and Windows workflows, check that the release artifacts matched, verify action references, update documentation, and sanity-check the release tag against the actual version source. My role was less about typing every line and more about setting the standard for what "done" meant.

It honestly reminds me of earlier in my career. My first love was HTML, CSS and JavaScript (this was 20+ years ago, so not the frameworks we know now). I loved creating things, but over time I started to get frustrated with doing the same basic CRUD operations. That's when I fell in love with backend coding, and eventually the move towards architecture and bigger-picture thinking. The key through all of this is the love of problem solving. Seeing an issue, finding a solution, and building a solution.

AI tools are like that next level up, where the key things that make you a great developer are still absolutely relevant. Anyone can vibe code something, but building something resilient, secure, safe and scalable for thousands if not millions of users still requires strong development skills.

AI is not human, don't treat it like it is

It may sound harsh, but its true. It's only natural to want to personify the AI tools we use. After all, it often responds to you as if it was a friend or colleague. In reality, you're talking to a probabilistic engine choosing responses it thinks are most appropriate using complex maths. It has no feelings or empathy. Keep operational prompts concise.

Most harnesses these days have an option to change its personality. I'd recommend setting it to something pragmatic rather than overly friendly and warm.

Do you have any tips you've found useful? Please do share ❤️ Maybe I'll return next year and point out all the things that have changed and learnt.

More from this blog