June 22, 2026. The most useful AI release this week was not a smarter chatbot. It was a quieter feature that changes who can build automation. OpenAI shipped Record and Replay for Codex, and the idea is simple: you do a task once while the agent watches, and it writes a reusable skill that can do that task again on demand. No prompt engineering, no script, no developer in the loop. Anyone who can perform a workflow can now teach an agent to repeat it. The full feature documentation is on the OpenAI developers site, and it was covered by TechTimes.
What shipped
Record and Replay turns a demonstrated workflow into what OpenAI calls a skill. Here is how it works, straight from the primary docs.
- You open the Codex app, start a recording, and perform a workflow on your Mac. The examples OpenAI gives are everyday tasks: file an expense, book a parking space, create a correctly configured issue, publish a video, or download a recurring report.
- When you stop, Codex inspects what it saw and drafts a skill that documents four things: when to use the workflow, what inputs it needs, the steps to follow, and how to verify the result. You can then refine that skill in plain language.
- To run it later, you start a new thread and give the agent only the values that change this time, such as the file to upload, the issue to create, or the date range for the report.
- Skills can run with Computer Use, browser actions, and connected plugins, and a team can share them. If you want to distribute a stable package, you bundle skills into a plugin.
- The limits are real and worth knowing: Record and Replay is macOS only, it requires Computer Use to be enabled, and it excludes the European Union at launch. Administrators can switch it off centrally, because the same setting that controls Computer Use controls this feature.
Why this is a bigger deal than it looks
For years, automating an internal task meant writing a script or wiring an integration, which meant paying a developer. That math only worked for high volume workflows. The long tail of small, screen based, slightly fiddly tasks that every business runs, the weekly report that lives in three tabs, the invoice that gets renamed and filed a certain way, the ticket that has to be tagged just so, was never worth the engineering time. Demonstration based automation collapses that cost. The work of describing the task is replaced by simply doing it once. And because the agent records the real steps plus a verification step, the result tends to be more reliable than a one shot prompt that hopes the model guesses your process correctly.
What it means for operators
For the small and mid sized businesses and agencies we work with, the opportunity is to capture tribal knowledge as reusable, shareable assets. The person who knows the exact way your team files a claim or builds a client report can now encode that knowledge once, and the whole team inherits it. That captured library of skills is the durable thing you own. The model underneath it will keep changing, as this week's reshuffle of AI talent makes clear, but your skills travel with you.
Be clear eyed, though. Industry surveys through 2026 keep finding the same thing: a large majority of agent pilots never reach production, and only a minority of companies report strong returns on AI spend. The teams that win are not the ones that automate the most, they are the ones that automate narrow, well understood tasks and put a verification step on every one. Record and Replay is well suited to that discipline because the verify step is baked into how a skill is created.
How to use this now
- Pick repetitive, rules based, screen bound tasks first: recurring reports, data entry, ticket triage, invoice filing, simple research pulls.
- Record the clean happy path. Keep it short, tell the agent your goal up front, and call out which inputs vary between runs.
- Keep secrets out of the recording. Use realistic but non sensitive inputs, and never demonstrate a task with a live password or key on screen.
- Lean on the agent's own verification step, and keep a human approval on anything irreversible: sending money, emailing a client, deleting records.
- Pilot on one team, refine the skill to encode the hidden preferences (naming conventions, field defaults, decision points), then share it.
- Pair desktop demonstration with API based automation where clean APIs exist. The most resilient setups combine both, which is exactly the kind of AI automation and n8n workflow work we build.
This is also why computer using agents matter so much right now. A demonstrated skill is only as good as the agent's ability to act on your screen and in your tools safely, which is the heart of our computer use agent setup and AI engineering services. If you would rather not wait for a macOS only beta, an AI automation agency can build the same record once, reuse forever pattern on tools you already run, with the guardrails and testing that keep a pilot from becoming the next failed statistic.
Frequently Asked Questions
It is a Codex feature that lets you demonstrate a workflow on your Mac and turns that demonstration into a reusable skill. After you stop recording, Codex writes a skill that documents when to use the task, the inputs it needs, the steps, and how to verify the result, then runs it on demand with the values that change each time.
A prompt asks the model to guess your process. A script requires a developer and only pays off for high volume tasks. Demonstration captures the real steps you actually take, plus a verification step, so it is usually more reliable for the long tail of small, screen based workflows and needs no code.
Start with repetitive, rules based, screen bound tasks where success is easy to check: recurring reports, data entry, ticket triage, invoice filing, and simple research pulls. Avoid anything irreversible or judgment heavy until you have tested the skill and added a human approval step.
It can be, with guardrails. Keep secrets out of the recording, scope the agent to only the tools the task needs, rely on the built in verification step, and require human approval before any irreversible action such as sending money, emailing clients, or deleting data.
Record and Replay is macOS only, requires Computer Use to be enabled, and excludes the European Union at launch. It is new, so treat early skills as drafts, test them, and have a fallback. Administrators can also disable it centrally through the same control that governs Computer Use.
No. The record once, reuse forever pattern is becoming an industry standard, and Anthropic publishes Agent Skills as an open, portable format too. We build model agnostic automations on the tools you already run, so you are not locked to one vendor or one operating system.