Google's Open Knowledge Format: The Quiet Shift That Will Separate Smart Marketers from Everyone Else Digital Marketing Agency Portugal

8 min to read

I'll be honest with you. When I first heard "Open Knowledge Format" my immediate reaction was mild eye-roll fatigue. Another Google initiative. Another spec to read, another thing to add to the "keep an eye on" list. We already survived schema.org, we're still debating llms.txt, and now this?

But then I actually read the spec. And then I read Andrej Karpathy's LLM Wiki gist — which hit 5,000+ GitHub stars almost overnight — and something clicked. This isn't just another Google format. This is the early signal of something that could fundamentally restructure how businesses make themselves legible to AI systems. And for digital marketers, that's either a big opportunity or a big problem, depending on how fast you move.

Let me break it down the way I'd explain it to a client over coffee.

What Google's Open Knowledge Format Actually Is (And What It Isn't)

Google Cloud published OKF v0.1 on June 12, 2026. The announcement was relatively quiet — buried under the usual Google Cloud news cycle — but the implications aren't.

At its simplest, OKF is a directory of Markdown files with YAML frontmatter. That's it. Each file represents one concept — a metric, a dataset, a playbook, a product definition, a runbook — and the only required field in the entire spec is type. Everything else (title, description, tags, resource, timestamp) is optional. The full v0.1 specification fits on a single page.

Here's what an OKF concept file looks like:

In markdown (If still dont use it, start!)

---
type: metric
title: Customer Acquisition Cost
description: Total paid media spend divided by new customers acquired in the same period
tags: [performance, paid-media, finance]
---

# Customer Acquisition Cost (CAC)

CAC is calculated monthly by dividing total paid media spend by the number of net new customers
acquired within that month. It excludes organic and referral traffic.

Related: [Lifetime Value](./ltv.md), [ROAS](./roas.md)

That's a knowledge concept. Bundle dozens of these together in a folder, add an index file, point your AI agent at it — and now your agent has a single canonical source of truth about what your business knows, not a fragmented mess of Notion pages, Slack threads, and decade-old Google Docs that nobody updates.

The design philosophy, as stated in the spec itself, is that knowledge should be "readable by humans without tooling, parseable by agents without bespoke SDKs, and diffable in version control." Three requirements, all of which most company knowledge assets currently fail at catastrophically.

What it isn't: a search ranking signal. I want to be clear about this because I've already seen the SEO Twitter discourse getting confused. OKF is not a replacement for schema.org structured data. Google's own documentation says as much. For external AI crawlers and search visibility, llms.txt remains the routing layer. For SEO, structured data and quality content still apply. OKF is an internal layer — it's what your own agents read so they operate from shared, accurate knowledge rather than re-deriving facts from scratch on every query.

The Karpathy Connection: Why This Isn't Coming From Nowhere

To understand why OKF matters, you need to understand what Andrej Karpathy was pointing at with his LLM Wiki pattern a couple of months back.

Karpathy's insight was essentially this: most of us are using AI wrong. We treat it like a search engine — ask a question, get an answer, repeat. Every query starts from zero. The LLM re-reads the same documents, re-derives the same logic, and produces slightly different answers depending on the day. There's no accumulation. No memory. No compounding.

His proposed fix was structurally simple: let the LLM compile your knowledge into a structured wiki first, then answer questions from that compiled artifact. The analogy he uses is exactly right — source code doesn't run directly; it gets compiled once into an optimized binary and then runs efficiently on demand. Your knowledge should work the same way.

OKF is, in many ways, Google's answer to formalizing that pattern. Where Karpathy's LLM Wiki is a personal workflow running against tools like Claude Code in a local folder, OKF is the organizational-scale, vendor-neutral, open spec version of the same idea. Both are rooted in the same observation: the future of AI isn't chatbots answering questions from the open internet. It's agents operating on curated, structured, internally-maintained knowledge.

For marketers, this is a significant mental shift. We've spent years optimizing for how the internet finds us. The next challenge is optimizing for how AI agents understand us — and agents have a different set of requirements than crawlers.

What I've Seen With Clients: The Knowledge Chaos Problem Is Real

Here's something I've noticed consistently running the team at Codedesign across our client portfolio in Europe and the US.

When we onboard a new client — say, a mid-market SaaS company or an e-commerce brand doing €20M+ — one of the first things we do is try to understand how they define their key metrics. And almost universally, what we find is that there is no single answer. The CFO has a different definition of CAC than the CMO. The sales ops team is calculating LTV on a different cohort window than the finance team. The agency brief says "conversion rate" but nobody has agreed on which step in the funnel that refers to.

This is organizational knowledge chaos. It's been manageable in a human-driven world because smart people ask clarifying questions and eventually converge. In an AI-agent-driven world, it becomes a critical failure mode. If you deploy an AI agent against your customer data and your metrics aren't defined in a single canonical source — your agent will make things up, or average across conflicting definitions, or quietly apply the wrong formula and nobody will notice until the numbers are embarrassingly wrong in a board presentation.

One of our clients in the B2B software space — I won't name them, but they sell supply chain tooling to enterprise buyers in the Nordics — ran into exactly this when they started using AI to generate weekly performance reports. The agent was pulling data from their GA4 integration, cross-referencing CRM data, and summarizing attribution. Looked great. The problem was that the agent's definition of "qualified lead" didn't match their sales team's definition. For three weeks, leadership was looking at numbers that were systematically overstating their pipeline quality by about 40%.

The fix wasn't a better AI model. It was better knowledge structure. We helped them build what is essentially an OKF bundle before we even knew OKF was a thing — a Markdown-based definitions repository that every agent and every analyst now reads first. The reporting issues went away immediately.

OKF is the formalization of that fix.

The Agentic Web and What It Means for Your Marketing Presence

Let me zoom out for a moment, because there's a bigger shift happening that OKF is just one signal of.

We are moving from a web designed for humans to a web that increasingly needs to serve agents. Right now, most company websites are architecturally built for human comprehension — visual hierarchy, persuasive copywriting, conversion-optimized flows. That's still important. But AI sales-research agents are increasingly operating inside CRMs and procurement tools, fetching vendor sites and summarizing them for decision-makers. If your website's structure makes it hard for an agent to extract the relevant information — pricing tier comparisons, integration compatibility, customer case studies — then the agent gives a muddier or more negative summary than your competitor who has structured things better.

llms.txt is the first layer of this: a public routing file that tells AI crawlers what your most important pages are. OKF is the second layer: an internal knowledge bundle that your own agents run on. The third layer — still emerging — is exactly what's suggested in the video that prompted me to write this piece: selling proprietary knowledge bundles directly to AI systems.

Think about that seriously. If your company has accumulated genuine expertise — in your vertical, about your processes, about your customers — that expertise has historically been locked inside your team's heads or buried in documents nobody reads. OKF opens the architectural possibility of packaging that knowledge as a structured, machine-readable asset and licensing it. It's not science fiction; it's the logical endpoint of what "knowledge economy" actually means in a world where agents are the primary consumers of structured information.

We've been writing about the shift toward Generative Engine Optimization (GEO) at Voice of Experts — the idea that the game is no longer just ranking in traditional search, but being cited, referenced, and trusted by AI systems. OKF adds another dimension to that conversation: it's not just about being findable by AI, it's about being legible to AI in a way that accurately represents your expertise.

How to Actually Build Your First OKF Bundle

This is the part most articles skip, so let me be practical.

Start small. You don't need a hundred concept files. You need maybe ten to twenty to get value from this immediately.

Step 1: Audit your knowledge chaos. List the five to ten terms in your business that different stakeholders define differently. CAC, LTV, conversion rate, MQL, active user — whatever applies to your context. These are your first concept files.
Step 2: Write one Markdown file per concept. Keep it simple. The required field is just type. Add a title, description, and a plain-language explanation that any new team member (or AI agent) could understand. Link related concepts to each other with standard Markdown links.
Step 3: Add an index.md. OKF supports optional index files for progressive disclosure — essentially a README that tells an agent where to start when navigating the bundle. Write it as you'd write an onboarding document: "Start here. The most important concepts are X, Y, Z."
Step 4: Version it in Git. This is non-negotiable. The ability to diff knowledge over time — to see when a definition changed and why — is one of OKF's core value propositions. Put it in a repo.
Step 5: Point your agents at it. Whatever AI tooling you're using — whether it's Claude, Gemini, internal RAG pipelines, or a custom agent setup — configure it to read the OKF bundle first before operating on any data. This is your agent's grounding document.

Tools that work well for maintaining OKF bundles include Obsidian (for human-friendly editing with graph visualization), Git (for versioning and diffing), and NotebookLM or Gemini for using AI to help you write and expand the concept files themselves. There's something beautifully recursive about using an LLM to build the knowledge structure that other LLMs will use.

The Honest Marketing Verdict

OKF v0.1 is an early spec. Google calls it "a starting point, not a finished standard," and that's accurate. It won't change your Google rankings. It won't immediately show up in any marketing dashboard metric.

But the organizations that move early on knowledge structuring — that invest now in making their expertise machine-readable, agent-legible, and internally consistent — will have a compounding advantage as the agentic web matures. The gap between "organizations whose knowledge is structured" and "organizations whose knowledge is scattered across docs nobody reads" will widen quickly.

This is a pattern I've seen play out repeatedly in digital marketing. The companies that adopted structured data early got schema markup advantages before competitors noticed. The companies that took mobile-first seriously in 2013 were not surprised by the mobile-first indexing shift in 2016. The companies that built first-party data infrastructure before iOS 14 didn't panic when third-party cookies started dying.

OKF is one of those early signals. It's Markdown files today. It's competitive infrastructure tomorrow.

At Codedesign, we're already integrating knowledge structuring into our client onboarding process and our internal agent workflows. It's not a massive lift — but it requires someone to actually care enough to do it systematically. That's the real barrier: not technical complexity, but organizational discipline.

What's Your Take?

I'm curious where you are on this. Is your organization already thinking about how you structure knowledge for AI agents? Have you experimented with llms.txt, knowledge wikis, or anything like OKF? Are you skeptical this matters at all right now?

Drop a comment below, or reach out to us at Codedesign — I'd love to hear how different companies are navigating the shift from the human web to the agentic web. And if you want to go deeper on GEO, AI-agent readiness, and what it means for your marketing strategy, the team at Voice of Experts has been covering this space closely.

The knowledge economy just got a new file format. The question is whether your business is ready to use it.

Author Bio

Bruno Gavino is the Founder & CEO of Codedesign, an international digital marketing agency helping growth-stage companies build data-driven, AI-ready marketing systems.

This article was written by Bruno with the help of Gemini, Claude and Codedesign Copywriting AI Agents

Bruno Digital footprint
About me: https://about.me/bruno.gavino or https://www.linkedin.com/in/brunogavino/
More Deep Dives & Writing: https://substack.com/@brunogavino
Agency & Insights: https://codedesign.org/
Bruno's Podcast : https://www.voiceofexperts.com/
Track your LLM Visibity : https://llmsearchconsole.com/
Blog content about Agentic AI: https://articles.llmsearchconsole.com/

My Linkedin: https://www.linkedin.com/in/brunogavino/

The Invisible Traffic Crisis: Why Your Google Rankings Are Lying to You

Wednesday, 17 June 2026