I rebuilt this blog recently. Not a redesign. The old posts are still here, same URLs, same markup. What changed is that every page now carries a full structured data graph and every element is annotated with microformats. Most of the new posts were co-written with an AI that has a checklist of things it is not allowed to sound like, plus a set of voice rules. The AI drafts, I edit. The structured data handles the other side: helping machines understand the content.
Structured data that pulls its weight
Every page has a JSON-LD block in the <head>. Not a
plugin. A 120-line Hugo partial that builds a @graph depending on what kind of
page it is.
For a blog post, the graph has four Schema.org nodes: a WebSite, a Person (me), a WebPage for the URL, and a BlogPosting with headline, dates, word count, tags as keywords, and a cover image. Breadcrumbs get a BreadcrumbList with position numbers walked from the page’s ancestors. Tag pages get CollectionPage with an ItemList. The home page gets just WebSite and Person.
The interesting part isn’t the markup. It’s the validation script that runs in CI on every push:
dangling = refs - ids_defined
if dangling:
errors.append(f"{rel}: dangling @id refs: {sorted(dangling)}")
if "/posts/" in str(rel) and rel.name == "index.html":
if "BlogPosting" not in types_seen:
errors.append(f"{rel}: post missing BlogPosting node")
This exists because when an AI edits a template partial, it can silently break
the @id chain. A node references #author but the Person node got dropped
during a refactor. The graph degrades gracefully in a browser, so without a gate
you’d never notice until Google starts showing bare URLs instead of article
cards.
The same pages carry IndieWeb microformats. Every post
is an h-entry with dt-published, p-category, p-name, u-url. The home
page has a hidden h-card with my name and rel="me" links to GitHub, GitLab,
Codeberg, and Mastodon. An IndieAuth parser can verify
that github.com/lvmbdv and blog.lvmbdv.dev belong to the same person.
The webmention endpoint is wired but commented out. I haven’t set up the receiver yet. The markup is there, waiting.
None of this is visible if you’re reading normally. That’s the point. Structured
data is infrastructure. It sits in the <head> and in hidden spans. The only
time you think about it is when something breaks, and the validation script
makes sure you find out before you deploy.
Giving an AI a voice
Raw LLM output has a smell. You know it when you read it. The Wikipedia “Signs of AI writing” guide documents 29 patterns that show up reliably: significance inflation, copula avoidance, elegant variation, em-dash abuse, rule-of-three padding, signposting. I turned them into a prompt.
Strip all that and you get clean text. But clean isn’t the same as having a voice. Sterile prose is still obviously not human: every sentence the same length, no opinions, no first-person, no humor. It reads like a press release.
The prompt has a “personality and soul” section. Vary your rhythm. Have opinions. Acknowledge complexity. Use “I” when it fits. Let some mess in. Be specific about feelings instead of reaching for generic adjectives.
Beyond the patterns, the prompt has rules about how I write. No signposting: don’t announce what you’re about to do, just do it. Start with the thing itself, not context. Short sentences are fine. Fragments are fine. If a sentence sounds like Wikipedia, rewrite it. Some rules are hard (no em-dashes, ever). Others are softer and get caught in editing.
The workflow is: agent drafts, I read and flag what sounds wrong, it rewrites. I flag again. Usually two or three passes. Sometimes the agent writes a sentence that’s technically clean but rhythmically dead, and the fix is to break it into two or add a fragment. It doesn’t always understand why the change works, but it can apply the pattern once I name it.
Voice calibration helps. I gave the agent samples of my writing so it knows I open with short declarative sentences, that I use “I” without ceremony, that my paragraphs don’t build to thesis statements. It’s not perfect. The first draft is never publishable. But the distance between draft and done shrinks.
What this produced
The SSD autopsy post started as an agent draft. I gave it dmesg output, photos, backstory. It wrote a first pass. I cut the throat-clearing, sharpened the jokes, added the paragraph about why I didn’t just read about the failure but wanted to see it. The agent reflowed the prose and checked the structured data graph.
The Albion black market post was similar. I had the idea and the mechanics. The agent drafted the explanation, I tightened the voice, the agent cleaned up the formatting and validated the build.
The split that emerged: I do the thinking. The agent does the rest. It handles structure, research, the first draft, formatting, validation. My job is having the idea, steering the draft toward my voice, and knowing when a sentence lands wrong. The agent never gets the voice right on its own. But it’s getting closer, and the distance between what it produces and what I publish shrinks every time. I just hope my brain doesn’t go with that too.
