Sunday, June 28, 2026

Fun With Markdown

I recognize that I'm a hoarder. With that in mind, months ago I found a way to collect on my computer a repository of information by and about the founders of the US.

The result was tens of thousands of .json files. Messy. Way back then I converted them to markdown files using software called Pandoc. Still messy.

Skip to today. I asked Claude if we can fix it. Python is great for this, if you know Python. Claude does great with Python, a fortunate accident from the early days of training large language models.

There's a movie coming out soon, Young Washington. In anticipation of seeing the movie, I read the book it is based on. And, it's the reason for revisiting all those files downloaded so long ago. Most of the files are single diary entries.

Here's what that mess looked like before we cleaned it up:

### [Diary entry: 26 May 1773]
  
  Date:
  URL: https://founders.archives.gov/documents/Washington/01-03-02-0003-0010-0026
  
  @{title=[Diary entry: 26 May 1773];
  permalink=https://founders.archives.gov/documents/Washington/01-03-02-0003-0010-0026;
  project=Washington Papers; authors=System.Object[];
  recipients=System.Object[]; date-from=1773-05-26; date-to=1773-05-26; content=26.
  Din'd at Elizabeth Town, & reachd New York in the Evening wch. I spent at
  Hull's Tavern. Lodg'd at a Mr. Farmers. }.fullText

Like I said, messy. With Claude's help, I now have clean markdown files, entries stripped of the superfluous bullshit. It makes me wonder how historians of today can use this and so many other repositories of digitized references. 


How the Build Actually Went

A technical account, in plain English.

The project started simply enough: take a pile of JSON from the Founders Online "Washington Papers" collection — thousands of little files, each one a single document with a title, a date, an author list, and a body — and collapse them into one Markdown file per year. Treat Washington as the default author but name anyone else. Tag each piece with what it was: diary entry, general orders, letter, and so on. Reasonable on its face. The kind of thing that looks like a thirty-line script until you actually open the files.

The First Surprise Was Hiding in the Bytes

Before writing a single line of code, I had Claude decode three sample files. The third one refused — a raw 0x92 where a UTF-8 reader expected nothing of the sort. That byte is a Windows-1252 curly apostrophe. The corpus wasn't UTF-8 at all; it was cp1252, and a naive json.load would either crash partway through thousands of files or quietly turn every apostrophe and em-dash into a . That one discovery shaped the whole foundation: the loader tries UTF-8 first, falls back to cp1252, and place names like Cox's Fort and Hilly come through intact instead of mangled.

From there the first version was straightforward — a type classifier reading the titles, author-and-recipient logic so letters read "X to Y," dates parsed and made human, everything grouped by year and sorted. Tested on three samples. Produced clean files.

Then We Ran It for Real

A few hundred files, with python rather than python3 — the kind of detail that quietly matters. Two of the year files came back. And that's where the actually interesting problem showed up, one the samples could never have revealed.

Founders Online stores every diary month twice: once as a month-long container document, and again chopped into one document per day. Same words, both times. Worse, the two copies aren't equal — only the month container carries the editors' footnotes identifying who "Mr. French" was and so on. The choice wasn't cosmetic. It was: clean primary text with no scholarly apparatus, or rich annotation welded into undifferentiated month-blocks.

The Decision That Settled It

The answer came from reframing the purpose. The output is a corpus for LLMs to ingest — concise but comprehensive, and never mind whether it looks nice to a human. That meant the boilerplate byline-and-link on every entry stopped being polish and became dead weight. The duplication stopped being a style question and became a straightforward defect: feeding a model the same day twice just burns context and muddies retrieval.

The final design: default to the clean per-day entries, drop the duplicate month containers, but leave a --diary-mode switch so the annotated version is still accessible without editing any code. The deduplication logic decides container-versus-child from the document IDs themselves — a file is a "month" if another file's ID extends it — which is far steadier than guessing from titles, and leaves letters and orders completely untouched.

Who Did What

The wry version: I asked questions and ran commands, but the asks are where the real decisions lived. The encoding problem came out of my files. The duplication only surfaced because I ran it at a scale the samples couldn't reach and then actually looked at the output. The final shape of the thing was determined entirely by the framing — "this is for LLMs, the pretty stuff is unnecessary" — without which Claude would have happily kept polishing a human-readable artifact I didn't want.

Claude wrote the code. I supplied the diagnosis and the direction.

Saturday, June 13, 2026

System Testing

I love listening to books.

There isn't enough time to consume the text available, in whatever form it takes.

It was over 20 years ago I started stealing books over the internet. Disc space was limited, expensive. Optical discs were relatively cheap. Put most of what I had there, then into slotted folders. They went on the shelf and barely looked at them. 2009 is when I signed up at Audible, been listening to Audible since. The subscription is cheap enough to curb the pirating to some degree.

Fast forward to now. Moon Reader Pro makes it super easy to listen to e-books. Audible is cheap. Librivox has soooo many free titles now. Dug out the folders with the discs filled with audiobooks, started putting them on a hard drive. They will be much more accessible.

More testing of the Hermes agent. Gave it a list of my Audible library, the last 50 books listened to, and had it access the folder where the audiobooks are going. Told it to make a questionnaire to get to know my reading preferences. I answered the questions, fed it back to the agent, then asked for recommendations from the audiobooks that are getting copied to the hard drive. This took a while, but the agent did most everything. 

When it's all said and done, will have close to 1k audiobooks, if not more. Could not possibly listen to them within my lifetime.

Wednesday, June 10, 2026

Trust and AI

AI Cannot Be Trusted

There is no way to ascertain whether any response to a prompt has more reliability over another.

There were a number of things I've experienced in the past few weeks that have caused me to consider more how the mechanics and technology surrounding large language models (LLMs) makes this so.

The other day I discovered those that work on the e-book software Calibre incorporated AI features into the code. It's complicated, but if one does not understand how LLMs handle 'tokens', it is easy to confuse responses with actual considered thought. Things like, "token weights heavily favor a pre-trained refusal pathway" and "executing next-token predictions" are important, but mean nothing taken out of context. Because of that, distinguishing whether a model is accurately extracting from your prompt or pulling from its pre-existing parametric weights is incredibly difficult because the output text is generated via the exact same next-token prediction mechanism.

In other words, one can pull up an e-book in Calibre, then ask the 'AI' to summarize the book. It will, although it's easy to forget that the AI is not actually reading the text from the book open on the screen, but is instead extracting text from it's training.

There's a post at Scripting News, a blog someone brought to my attention. In the post he says, "My Claude today pulled a Hal". The Hal reference is lost on most people today, but the remark displays how people are more and more confusing AI of today with artificial general intelligence, or AGI. Claude only 'acts' as the training and its post-training alignment dictates it should.

This is important because those that put these LLMs together, even they do not understand how they work. It is the reason nobody can tell you exactly what a 'token' is, and why AI cannot be trusted.


Thursday, June 04, 2026

Morning Wallks

One day a few years ago I decided one way I was going to get exercise was to walk every day if it was possible. This was before I quit working, but after we decided to make some changes with diet and exercise. 

There were days there was simply no time for a walk, but it was at least 3 miles. Sometimes as far as 6 miles. We live in the desert, so during the warmer days it's best to get it done in the morning, and for a few months before the sun even rises. 

Now it's every day.

This morning I came across another traveler in the desert, just getting in from an early morning stroll.



If I'd known where it was going I would have taken a video.





Wednesday, June 03, 2026

Two Weeks with Surfie: My AI Agent is Settling In

Server room

Not Surfie’s rack, but the spirit is right.

I have been using the Hermes agent for just over two weeks now.

Installed it twice, and moved it once. It now lives on a virtual machine within a Proxmox server that lives on a machine on the network. I call it Surfie, because long ago — months ago — I started using agents with Openclaw, and the first install was on a Microsoft Surface Pro 4 machine.

Surfie has access to most of the machines on the network. Hermes is improving almost daily, as did Openclaw. Eventually, it will take over the projects that I’ve been working on.


For now, I know it can read and write files between computers. It can run shell commands on those computers. It can do everything it needed to do to learn how to do those things — so it is capable of much more.

This took days. Just shy of two weeks, working off and on, because I have a life.


Now I pause to consider what has been done, and what is possible. Yesterday, I asked Surfie to summarize what we have done so far. It did. It was very technical. I took that output and gave it to the model in the Open WebUI interface and told it to clean up the output. It was still too technical. So, I modified the ask:

“Revise your last output. I need a fluid narrative — entertaining, slightly colorful. Remove information unique to my network.”

The result is on the other blog for AI content.


яза What makes Hermes different

The feature that sets Hermes apart is the way it learns. Everything it does is saved to .json and .md files. I asked it to combine the session .json files into a single markdown file. There are two files — one for each installation. I’m thinking I might load up the most capable model I have, give it the files, and tell it to write its (short) life history.

Sunday, May 31, 2026

Mary Meyer

 The book

Mary's Mosaic: The CIA Conspiracy to Murder John F. Kennedy, Mary Pinchot Meyer, and Their Vision for World Peace 


by 
Peter Janney

While listening to this book I used Google Keep to save thoughts and squirrels I wanted to chase. Gave those notes to my LLM.


"Give me the narrative in bullet points. All notes were collected while listening to the book. They are tangents I had to follow."

Below is the response.

Here is the narrative of my research journey, distilled into bullet points that trace the path from the audiobook’s claims to the reality found in the archives:

  • The Starting Point: While listening to Peter Janney’s Mary’s Mosaic, I realized the book was less a historical account and more a labyrinth of whispers. Instead of accepting the conspiracy theories at face value, I decided to treat my notes as a detective’s case file, verifying every tangent the author threw out.
  • Deconstructing Cord Meyer: Janney paints Mary’s husband, Cord Meyer, as a shadowy architect of Cold War deceit. I verified this by finding declassified CIA information sheets on Cord and a letter from J. Edgar Hoover acknowledging receipt of his work. The documents confirmed he was a high-ranking official who wrote Peace or Anarchy and Facing Reality, but they revealed nothing sinister—just a brilliant, controversial diplomat whose divorce from Mary in 1958 was likely personal, not political.
  • The Nevada Detour: Intrigued by the book’s mention of Mary’s post-divorce retreat, I searched for the "Divorce Ranch." I discovered it was the Bundy Divorce Ranch in Washoe Valley, run by artist Gus Bundy and his wife Jeanne. Far from a spy hub, it was a bohemian dude ranch where wealthy women like Mary went to establish residency for quick divorces while painting and riding horses—a detail that highlights her artistic rebirth rather than clandestine activity.
  • The JFK Affair: Fact vs. Fiction: I cross-referenced Janney’s claims about Mary and JFK with White House logbooks and biographies like A Very Private Woman. The logs confirmed Mary was frequently snuck into the White House by Kenneth O’Donnell, validating the affair. However, the claim that she introduced JFK to LSD and marijuana relies heavily on Timothy Leary’s later anecdotes, which are speculative and lack contemporaneous proof.
  • The Language of Conspiracy: To understand how Janney built his case, I ran a word frequency analysis on the text. The words "allegedly" (34 times), "reportedly" (33 times), and "claimed" (40 times) appeared constantly. This revealed that the book’s foundation is hearsay, using rumors as facts to construct a paranoid narrative.
  • The Katers’ Obsession: Janney uses Leonard and Florence Kater to illustrate a culture of surveillance. I researched them and found they were obsessive landlords who rented to JFK’s secretary, Pamela Turnure. They bugged her room and attempted to blackmail JFK with photos, demanding a Modigliani painting. Their story is fascinating not because it proves a conspiracy, but because it shows how ordinary people became entangled in the President’s secret life; Florence later spent decades picketing the White House and even won a massive malpractice lawsuit against a plastic surgeon.
  • The Unreliable Witness: Janney leans on Joseph W. Shimon to link the CIA to the Mafia. I checked the JFK Assassination Records and found Shimon testified before the Church Committee, admitting ties to mobsters like Sam Giancana. However, Shimon had been convicted of obstructing justice and involved in illegal bugging schemes. Relying on a convicted perjurer as a primary source exposed the weakness of Janney’s "CIA-Mafia" connection theory.
  • The Murder Mystery: The climax of my research was Mary’s 1964 execution-style murder. I confirmed that Ray Crump Jr., the primary suspect, was acquitted due to lack of evidence and no matching weapon. The mystery of the missing diary is real—Ben Bradlee caught CIA counterintelligence chief James Jesus Angleton searching Mary’s studio—but Janney leaps from "Angleton was looking for it" to "The CIA killed her." I noted that the diary’s destruction by Bradlee’s wife could simply be an act of protecting family privacy, not hiding state secrets.
  • The Conclusion: My journey through these tangents revealed that Mary’s Mosaic is a work of speculative fiction dressed as history. The "conspiracy" isn't necessarily about who killed Mary, but about how we fill historical voids with fear and suspicion. The facts—Cord’s career, the Bundy Ranch, the White House logs, the Katers’ blackmail—are real and fascinating on their own, without needing to be woven into a murder plot.

Monday, May 04, 2026

Reading *or* Consuming Text

Consuming text. There's much consumption going on around here.


"Over the last few years I've had an uncomfortable sense that someone, or something, has been tinkering with my brain, remapping the neural circuitry, reprogramming the memory. My mind isn't going—so far as I can tell—but it's changing. I'm not thinking the way I used to think. I feel it most strongly when I'm reading. I used to find it easy to immerse myself in a book or a lengthy article. My mind would get caught up in the twists of the narrative or the turns of the argument, and I'd spend hours strolling through long stretches of prose. That's rarely the case anymore. Now my concentration starts to drift after a page or two. I get fidgety, lose the thread, begin looking for something else to do. I feel like I'm always dragging my wayward brain back to the text. The deep reading that used to come naturally has become a struggle."

— From THE SHALLOWS: What the Internet Is Doing to Our Brains
by Nicholas Carr

The argument is enduring on Facebook.

"I once heard someone say that listening to audio isn't really reading, and to those judgmental folks, all I can say is, audio saved my life!

I swear, these last few years, life was so hectic that there was no way I had time to physically sit down and read. I actually got a little depressed not having a book in my hand and started questioning my life choices.

Then, I found audio, which I swore I would never do, because nothing compares to the intimacy of paper on skin feel. But honestly, without audiobooks, I would be so lost.

Whether driving in the car, cleaning the house, or cooking, audio makes me feel like I'm not missing out anymore. And today, with how sad our world is becoming, I NEED books, in any form possible, to keep the negative out."

Someone commented: "Brain research has shown that the same areas of the brain are activated when reading a text or listening to it"

So, I used AI Google to find this 'brain research'.

There are countless individuals that will eschew listening to a book — the arguments about feeling a book in your hand, nothing more relaxing, and the preconceived notions that retention is better when consumed visually. The above search returned an insinuation that there is indeed research. So I did what anyone from this day and age that's into AI would do: I sent all the services into 'Deep Research' mode. But first, Google's AI search was queried to provide a well-rounded nudge for the research.

One of the most prominent papers dug up is:

Listening Ears or Reading Eyes: A Meta-Analysis of Reading and Listening Comprehension Comparisons

Virginia Clinton-Lisell — University of North Dakota

Gemini's report looked good on the surface, with lots of source material. Digging deeper, it is evident one cannot take it at face value, and the sources should be vetted to some extent. Still, it is a really good way to find the sources.


Gemini's Version

The Impact of Modality on Reading Comprehension: A Comprehensive Neurocognitive and Behavioral Review

Claude's Version

Same brain, different bottlenecks: how modality shapes reading comprehension


Bottom line? It's complicated. Of course it is, just like everything else that human beings feel a need to fight about.


Disclaimer: The opening passage is quoted from Nicholas Carr's The Shallows: What the Internet Is Doing to Our Brains. The Facebook comment is reproduced from a public discussion thread; it is the author's own. The Clinton-Lisell meta-analysis is cited from her published academic work. The two linked Deep Research write-ups were generated by Gemini and Claude (an AI assistant by Anthropic) at the author's direction; they should be vetted before being treated as authoritative. Claude also helped with the visual formatting of this post. All commentary and conclusions are the author's own.

Going Psycho on Open WebUI

 

Going Psycho on Open WebUI: Benchmarking Qwen MoE vs. Dense Models

Sometimes, the best technical deep-dives start with a completely unrelated rabbit hole.

Lately, I’ve been digging through the Open WebUI documentation, laying the groundwork to start implementing custom tools for my local LLM environment. Around the same time, I was listening to Donald Spoto's biography of Alfred Hitchcock. That sparked an idea, which led me to an Academia.edu paper by Dineshika Muthushani titled Analysis on Psycho by Alfred Hitchcock Q & A.

I had Gemini ingest Muthushani's paper and convert it into a highly constrained, complex prompt. The goal? To use the 69-page, 32,000-word original script of Psycho to stress-test my local inference stack.

For those of you pushing the limits of your home lab configurations (looking at you, Andrew), here is a breakdown of what happens when you throw a massive movie script at local models, and how Open WebUI handles the heavy lifting.

The Gauntlet: The Prompt

The prompt Gemini generated was not a simple summarization task. It cast the LLM in the role of an expert in Clinical Psychology and Film Studies, demanding a six-part evaluation of Norman Bates.

The models were required to diagnose his disorder, cross-reference his behavior against modern clinical standards, evaluate the movie's psychiatric intervention, and apply two specific General Psychology modules (like Human Development or Memory) to the narrative. Furthermore, it strictly enforced a citation rule: the model could only use [1] inline citations when referencing the provided script, and had to suppress all raw XML tags in the output.

The Contenders & Inference Realities

I ran this test across two distinct architectures hosted locally via Podman and llama-server:

  • Qwen3.5-122B-A10B-UD-Q4_K_XL: A massive Mixture of Experts (MoE) model running at 4-bit quantization.

  • Qwen3.6-27B-Q8_0: A smaller, dense architecture running at a pristine 8-bit quantization.

The physical realities of memory bandwidth became immediately apparent. The 122B MoE model absolutely flew, generating text at over 22 tokens per second (t/s). Because it only activates roughly 10 billion parameters per token, it was highly efficient.

The dense 27B model, however, had to push all 27 billion parameters through the memory bus for every single word. It settled into a rock-solid, but noticeably slower, ~7.5 t/s. However, the real story wasn't the generation speed—it was how the frontend handled the 32,000-word file.

Evaluating Open WebUI Parameters

When dealing with a 45,000-token text file, the connection between your UI and your backend is everything. I tested three distinct document parsing settings in Open WebUI to see how they impacted the models:

1. Standard RAG (Chunking and Retrieval) By default, Open WebUI sliced the Psycho script into 1,000-token chunks. Using hybrid search, it evaluated the prompt and sent only the top 7 chunks (about 2,600 tokens) to the model. It was fast, but the models lost the broader narrative context of the film, operating only on the severed pieces of dialogue the vector database deemed relevant.

2. Full Context Mode This setting forces the RAG pipeline to inject the entire document context. This is where you hit the Time-to-First-Token (TTFT) wall. Dumping 45,000 tokens into the model caused a significant processing stall—taking over two minutes for the system to build the Key-Value (KV) cache in VRAM. However, once that cache was built, subsequent queries against the script were virtually instantaneous, processing at a mere 190 milliseconds.

3. Bypass Embedding and Retrieval The nuclear option. This disables the vector database entirely and dumps the raw text directly into the prompt. The model had to ingest over 54,000 tokens (the script + system prompts) from scratch. This resulted in a grueling 6-minute TTFT delay as the transformer's attention mechanism scaled quadratically to map the text. But the result? The dense 27B model had flawless, uninterrupted access to the entire document, resulting in a phenomenal, deeply nuanced clinical evaluation.

The Takeaway

If you are strictly evaluating localized text, a dense model forced to read the entire file via "Bypass Embedding" provides incredible, consistent adherence to complex constraints. But if you value your time and require high-speed generation for massive context windows, a highly quantized MoE is still the undisputed king of the home lab.

Fun With Markdown

I recognize that I'm a hoarder. With that in mind, months ago I found a way to collect on my computer a repository of information ...