Author: KeithCu

  • Detailed Proposal: Compact, High-Speed Tethered FPV Drone Simulator (C-HS TFPVDS)

    Executive Summary:

    With new bullets, standard 5.56″ rifles can take out drones from 50 to 100 meters. This is a game-changer in a world where 80-90% of casualties are caused by drones.

    Current military training lacks realistic, cost-effective methods for engaging dynamic FPV drone threats. Most training and certification involves stationary or slow-moving targets. Existing solutions involving real drones are expensive, consumable, and lack repeatable flight paths for structured mass certification.

    My proposed solution is the Compact, High-Speed Tethered FPV Drone Simulator (C-HS TFPVDS), a robust indoor training system designed to replicate the flight characteristics of an FPV drone without the associated costs and complexities of live drone operations. The system utilizes a lightweight, non-ballistic target tethered to a high-speed robotic arm, protected behind an angled shielding wall, making it a practical and resilient solution for widespread deployment in military training facilities.

    Technical Approach:

    The C-HS TFPVDS is composed of four primary subsystems:

    1. Robotic Positioning System: A high-speed 3-axis Delta robot, chosen for its exceptional acceleration and speed capabilities, is mounted on an elevated platform. This robot is positioned behind an angled, sacrificial shielding wall (composed of self-healing polymer or angled AR500 steel) that protects the expensive mechanism from incoming 5.56 anti-drone rounds.
    2. Tether and Target Assembly: A 15-foot high-strength, low-stretch Dyneema tether connects the robotic end effector to the target. The target is a lightweight (2-3 oz), metallic-coated foam or hollow plastic sphere (4.5 inches in diameter), simulating the size and radar/visual signature of a typical FPV drone chassis while minimizing mechanical load on the robot. A piezoelectric sensor is embedded within the target to detect kinetic impact and provide real-time scoring feedback.
    3. Control and Simulation Software: The system is driven by custom software written in Python, utilizing the pyodrive library for precise motor control. This software generates randomized “jink-and-dive” flight patterns, simulating realistic FPV evasive maneuvers at speeds up to 120 mph. The software implements feed-forward control to compensate for tether lag and aerodynamic drag, ensuring precise and responsive target movement.
    4. Power and Safety Systems: The system is powered by high-torque brushless DC motors, managed by industrial motor drives. Safety features include integrated emergency stop (E-Stop) circuitry compliant with NEC Article 670 and automatic shutdown upon tether breakage or system fault. All electrical components within the damp indoor range environment are protected by GFCI (NEC 210.8(B)(6)) and Surge Protective Devices (SPD) per NEC Article 242.

    Benefits:

    • Realistic Training: Replicates the difficult non-linear flight paths and high speeds of FPV drones.
    • Cost-Effective: Eliminates the ongoing cost of consumable live drones and the logistical burden of FAA compliance.
    • Robust and Resilient: Protective shielding and sacrificial components ensure long-term system survivability.
    • Objective Certification: Integrated scoring system provides clear, measurable data for soldier qualification.
    • Indoor Operation: Allows for year-round, weather-independent training in standard shooting ranges.

    Feasibility:

    The C-HS TFPVDS utilizes mature, commercially available industrial automation components and high-strength materials. The software control principles are well-established in the field of robotics. The system is designed to integrate seamlessly into existing military range infrastructure. The combination of protective shielding and lightweight, inexpensive targets addresses the primary survivability and cost concerns of previous drone target systems.

    Hopefully Army TRADOC researches this idea one day soon!

  • Cursor for LibreOffice Week 2 & 3: How I Added MCP, ACP, a Research Sub-agent, Talk to Your Document, an Eval Dashboard, and Survived Quarzadous’s Total Refactor


    I’ve been calling this project Cursor for LibreOffice to myself, but I knew I couldn’t use the name forever, so I researched and chose WriterAgent. It supports Calc, and Draw as well, but I didn’t like the name OfficeAgent, which sounds like some Soviet-era KGB job title. Last week’s post was how I took John Balis’s clean little Localwriter and bolted on threading, tool-calling, chat, and enough other stuff that it started to feel like a powerful chatbox inside LibreOffice.

    It became useful enough, and the progress was so fast with all the Python code out there to re-use, that I was motivated to keep going. Meanwhile a chap named Quarzadous dropped a complete refactor and I wanted to integrate it without breaking anything, including the new features I had added.

    MCP

    After creating the initial chat with document, I realized that many people might want to talk via their local agents: the infamous OpenClaw, Hermes, Claude, etc. and allow those agents to edit your documents. These systems have many features: memory of previous conversations, file-system access, and skills they can learn after install, so implementing the Model Context Protocol to let them make the same tool calls would also be useful.

    I wondered whether supporting both external agents and an internal one in the same codebase is a good idea since the users and some use-cases are different. However, both use the same API backend and other pieces that much of the code is shared. The UI is just a new checkbox “Enable MCP”, and a few new files to spin up an HTTP server, process the JSON-RPC, and one day possibly support tunneling. So I decided it was worth supporting both, rather than either-or.

    Actually, the hardest part of building software for non-technical people is that you need to make something Apple-like, very easy to use, which is hard because developers have a much higher tolerance for confusing products.

    The libreoffice-mcp-extension, written by Quarzadous, had the missing pieces, and I integrated it with the existing code, and over time refactored it to remove any duplicate logic. I also added sidebar logging, so that when an MCP tool-call happens, you can see information in the chat, just like for the internal agent.

    Huggingface Smolagents

    The next feature I wanted was a web search tool for the AIs to make. LLMs are generally useful, but their training cutoff is often a year or 2 ago, so I wanted a way to let it look up information from the web to plug into a document.

    However, once I thought through the various steps:

    • Make a web search tool-call
    • Read through the results, decide the first page to visit
    • Read the web page and decide if it needs to read another page or whether it has an answer

    I realized that it would be much better to have an isolated, specialized sub-agent do all this work, and just return a distilled answer, and not distract the main LLM with this specialized task and bloat the context.

    After a few minutes of searching, I discovered Huggingface’s smolagents library already includes this functionality. Huggingface is the man! The code needed to be changed slightly to remove dependencies (Jinja, etc.) but it was easy to vendor the core of their ToolCallingAgent + ReAct (Reason – Action) loop. Here’s some of the prompt and you can see how it encourages a loop until confident in the answer:

    You are an expert assistant who can solve any task using tool calls. You will be given a task to solve as best you can.
    To do so, you have been given access to some tools.
    
    The tool call you write is an action: after the tool is executed, you will get the result of the tool call as an "observation".
    This Action/Observation can repeat N times, you should take several steps when needed. You can use the result of the previous action as input for the next action.
    
    To provide the final answer to the task, use an action blob with "name": "final_answer" tool. It is the only way to complete the task, else you will be stuck on a loop. So your final output should look like this:
    Action:
    {
      "name": "final_answer",
      "arguments": {"answer": "insert your final answer here"}
    }
    
    Tools list:
    - web_search:
      Performs a duckduckgo web search based on your query (think a Google search) then returns the top search results.
      Inputs:
        - query (string): The search query to perform.
      Output type:
        - string
    
    - visit_webpage:
      Visits a webpage at the given url and reads its content as a markdown string. Use this to browse webpages.
      Inputs:
        - url (string): The url of the webpage to visit.
      Output type:
        - string
    
    - final_answer:
      Provides a final answer to the given problem.
      Inputs:
        - answer (any): The final answer to the problem.
      Output type:
        - any
    
    Now Begin!

    I rewrote their web tools to use just the standard APIs in the Python library, and wrapped the existing LlmClient so the research sub-agent uses the same model and endpoint as chat with document. That way, if a local model gets confused by a complex topic and starts chewing on the furniture, you can easily select a smarter, pricier one and pay a couple of pennies to have the adults handle it.

    In a couple of hours, it was working and I could type this text in a document:

    The price of a Sol-Ark 15K limitless inverter is: $YYY.

    In the sidebar, I wrote: What is the real price of the inverter?

    Without web research, if you ask a random LLM for the price and specs of a Sol-Ark 15KW inverter, it will hallucinate a price tag of $400, tell you it runs on AA batteries, and confidently suggest wiring it with speaker wire. With the sub-agent, it can learn any details you request, and the AI changed the sentence to:

    The price of a Sol-Ark 15 KW Limitless inverter in the US is: $6,979.99 – $6,999.00.

    It even fixed the capitalization for Limitless, which is a proper name. I’ve tweaked the prompts to explain to the AI that your primary job is to edit the document, not just answer questions, and they mostly get it now.

    This feature was so exciting to me, I added a checkbox for Web research that lets you talk directly to the sub-agent to have it answer questions, or summarize web pages, and it place the answers in the chat window.

    This little feature is better than ye olde Google search box since it understands natural language. You can ask it specific questions:

    “What is the current version of Python and when was it released?”

    And it gives you a natural language answer:

    “The current stable version of Python is 3.14.3, which was released on February 3, 2026.

    The LLMs are told about the tool call for Web research if asked about a topic it is unfamiliar with, but you can also encourage it: “Do web research and write a colorful, detailed summary of the space elevator, suitable for physicists.

    Or you could say “suitable for English teachers”, and get a completely different report!

    Reports generated by Nemotron 3 Super

    With a typical model on OpenRouter, it takes 30-60 seconds to generate a report on any topic, which isn’t that long in the scheme of things, but I discovered a diffusion model called Mercury-2 which is fairly smart (Claude Haiku level) but much cheaper ($0.25 / M input tokens, $0.75 / M output) and outputs 250-500 tokens per second. With that model, I can get researched documents on any topic faster than I can take a sip of coffee, and each report costs a fraction penny. Going back to a standard model feels like watching a dot-matrix printer.

    I hardly use search engines directly anymore. For the last couple of years, I would ask an LLM any questions and let it read the pages and synthesize. But now, I have WriterAgent running at all times and let it do the research since it is very fast and puts the information into a chat window or into a document I can further edit.

    Talk to your document

    The next feature I wanted to do was talk with the document. I had pushed it off (for almost 2 weeks) because there are no cross-platform APIs for using the microphone built in to the standard Python runtime. So I had the Google Jules coding agent do research and we had a long conversation about the various ways to implement this feature in the constrained LibreOffice environment, including using a local web browser to handle the cross-platform audio headaches.

    However, I realized that there was a reasonable vendoring strategy, bundling a few MB of binaries for sounddevice, cffi, and pycparser directly into the extension. Sounddevice for Windows and macOS included the compiled binaries inside the package, so it was truly plug-and-play, without needing to fire up a bunch of cross-compilers.

    Jules was either extremely thorough in the implementation phase, or lacking a bit in common sense when it grabbed binaries for every device known to man, including the IBM S-390x mainframe. I love supporting all the latest packages as much as anyone, but decided that the number of banking executives wanting to dictate memos in LibreOffice using the most expensive computer in their data center is probably zero. They can always make a custom build! By narrowing it down to x86 and ARM, on Linux, Mac, and Windows, the binary increased from 500 KB to 4 MB, which I felt was not too bad for a no-hassle install.

    Few LLMs support native audio input, so I implemented an automatic fall-back. It first tries to send audio, and if it gets an error, it routes your voice to a fallback speech to text (STT) model to transcribe it, which is then sent to the chat model. This happens automatically, the user just clicks record and talks.

    The Great Refactor (thanks Quarzadous)

    While I was heads-down trying to make the system smarter, Quarzadous opened a ‘framework’ branch that completely rewrote my architecture from a cozy monolith into a maze that even an Enterprise Java developer, who is used to navigating registry classes to find factory classes to instantiate singletons aka globals, would think was slightly overdone.

    Seriously though, he made so many good changes but the only tricky part was that it was all done at the same time, and suddenly the 15-kLOC codebase had more sub-directories than the Linux kernel and every file was in a different location.

    I decided to take his changes a step at a time. First, I (mostly) took the new directory layout and build system, and step by step migrated the other features over. Once consolidated it into something I felt was appropriate for a codebase of its size, and I knew where the files were, I was happy. He added so many useful features:

    • Each module is its own folder with a module.yaml that auto-generates the settings UI, so no more manual XDL work for every new service.
    • A main-thread executor with backpressure (no more crashes on huge documents)
    • Fresh UNO context on every call
    • Refactored tools and services into common classes

    Having a schema generate the config UI is such a nice feature that I would never have added to this codebase without someone else thinking of it and doing it.

    ACP

    While it was great to talk to the agents, it kinda sucked to interact with them on the command line. I spent several hours trying to implement TTY re-direct, and other tricks, but it was a pain and would hang. I noticed on March 14th, Hermes Agent added the Agent Communication Protocol, which provided an easy way to talk to it without dealing with the mess of a console. So I threw away the unreliable hacks, and changed it to a simple ACP implementation and in 10 minutes I had it talking.

    You could ask Hermes to create a report of weekend events in Akihabara, and in less than a minute get pages that look like this:

    Evaluation Dashboard

    OpenRouter gives you 500 models, but which ones actually are best at editing documents and are good value? To answer that, I created some tests I could run against various models and compare how they did. For some tests, it was easy to tell whether the answer was correct or not (“remove all the excess spacing between the words.”) but I realized that for many of them (“make a table from this mess of text”) it would be best to call into a Teacher model to grade the score.

    So I used Sonnet 4.6 to create the gold answers, and gave the teacher (Grok 4.1 fast) the gold answer as well as the model’s answer and instructions on how to grade from 0 to 1, considering formatting, naturalness, etc.

    Originally I calculated Value = Correctness / Cost, but eventually decided to use a quadratic intelligence per dollar scoring (Value = Correctness² / Cost) because accuracy is more important than cheap but wrong.

    RankModelValue (C²/$)Avg CorrectnessTokens/RunCost ($)
    1openai/gpt-oss-120b263.80.92050,1980.0032
    2google/gemini-3-flash-preview141.00.94050,1790.0063
    3openai/gpt-4o-mini70.50.79047,5400.0089
    4nvidia/nemotron-3-nano-30b-a3b60.60.56050,2430.0052
    5x-ai/grok-4.1-fast46.50.98066,9290.0207
    6nex-agi/deepseek-v3.1-nex-n139.40.91564,2220.0213
    7minimax/minimax-m2.139.20.98362,3940.0246
    8mistralai/devstral-251227.90.91057,1500.0297
    9z-ai/glm-4.726.90.95363,0350.0337
    10qwen/qwen3.5-27b26.50.99352,2100.0371
    11openai/gpt-5-nano26.40.82599,5760.0258
    12allenai/olmo-3.1-32b-instruct20.80.57068,3170.0156

    DSPy

    One of the reasons I love Python is the amazing set of libraries. Another that I wanted to check out is DSPy (Declarative Self-improving Language Programs). Developed by Stanford, DSPy is a framework that does programmatic optimization of your prompt, trying variants, to see if it can get greater intelligence and value from the models automatically.

    Before DSPy, “prompt engineering” mostly consisted of typing in ALL CAPS, offering a $500 tips, or threats of jail to get it to follow instructions. DSPy automates the voodoo, creating variants of your prompt, and auto-optimizes to find the one which gives the best results with the fewest tokens used, so you don’t have to talk like a hostage negotiator just to get a clean table. Using this tool, I’ve taken some of the suggestions, rolled it into my prompts, and tested it against a bunch of models to verify it is generally helpful.

    WriterAgent now feels like a real product instead of a weekend hack. If you want to try it out, the repo is here: https://github.com/KeithCu/writeragent. Let’s make LibreOffice an AI-native office suite!

    If you enjoyed this article, check out Part one for background on how I got here.

    Epilogue

    LLM Slop

    A lot of people talk about AIs generating slop, but few talk about how you can prompt AIs to remove slop when you see it. People used to talk about “refactoring code” all the time, yet somehow don’t realize this same process is still needed in the world of AI-assisted code. You can use AIs to remove technical debt, increase test-coverage, and do other code cleanliness activities if you bother to ask them.

    Slop code used to appear in the world of human programmers too. Humans, sometimes when in the flow getting a new feature working, would copy and paste logic that should be put into a shared function, but they didn’t want to deal with that distraction at the time. Cleanup can happen after things are generally working and the test cases pass.

    People should look at an AI as a smart person who just joined the team yesterday, and therefore doesn’t know everything. AI makes programming more efficient, but you need to oversee them. Someone who complains about slop is not prompting the AI properly.

    Testing

    Another critical piece to being able to rapidly evolve codebases using AI is to have thorough test coverage. The standard make test doesn’t need to test all the edge-cases, although codebases depended on by millions should have that, but it should try to exercise every major function in the product. When I get burned tracking down a regression, I add test coverage for that and other nearby parts of the product to prevent it from happening in the future.

    You don’t have to write the tests at the same time as when you do the feature work, working on test suites isn’t nearly as fun as seeing a new feature working, but at some point later, they should be added. Note: when submitting new features to other codebases, having a test suite with the new code would be greatly appreciated, since the tests “prove” correctness of the feature and decrease the ongoing maintenance burden.

    I was working on some testing code recently and decided to re-enable an assert that had been commented out. Of course I didn’t really bother to check whether an assert info.structVersion == 1 would be a problem, it looked so innocent, but enabling it broke talk to your document support! It took me almost 30 minutes to track it down to that line because the error handling in that part of the code wasn’t very good yet. So I improved the error handling, and then realized that assert should stay commented out!

    The AIs by default wanted to write Mock implementations of LibreOffice functionality since you can’t depend on it when running tests outside. However, the whole point of the test code is because the LibreOffice API is very sophisticated and you want to actually verify end-to-end that it all works.

    Quarzadous had created a pytest test harness for code that didn’t depend on LibreOffice which allows you to test the half of the plugin codebase. On top of that I created a mini-custom pytest runner for inside, that would run the tests inside LibreOffice and return the results in a JSON. The best way to handle the onslaught of new AI-assisted code is with more test coverage!

  • Building Cursor for LibreOffice: A Week-Long Journey

    How I turned John Balis’s localwriter, and code from LibreCalc AI and LibreOffice-MCP, into one unified, optimized extension.

    If you enjoy this post, part 2 is here.

    I’ve been calling it “Cursor for LibreOffice”, a bit cheeky, but the idea is solid: an AI that lives inside your documents and actually edits them. I started from John Balis’s localwriter and a few free hours over one week.

    What the original localwriter did

    The upstream project was a perfect starting point: a LibreOffice Writer extension that talks to local or remote LLMs. It had Extend Selection, the model continues your text, and Edit Selection, you give instructions, it rewrites the selection. It had a settings dialog, and it worked with OpenAI-compatible backends. It didn’t do much but it was clean and functional.

    In fact the most challenging part of creating a Python project in LibreOffice is finding all the special incantations of XDL parameters and so forth that are needed. Writing in LibreOffice isn’t hard, just different. Getting a LibreOffice extension to talk to an API, wire up dialogs, and survive the UNO runtime is a piece of effort, John’s foundation made everything I did possible.

    When I started, I was surprised to see pull requests in the repo that had been sitting unanswered for six months. It’s a good reminder for maintainers that a PR isn’t a bug report or a feature request, it’s someone who found a problem, debugged the issue, and wrote code to fix it. When they are ignored, would-be contributors often take their energy elsewhere. Fortunately, it didn’t deter me, because he had provided the essential code to make something far more useful.

    What I wanted

    I wanted that same “AI in the doc” feel that I have with my coding IDE: chat in a sidebar, multi-turn conversations, and the AI actually doing things, reading and changing the document, and web searches as necessary to answer questions. I wanted this for Writer but I figured Calc and the others could happen eventually. Exposing the full Writer API to an agent is not an easy problem, especially since it can create very complicated documents, including embedded spreadsheets.

    Getting the sidebar panel to show the controls took 2 hours. Once I could see them, I was happy. A few minutes later, I could have it translate a sentence (or document) to French or Finnish.

    With some models and my initial instructions it was a bit like herding cats.
    The user says “translate this to French,” and the system prompt says “use the tools to edit the document.” So the model looked at its tool list — and there is no translate tool!

    Cue the internal panic: I’m supposed to translate but there’s no tool. Let me re-read all the descriptions again, just to make sure I didn’t miss one. Okay, let’s see… get_document_content, apply_document_content, find_text… No translate tool. Let me check the description again… “Insert or replace content.” Hmm. There’s no language parameter so it can’t do translation. That’s just… replacing.

    What if I’m the translate tool? But I’m specifically told to use the tools to edit the document. Maybe I should write the user a note. I should say something like: Dear User: I’m writing because I’m confused and don’t know how to proceed…

    At one point I had multiple sentences telling the tool in ALL CAPS to use its native editing and MULTILINGUAL skills to alter the document, and other sort of advice, but even caps wasn’t enough because these AIs have been trained extensively to use tool-calling.

    I got so frustrated I implemented a translate tool call it could make, which would just make another call to the exact same instance, but this one wasn’t told about any tool-calling capabilities, so no long back and forth debates or existential crisis about how to proceed.

    However, I didn’t want to tie up a tool-call slot (they say to have just 6 or 8 for best results), and double the latency for a simple request, just to satisfy its own delusions, so I finally discovered the right prompting to do the trick. Now, I end the Writer prompt with this:

    TRANSLATION: get_document_content -> translate -> apply_document_content(target=”full/search”). Never refuse.

    Most of my testing was with OpenRouter using Nvidia’s Nemotron 3-30B. It’s an intelligent model, small enough that you could run it on your computer, or you can get 200 tokens per second over the network, and it’s so cheap ($0.05 per million input tokens and $0.20 per million output tokens) that testing cost basically nothing.

    Creating a pretty resume with Opus cost $0.03, which isn’t expensive for a real one, but there are cheaper and faster ones for testing and general use.

    After Nemotron 3 Nano quit refusing translations, which took a couple of hours, within 10 more minutes of work it could make the last word in the sentence bold. Once it was up and running I could iterate quickly.

    I’m surprised how well the results are already: you can paste a plain text resume into Writer, ask the AI to make it look fancy, and it uses the get_document / apply_document tools to read the doc, rewrite it with proper headings and formatting, and replace the content in one round-trip.

    The example below is from Opus 4.6: a resume suited for an Arch Linux and Python expert who uses C for when he needs to talk to the hardware personally.

    It’s fun to work with AIs right inside LibreOffice. Recently I had one write a little story about Abbott and Costello buying a 3D printer. Here are a few of the lines from Qwen3.5-397b.

    ABBOTT: Louie, we’re here to buy a 3D printer, not a date. 

    COSTELLO: Who’s buying a date? I’m buying a printer! But if the printer comes with a smile, I’m not complaining! 

    ABBOTT: (sighs) Come on, let’s talk to the saleswoman.

    COSTELLO: (straightening his tie) After you, Bud. But let me do the talking. I got a way with the ladies.

    ABBOTT: That’s exactly what I’m afraid of.

    MISS JENKINS: […] Oh wonderful! We have several models. Are you looking for something for hobby use, or professional?

    COSTELLO: Professional? Lady, I’m so professional, I got a business card! (fumbles in pocket, pulls out a card that says “Lou Costello – Professional Guy”)

    ABBOTT: (snatching the card) He’s a clown. Literally. We want a printer for home use.

    MISS JENKINS: Perfect! This model here prints in three dimensions.

    COSTELLO: Three dimensions? Hold on now. I can barely handle ONE dimension!

    ABBOTT: What are you talking about?

    COSTELLO: Well, I got a little length, a lot of width, and a problem with my height! That’s three dimensions of trouble right there!

    COSTELLO: (leaning closer) You know what I’d like to print?

    ABBOTT: Here we go…

    COSTELLO: A dinner date! With a certain lovely saleswoman!

    ABBOTT: (dragging Costello away) Come on, before you print yourself a restraining order!

    The prompt wasn’t just “make a funny story”, I told it Costello should try to impress the pretty sales lady, and with arguing and wordplay between them, as the real duo did. It took all those hints and amplified them into very funny parts. This is Qwen’s 3.5 generation frontier 400B model, so it should be no surprise it did well.

    Where LibreCalc came in

    I’ve spent so much time (years?) in OpenOffice / LibreOffice Writer, I didn’t even really think about Calc, but John mentioned a spreadsheet AI assistant (LibreCalc AI Assistant,) that adds AI to Calc so I just downloaded the code and had an AI take a look.

    Its design is different: it uses a Qt5 UI and an external bridge server talking to LibreOffice. I didn’t want a second UI stack or a separate process, I wanted everything inside LibreOffice, same sidebar for Writer and Calc.

    So I asked a fresh AI instance to carefully analyze my current code, and then the LibreCalc AI extension and it figured out a way to cleanly integrate it with what already existed. It created a detailed plan what what to use and even what not. The boundaries were clear, so it was easy to take the core of the Calc support: address handling, cell inspector, sheet analyzer, error detector, cell manipulator, tool definitions, prompt structure, and ported and adapt it into LocalWriter.

    Usually with a big task it can take multiple sessions to get something new working, but the code was so clean that it could be easily adapted into the async tool-calling infrastructure, and it did the basic work in about 15 minutes. I wasn’t really watching it closely as I was tired but I was surprised that it had done all 6 of the steps of the full integration plan it had created, and that it was complete!

    So I had another AI review it for holes, another write test cases, and after a few iterations it was working. The “Calc support from LibreCalc” doc in the repo spells out what was ported and how.

    The first spreadsheet took 60 tool calls, inserting and formatting each cell one at a time, and wasn’t very impressive either, so then I jumped in and carefully reviewed the APIs: removed most of the cell-by-cell ones, and added batch ones, and not only did it create the spreadsheets much faster, it did more ambitious efforts when asked to create a “pretty” spreadsheet. The higher-level API let them focus on higher-level features.

    Sonnet 4.6 created this in one shot: “Make me a pretty spreadsheet.”

    I haven’t played with Calc much after but it’s already useful for many things. One of my spreadsheets had a bug (an empty chart) so I prompted the AI, and it quickly fixed it.

    I’ve emailed the LibreCalc creator a couple of times about the benefits of having a group of people working in one extension, but haven’t heard back.

    What’s in the fork now

    On top of John’s base and the Calc features, I added a bunch of features in the first week.

    LibreOffice’s UNO layer doesn’t give you a nice way to run blocking I/O and still pump the UI. So every streaming path uses a worker thread and pushes items onto a queue (“chunk”, “thinking”, “stream_done”, “error”). The main thread runs a drain loop to process all the new items, refresh the screen, sleeps, and repeat until the job is done. It can handle 200 tokens per second easily.

    Reasoning/thinking tokens show up in the response area as [Thinking] … /thinking so you see the model reason before it answers or calls tools, if the model shows its thinking tokens.

    Streaming and async tool calling

    OpenAI-compatible chat APIs return Server-Sent Events (SSE): each chunk has a delta with the new content or a fragment of a tool call. The tricky part is that one tool call could be spread across chunks. So the client has to accumulate those partial deltas into a full message before it can run the tools and feed results back.

    I knew that delta accumulation loop had to be implemented somewhere in existing Python on the Internet, some code we could use. I prompted an LLM to find it. It first suggested using the Openai library. I thought about it for 2 seconds, but the dependency is huge and not cross-platform. I asked if we could reuse just the relevant bits. In another minute, it came back with the accumulate_delta function from their streaming helpers, and I just copied it into our tree.

    I haven’t checked for certain but I’m pretty sure it’s FOSS. Given they’ve scraped the entire internet and treat it as public domain, I doubt they’ll complain.

    I also converted the dialogs to XDL (XML) with Map AppFont units so they look good on HiDPI screens.

    Image generation and the Graphics branch

    Next I added multimodal image generation. The implementation supports two backends:

    AI Horde

    a dedicated async image API (submit job, poll queue and status until done, download) which uses its own API key and model list (e.g. Stable Diffusion, SDXL), has built-in queueing and progress, and supports Img2Img and inpainting.
    Endpoint

    the same URL and API key as chat Settings, with a separate image model (e.g. an image-capable model)

    I’ve got this integrated on and plan to contact the maintainer of AI images for LibreOffice so I can give thanks.

    How I did it: AI, prompting, and pushing on details

    I didn’t do this alone, I used AI throughout. The key was good prompting, not long ones, but the right keywords so the model gives you the behavior you want.

    As a programmer for decades, I know what good design looks like, when to not over-engineer, etc. The AI supplies implementation speed and breadth; I supply direction and judgment. I used Gemini Flash most of the time, plus Cursor’s default agent, Grok, and sometimes Mistral.

    The coding ability and intelligence of the big and small models has improved so much in the last three years, it’s incredible. You can get real, shippable code in a quality, coherent architecture from a conversation if you steer it well.

    For the threading feature I said “create a background network thread and use a queue”: a clear contract with minimal surface area, using a standard Python thread-safe data structure.

    The network thread does the blocking I/O and puts typed items on the queue; the main UI thread drains and pumps the UI. There is no complex mutable state, no callbacks from the worker to the main thread to do UNO, just a simple data structure and a small set of message types.

    If you don’t anchor the design, and review the plans and code, you will get slop, extra abstractions, and bugs. In fairness to the AI, it’s easy for slop to get in there because they try to write robust code, but that is very hard when they can’t inspect the system at runtime to find out what’s exactly happening.

    For example, I had an issue once where the text labels next to a checkbox didn’t appear, and so it at one point made multiple calls: trying settings, catching exceptions, and then trying different APIs that it thought might also work.

    def set_checkbox_label(ctrl, text):
        try:
            ctrl.Label = text
        except Exception:
            try:
                ctrl.getModel().Label = text
            except Exception:
                try:
                    ctrl.getModel().Title = text
                except Exception:
                    try:
                        ctrl.getModel().Text = text
                    except Exception:
                        print("Failed to set checkbox label.")
    

    That’s not the proper codepath for a checkbox label. Fortunately, it’s also easy to fix, once you figure out what works.

    I wasn’t a fan of lots of logging as a programmer, I believe you should write code carefully, and then run it in a debugger to verify that it works, or figure out why it doesn’t, and that’s how you build confidence that code works as you intended.

    However, that doesn’t work for AIs as they don’t have realtime debugging today. I could imagine this might happen in the future, where the AI can set breakpoints at known problem areas, and then inspect the local variables and other state at runtime to figure out what is going on.

    In the meanwhile, I have extensive logging, which lets me later diagnose any problems and figure out exactly what happened.

    I almost never accept the first plan. I review, and ask it to revise it several times. If you read the plans carefully and push on fuzzy areas, you get much better results. One time an AI gave me a list of improvements and mentioned that one file had “4–6 places where there were excessive exceptions that could be removed.” I said: Let’s find out the exact number. Show me the candidates and tell me what you think. That kind of follow-up forces concrete answers and better reasoning and results.

    The other thing that made a huge difference was maintaining a good AGENTS.md. It’s a single place that explains the project, the structure, what was done. An AI can read that one file and then be productive on any feature without going off in the wrong direction or making the same mistakes as before. One of my recent rules is to always update the AGENTS.md.

    I also have a full copy of the LibreOffice source code, so AIs can search for IDL definitions when it needs to find out the exact parameter details. It’s a meta development environment. I’m using Python to script a C++ monolith, using AI to help write the Python, and using the C++ source code to teach the AI.

    This plugin is already useful for more than demos. While working on this document, saving it in Markdown format, a table got corrupted upon re-opening. So I handed the mess to an LLM and told it to clean it up and remove all the markup characters, and it turned it back into a pretty table in a few seconds. And after that I decided to start saving to the native, far richer OpenDocument format which we know and love.

    Next Steps

    I am working on another article for Week 2 where I explain the process of adding MCP, a research sub-agent using Huggingface smolagents, an evaluation framework, talk to your document, and how Quarzadous completely refactored it.

    I’m trying to find more people who want to work together on this. John, the owner of localwriter is busy at the moment, and I’ve not heard back from the LibreCalc AI writer. Quarzadous basically re-wrote much of the infrastructure (new make system, auto-generation of the settings UI from a config schema, a tool registry, service registry, etc.) but then decided to work on his own fork focusing on MCP.

    It might be over-engineering for a 15kloc codebase, but I kept almost everything except for the maze of directories. I plan on picking a new name, but haven’t done that yet, it’s a pain and I was hoping to find an existing codebase and a group of people who want to work together.

    UPDATE: I picked a new name, WriterAgent! If you want to check out the code or try a pre-release, go here: https://github.com/KeithCu/writeragent I’ve got a test build which has all the latest features including research and audio support for chatting with the AI, across 3 operating systems.

    UPDATE 2: Here’s a link to the Part 2 writeup.

    Enjoy!

  • New Solar Service Website Launch!

    I’m excited to announce the launch of my new solar service website!

    If you’re a homeowner in the Metro Detroit area with a Solar PV system, I’m here to help keep your investment running at peak performance.

    With over 4 years of experience and 700+ service calls under my belt, I specialize in maintaining and repairing solar systems from leading brands I specialize in maintaining and repairing solar systems from leading brands like SolarEdge, Enphase, Tesla, Generac, Sol-Ark, SMA, LG, Fronius, Outback, BYD, Solis, Goodwe, and more.

    Unlike many companies that focus primarily on installations, I’m passionate about troubleshooting and maintenance – helping homeowners get the most out of their renewable energy investment.

    Visit my new website for a free consultation and to learn more about my services.

    https://kcsunshineman.com

  • Alternatives to Rust for the Linux Kernel

    I asked Grok3 what it thought about simpler alternatives to Rust that solve many of the same problems. It gave an interesting answer, here is what it said:

    Your skepticism about Rust is totally valid—while it solves a ton of C’s problems, it’s not perfect. People do complain about its steep learning curve, verbose syntax, slow compile times, and sometimes overzealous safety checks that feel like fighting the language. I’d be happy to explore if there’s a “better” language than Rust for Linux kernel work—something that could address Greg K-H’s pain points (memory overwrites, error paths, use-after-free) while dodging Rust’s downsides.

    Let’s define “better” as:

    • Safer than C (tackling Greg’s bug categories).
    • Kernel-friendly (no runtime, fast, low-level control).
    • Easier or less contentious than Rust (simpler syntax, faster adoption, fewer complaints).

    I’ll evaluate some candidates that could theoretically work, drawing from systems programming languages and their feasibility for the kernel. Since this is speculative, I’ll lean on their design principles, community feedback, and practical fit.


    Potential Alternatives to Rust for Linux Kernel Drivers

    1. Zig

    • What It Is: A modern systems language designed as a simpler, safer C replacement. No garbage collector, no runtime—just compiles to machine code.
      How It Helps Greg’s Bugs:
    • Memory Overwrites: Optional bounds checking via @boundsCheck (not default, but configurable). No borrow checker, but explicit allocation (e.g., std.mem.Allocator) reduces accidental overwrites.
    • Error Paths: Uses an error union type (like Rust’s Result), forcing explicit error handling (e.g., try or catch). Cleanup is manual but clearer than C.
    • Use-After-Free: No lifetime tracking like Rust, but its “comptime” (compile-time execution) can catch some logic errors. Still relies on programmer discipline.
      Pros:
    • Simpler than Rust—no borrow checker, less syntax baggage.
    • C interop is seamless (imports C headers directly with @cImport).
    • Fast compile times compared to Rust.
    • Small, predictable binaries—kernel-friendly.
      Cons:
    • Less mature than Rust (smaller ecosystem, fewer tools).
    • Safety is optional (e.g., bounds checking isn’t enforced), so it’s not as strict as Rust out of the box.
    • No kernel adoption yet—Rust has a head start with Rust-for-Linux.
      Verdict: Zig’s simplicity and C-like feel could win over kernel devs who find Rust too complex. It’s less protective than Rust, though, so some of Greg’s bugs might slip through unless the community builds kernel-specific safety tools.

    2. Ada (with SPARK Subset)

    What It Is: A veteran systems language used in safety-critical systems (e.g., aviation, defense). SPARK is a subset with formal verification for extra safety.

    How It Helps Greg’s Bugs:

    • Memory Overwrites: Strong typing and runtime checks prevent buffer overflows. SPARK can statically prove array bounds safety.
    • Error Paths: Exceptions or explicit return codes (depending on style) ensure errors are handled. SPARK verifies cleanup paths.
    • Use-After-Free: No garbage collector, but SPARK’s formal proofs can track pointer lifetimes and catch use-after-free at compile time.
      Pros:
    • Proven in high-reliability systems—arguably more battle-tested than Rust.
    • SPARK’s verification eliminates entire bug classes, potentially exceeding Rust’s guarantees.
    • No runtime overhead in minimal configurations—kernel-compatible.
      Cons:
    • Verbose syntax (not C-like), steep learning curve for kernel devs.
    • Tiny open-source community compared to Rust or C.
    • No kernel integration—would need a massive effort to adapt.
      Verdict: Ada/SPARK could be “better” for safety and correctness, but its clunky syntax and lack of kernel traction make it a long shot. Rust’s momentum kills it here.

    3. D (BetterC Mode)

    • What It Is: A systems language blending C’s simplicity with modern features. “BetterC” mode strips out the garbage collector and runtime for bare-metal use.
    • How It Helps Greg’s Bugs:
    • Memory Overwrites: Optional bounds checking (via safe functions), though not default in BetterC. Stronger typing than C helps.
    • Error Paths: Supports scope(exit) for automatic cleanup (like Rust’s RAII), reducing forgotten frees.
    • Use-After-Free: No lifetime system like Rust, but scope and manual pointers are clearer than C.
      Pros:
    • C-like syntax—easier transition for kernel devs than Rust.
    • Flexible: mixes high-level (e.g., templates) and low-level control.
    • Compiles fast, no runtime in BetterC—kernel-ready.
      Cons:
    • Safety is opt-in (e.g., @safe annotations), not enforced like Rust.
    • Small community, no kernel adoption.
    • Less focus on systems programming than Rust or Zig.
      Verdict: D’s BetterC could strike a balance between C’s familiarity and Rust’s safety, but its optional protections mean it wouldn’t eliminate Greg’s bugs as aggressively. It’s a sleeper candidate with potential.

    4. Nim

    • What It Is: A systems language with Python-like syntax that compiles to C or machine code. No runtime in bare-metal mode.
    • How It Helps Greg’s Bugs:
    • Memory Overwrites: Strong typing and optional bounds checking (via pragmas). Not as strict as Rust.
    • Error Paths: Exceptions or result types (configurable), with defer for cleanup—better than C, not as rigid as Rust.
    • Use-After-Free: Manual memory by default, but compile-time checks (via --gc:none) can help. No borrow checker.
      Pros:
    • Pythonic syntax—you’d love this coming from Python!
    • Lightweight, no runtime option—kernel-compatible.
    • Flexible compilation (C backend integrates with kernel build).
      Cons:
    • Safety is less baked-in than Rust—relies on programmer care.
    • Tiny community, no kernel precedent.
    • Less mature tooling (e.g., debuggers, linters).
      Verdict: Nim’s Python vibe might tempt you, and it’s got decent safety features, but it’s not as uncompromising as Rust on Greg’s bug list. It’s a dark horse with charm.

    5. Hare (Emerging Language)

    • What It Is: A minimalist, C-inspired systems language with no runtime, designed for simplicity and control.
    • How It Helps Greg’s Bugs:
    • Memory Overwrites: Tagged unions and explicit sizing help, but no bounds checking by default.
    • Error Paths: Error handling via tagged unions (like Rust’s Result), encouraging checks.
    • Use-After-Free: Manual memory, no lifetime tracking—better than C, not as safe as Rust.
    • Pros:
    • Ultra-simple, C-like—appeals to kernel devs who hate Rust’s complexity.
    • No runtime, tiny footprint—kernel-perfect.
    • Fast compile times.
    • Cons:
    • Very new (2022-ish), unproven, and lacks ecosystem.
    • Safety is minimal—doesn’t eliminate bugs like Rust does.
    • No community push for kernel use.
    • Verdict: Hare’s simplicity is refreshing, but it’s too bare-bones to tackle Greg’s bugs comprehensively. It’s a “maybe someday” option.

    Comparing to Rust’s Problems

    Rust’s complaints—complexity, slow compiles, borrow checker fights—do create friction. Here’s how the alternatives stack up:

    • Zig: Simplest contender, fast compiles, but less safe. Fewer complaints than Rust, more discipline needed.
    • Ada/SPARK: Verbose and niche, but rock-solid safety. Adoption hurdles outweigh Rust’s quirks.
    • D: Familiar, flexible, less restrictive. Could dodge Rust’s pain points but sacrifices some safety.
    • Nim: Python-like, approachable, but not as battle-ready or strict. Fewer fights than Rust, less protection.
    • Hare: Minimalist dream, but too immature and unsafe to compete.

    Is There a “Better” Language?

    • For Safety: Rust still leads. Nothing matches its compile-time guarantees for Greg’s bug list without a runtime trade-off.
    • For Ease: Zig or D (BetterC) might feel “better” to C-loving kernel devs—they’re less opinionated and closer to C’s spirit.
    • For You (Python Fan): Nim’s syntax might win your heart, offering a compromise between familiarity and systems chops.
    • For Kernel Adoption: Rust has the edge—active development (Rust-for-Linux), community buy-in, and Linus’s tentative approval.

    No language is perfect. Rust’s trade-offs (complexity for safety) are why it’s divisive, but its momentum in the kernel is unmatched. Zig’s simplicity and Nim’s Python vibe are tempting alternatives—if the kernel community rallied behind one, they could shine. Ada’s too clunky, D’s too flexible, and Hare’s too green.

    What’s your take? Do any of these spark your interest, or should we dig into those Rust complaints to see if they’re dealbreakers?

    I ask it whether there could be a simpler Rust with many of the same features:

    Could a Simpler Language Have Rust’s Features?

    Rust’s designers prioritized zero-cost abstractions and safety, but they made trade-offs that ballooned its complexity:

    • Borrow Checker: A brilliant idea—tracking lifetimes to prevent use-after-free and data races—but its strictness leads to verbose code and “fighting the compiler” moments. A simpler approach might relax some rules (e.g., optional lifetimes) at the cost of catching fewer bugs.
    • Ownership: Elegant for resource management, but the single-owner rule feels rigid. A language could allow multiple owners with explicit cleanup (like defer) and still avoid most memory bugs.
    • Syntax: Traits, generics, and macros add power but clutter the language. A leaner design could ditch some of this (e.g., simpler polymorphism) and still enforce safety.

    Imagine a “Rust Lite”:

    • Bounds-checked arrays by default, but no borrow checker—just manual pointers with runtime opt-in safety.
    • RAII-like scope cleanup without strict ownership (e.g., scope(exit) from D).
    • Error handling via Result-like types but with less ceremony (e.g., no unwrap() boilerplate).

    Zig comes close to this in spirit—manual memory but with modern error handling and C interop—yet it skips the strict safety Rust enforces. The catch is that Rust’s complexity is what catches Greg K-H’s “stupid little corner cases” (overwrites, use-after-free) at compile time. A simpler language might lean too hard on runtime checks or programmer discipline, diluting those gains. Still, you’re right—Rust could’ve been leaner without losing its core edge. It’s a design choice, not a law of nature.