<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://thapar25.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://thapar25.github.io/" rel="alternate" type="text/html" /><updated>2026-05-06T11:15:25+00:00</updated><id>https://thapar25.github.io/feed.xml</id><title type="html">thapar.logs</title><subtitle>Hello, world! This is Pulkit Thapar&apos;s vault of notes, wins and lessons from building in tech. AI, hobby projects, and open-source. Mostly the journey, sometimes the opinion, rarely the feelings.</subtitle><entry><title type="html">I Smell The 3D Printer Coming</title><link href="https://thapar25.github.io/2026/05/02/blender-mcp.html" rel="alternate" type="text/html" title="I Smell The 3D Printer Coming" /><published>2026-05-02T00:00:00+00:00</published><updated>2026-05-02T00:00:00+00:00</updated><id>https://thapar25.github.io/2026/05/02/blender-mcp</id><content type="html" xml:base="https://thapar25.github.io/2026/05/02/blender-mcp.html"><![CDATA[<p>There is a pattern I have noticed with myself. The moment I save enough money to feel comfortable, something finds me. Something I want but cannot fully justify.</p>

<p>Right now, that something is a 3D printer.</p>

<p>A few months ago, <a href="https://microsoft.github.io/TRELLIS.2/">TRELLIS 2</a> made it worse. Microsoft dropped a 4 billion parameter model that takes a single image and spits out a fully textured 3D asset in seconds.</p>

<video autoplay="" loop="" muted="" playsinline="" aria-label="Trellis 2 via Microsoft" style="width:70%;">
<source src="https://microsoft.github.io/TRELLIS.2/assets/reconstruction/retro_fridge_tv.mp4" type="video/mp4" />
</video>

<p><br /></p>

<p>I watched the demo, raised an eyebrow, and quietly moved the 3D printer a little higher on my wishlist.</p>

<p>Then this week I did something dumber and more interesting.</p>

<p>I gave Claude a photo of my desk and asked it to build a 3D model in <a href="https://www.blender.org/lab/mcp-server/">Blender via MCP</a>.</p>

<p><img src="/assets/images/claude-blender-mcp-chat.jpg" alt="Claude Chat Screenshot" /></p>

<p><img src="/assets/images/blender-mcp-output.gif" alt="Animation rendered via Blender MCP + Claude" /></p>

<p>Not perfect. But it rendered.</p>

<p><em>One thing worth mentioning for anyone trying this: if Claude and Blender are on separate machines, MCP’s default <code class="language-plaintext highlighter-rouge">stdio</code> won’t cut it. Switch to <code class="language-plaintext highlighter-rouge">http</code> and expose it over your local network.</em></p>

<p>Here is the thing about the 3D printer sitting in my wishlist for years: the barrier was never the money. It was the time. Learning 3D modeling always felt like a full commitment I could not justify. TRELLIS showed me the ceiling of what AI can do here. Blender MCP gave me something more valuable: a starting point I could actually own.</p>

<p>That distinction matters. One does the work for you. The other teaches you how the work gets done.</p>

<p>AI is not here to kill the skill. It is here to hand you the door. You still have to walk through it.</p>

<p>Anyway. Now I know what no-code CEOs felt shipping their first website.</p>

<blockquote>
  <p>The 3D printer is getting closer ;)</p>
</blockquote>]]></content><author><name></name></author><category term="blender" /><category term="3d-printing" /><category term="mcp" /><category term="AI" /><summary type="html"><![CDATA[A rough render, a real unlock, and one step closer to buying a 3D printer.]]></summary></entry><entry><title type="html">Distribution is King: Why Integrations Will Define the AI Moat</title><link href="https://thapar25.github.io/2026/05/01/distribution-is-king.html" rel="alternate" type="text/html" title="Distribution is King: Why Integrations Will Define the AI Moat" /><published>2026-05-01T00:00:00+00:00</published><updated>2026-05-01T00:00:00+00:00</updated><id>https://thapar25.github.io/2026/05/01/distribution-is-king</id><content type="html" xml:base="https://thapar25.github.io/2026/05/01/distribution-is-king.html"><![CDATA[<p>There is a specific kind of satisfaction that comes from watching an idea you had in a hallway conversation turn into a funded product. It has happened enough times now that I have started treating it less as coincidence and more as signal.</p>

<p>Two years ago, I was talking with <a href="https://www.linkedin.com/in/thejpal-ramannagari/">Thejpal Ramannagri</a> about memory in AI systems. The argument was simple: long-term memory is not something a user should be managing. The agent should own it, evolve it, and improve it without prompting. Today, <a href="https://hermes-agent.nousresearch.com/">Hermes Agent</a> is gaining traction almost entirely because of this self-improving memory layer. The idea was not novel because it was clever. It was obvious if you were thinking about what agents actually needed to be useful.</p>

<p>Around the same time, <a href="https://www.linkedin.com/in/yashmakkar/">Yash Makkar</a> wanted to build a tool that turned codebases into visual graphs for developer onboarding. While discussing Hybrid GraphRAG with Neo4j, we landed on something important: codebase data is <code class="language-plaintext highlighter-rouge">semi-structured</code>. An LLM processing an entire codebase into a knowledge base is expensive and often unnecessary. Agents navigating files through search, grep, and glob patterns are cheaper and more accurate for discovery tasks. Claude Code ships exactly this behaviour. Not a coincidence. Just physics.</p>

<blockquote class="twitter-tweet"><p lang="en" dir="ltr">👋 Early versions of Claude Code used RAG + a local vector db, but we found pretty quickly that agentic search generally works better. It is also simpler and doesn’t have the same issues around security, privacy, staleness, and reliability.</p>&mdash; Boris Cherny (@bcherny) <a href="https://twitter.com/bcherny/status/2017824286489383315?ref_src=twsrc%5Etfw">February 1, 2026</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

<p>These two patterns share a common thread: the real unlock was not a better model. It was a better understanding of the problem’s shape.</p>

<p>Which brings me to what I think is the next big one.</p>

<h2 id="the-benchmark-treadmill">The benchmark treadmill</h2>

<p>Right now, the AI discourse is stuck in a loop. A new model drops. A new benchmark is cited. Social media fills up with capability demonstrations. Repeat.</p>

<p><img src="/assets/images/tin-foil-cat-meme.png" style="width: 50%" alt="tinfoil cat meme template" /></p>

<p>Nobody is talking about cost per value. OpenClaw, for instance, is doing genuinely interesting work with integrations, but the conversation around it stays focused on what it can do rather than what it costs to do it in production. Nobody is publishing what their agent actually costs to run versus what it is returning. Yes, AI is advancing at a remarkable pace. But for the majority of businesses that are not top-dollar firms with dedicated AI budgets, the unglamorous questions of cost, reliability, and fit are still the ones that determine adoption. The capability gap is shrinking. The integration gap is not.</p>

<h2 id="the-actual-moat-is-distribution">The actual moat is distribution</h2>

<p>Every major SaaS product right now is bolting AI onto its existing ecosystem. Most are gating it behind a higher tier or a separate membership. This is rational short-term pricing behaviour. It is a long-term adoption trap.</p>

<p>Users do not want another tool. They want their existing tools to get smarter. Look at what Anthropic is doing with Claude: rather than building a monolithic suite, they are taking integrations one at a time. Excel, PowerPoint, and now reportedly Blender. Each integration is a new reason for a different user segment to stay. <a href="https://n8n.io/">n8n</a> tells the same story from a different angle. It became popular not because it was the most powerful automation tool, but because it became a bridge. Any tool with an API could talk to any other tool. The product did not replace your stack. It connected it. That is why it spread.</p>

<p><img src="https://n8niostorageaccount.blob.core.windows.net/n8nio-strapi-blobs-prod/assets/Screenshot_2022_08_05_at_15_05_13_7bb75d8cf5.png" alt="n8n-as-a-bridge: source n8n.io" /></p>

<p>The products that win will be the ones that integrate horizontally across the workflows users already live in, not the ones selling a new suite they have to migrate to. An AI that lives inside Slack, reads your Notion, checks your calendar, and acts on your behalf without requiring you to open a new tab is not just more convenient. It is structurally stickier. The integration is the lock-in. Not the model. Not the interface.</p>

<p>This is why distribution wins. A good-enough model with excellent integrations beats an excellent model with no integrations, every time.</p>

<h2 id="harness-engineering-is-the-name-for-what-this-requires">Harness engineering is the name for what this requires</h2>

<p><img src="https://pbs.twimg.com/media/HFOWvmAaIAAzGHg?format=jpg&amp;name=900x900" alt="Harness Engineering via X.com" /></p>

<p><em>Sources:</em></p>
<ul>
  <li><a href="https://x.com/akshay_pachaar/article/2041146899319971922">@akshay_pachaar <em>via X</em></a></li>
  <li><a href="https://www.langchain.com/blog/the-anatomy-of-an-agent-harness">Vivek Trivedy <em>via Langchain</em></a></li>
</ul>

<p>The industry has recently started using the term “harness engineering” to describe the discipline of building the systems around an AI model: the constraints, the feedback loops, the integrations, the context pipelines. The model is the horse. The harness is everything that makes it go where you need it to go.</p>

<p>This framing correctly relocates the engineering challenge. Think of it this way: the model is the CPU, the harness is the operating system, and the agent is the application. Nobody buys a computer for the CPU alone. The OS is what makes it useful, and the applications are what make it irreplaceable.</p>

<p>Most agent failures in production are not model failures. They are harness failures: broken state management, missing context, tools that do not connect to the right systems. The bottleneck was never the intelligence. It was the infrastructure around it.</p>

<p>Models are becoming interchangeable faster than anyone predicted. The harness is not. And the most defensible part of any harness will be its integrations, how deeply it is woven into the workflows and data sources that users already depend on.</p>

<p>The teams that figure this out first will not just have better products. They will have products that are genuinely hard to replace.</p>

<p>That is the moat. Not the benchmark. Not the context window. The connections.</p>]]></content><author><name></name></author><category term="AI" /><category term="harness-engineering" /><category term="two-cents" /><summary type="html"><![CDATA[Models are becoming commodities. The real competitive advantage in AI is harness engineering, and it starts with distribution.]]></summary></entry><entry><title type="html">Luna v2: The Orchestrator Takes Charge</title><link href="https://thapar25.github.io/2026/04/27/luna-v2.html" rel="alternate" type="text/html" title="Luna v2: The Orchestrator Takes Charge" /><published>2026-04-27T00:00:00+00:00</published><updated>2026-04-27T00:00:00+00:00</updated><id>https://thapar25.github.io/2026/04/27/luna-v2</id><content type="html" xml:base="https://thapar25.github.io/2026/04/27/luna-v2.html"><![CDATA[<p><img src="/assets/images/luna-pokemon-evolution.gif" alt="Pokemon-style Evolution: Luna.AI" /></p>

<p>The <a class="wiki-link" href="/2026/04/19/luna-backend.html">last post about Luna</a> ended with a known problem: one sentence, two agents, one dropped task. The router picked a lane and stayed in it. Luna v2 fixes that.</p>

<p>It is live. Here is what changed.</p>

<hr />

<h2 id="the-new-architecture">The New Architecture</h2>

<p>The single <code class="language-plaintext highlighter-rouge">match</code> statement is gone. In its place: an orchestrator and a set of worker agents.</p>

<p>The orchestrator receives every message, understands intent, and decides who handles it. Workers get invoked in one of two modes:</p>

<ul>
  <li><strong>Ask</strong> - the worker executes and reports back to the orchestrator, which synthesises a final response.</li>
  <li><strong>Delegate</strong> - the worker responds directly to the user. No round-trip.</li>
</ul>

<p>The distinction matters. A task that spans Notion and Google Calendar goes through Ask mode; both agents run, the orchestrator assembles the result. A quick Q&amp;A gets delegated immediately. The right tool for the right depth of task.</p>

<p>This is what LangGraph was always capable of. The v1 graph just did not use it.</p>

<p>A note on latency. Luna is an ambient agent. Fire a request and move on. The priority has always been task completion at minimum cost, not response speed. With that context, the numbers are fine: basic responses around 1.5 seconds, delegated tasks like a calendar query around 4 seconds. There is a well-known trilemma in AI-systems design: performance, cost, and speed. Pick two. This stack is optimized for the first two. Speed is acceptable collateral.</p>

<p><img src="/assets/svgs/mermaid-diagram-2026-04-28T23-07-58.svg" alt="Luna v2 Graph" /></p>

<h2 id="feedback-is-now-a-tool">Feedback Is Now a Tool</h2>

<p>There was a quiet failure mode in v1 that kept showing up in the logs.</p>

<p>Feedback - thumbs up, thumbs down, corrections, was handled by regex matching. Specific phrases triggered specific logic. It was brittle in exactly the way you would expect from brittle logic: the moment ASR transcribed something slightly off, the match failed and the feedback went nowhere.</p>

<p>The fix was to stop treating feedback as a parsing problem and start treating it as an agent capability. The orchestrator now has a dedicated feedback tool it can call when it detects a correction or rating signal, regardless of exact phrasing. Speech-to-text imperfections become the model’s problem to interpret, not the code’s problem to match.</p>

<p>Fewer silent failures. More reliable feedback loop.</p>

<h2 id="wispr-flow-and-the-invisible-interface">Wispr Flow and the Invisible Interface</h2>

<p><a href="https://wisprflow.ai">Wispr Flow</a> has been taking off lately, and it is worth calling out in this context.</p>

<p>I have been using it heavily across work, writing, and talking to Luna. Going back to typing feels like a downgrade at this point. Speaking is faster, more natural, and closer to how you actually think.</p>

<p>The premise of Luna has always been voice as the primary interface. Wispr Flow gaining this much traction right now is a signal that the broader market is arriving at the same conclusion.</p>

<p>The roadmap writes itself from here. If voice-first tools are breaking out, local ASR is going to get significantly better. Google and Apple are not going to sit this one out. On-device speech recognition will get smarter, faster, and more context-aware. Luna is already built for a world where that is true.</p>

<h2 id="slack-as-a-second-channel">Slack as a Second Channel</h2>

<p>Luna now lives in Slack.</p>

<p>Two reasons this happened:</p>

<ul>
  <li>
    <p>First, I started using Slack seriously and immediately saw the overlap. Luna already knows my Notion workspace, my calendar, my tasks. Having that available in the context where I am already thinking about work is not a nice-to-have, it is the right place for it. Tag Luna in a channel, ask a question, get something done. Same brain, new door.</p>
  </li>
  <li>
    <p>Second, observability. LangSmith handles full traces. But I do not always want to open LangSmith. Sometimes I just want to see what the agent is doing in the background - which tools it called, what it decided, where it went. I am considering writing selective log messages directly to a Slack channel. I control what gets logged. It is a lightweight window into agent behaviour without the overhead of pulling up a full trace.</p>
  </li>
</ul>

<p>Luna inside Slack channels is still being tested. Notion queries and calendar lookups are working. More to follow.</p>

<p><img src="/assets/images/luna-x-slack.png" alt="Luna on Slack" /></p>

<hr />

<h2 id="one-brain-three-voices">One Brain, Three Voices</h2>

<p>WhatsApp is the next channel on the list. But adding interfaces is no longer just a routing problem, it is a formatting problem.</p>

<p>Think about it. The iOS shortcut speaks its response out loud. A bullet point in a TTS response sounds like someone reading a list at you. An asterisk sounds like nothing at all. That interface needs plain prose: conversational, natural, the kind of language you would use speaking to someone, not presenting to them.</p>

<p>WhatsApp is different. It is a screen. Line breaks, spacing, maybe a little structure, all of that renders and helps. Slack goes further: work context, larger screen real estate, richer formatting is appropriate and expected.</p>

<p>Same response. Three different contracts for how it gets delivered.</p>

<p>Luna will eventually need to be aware of which channel it is speaking into and shape its output accordingly. Not just what to say, but how to say it. That is the next layer of work before new interfaces get added casually.</p>

<p>Same brain. But different voices for different rooms.</p>

<h2 id="what-is-next">What Is Next</h2>

<p>WhatsApp is in progress. The n8n automation layer is next in line. And before any of that ships cleanly, the channel-aware response formatting above has to be solved. You do not want Luna reading markdown bullet points into your AirPods.</p>

<p>The architecture is ready. The voice has to catch up.</p>]]></content><author><name></name></author><category term="Luna" /><category term="AI" /><category term="homelab" /><category term="personal-assistant" /><category term="LangGraph" /><category term="Slack" /><summary type="html"><![CDATA[Luna v2 is live. The single-destination router is gone. Here is what replaced it, why feedback is now a tool, and how Slack became a second home for the brain.]]></summary></entry><entry><title type="html">Everyone in SF Knows GitHub Stars Are Fake. Nobody Cares.</title><link href="https://thapar25.github.io/2026/04/22/github-stars.html" rel="alternate" type="text/html" title="Everyone in SF Knows GitHub Stars Are Fake. Nobody Cares." /><published>2026-04-22T00:00:00+00:00</published><updated>2026-04-22T00:00:00+00:00</updated><id>https://thapar25.github.io/2026/04/22/github-stars</id><content type="html" xml:base="https://thapar25.github.io/2026/04/22/github-stars.html"><![CDATA[<h2 id="are-star-histories-oss-milestones">Are Star Histories OSS Milestones?</h2>

<p><img src="https://api.star-history.com/chart?repos=raga-ai-hub/RagaAI-Catalyst%2Cnousresearch/hermes-agent&amp;type=date&amp;legend=top-left" alt="Star History Chart" /></p>

<p>There’s a repo on GitHub right now with 14,000 stars, a gorgeous README, and code that hasn’t had a meaningful commit in eight months. You’ve probably starred something like it. So have I.</p>

<p>We all know what a GitHub star actually means in 2026: someone thought a project looked cool for thirty seconds. Maybe they were procrastinating. Maybe it showed up on Hacker News. Maybe someone in a San Francisco office spent $200 to make it look like traction before a seed round. Nobody says that last part out loud.</p>

<h2 id="the-game-everyones-playing">The game everyone’s playing</h2>

<p>Here’s how the SF open-source playbook works right now, and it’s barely a secret.</p>

<p>You build a tool. You write a beautiful README with a snappy GIF. You post it to every subreddit, every Discord, every corner of the internet simultaneously. You call it a “day one launch.” Then an AI influencer with 200k followers on X quote-tweets it with “this is going to be HUGE 🚀” without having run a single line of it. Three more influencers repost that. Stars pour in. If you’re playing the game seriously, you might top it off with a few hundred purchased ones, because VC firms have <a href="https://www.redpoint.com/content-hub/written/so-how-many-stars-is-enough/">literal scrapers watching star velocity</a>, and <a href="https://runacap.com/ross-index/">Runa Capital publishes a quarterly ranking</a> of open-source startups sorted almost entirely by 90-day star growth. Sixty-eight percent of those ranked companies subsequently raised funding.</p>

<p>You can buy 1,000 stars for around $64. <a href="https://awesomeagents.ai/news/github-fake-stars-investigation/">Providers are easy to find</a>: prices run from $0.10 to $2.00 per star, delivery in hours, no login required. Against a $2M seed round, that math writes itself.</p>

<p>The part that makes it funny, in a bleak way: <a href="https://arxiv.org/abs/2412.13459">a peer-reviewed paper presented at ICSE 2026</a> by researchers at CMU, NC State, and Socket scanned 20 terabytes of GitHub data and found roughly <strong>six million suspected fake stars</strong> across 18,617 repositories. AI and LLM repos are now the single largest non-malicious category of recipients. GitHub’s response has mostly been to quietly delete the flagged repos after someone else does the detective work.</p>

<h2 id="the-actual-product-is-the-readme">The actual product is the README</h2>

<p>Stars don’t just get bought. They get <em>gamed</em> organically too, in a way that’s arguably worse.</p>

<p><a href="https://blog.bytebytego.com/p/top-ai-github-repositories-in-2026">OpenClaw</a> went from 9,000 to 60,000 stars in a few days in January 2026, then blew past 210,000. Legitimately impressive. But it also set the benchmark everyone else is now trying to fake. The week it peaked, a dozen “autonomous agent frameworks” launched with nearly identical READMEs, riding the same wave of AI influencer reposts. Most were a <code class="language-plaintext highlighter-rouge">for</code> loop and some string formatting. A few bought stars to close the gap.</p>

<p>Here’s the part the paper confirms that most people don’t know: it doesn’t even work. The CMU study found that fake stars produce a small bump in organic attention for at most two months, then become a net negative. Real users can smell something off. The lockstep patterns, hundreds of accounts starring the same repo in the same 30-day window with no other activity, register as a trust penalty once the algorithm catches up.</p>

<p>Stars measure whether something looked impressive during a 3-minute scroll. They have <a href="https://www.ndss-symposium.org/wp-content/uploads/madweb2024-4-paper.pdf">almost no correlation</a> with whether the code works, whether it’s maintained, or whether anyone actually runs it in production.</p>

<p>The engineers who’ve been building quietly for ten years know this. The maintainer of <a href="https://github.com/SocketCluster/socketcluster">SocketCluster</a> put it plainly after watching his 6,000 legitimately-earned stars become meaningless:</p>

<blockquote>
  <p>“It sucks having put in the effort and seeing it get lost in a sea of scams and seeing people doubting my project’s own authenticity.”</p>
</blockquote>

<h2 id="nobodys-stopping">Nobody’s stopping</h2>

<p>The honest answer is that the game continues because everyone’s playing it and nobody wants to be the one who stops first.</p>

<p>VCs keep using stars as a lazy proxy for developer love because it’s a number that fits in a spreadsheet. Founders keep optimising for stars because that’s what gets VC meetings. AI influencers keep boosting every shiny new repo because engagement is engagement and nobody checks back in six months when the repo is abandoned. Developers keep using stars as a trust signal when picking dependencies because who has time to read the actual source.</p>

<p>And the end of that chain is darker than most people realise. The CMU paper found one repo with 111 stars, 109 of them fake, presenting as a Solana trading bot. Hidden inside: a <code class="language-plaintext highlighter-rouge">spawn()</code> call quietly executing a remote obfuscated script that drained wallets. That’s where <a href="https://research.checkpoint.com/2024/stargazers-ghost-network/">the Stargazers Ghost Network</a> went, 3,000 coordinated bot accounts selling fake stars as a distribution channel for malware, because a starred repo looks trustworthy enough to clone.</p>

<p>The people proposing fixes, like <a href="https://gerus-lab.hashnode.dev/your-open-source-repo-has-10k-github-stars-half-of-them-are-fake">fork-to-star ratios</a>, contributor counts, <a href="https://about.scarf.sh/">download telemetry</a>, and <a href="https://scorecard.dev/">OpenSSF scorecards</a>, are correct and will largely be ignored. The fix requires effort. The game only requires a credit card.</p>

<p>The GitHub star is not dead. It just means something different now. It means someone wanted you to think a project was popular. Whether it actually is, that part’s still on you to figure out.</p>]]></content><author><name></name></author><category term="AI" /><category term="GitHub" /><category term="open-source" /><category term="oss" /><category term="cybersecurity" /><summary type="html"><![CDATA[The perception of GitHub stars as a reliable indicator of project quality has significantly diminished due to large-scale manipulation and their use as a "vanity metric".]]></summary></entry><entry><title type="html">The Code Was Never The Point</title><link href="https://thapar25.github.io/2026/04/21/code-is-not-the-point.html" rel="alternate" type="text/html" title="The Code Was Never The Point" /><published>2026-04-21T00:00:00+00:00</published><updated>2026-04-21T00:00:00+00:00</updated><id>https://thapar25.github.io/2026/04/21/code-is-not-the-point</id><content type="html" xml:base="https://thapar25.github.io/2026/04/21/code-is-not-the-point.html"><![CDATA[<p><img src="https://preview.redd.it/just-a-meme-still-maybe-worth-discussion-v0-5susw7fbhzbe1.jpeg?auto=webp&amp;s=daf694ca90eb70e592bd92ae5d60c007d7bd740b" alt="AI-assisted coding bell-curve meme" /></p>

<p>Code was a medium. Not the product. Not the craft. The medium.</p>

<p>The machine does not admire your naming conventions. It does not appreciate the abstraction layer. It executes. And yet, somewhere along the way, we convinced ourselves that the writing of code was the point, rather than a means to one.</p>

<p>Jensen Huang said it plainly: <em>“The purpose of a software engineer is to solve known problems and find new ones. Coding is one of the tasks.”</em></p>

<p>Some engineers bristled. They heard the wrong thing. He was not diminishing code. He was correctly categorizing it.</p>

<h2 id="the-tax-you-paid-to-ship">The tax you paid to ship</h2>

<p>The gap between <em>what I want done</em> and <em>code that does it</em> was always the job. The outcome was the job. Code was the tax you paid to get there.</p>

<p>Agentic tools have not automated programming. They have made the tax cheaper. You describe the outcome; the agent drafts the filing. You review it.</p>

<p>That is a more honest relationship with the medium. And honesty, it turns out, is uncomfortable for an industry that built its professional identity around paying that tax with elegance.</p>

<h2 id="clean-code-was-written-for-humans">Clean code was written for humans</h2>

<p>Every quality metric we use, readability, DRY principles, abstraction clarity, was designed for one human to hand code to another.</p>

<p>But agents are now both writing and reading code. And an agent does not need your comments. It does not benefit from your nested abstractions. It works better with less surface area to misread.</p>

<blockquote>
  <p><em>The conventions we treat as gospel were built for a consumer that is no longer the only one in the room.</em></p>
</blockquote>

<p>This does not mean clean code is dead. It means the definition is overdue for an update. A codebase optimized for agent-assisted workflows might look leaner, flatter, and more annotated in markdown than in inline comments. Documentation moves out of the code and into context files that actively shape how the agent writes and extends the repo.</p>

<p>That is a different kind of craft. Not a lesser one.</p>

<h2 id="the-repository-is-becoming-a-hybrid-artifact">The repository is becoming a hybrid artifact</h2>

<p>The codebases of the next five years will carry as much prose as logic. Not documentation trailing six months behind the code, but living context that instructs the agents maintaining it.</p>

<p><code class="language-plaintext highlighter-rouge">CLAUDE.md</code>. <code class="language-plaintext highlighter-rouge">AGENTS.md</code>. Architecture decision records written not for your future colleague, but for the model that will touch the code before they do.</p>

<p>The source code is for the machine. The markdown is for the agent. The agent does the translation.</p>

<p>Which raises a question worth sitting with: if the most valuable part of a modern repository is its surrounding context, what does that mean for how we evaluate engineering work? We have metrics for code coverage, complexity, and performance. We have almost nothing for the quality of the prose that now shapes how that code gets built.</p>

<hr />

<h2 id="we-built-the-vault-before-we-knew-what-we-were-preserving">We built the vault before we knew what we were preserving</h2>

<p>In 2020, GitHub ran the <a href="https://archiveprogram.github.com/arctic-vault/">Arctic Code Vault</a> campaign, buried a snapshot of public repositories in a coal mine in Svalbard, Norway. A time capsule meant to last a thousand years. At the time it felt like a stunt.</p>

<p>It looks different now. A codebase stripped of its context files is code without an operating manual. The logic is there. The intent is not.</p>

<p>Future developers, human or otherwise, will not just need the source. They will need the surrounding layer of decisions, constraints, and instructions that gave it shape.</p>

<hr />

<p>Code was always a means to an end. The agent era did not change that. It just made it impossible to pretend otherwise.</p>

<p>The question now is not whether your code is clean. It is whether your thinking is.</p>]]></content><author><name></name></author><category term="AI" /><category term="software-engineering" /><category term="agents" /><category term="two-cents" /><summary type="html"><![CDATA[Code was always a medium, not the message. Agents writing and reading code forces a reckoning with what we actually valued about it and why the Arctic Code Vault just got a lot more interesting.]]></summary></entry><entry><title type="html">The Underdog Stack: How Luna Runs on Free Inference</title><link href="https://thapar25.github.io/2026/04/19/luna-backend.html" rel="alternate" type="text/html" title="The Underdog Stack: How Luna Runs on Free Inference" /><published>2026-04-19T00:00:00+00:00</published><updated>2026-04-19T00:00:00+00:00</updated><id>https://thapar25.github.io/2026/04/19/luna-backend</id><content type="html" xml:base="https://thapar25.github.io/2026/04/19/luna-backend.html"><![CDATA[<p><img src="/assets/images/the-donna.png" alt="The Donna Device from Suits" /></p>

<p>If you’ve watched Suits, you already get it. <a href="https://chat.openai.com/?q=what+is+the+Donna+device+from+Suits">(incase you’re uncultured)</a></p>

<p>I wanted that. Something omnipresent, anticipatory, works across every context. I called it <strong>Luna</strong> instead of The Donna because the vision was never just one interface. A shortcut, a WhatsApp message, a Slack command, an n8n automation. Same brain, different doors.</p>

<p>The <a class="wiki-link" href="/2026/04/12/luna-ios-shortcut.html">last post</a> covered the front door: an iOS shortcut on my lock screen, one tap, voice memo fired into a backend. This post is about what is behind that door, what it actually runs on, and why I am rebuilding it before adding anything new.</p>

<hr />

<h2 id="the-architecture-right-now">The Architecture (Right Now)</h2>

<p><img src="/assets/svgs/mermaid-diagram-2026-04-19T21-47-58.svg" alt="Luna AI Graph as Mermaid Chart" /></p>

<p>Luna is a <a href="https://langchain-ai.github.io/langgraph/">LangGraph</a> multi-agent system. Every message comes in, gets loaded with chat history from Redis, and hits a router. The router classifies intent and fires to one of four agents:</p>

<ul>
  <li><strong>Notion agent</strong>: reads and writes to my workspace. Tasks, projects, fitness logs, notes.</li>
  <li><strong>Calendar agent (Donna)</strong>: manages Google Calendar. Scheduling, busy slots, edits, removals.</li>
  <li><strong>Fitness tracker (Rocky)</strong>: dedicated logging agent for workouts and activity.</li>
  <li><strong>General agent</strong>: everything else. Q&amp;A, quick lookups, conversations.</li>
</ul>

<p>Each agent runs its tools, returns a response, and the result gets pushed back to Redis with the updated conversation history. Clean, stateless agents. Stateful conversations.</p>

<p><img src="/assets/images/langsmith-dashboard-tools.png" style="border-radius: 10px;" alt="Langsmith Dashboard for Tools" /></p>

<p>The observability and feedback loops are taken care of via <a href="https://smith.langchain.com/">Langsmith</a> (more about this, in a later post).</p>

<p>That is the current version. It works. And it has one problem.</p>

<h2 id="where-it-breaks">Where It Breaks</h2>

<p>Say you tell Luna: <em>“Add a task to check out Wan2.2 and block two hours for it on Friday, after work.”</em></p>

<p>One sentence. Two agents. Right now the router picks one and sends it there. The Notion agent creates the task. The calendar block never happens. Or vice versa.</p>

<p>The routing is a single <code class="language-plaintext highlighter-rouge">match</code> statement. One task, one destination. No mechanism for agents to talk to each other, no fan-out for overlapping intent, no synthesis layer for dual-delegation.</p>

<p>Usage data made this obvious. It kept showing up in the logs. Luna v2 fixes this: the router identifies multi-agent tasks, dispatches to multiple nodes in parallel, and synthesizes a single response. LangGraph supports it. The current graph just does not use it yet.</p>

<h2 id="the-underdog-stack">The Underdog Stack</h2>

<p>Luna runs on a repurposed college laptop. Ubuntu, homelab, Cloudflare Tunnel. Infrastructure cost: electricity. API cost: close to zero.</p>

<p><strong><a href="https://groq.com">Groq</a></strong> runs three jobs. <a href="https://console.groq.com/docs/model/whisper-large-v3">Whisper</a> handles transcription. The router uses <a href="https://console.groq.com/docs/model/openai/gpt-oss-20b">gpt-oss-20b</a> with structured output and strict mode for fast, reliable intent classification. The lightweight agents (general Q&amp;A, calendar, fitness) also run on <a href="https://console.groq.com/docs/model/openai/gpt-oss-120b">gpt-oss-120b</a>. Fast enough to feel instant. Free tier covers everything at personal usage volumes.</p>

<p><strong><a href="https://openrouter.ai/z-ai/glm-4.5-air:free">GLM-4.5 Air by Z-AI</a></strong> handles the heavy Notion agent. Ten tools, full datasource context, real read-write operations against my workspace. Purpose-built for agentic applications, MoE architecture, 131K context window, and a thinking/non-thinking toggle depending on whether the flow needs reasoning or just speed. Reasoning is explicitly turned off in the Notion agent config. For tool-calling flows you want decisiveness, not deliberation. A Chinese lab’s open-source model is running the most critical part of this system for free. That is worth saying out loud.</p>

<p>Before GLM-4.5 Air, this slot was <strong><a href="https://openrouter.ai/stepfun/step-3.5-flash:free">Step 3.5 Flash by StepFun</a></strong> (really slept on, in my opinion). 196 billion total parameters with only 11 billion active per token via sparse MoE. Multi-Token Prediction generating 4 tokens per forward pass, hitting 100 to 300 tokens per second in typical usage. It reasoned like a large model and moved like a small one. The free tier on OpenRouter dried up last month. GLM stepped in and has not missed a beat.</p>

<p>Both of these models come from labs that do not get the coverage they deserve. If you are building anything agentic and have not looked at either of them, you should.</p>

<p>Four OpenRouter API keys rotate automatically via a <code class="language-plaintext highlighter-rouge">KeyRotator</code>. When one hits a rate limit, the next one picks up. Rate limits per key become irrelevant at personal usage volumes.</p>

<p><strong><a href="https://deepmind.google/models/gemini/">Gemini</a></strong> sits at the bottom via <code class="language-plaintext highlighter-rouge">ModelFallbackMiddleware</code>. GLM fails, Gemini catches it. Gemini fails, Groq catches it. Three layers of free inference before a single rupee gets spent.</p>

<p>This is not cheapness. It is a deliberate tiered inference strategy. Fast model for routing, capable free model for heavy tool use, two fallback layers below it.</p>

<p>We call that the finest form of <a href="https://en.wikipedia.org/wiki/Jugaad"><em><strong>Jugaad</strong></em></a></p>

<h2 id="what-is-next">What Is Next</h2>

<p>Luna v2 is the immediate priority: rewire tools, multi-agent dispatch, inter-agent communication, proper handling of tasks that span multiple domains.</p>

<p>After that: a WhatsApp interface already <a href="https://www.linkedin.com/posts/pulkit-thapar_ambientagents-selfhost-homelab-activity-7404026765439950849-ydaJ">in progress</a>, then Slack, then the n8n automation layer expanding significantly. Same brain. More doors.</p>

<p>The shortcut was always the starting point. The architecture has to be built for omnipresence before any new interface gets added.</p>

<p>The brain comes first. The ears come later.</p>]]></content><author><name></name></author><category term="Luna" /><category term="AI" /><category term="homelab" /><category term="personal-assistant" /><category term="LangGraph" /><summary type="html"><![CDATA[Luna is being rebuilt. Here is what is running under the hood right now, why a Chinese MoE model nobody talks about is doing the heavy lifting, and why the next version fixes a problem the current one cannot.]]></summary></entry><entry><title type="html">The Friction Is the Feature (You’re Fighting)</title><link href="https://thapar25.github.io/2026/04/12/luna-ios-shortcut.html" rel="alternate" type="text/html" title="The Friction Is the Feature (You’re Fighting)" /><published>2026-04-12T00:00:00+00:00</published><updated>2026-04-12T00:00:00+00:00</updated><id>https://thapar25.github.io/2026/04/12/luna-ios-shortcut</id><content type="html" xml:base="https://thapar25.github.io/2026/04/12/luna-ios-shortcut.html"><![CDATA[<div class="tenor-gif-embed" data-postid="18002448761037760173" data-share-method="host" data-aspect-ratio="1.33333" data-width="100%"><a href="https://tenor.com/view/louis-litt-litt-litt-up-suits-gif-18002448761037760173">Louis Litt Litt Up GIF</a>from <a href="https://tenor.com/search/louis+litt-gifs">Louis Litt GIFs</a></div>
<script type="text/javascript" async="" src="https://tenor.com/embed.js"></script>

<p><br /><br /></p>

<p><a href="https://suits.fandom.com/wiki/Louis_Litt">Louis Litt</a> carried a Dictaphone everywhere. Barked memos into it mid-stride, never broke his pace, never got distracted. It always seemed a little ridiculous.</p>

<p>I get it now.</p>

<hr />

<p>Every habit tracker fails the same way.</p>

<p>Not because it’s badly built. Because it requires you to open it. You finish a workout, tell yourself you’ll log it later, and later never comes. The app becomes a graveyard of good intentions. The data gap is not a discipline problem. It’s a friction problem.</p>

<p>This is what I was trying to solve with Luna.</p>

<h2 id="what-luna-actually-does-the-short-version">What Luna actually does (the short version)</h2>
<p>Luna is my personal AI assistant: a <a href="https://langchain-ai.github.io/langgraph/">LangGraph</a> multi-agent system and <a href="https://n8n.io/">n8n</a> automation flows, all running on my homelab, wired into my Notion workspace and Google Calendar (for now). One name, one interface, for everything.</p>

<p>This post is about the front door: an <a href="https://support.apple.com/en-in/guide/shortcuts/welcome/ios">iPhone Shortcut</a>.</p>

<h2 id="the-constraint-that-made-the-decision-easy">The constraint that made the decision easy</h2>

<p>Luna runs on my homelab, exposed to the outside world via a <a href="https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/">Cloudflare Tunnel</a>. That means REST API: clean, simple, battle-tested. What it doesn’t give me is a persistent duplex connection. WebSocket is off the table for now. So a native app with any kind of real-time feel would’ve been more work for a worse result. Shortcuts talking to a REST endpoint is the honest architecture for the setup I have.</p>

<p>I’ll build a proper app when it makes sense. Right now, it doesn’t.</p>

<h2 id="two-shortcuts-one-idea">Two shortcuts, one idea</h2>

<p>There are two shortcuts doing the work.</p>

<p>The first lives on my lock screen and in the notification center. One tap. It records my voice, sends the audio to <a href="https://console.groq.com/docs/speech-text">Groq’s Whisper API</a> for transcription, and fires the text to Luna’s backend. I never unlock the phone. I never see a feed. The job is done before the algorithm gets a chance.</p>

<p>The second is called <strong>“Ask Luna”</strong>, completely hands-free. Say <em>“Hey Siri, Ask Luna”</em> and it runs. This one uses on-device ASR instead of Whisper, so the accuracy isn’t as sharp, but it works well enough when I’m driving or my hands are full.</p>

<p>A Dictaphone. With a backend.</p>

<h2 id="the-actual-problem-it-solves">The actual problem it solves</h2>

<p>Think about every app that asks you to manually enter information: fitness logs, food diaries, task managers, journals. The input step is where they all die. By the time you’ve unlocked your phone, navigated to the right screen, and tapped into a text field, you’re already two notifications deep into something else.</p>

<p>Voice-to-lock-screen cuts all of that out. I finish a set, tap once, say “12 pull-ups, ate clean today,” and move on. That goes into my Notion Fitness Tracker. No app open, no feed in my peripheral vision, no “I’ll do it later.”</p>

<p>Same for tasks: <em>“Add a note to read <a href="https://paulgraham.com/hackpaint.html">Hackers and Painters</a> by Paul Graham”</em> lands in my Hobby Projects board. Scheduling requests go to Donna, my calendar agent. Luna routes everything.</p>

<h2 id="luna-isnt-a-chatbot">Luna isn’t a chatbot</h2>
<p>This is worth saying clearly: Luna isn’t designed for conversation. The priority is to get things done, not generate replies.</p>

<p>When I log a workout, I don’t <em>need</em> an acknowledgement. When I create a task, I don’t <em>need</em> it read back to me. The shortcut fires, the agent acts, and my phone reads the response back to me via TTS when there’s something worth saying. Most interactions are still fire-and-forget, Luna confirms a task was logged, tells me my next meeting, answers a question. No screen required. The async model isn’t a limitation. It’s the point.</p>

<h2 id="why-not-a-wearable">Why Not a Wearable</h2>

<p>A quick detour because why reinvent the wheel.</p>

<ul>
  <li>
    <p><strong><a href="https://neosapien.ai/">Neo Sapien</a></strong> is an Indian startup doing interesting work here. Reached out to the founder directly. Hardware was not coming my way, and the device is closed source. Even if I ordered one, there is no path to wiring it to your own backend. Hard pass.</p>
  </li>
  <li>
    <p><strong><a href="https://www.omi.me/">OMI</a></strong> is open source, hackable, and developer-friendly. The problem is battery. Transcription and active task processing drain it fast, which means you are charging a wearable you are supposed to be wearing. That defeats the point.</p>
  </li>
  <li>
    <p><strong><a href="https://repebble.com/index">Pebble Ring</a></strong> is the closest fit. Push-to-talk, not always-listening, completely unobtrusive. The only catch: non-rechargeable. Disposable after roughly a year and a half to two years. A strange trade-off for a device this promising. Top contender. Not yet.</p>
  </li>
</ul>

<p>The shortcut wins by elimination for now. Not the dream interface. The honest one.</p>

<h2 id="whats-next">What’s next</h2>

<p><a class="wiki-link" href="/2026/04/19/luna-backend.html">The next post</a> covers the backend: how I’m running a full multi-agent system at near-zero cost by distributing across LLM providers. <a href="https://groq.com">Groq</a> for routing, <a href="https://openrouter.ai">OpenRouter</a> with round-robin key rotation for agents, <a href="https://deepmind.google/models/gemini/">Gemini</a> as fallback. Plus caching, tool calls, and the feedback loop I’m using to improve Luna over time.</p>

<p>The shortcut is intentionally dumb. The backend is where it gets interesting.</p>]]></content><author><name></name></author><category term="Luna" /><category term="AI" /><category term="homelab" /><category term="personal-assistant" /><category term="ios-shortcuts" /><summary type="html"><![CDATA[Building a habit tracker is easy. Actually using it isn't. Here's how I removed the friction entirely, and why a two-tap lock screen button beat a native app.]]></summary></entry><entry><title type="html">Upcycled, Overengineered, and Held Together by Prayer</title><link href="https://thapar25.github.io/2026/04/11/immich.html" rel="alternate" type="text/html" title="Upcycled, Overengineered, and Held Together by Prayer" /><published>2026-04-11T00:00:00+00:00</published><updated>2026-04-11T00:00:00+00:00</updated><id>https://thapar25.github.io/2026/04/11/immich</id><content type="html" xml:base="https://thapar25.github.io/2026/04/11/immich.html"><![CDATA[<p><img src="/assets/images/setup-hdd.jpeg" alt="Homelab desk and server setup" /></p>

<p>It started with an old college laptop. Ubuntu went on it, and suddenly I had a server. The NAS idea followed shortly after, as it always does.</p>

<p>I had a 4TB portable drive but didn’t want it permanently attached. Then one afternoon I walked past some discarded junk and spotted an old CCTV DVR. Cracked it open, found a healthy 1TB HDD inside. Formatted it, set up LUKS encryption, and called it my primary storage. Free of charge.</p>

<p><img src="/assets/images/hdd.png" alt="The HDD pulled from the discarded CCTV DVR" /></p>

<p>With storage sorted, I set up Immich, paired with Tailscale and Cloudflare Tunnels, gated behind Google OAuth. Google Photos, evicted. Migrated ~320GB going back to 2015: old phones, Snapchat memories (had to), scanned family prints, and VHS tapes I recorded using OBS Studio.</p>

<p><img src="/assets/images/scanner-open.jpeg" alt="Scanner open for digitising family prints" /></p>

<p>My parents’ wedding. My grandparents visiting family in the US, from an era when home video was a rare luxury. Black-and-white pictures of ancestors I never met. Things that don’t exist twice.</p>

<p><img src="/assets/images/vhs-setup.jpeg" alt="VHS tapes ready for digitisation" /></p>

<p>The 4TB drive handled off-server backups via Syncthing. Last week, it died. Syncthing had already done its job, everything safe. Now I’m shopping for an NVMe in this economy. Prayers welcome.</p>

<p>Next: once backup is properly sorted (3-2-1 rule), I’m onboarding family so they can contribute their own photos and media. A shared family archive, self-hosted, no middleman.</p>

<p>Open source did the heavy lifting. The dumpster did the rest.</p>

<p>Oh, and the server does a lot more than store memories, but that’s a story for another day.</p>

<hr />

<p><strong>Tools that made this possible at near-zero cost:</strong></p>

<ul>
  <li><a href="https://immich.app/">Immich</a></li>
  <li><a href="https://www.cloudflare.com/">Cloudflare</a></li>
  <li><a href="https://tailscale.com/">Tailscale</a></li>
  <li><a href="https://syncthing.net/">Syncthing</a></li>
</ul>]]></content><author><name></name></author><category term="homelab" /><category term="NAS" /><category term="Immich" /><summary type="html"><![CDATA[How I built a self-hosted family photo archive from a discarded CCTV DVR, open source software, and questionable life choices.]]></summary></entry><entry><title type="html">Obsidian’s CEO and I Had the Same Idea</title><link href="https://thapar25.github.io/2026/04/06/hello-world.html" rel="alternate" type="text/html" title="Obsidian’s CEO and I Had the Same Idea" /><published>2026-04-06T00:44:09+00:00</published><updated>2026-04-06T00:44:09+00:00</updated><id>https://thapar25.github.io/2026/04/06/hello-world</id><content type="html" xml:base="https://thapar25.github.io/2026/04/06/hello-world.html"><![CDATA[<p><img src="/assets/images/thapar-logs-setup.jpeg" alt="thapar.logs : Workflow Setup" /></p>

<p>I was walking Murphy, my dog, when it hit me.</p>

<p>I’m a developer. I don’t want to <em>engineer</em> my blog. Markdown is already my first language. Turns out it’s an LLM’s too. And that’s when I thought, <em>why not also write this blog in markdown</em>?</p>

<p>I got home and googled it. Apparently <a href="https://jekyllrb.com/">Jekyll</a>, GitHub Pages, and half the internet had already figured this out. Andrej Karpathy even has a name for it: <a href="https://x.com/karpathy/status/2039805659525644595">LLM Wiki</a>. A person who could use anything, chose a folder of <code class="language-plaintext highlighter-rouge">.md</code> files. That’s not laziness. That’s a signal.</p>

<blockquote class="twitter-tweet"><p lang="en" dir="ltr">LLM Knowledge Bases<br /><br />Something I&#39;m finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating…</p>&mdash; Andrej Karpathy (@karpathy) <a href="https://twitter.com/karpathy/status/2039805659525644595?ref_src=twsrc%5Etfw">April 2, 2026</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

<p>Because here’s the thing: markdown wasn’t designed to be the language of AI. It just turned out to be the language that humans and machines both happen to speak. The first format in history that was built for thinking, and accidentally became perfect for reasoning.</p>

<p>Turns out I’m not a genius. I’m just predictable.</p>

<p>The thing that actually sealed it for me was a Reddit thread. Someone asking how to publish Obsidian notes for free. The top answer was exactly right.</p>

<iframe src="https://www.redditmedia.com/r/ObsidianMD/comments/16e5jek/comment/jzv38ja/?ref_source=embed&amp;ref=share&amp;embed=true" sandbox="allow-scripts allow-same-origin allow-popups" style="border: none;" height="368" width="100%" scrolling="no">
</iframe>

<p>It was posted by Obsidian’s CEO.</p>

<p>He showed up in a subreddit to personally help a user avoid paying for his own product. I don’t know, that just did something for me.</p>

<p>This blog is the result of that dog walk. I build things in the open. Might as well write about them the same way. Written in Obsidian, rendered by Jekyll, hosted on GitHub Pages, tracked by Google Analytics. Completely free.</p>

<p>Oh, and one more thing. I didn’t really write this post. I talked it into existence and let a machine find the shape. Maybe that’s the whole point.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[]]></summary></entry></feed>