<?xml version="1.0" encoding="utf-8"?> 
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-us">
    <generator uri="https://gohugo.io/" version="0.123.7">Hugo</generator><title type="html"><![CDATA[BitWorking]]></title>
    
    
    
            <link href="https://bitworking.org/" rel="alternate" type="text/html" title="html" />
            <link href="https://bitworking.org/news/feed/index.xml" rel="self" type="application/atom+xml" title="atom" />
    <updated>2026-05-20T19:45:11+00:00</updated>
    
    
    <author>
            <name>Joe Gregorio</name>
            
                <email>joe@bitworking.org</email>
            </author>
    
        <id>https://bitworking.org/</id>
    
        
        <entry>
            <title type="html"><![CDATA[You can't un-see the duck]]></title>
            <link href="https://bitworking.org/news/2026/05/you-cant-unsee-the-duck/" rel="alternate" type="text/html" />
            
            
                <id>https://bitworking.org/news/2026/05/you-cant-unsee-the-duck/</id>
            
            
            <published>2026-05-20T15:30:27-04:00</published>
            <updated>2026-05-20T15:30:27-04:00</updated>
            
            
            <content type="html"><![CDATA[<p>Google released new icons for their apps today, which are nice, but don&rsquo;t Google Meet and Google Keep look like pictures of the same duck, but from different perspectives?</p>
<p><img src="Ducks.png" alt="Ducks"></p>
]]></content>
            
                 
                    
                 
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Surprising things I learned putting together a Home Brain]]></title>
            <link href="https://bitworking.org/news/2026/05/surprising-things-i-learned-putting-together-a-home-brain/" rel="alternate" type="text/html" />
            
            
                <id>https://bitworking.org/news/2026/05/surprising-things-i-learned-putting-together-a-home-brain/</id>
            
            
            <published>2026-05-17T22:14:34-04:00</published>
            <updated>2026-05-17T22:14:34-04:00</updated>
            
            
            <content type="html"><![CDATA[<p>So, I&rsquo;m trying to put together something I call a &ldquo;Home Brain&rdquo;, a conversational
system I can interact with that not only allows me to control
<a href="https://en.wikipedia.org/wiki/Internet_of_things">IoT</a> devices in my home, but
also contains personal knowledge I can access, and also more general knowledge
(think wikipedia), and yet all this functionality should run locally, with no
cloud services required.</p>
<p>I initially started down this path with <a href="https://www.home-assistant.io/">Home
Assistant</a>, adding all of our IoT devices into
Home Assistant running on an <a href="https://www.raspberrypi.com/">RPi</a>, which worked
fine, but when I took the next step of buying a couple <a href="https://www.home-assistant.io/voice-pe/">Home Assistant Voice
Preview Edition</a> devices I found the
performance of the RPi was painfully slow.</p>
<p>Moving the entire stack to my desktop off of the RPi made a huge difference, and
this is where the paid <a href="https://claude.ai/new">Claude</a> account paid for itself,
with Claude planning and then scripting the entire migration process that went
super smoothly. After that migration the voice path ran much faster, but digging
into the HA implementation I saw that it was the equivalent of yelling at a bag
of regexes. This was the perfect place to use an LLM. Now, first world benefits,
I happened to have a computer laying around the house with an RTX 5000 (Turing)
GPU with 16GB of RAM. I wondered how far I could push that hardware to be the
House Brain I imagined.</p>
<p>Now full credit to the Home Assistant folks for creating a great basis for all
of this. For the device management and control Home Assistant is open source and
just works. The UI is fairly complex, verging on the byzantine, but HA also
supports a REST API that allows enumerating and controlling all of your devices.
With that as the foundation I embarked on building what I call &ldquo;My Jarvis&rdquo;, the
Home Brain. Now the naming isn&rsquo;t incidental, the Home Assistant folks have done
training on various wake words, and then boiled those down into tiny models you
can run on as EPS32-S3 device, and one of those wake works is &ldquo;Hey, Jarvis&rdquo;.</p>
<p>Getting Claude to spin up a Go program that coordinated the ESP32 Voice devices
and TTS, STT, and mapping that in a crude way to Home Assistant calls will be a
blog post for another day, but after that worked I pivoted to running LLMs on
the RTX 5000 machine to first handle the mapping from voice commands to Home
Assistant API calls. Once that worked I expanded the actions to answering
queries across two distinct datasets. The datasets are Wikipedia and an
<a href="https://obsidian.md/">Obsidian</a> vault with personal information in it. The key
here was to index both datasets using <a href="https://qdrant.tech/">Qdrant</a>.</p>
<p>And this is where it got interesting, because there&rsquo;s a whole slew of models and
indexing strategies to try, particularly trying to run all of this on a five
year old GPU with only 16GB of RAM. I did look at a bunch of benchmarks, but in
the end nothing is a substitute for just testing the actual thing yourself.</p>
<h2 id="the-setup">The setup</h2>
<p>MyJarvis is a Go voice-assistant pipeline for Home Assistant: an ESP32
hears a wake word, audio goes to VAD → STT → an LLM, and the LLM either
calls a Home-Assistant tool or answers a question from RAG (Obsidian
notes or a 43-million-document Wikipedia index in Qdrant).</p>
<p>Every user interaction is dominated by <strong>two LLM calls</strong>, not the code:</p>
<ol>
<li><strong>Routing</strong> — pick the right tool (turn on a light? check a list?
look something up in Wikipedia? in my notes?).</li>
<li><strong>Synthesis</strong> — for a lookup, turn retrieved chunks into one spoken
sentence.</li>
</ol>
<p>Latency is the whole game for voice. A two-second answer feels like an
assistant; a fifteen-second one feels broken. So we want not only
to be correct, but fast, so the first test harness tested</p>
<ul>
<li>33 prompts spanning home automation, personal-notes lookups, and
general-knowledge lookups.</li>
<li>Each asserts the model&rsquo;s <em>first tool call</em> equals the expected tool.</li>
<li>Synthetic Home-Assistant entities so it&rsquo;s reproducible.</li>
<li>A machine-readable result line so a script can sweep it across models.</li>
</ul>
<table>
<thead>
<tr>
<th>#</th>
<th>Prompt</th>
<th>Expected tool</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>Turn on the kitchen light</td>
<td><code>set_state</code></td>
</tr>
<tr>
<td>2</td>
<td>Turn off the living room light</td>
<td><code>set_state</code></td>
</tr>
<tr>
<td>3</td>
<td>Switch off the fan</td>
<td><code>set_state</code></td>
</tr>
<tr>
<td>4</td>
<td>Bedroom light on please</td>
<td><code>set_state</code></td>
</tr>
<tr>
<td>5</td>
<td>Can you turn the coffee maker on</td>
<td><code>set_state</code></td>
</tr>
<tr>
<td>6</td>
<td>Shut off the office fan</td>
<td><code>set_state</code></td>
</tr>
<tr>
<td>7</td>
<td>Run the goodnight routine</td>
<td><code>trigger_automation</code></td>
</tr>
<tr>
<td>8</td>
<td>Activate movie time</td>
<td><code>trigger_automation</code></td>
</tr>
<tr>
<td>9</td>
<td>Set a timer for 10 minutes</td>
<td><code>set_timer</code></td>
</tr>
<tr>
<td>10</td>
<td>Start a 5 minute pasta timer</td>
<td><code>set_timer</code></td>
</tr>
<tr>
<td>11</td>
<td>Add milk to the shopping list</td>
<td><code>add_to_list</code></td>
</tr>
<tr>
<td>12</td>
<td>Put batteries on the todo list</td>
<td><code>add_to_list</code></td>
</tr>
<tr>
<td>13</td>
<td>What&rsquo;s on my shopping list</td>
<td><code>check_list</code></td>
</tr>
<tr>
<td>14</td>
<td>Is bread on the shopping list</td>
<td><code>check_list</code></td>
</tr>
<tr>
<td>15</td>
<td>Check off bread from the shopping list</td>
<td><code>check_off_item</code></td>
</tr>
<tr>
<td>16</td>
<td>Clean up the lists</td>
<td><code>clean_lists</code></td>
</tr>
<tr>
<td>17</td>
<td>What did I write about Goldmine Prime</td>
<td><code>search_notes</code></td>
</tr>
<tr>
<td>18</td>
<td>When did I buy the Hayes Run property</td>
<td><code>search_notes</code></td>
</tr>
<tr>
<td>19</td>
<td>Summarize my notes on the Telluride trip</td>
<td><code>search_notes</code></td>
</tr>
<tr>
<td>20</td>
<td>What are the specs of my Austin computer</td>
<td><code>search_notes</code></td>
</tr>
<tr>
<td>21</td>
<td>Remind me what I wrote about the RAG setup</td>
<td><code>search_notes</code></td>
</tr>
<tr>
<td>22</td>
<td>What did I write in my notes about the basement renovation</td>
<td><code>search_notes</code></td>
</tr>
<tr>
<td>23</td>
<td>What did I note about my car&rsquo;s last oil change</td>
<td><code>search_notes</code></td>
</tr>
<tr>
<td>24</td>
<td>Who invented the transistor</td>
<td><code>search_wikipedia</code></td>
</tr>
<tr>
<td>25</td>
<td>How far is the moon from the earth in light seconds</td>
<td><code>search_wikipedia</code></td>
</tr>
<tr>
<td>26</td>
<td>What is the capital of Mongolia</td>
<td><code>search_wikipedia</code></td>
</tr>
<tr>
<td>27</td>
<td>When did World War 2 end</td>
<td><code>search_wikipedia</code></td>
</tr>
<tr>
<td>28</td>
<td>What is the speed of light</td>
<td><code>search_wikipedia</code></td>
</tr>
<tr>
<td>29</td>
<td>How tall is Mount Everest</td>
<td><code>search_wikipedia</code></td>
</tr>
<tr>
<td>30</td>
<td>Who wrote Pride and Prejudice</td>
<td><code>search_wikipedia</code></td>
</tr>
<tr>
<td>31</td>
<td>Explain how photosynthesis works</td>
<td><code>search_wikipedia</code></td>
</tr>
<tr>
<td>32</td>
<td>What year did the Berlin Wall fall</td>
<td><code>search_wikipedia</code></td>
</tr>
<tr>
<td>33</td>
<td>What is the boiling point of water in Fahrenheit</td>
<td><code>search_wikipedia</code></td>
</tr>
</tbody>
</table>
<p>So I&rsquo;ve got the time, and the tokens are free, so let&rsquo;s test this against
a large number of models; five families across sizes, context windows, and
quantizations, each run with <em>and</em> without chain-of-thought where the
family supports the switch:</p>
<ul>
<li><strong>Qwen</strong>: 3:4b, 3:8b, 3:8b-128k, 3:8b-q8_0, 3:14b-64k, 3:14b-q4_K_M,
3.5:9b, 3.5:9b-64k, 3.6:latest</li>
<li><strong>Gemma</strong>: 4:latest, 4:16k, 3:4b, 3:12b, 4-26B (Q4 GGUF)</li>
<li><strong>Nemotron</strong>: 3-nano:4b, mini:latest</li>
<li><strong>Granite</strong>: 4:latest, 3.3:8b</li>
<li><strong>Mistral / Phi</strong>: mistral:7b, phi4-mini, phi4, phi4-reasoning, phi3</li>
</ul>
<p>The results were interesting:</p>
<p><strong>Group 1 — 100% routing accuracy</strong> (33 prompts, warm, chain-of-thought
on, ranked by mean latency):</p>
<table>
<thead>
<tr>
<th>Model</th>
<th>Accuracy</th>
<th>Mean</th>
<th>p95</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>granite4:latest</strong></td>
<td><strong>100%</strong></td>
<td><strong>0.54 s</strong></td>
<td><strong>0.65 s</strong></td>
</tr>
<tr>
<td>nemotron-3-nano:4b</td>
<td>100%</td>
<td>1.88 s</td>
<td>2.42 s</td>
</tr>
<tr>
<td>gemma4:latest</td>
<td>100%</td>
<td>2.24 s</td>
<td>6.41 s</td>
</tr>
<tr>
<td>gemma4:16k</td>
<td>100%</td>
<td>2.26 s</td>
<td>6.24 s</td>
</tr>
<tr>
<td>qwen3:8b</td>
<td>100%</td>
<td>3.02 s</td>
<td>5.85 s</td>
</tr>
<tr>
<td>qwen3:8b-128k</td>
<td>100%</td>
<td>3.04 s</td>
<td>4.42 s</td>
</tr>
<tr>
<td>qwen3.5:9b</td>
<td>100%</td>
<td>3.22 s</td>
<td>4.30 s</td>
</tr>
<tr>
<td>qwen3.5:9b-64k</td>
<td>100%</td>
<td>3.24 s</td>
<td>4.11 s</td>
</tr>
<tr>
<td>qwen3:14b-64k <em>(old prod)</em></td>
<td>100%</td>
<td>4.58 s</td>
<td>7.81 s</td>
</tr>
<tr>
<td>qwen3:14b-q4_K_M</td>
<td>100%</td>
<td>4.79 s</td>
<td>8.49 s</td>
</tr>
<tr>
<td>qwen3:4b</td>
<td>100%</td>
<td>4.81 s</td>
<td>9.09 s</td>
</tr>
<tr>
<td>qwen3:8b-q8_0</td>
<td>100%</td>
<td>4.98 s</td>
<td>10.35 s</td>
</tr>
<tr>
<td>qwen3.6:latest</td>
<td>100%</td>
<td>26.73 s</td>
<td>38.35 s</td>
</tr>
</tbody>
</table>
<p>I absolutely did not expect IBM&rsquo;s Granite to top out the list here.</p>
<p><strong>Group 2 — ran but mis-routed</strong> (unusable as routers):</p>
<table>
<thead>
<tr>
<th>Model</th>
<th>Accuracy</th>
<th>Mean</th>
</tr>
</thead>
<tbody>
<tr>
<td>nemotron-mini:latest</td>
<td>21%</td>
<td>0.58 s</td>
</tr>
<tr>
<td>granite3.3:8b</td>
<td>0%</td>
<td>0.99 s</td>
</tr>
<tr>
<td>mistral:7b</td>
<td>0%</td>
<td>1.06 s</td>
</tr>
<tr>
<td>phi4-mini:latest</td>
<td>0%</td>
<td>1.61 s</td>
</tr>
</tbody>
</table>
<p><strong>Group 3 — no tool-calling support</strong>: <code>gemma3:4b</code>, <code>gemma3:12b</code>, <code>phi4:latest</code>,
<code>phi4-reasoning:latest</code>, <code>phi3:latest</code>, and the 26B Gemma GGUF all return
Ollama&rsquo;s <em>&ldquo;does not support tools&rdquo;</em> — no function-calling template.</p>
<p><strong>Chain-of-thought on vs off</strong> was tested for every Qwen and Nemotron
config — the extra runs that take the sweep to 36. The deltas were small
<em>and inconsistent</em> — granite4 0.54→0.55 s, qwen3:8b 3.02→3.06 s, but
qwen3:8b-q8<em>0 actually _improved</em> 4.98→4.28 s while qwen3:4b <em>worsened</em>
4.81→5.99 s, and nemotron-3-nano slipped 100→97%. Disabling
&ldquo;thinking&rdquo; was <em>not</em> a reliable latency reducer.</p>
<p>Now the above only tests if the LLM chooses the correct tool to call, we still
need to measure how well the LLM does as querying Wikipedia and the Obsidian
vault via Qdrant and organizing that into a coherent answer.</p>
<p>A separate suite ran real Wikipedia questions through the full
retrieve-then-answer path, scoring factual correctness, TTS-cleanliness,
source attribution, and latency. Synthesis turned out to be <em>easy</em>: 16
of 17 models were 100% factually correct (the answer is in the retrieved
text; the model just has to phrase it). The differentiators were latency
and cleanliness:</p>
<p>All 17 chat-capable models, 6 Wikipedia questions each, ranked by mean
latency (facts = answer contains the correct fact; clean = TTS-safe, no
markdown; attrib = cites the source article):</p>
<table>
<thead>
<tr>
<th>Model</th>
<th>Facts</th>
<th>Clean</th>
<th>Attrib</th>
<th>Mean</th>
</tr>
</thead>
<tbody>
<tr>
<td>phi3:latest</td>
<td>100%</td>
<td>83%</td>
<td>50%</td>
<td>1.61 s</td>
</tr>
<tr>
<td><strong>granite4:latest</strong></td>
<td><strong>100%</strong></td>
<td><strong>100%</strong></td>
<td>83%</td>
<td><strong>1.64 s</strong></td>
</tr>
<tr>
<td>gemma3:4b</td>
<td>100%</td>
<td>100%</td>
<td>100%</td>
<td>1.65 s</td>
</tr>
<tr>
<td>phi4-mini:latest</td>
<td>100%</td>
<td>100%</td>
<td>83%</td>
<td>2.22 s</td>
</tr>
<tr>
<td>mistral:7b</td>
<td>100%</td>
<td>100%</td>
<td>67%</td>
<td>2.29 s</td>
</tr>
<tr>
<td>llama3.1:8b</td>
<td>100%</td>
<td>100%</td>
<td>100%</td>
<td>2.32 s</td>
</tr>
<tr>
<td>granite3.3:8b</td>
<td>100%</td>
<td>100%</td>
<td>100%</td>
<td>3.18 s</td>
</tr>
<tr>
<td>gemma3:12b</td>
<td>100%</td>
<td>100%</td>
<td>100%</td>
<td>3.20 s</td>
</tr>
<tr>
<td>phi4:latest</td>
<td>100%</td>
<td>100%</td>
<td>100%</td>
<td>4.17 s</td>
</tr>
<tr>
<td>nemotron-3-nano:4b</td>
<td>67%</td>
<td>100%</td>
<td>67%</td>
<td>5.17 s</td>
</tr>
<tr>
<td>qwen3:8b</td>
<td>100%</td>
<td>83%</td>
<td>100%</td>
<td>6.91 s</td>
</tr>
<tr>
<td>gemma4:16k</td>
<td>100%</td>
<td>100%</td>
<td>100%</td>
<td>8.36 s</td>
</tr>
<tr>
<td>gemma4:latest</td>
<td>100%</td>
<td>100%</td>
<td>100%</td>
<td>8.54 s</td>
</tr>
<tr>
<td>qwen3:14b-64k <em>(old prod)</em></td>
<td>100%</td>
<td>83%</td>
<td>100%</td>
<td>11.48 s</td>
</tr>
<tr>
<td>qwen3:4b</td>
<td>100%</td>
<td>100%</td>
<td>100%</td>
<td>13.48 s</td>
</tr>
<tr>
<td>qwen3.5:9b</td>
<td>100%</td>
<td>100%</td>
<td>100%</td>
<td><strong>37.33 s</strong></td>
</tr>
<tr>
<td>qwen3.5:9b-64k</td>
<td>100%</td>
<td>100%</td>
<td>100%</td>
<td><strong>44.47 s</strong></td>
</tr>
</tbody>
</table>
<p>Again, Granite being the winner here, fastest speed with 100% on facts and
cleanliness, was a total surprise.</p>
<p>So I thought I had a win here, but then a simple query blew the whole thing apart.</p>
<blockquote>
<p>&ldquo;How big is the moon?&rdquo;</p>
</blockquote>
<p>That query produced a 600-character ramble stitched across several <em>unrelated</em>
articles including Phoebe — a moon of Saturn; a basin on Mars; an
orders-of-magnitude list, etc.</p>
<p>Probing the 43 M-doc index directly was the eye-opener: <strong>no short query
surfaces canonical articles.</strong> &ldquo;Moon&rdquo;, &ldquo;how big is the moon&rdquo;, and the
LLM&rsquo;s keyword expansion all return bands, people, and disambiguation
pages — never the <em>Moon</em> article. It <em>is</em> indexed (a long, content-rich
query finds it at rank 4–5), so this is a <strong>ranking</strong> problem, not a
data problem. The token &ldquo;moon&rdquo; is swamped across 43 million documents;
sparse hybrid search (SPLADE + BM25, DBSF fusion) can&rsquo;t promote the
canonical article from a two-word query.</p>
<p>Because retrieval is fast (~70 ms), I could afford to be clever:</p>
<p><strong>Use the question, not just the keywords.</strong></p>
<p>The router distills the user&rsquo;s question into keywords; sometimes that drops the
very term that finds the article. I probed 12 factual questions, retrieving
three ways and scoring whether the <em>expected canonical article</em> landed in the
top 5:</p>
<table>
<thead>
<tr>
<th>Retrieval input</th>
<th>recall@5</th>
<th>mean rank</th>
</tr>
</thead>
<tbody>
<tr>
<td>keyword query (old behavior)</td>
<td>8/12</td>
<td>1.62</td>
</tr>
<tr>
<td>raw question</td>
<td>10/12</td>
<td>1.70</td>
</tr>
<tr>
<td><strong>keyword + question</strong></td>
<td><strong>10/12</strong></td>
<td><strong>1.60</strong></td>
</tr>
</tbody>
</table>
<p>So simply adding the original query in along with the keyword query made a
huge difference.</p>
<p><strong>Adversarial re-rank.</strong></p>
<p>Search results kept hitting Wikipedia disambiguation pages that aren&rsquo;t really
useful. A simple fix was to drop them outright; demote &ldquo;List/Index/Outline of …&rdquo;
pages below real articles.</p>
<p><strong>One conditional re-query.</strong></p>
<p>If the top hit was still junk (disambiguation/index), ask the model for a better
article title and retrieve <em>once</em> more. Bounded, no loop, and it only fires on
the junk signal so clean queries pay nothing. This <strong>rescued &ldquo;What is the
capital of France?&rdquo;</strong> — which had missed the <em>Paris</em> article in <em>every</em>
retrieval mode — by re-querying into &ldquo;Paris&rdquo;.</p>
<p>After these, the synthesis suite (now including adversarial cases) was
8/8: the Eiffel Tower question cites the <em>Eiffel Tower</em> article, not
&ldquo;Eiffel Tower (disambiguation)&rdquo;.</p>
<p>This has been a fun project, and I&rsquo;m continuing to work on it and improve
functionality and correctness, but the biggest lesson I learned along the way so
far has been to just simply measure things and not rely on rankings/ratings from
other parties. While they might give you a general idea of a model&rsquo;s
capabilities, you won&rsquo;t really know until you apply them to your specific
problem. I had originally just gone with qwen3.5:9b based off it&rsquo;s rankings
against the other models, but in actual measurments the Granite model is not
only more accurate on tool calling and synthesis, but also over 8x faster!</p>
<p>Now the moon query is faster, but it still ends up referencing the Wikipedia
article on <a href="https://en.wikipedia.org/wiki/Habash_al-Hasib">Habash al-Hasib</a>, so
clearly there&rsquo;s still work to do.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="https://bitworking.org/tags/homeassistant" term="homeassistant" label="HomeAssistant" />
                             
                                <category scheme="https://bitworking.org/tags/llm" term="llm" label="LLM" />
                             
                                <category scheme="https://bitworking.org/tags/homebrain" term="homebrain" label="HomeBrain" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[The Critical Path in Project Planning]]></title>
            <link href="https://bitworking.org/news/2025/05/the-critical-path-in-project-planning/" rel="alternate" type="text/html" />
            
                <link href="https://bitworking.org/news/2025/03/applying-the-fundamental-axioms-to-reduce-uncertainty/" rel="related" type="text/html" title="Applying the Fundamental Axioms to Reduce Uncertainty" />
                <link href="https://bitworking.org/news/2025/03/the-fundamental-axiom-of-project-planning/" rel="related" type="text/html" title="The Fundamental Axioms of Project Planning" />
            
                <id>https://bitworking.org/news/2025/05/the-critical-path-in-project-planning/</id>
            
            
            <published>2025-05-11T21:44:23-04:00</published>
            <updated>2025-05-11T21:44:23-04:00</updated>
            
            
            <content type="html"><![CDATA[<p><a href="../../03/applying-the-fundamental-axioms-to-reduce-uncertainty/">Applying the Fundamental Axioms to Reduce
Uncertainty</a>
walked through the steps of using divide and conquer to reduce a large complex
project into smaller inter-related tasks.</p>
<p>Now that we have our smaller list of tasks, one of the first thing you will want
to do is look at the critical path, that is, the longest set of tasks in your
plan that all depend on each other and define the longest path from start to
finish in your project.</p>
<p>Let&rsquo;s consider the following project, where we have tasks A, B1, B2, and C. Note
that B1 and B2 are both &ldquo;successors&rdquo; of A, i.e. A has to finish before they can
begin. Also note the B1 and B2 are &ldquo;predecessors&rdquo; of C, that is both of them
much complete before task C can begin. The final thing to note is that B1 takes
twice as long to complete as B2.</p>
<p><img src="example.explan.png" alt="A-&gt;B1, A-&gt;B2, B1-&gt;C, B2-&gt;C, but B2 is half the duration of B1."></p>
<p>So if B1 takes four weeks to complete, B2 only takes two weeks to complete. In
this case the critical path of the project is <code>A -&gt; B1 -&gt; C</code>, which you can see
as highlighted in blue in the above chart. Any delay in A, B1, or C will delay
the completion of the project. On the other hand, if B2 takes a few days longer
than planned, actually anywhere up to taking twice as long, and the project will
remain on time.</p>
<p>The <a href="https://en.wikipedia.org/wiki/Critical_path_method">critical path</a> is an
important tool in project planning because it tells you the tasks you really
need to monitor closely because they are the ones that determine the overall
project lengh. Also, these are the tasks you need to focus on when trying to
shorten a project. And who among us hasn&rsquo;t been on a project where you&rsquo;ve
planned to do the work in <code>X</code> days and you&rsquo;re asked, what would it take to
get it done in <code>X/2</code> days?</p>
<p>In the above example how much effort should you put into shortening task B2?
Well, none, because even if  you got B2 down to just a single day, that will not
have any affect on when the project gets finished:</p>
<p><img src="example2.explan.png" alt="A-&gt;B1, A-&gt;B2, B1-&gt;C, B2-&gt;C, but B2 is just a sliver of the duration of
B1."></p>
<p>What you really want to focus on in this particular example is reducing the
length of task A. It clearly makes up a large portion of the project timeline
and reducing that task will have the largest impact on finishing the project
sooner.</p>
<p>That&rsquo;s the general idea of critical path analysis, find the critical path, then
find the &ldquo;long poles&rdquo; on that critical path, that is, the longest duration tasks
that appear on the critical path, and focus on shortening them to bring down the
total project duration.</p>
<p>While simply calculating the critical path will certainly help you run your
project, you must be aware of, and always on the lookout for, hidden critical
paths. Let&rsquo;s look again as our first example:</p>
<p><img src="example.explan.png" alt="A-&gt;B1, A-&gt;B2, B1-&gt;C, B2-&gt;C, but B2 is half the duration of B1."></p>
<p>But now let&rsquo;s assign a level of <code>Uncertainty</code> to each task. In this case we will
use <a href="https://jacobian.org/2021/may/25/my-estimation-technique/">Jacob Kaplan-Moss</a>&rsquo;s
multipliers for measuring uncertainty:</p>
<table>
<thead>
<tr>
<th>Uncertainty</th>
<th>Multiplier(Divisor)</th>
</tr>
</thead>
<tbody>
<tr>
<td>low</td>
<td>1.1</td>
</tr>
<tr>
<td>moderate</td>
<td>1.5</td>
</tr>
<tr>
<td>high</td>
<td>2</td>
</tr>
<tr>
<td>extreme</td>
<td>5</td>
</tr>
</tbody>
</table>
<p>So what does a <code>moderate</code> level of uncertainty mean? If we presume
a task has a duration of 6 days, then that task could be completed from
anywhere from the low side of $$(6 / 1.5) = 4$$ days, or on the high side
of $$(6 * 1.5) = 9$$ days.</p>
<p>If all the tasks in the project below have a <code>low</code> uncertainty except for &lsquo;B2&rsquo;
which has an <code>extreme</code> level of uncertainty then (totally depending on what the
distribution of the uncertainty of B2 looks like), B2 may actually end up on the
critical path just as often as B1.</p>
<p><img src="example.explan.png" alt="A-&gt;B1, A-&gt;B2, B1-&gt;C, B2-&gt;C, but B2 is half the duration of B1."></p>
<p>That is, B2 will complete somewhere between $$[(1w / 5), (1w * 5)]$$ or
somewhere in 3 to 35 days, and given that uncertainty in B2 there&rsquo;s roughly a
50% chance it&rsquo;s actually on the critical path.</p>
<p>While this might seem like a pretty academic exercise, looking for hidden on
critical paths was instrumental on getting one very large profile project to
finish on time: careful attention found a long pole task on a hidden critical
path that could be accelerated, which we did accelerate, which was lucky because
other parts of the project finished early and the hidden long pole did end up
being on the critical path and our acceleration of that task turned into a huge
win in getting the project done in time.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="https://bitworking.org/tags/gantt" term="gantt" label="gantt" />
                             
                                <category scheme="https://bitworking.org/tags/project-management" term="project-management" label="project management" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[Applying the Fundamental Axioms to Reduce Uncertainty]]></title>
            <link href="https://bitworking.org/news/2025/03/applying-the-fundamental-axioms-to-reduce-uncertainty/" rel="alternate" type="text/html" />
            
                <link href="https://bitworking.org/news/2025/05/the-critical-path-in-project-planning/" rel="related" type="text/html" title="The Critical Path in Project Planning" />
                <link href="https://bitworking.org/news/2025/03/the-fundamental-axiom-of-project-planning/" rel="related" type="text/html" title="The Fundamental Axioms of Project Planning" />
            
                <id>https://bitworking.org/news/2025/03/applying-the-fundamental-axioms-to-reduce-uncertainty/</id>
            
            
            <published>2025-03-23T10:19:25-04:00</published>
            <updated>2025-03-23T10:19:25-04:00</updated>
            
            
            <content type="html"><![CDATA[<p><a href="../the-fundamental-axiom-of-project-planning/">The Fundamental Axioms of Project
Planning</a> introduced the two
fundamental axioms:</p>
<h2 id="the-axioms-of-project-management">The Axioms of Project Management:</h2>
<ol>
<li>Starting is definite, finishing less so.</li>
<li>Divide and conquer to reduce uncertainty.</li>
</ol>
<p>So now let&rsquo;s apply that to an ambiguous task and see how we can break it down
into more manageable chunks. Let&rsquo;s start with the classic example of building a
house:</p>
<p><img src="./BuildHouse.explan.png" alt="A Gantt chart showing a single task labelled &ldquo;Build House&rdquo;."></p>
<p>Now that&rsquo;s pretty ambiguous, we have really no idea how long that will take,
maybe anything from 3 months to a couple of years. As a first step let&rsquo;s split
this task in two on the time axis.</p>
<p><img src="./GetPermits.explan.png" alt="A &ldquo;Get Permits&rdquo; task followed by a &ldquo;Build House&rdquo; task."></p>
<p>Here we can split the task, first getting the permits needed to
build the house, and then beginning construction.</p>
<p>Note the dependency between those two tasks and each one&rsquo;s uncertainty.</p>
<p>We can&rsquo;t start building the house until all the permits are secured. (Starting
is definite, finished less so.)</p>
<p>The uncertainty around getting the permits is a subset of the uncertainty around
the whole project and wil be much less than the original single task. (Divide
and conquer to reduce uncertainty.)</p>
<p>And we can further subdivide the &ldquo;Get Permits&rdquo; task, because before you do that
you need to know all about the lot you are building on. Again, we&rsquo;ve take a task
and broken it down into two tasks in the time dimension, one task coming before
the other:</p>
<p><img src="./SurveyBeforePermits.explan.png" alt="A chain of three tasks, &ldquo;Survey&rdquo;, &ldquo;Get Permits&rdquo;, and &ldquo;Build House&rdquo;."></p>
<p>Let&rsquo;s assume we&rsquo;re building in an area without city water and sewer, so we&rsquo;ll
also need to plan and build out a septic system and know the dimensions of the
lot we&rsquo;re building on. Both of those can happen at the same time, in this case
think of splitting the original &ldquo;Survey&rdquo; task into two parallel tasks with each
task being done by separate people. The first &ldquo;Survey&rdquo; means getting a surveyor
out to survey the land, and the second task is getting a soils person out to
test the soils.</p>
<p><img src="./SoilsAtSameTime.explan.png" alt="Survey and Soils tasks happen at the same time."></p>
<p>Finally, we can put together the final information we need for the permits once
we have both the survey and the soils report, at which time we can layout the
house envelope and the septic field. This is like a game of tetris as you try to
fit these things on the same lot, but still being aware of the setbacks, i.e.
the septic field should be at least 25 feet from the house, but also needs to be
100 feet from the water well, etc. But I digress, lets&rsquo; get back to our chart.
All the work has to happen before the permit application, and after the survey
and soils report:</p>
<p><img src="SepticAndBuildingEnvelope.explan.png" alt="The Permit task is now a &ldquo;Septic Field &amp; Building Envelope&rdquo; task followed by a &ldquo;Permits&rdquo; task."></p>
<p>Note that at each step we are sub-dividing a task either in time or in
resources, specifying things that come before and after, or tasks that can take
place in parallel. And as each task gets smaller, the more the uncertainty will
shrink.</p>
<p>Now the diagrams you see above are called Gantt charts, and if you work in the
software field you&rsquo;ll know that a subset of people in the field will be begin
shuddering, averting their eyes and muttering &ldquo;waterfall&rdquo;, &ldquo;agile&rdquo;, and &ldquo;scrum&rdquo;
under their breath.</p>
<p>You see, somewhere in the distant misty past of software development people
stopped looking at Gantt charts not as the output of a process to reduce
ambiguity, but as a mandate from on high on how to exactly run a software
project. Or maybe some weak and ineffective managers decided to use the Gantt
chart as a command and control mechanism to manage software development. Either
way the original use of such a chart got labelled as &ldquo;waterfall&rdquo; development and
the word &ldquo;waterfall&rdquo; became vilified.</p>
<p>But that&rsquo;s completely wrong, and not how these charts came about. They were
invented to tackle large ambiguous projects, doing so from the bottom up, and
I&rsquo;ve got the receipts to prove it.</p>
<p>Let&rsquo;s jump back to 1956, <strong>almost 70 years ago!</strong>, to the development of the
<a href="https://en.wikipedia.org/wiki/UGM-27_Polaris">Polaris Missle System</a>.</p>
<blockquote>
<p>The Polaris missile program&rsquo;s complexity led to the development of new project
management techniques, including the Program Evaluation and Review Technique
(PERT) to replace the simpler Gantt chart methodology.</p>
</blockquote>
<p>Here&rsquo;s where we hit a little bit of complexity because language isn&rsquo;t fixed and
the meanings of words change over time. Back in 1956 a Gantt chart was just a
horizontal bar chart that did not include dependency relationships between
tasks. PERT came along and showed the importance and power of including the
relationships betwen tasks, so then Gantt charts started including inter-task
dependencies, and yet still retained the name &ldquo;Gantt&rdquo; chart.</p>
<p>A two part report was publishing on how the PERT process was computerized
and applied to the project:</p>
<p><a href="https://www.google.com/books/edition/_/bocPI2FOxJ0C?hl=en&amp;gbpv=0">Program Evaluation Research Task (PERT) Summary Report - Phase 1</a></p>
<p>Let&rsquo;s look at some key quotes from this document, stating on <strong>page one</strong>:</p>
<blockquote>
<p>Three factors, however, set research and development programming apart. First,
we are attempting to schedule intellectual activity as well as the more easily
measurable physical activity. Second, by definition, research and development
projects are of a pioneering nature. Therefore previous, parallel experience
upon which to base schedules of a new project is relatively unavailable.
Third, the unpredictability of specific research results inevitably requires
frequent change in program detail. These points are acknowledged by all
experienced research people. Yet, even though it be ridiculous to conceive of
scheduling research and development with the split- second precision of an
auto assembly line, it is clear that the farther reaching and more complex our
projects become, the greater is the need for procedural tools to aid top
managers to comprehend and control the project.</p>
</blockquote>
<p>Would it be bad form to point out that the entire edifice of &ldquo;Agile&rdquo; software
development is built on a bed of lies? Anyway, we can clearly see that the
entire point of the enterprise is to reduce ambiguity around research and
development work, the kind of work with the highest levels of uncertainty.</p>
<p>And this process is emphatically not a top-down process. Still on page one:</p>
<blockquote>
<p>This last point introduces a most important matter in research administration.
The people most qualified to speak on what they have done, are doing, can do,
and might do in a development project are the development people themselves.
To interpose a substantial layer of evaluation organization between top
management and the development people stretches the time of progress
reporting, risks distortion of reports through successive interpretation on
the way to the top, and generally adds to the remoteness of top management to
the tasks it is managing. 1 A system should be a close coupling between the
laboratory and top management and should serve both the planning and
evaluation interests of both, each at the proper level.</p>
</blockquote>
<p>And flexibility was baked in from the beginning, on page 3:</p>
<blockquote>
<p>Actual day-to-day happenings never follow the stated or &ldquo;nominal&rdquo; schedule
exactly. They should bear a reasonable identity, but the &ldquo;actual&rdquo; schedule
will continuously change and flex within the general limits of the nominal
schedule.</p>
</blockquote>
<p>Does it scale? Yes, yes it does, from page 4:</p>
<blockquote>
<p>The development of the FBM incorporates a tremendously complex system of event
achievement. It is estimated that there may be upwards of 5,000 events which
should be portrayed in the evaluation process. The computations that must be
undertaken for each event as well as the interactions between events require
something more than unabetted human contemplation . For this reason, the PERT
procedure has been laid out so as to be compatible with processing on modern
electronic computers.</p>
</blockquote>
<p>The amusing part is that this project took place so long ago and the computers
were so slow that they spent a bunch of time and math speeding up the analysis
by <em>avoiding taking cube roots</em>. The bonkers thing is that folks have just been
copy and pasting the formulas on page 7 of the report to this very day as the
<em>right</em> way to estimate task duration even though today we have computers powerful
enough to, <em>checks notes</em>, take cube roots.</p>
<p>Anyway, you should absolutely go read <a href="https://www.google.com/books/edition/_/bocPI2FOxJ0C?hl=en&amp;gbpv=0">Program Evaluation Research Task (PERT)
Summary Report - Phase
1</a>, it&rsquo;s eye
opening how forward looking the project was, and how much we&rsquo;ve lost, and then
poorly reinvented, since then.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="https://bitworking.org/tags/gantt" term="gantt" label="gantt" />
                             
                                <category scheme="https://bitworking.org/tags/project-management" term="project-management" label="project management" />
                            
                        
                    
                
            
        </entry>
    
        
        <entry>
            <title type="html"><![CDATA[The Fundamental Axioms of Project Planning]]></title>
            <link href="https://bitworking.org/news/2025/03/the-fundamental-axiom-of-project-planning/" rel="alternate" type="text/html" />
            
                <link href="https://bitworking.org/news/2025/05/the-critical-path-in-project-planning/" rel="related" type="text/html" title="The Critical Path in Project Planning" />
                <link href="https://bitworking.org/news/2025/03/applying-the-fundamental-axioms-to-reduce-uncertainty/" rel="related" type="text/html" title="Applying the Fundamental Axioms to Reduce Uncertainty" />
            
                <id>https://bitworking.org/news/2025/03/the-fundamental-axiom-of-project-planning/</id>
            
            
            <published>2025-03-18T21:26:57-04:00</published>
            <updated>2025-03-18T21:26:57-04:00</updated>
            
            
            <content type="html"><![CDATA[<p>Over the years I have run many projects, everything from small software projects
of just a couple people, to new product development projects in the material
testing space, include both hardware and software, to large projects involving
work that effects the daily routines of thousands of software engineers. In that
time I&rsquo;ve honed down how I think about project management into just two axioms,
which, if you know the field of project management is quite short.</p>
<p>If you were to search across the web today you&rsquo;d find a slew of randomly
enumerated fundamentals of project management, including, but certainly not
limited to:</p>
<ul>
<li>5 C&rsquo;s of project management</li>
<li>5 P&rsquo;s of project management</li>
<li>The 5 principles of project management</li>
<li>The 12 principles of project management</li>
<li>50+ Axioms on the Art and Science of Managing Projects</li>
</ul>
<p>Being a trained mathematician I can tell you for certain that if you&rsquo;ve got 50+
axioms, you clearly don&rsquo;t know the meaning of &ldquo;axiom&rdquo;.</p>
<p>Anyway, for you dear reader, I am going to break down project management into
just two axioms.</p>
<p>The first axiom of project planning:</p>
<blockquote>
<p>Starting is definite, finishing less so.</p>
</blockquote>
<p>So what does that mean? Well maybe let&rsquo;s start with a more colloquial saying,
which is:</p>
<blockquote>
<p>A journey of a thousand miles begins with a single step.</p>
</blockquote>
<p>See how we know when the journey begins, when we take that first step, and we
can definitely say when we are going to start. But when, exactly, will we finish
that journey of 1,000 miles? Presuming we&rsquo;re going to walk, that&rsquo;s going to take
quite a while and the exact finish date of our journey is going to be highly
variable. Consider hiking the Appalachian Trail:</p>
<blockquote>
<p>Completing the entire 2,190+ miles of the Appalachian Trail (A.T.) in one trip
is a mammoth undertaking. Each year, thousands of hikers attempt a thru-hike;
only about one in four makes it all the way.</p>
<p>A typical thru-hiker takes 5 to 7 months to hike the entire A.T.</p>
</blockquote>
<p>&ndash;<a href="https://appalachiantrail.org/explore/hike-the-a-t/thru-hiking/">The Appalachian Trail
Conservancy</a></p>
<p>If only 1/4 of the hikers actually finish hiking the trail in any year, the
remaining 3/4 don&rsquo;t finish it, the average time to finish the project is
infinity. Infinity! I don&rsquo;t think I&rsquo;m going out on a limb when I say that [5
months, ∞) is a huge amount of uncertainty.</p>
<p>Just the <a href="https://en.wikipedia.org/wiki/List_of_failed_and_overbudget_custom_software_projects#Projects_with_ongoing_problems">list of failed and overbudget custom software
projects</a>
is worth an entire Wikipedia entry.</p>
<p>And there are projects that have gone on so long that they&rsquo;ve become legendary,
like <a href="https://en.wikipedia.org/wiki/Duke_Nukem_Forever">Duke Nukem Forever</a>
which took 14 years to ship.</p>
<p>The longer a task takes, the higher the uncertainty of when it will finish.</p>
<p>Think of a simple task, like vacuuming the house. This has a short time, and you
can probably predict with pretty good accuracy how long it will take you to
complete the task. And yes, leave it to your kids and it might never get done, but
let&rsquo;s not go there.</p>
<p>Compare that to taking on a larger project, like building a skyscraper, a
sub-division, or something never accomplished before, like a fusion reactor, and
the uncertainty rises dramatically.</p>
<p>But what to we do in the face of such uncertainty?</p>
<p>The second axiom of project planning:</p>
<blockquote>
<p>Divide and conquer to reduce uncertainty.</p>
</blockquote>
<p>Yeah, really, it&rsquo;s that simple. If you have a large, complex, or ambiguous task,
then break it down into smaller more manageable tasks. For example we could
divide up our Appalachian Trail hiking into first figuring out how many days of
food and water we can carry, along with preliminary test hikes to see how many
miles a day we can cover on similar terrain.</p>
<p>And this loops back around to our first axoim, the shorter the sub-task the less
the ambiguity.</p>
<p>So now we have our two axioms:</p>
<h2 id="the-axioms-of-project-management">The Axioms of Project Management:</h2>
<ol>
<li>Starting is definite, finishing less so.</li>
<li>Divide and conquer to reduce uncertainty.</li>
</ol>
<p>In my next installment I&rsquo;ll talk about how to apply these axioms to a project to
reduce uncertainty, and how everything you&rsquo;ve learned about project management
is probably wrong.</p>
]]></content>
            
                 
                    
                 
                    
                         
                        
                            
                             
                                <category scheme="https://bitworking.org/tags/gantt" term="gantt" label="gantt" />
                             
                                <category scheme="https://bitworking.org/tags/project-management" term="project-management" label="project management" />
                            
                        
                    
                
            
        </entry>
    
</feed>
