Author: Ben

  • Decarbonizing Shipping with Wind

    The shipping industry is known for being fairly dirty environmentally due largely to the fact that the most common fuel used in shipping — bunker fuel — contributes both to carbon emissions, significant air pollution, and water pollution (from spills and due to the common practice of dumping the byproduct of sulphur scrubbing to curtail air pollution).

    While much of the effort to green shipping has focused on the use of alternative fuels like hydrogen, ammonia and methanol as replacements for bunker fuel, I recently saw an article on the use of automated & highly durable sail technology to le ships leverage wind as a means to reduce fuel consumption.

    I don’t have any inside information on what the cost / speed tradeoffs are for the technology, nor whether or not there’s a credible path to scaling to handle the massive container ships that dominate global shipping, but it’s a fascinating technology vector, and a direct result of the growing realization by the shipping industry that it needs to green itself.


  • Google’s Quantum Error Correction Breakthrough

    One of the most exciting areas of technology development, but that doesn’t get a ton of mainstream media coverage, is the race to build a working quantum computer that exhibits “below threshold quantum computing” — the ability to do calculations utilizing quantum mechanics accurately.

    One of the key limitations to achieving this has been the sensitivity of quantum computing systems — in particular the qubits that capture the superposition of multiple states that allow quantum computers to exploit quantum mechanics for computation — to the world around them. Imagine if your computer’s accuracy would change every time someone walked in the room — even if it was capable of amazing things, it would not be especially practical. As a result, much research to date has been around novel ways of creating physical systems that can protect these quantum states.

    Google has (in a pre-print in Nature) demonstrated their new Willow quantum computing chip which demonstrates a quantum error correction method that spreads the quantum state information of a single “logical” qubit across multiple entangled “physical” qubits to create a more robust system. Beyond proving that their quantum error correction method worked, what is most remarkable to me, is that they’re able to extrapolate a scaling law for their error correction — a way of guessing how much better their system is at avoiding loss of quantum state as they increase the number of physical qubits per logical qubit — which could suggest a “scale up” path towards building functional, practical quantum computers.

    I will confess that quantum mechanics was never my strong suit (beyond needing it for a class on statistical mechanics eons ago in college), and my understanding of the core physics underlying what they’ve done in the paper is limited, but this is an incredibly exciting feat on our way towards practical quantum computing systems!


  • Cynefin

    I had never heard of this framework for thinking about how to address problems before. Shout-out to my friend Chris Yiu and his new Substack Secret Weapon about improving productivity for teaching me about this. It’s surprisingly insightful about when to think about something as a process problem vs an expertise problem vs experimentation vs direction.


    Problems come in many forms
    Chris Yiu | Secret Weapon

  • The Hits Business — Games Edition

    The best return on investment in terms of hours of deep engagement per dollar in entertainment is with games. When done right, they blend stunning visuals and sounds, earworm-like musical scores, compelling story and acting, and a sense of progression that are second to none.

    Case in point: I bought the complete edition of the award-winning The Witcher 3: Wild Hunt for $10 during a Steam sale in 2021. According to Steam, I’ve logged over 200 hours (I had to doublecheck that number!) playing the game, between two playthroughs and the amazing expansions Hearts of Stone and Blood and Wine — an amazing 20 hours/dollar spent. Even paying full freight (as of this writing, the complete edition including both expansions costs $50), that would still be a remarkable 4 hours/dollar. Compare that with the price of admission to a movie or theater or concert.

    The Witcher 3 has now surpassed 50 million sales — comfortably earning over $1 billion in revenue which is an amazing feat for any media property.

    But as amazing and as lucrative as these games can be, these games cannot escape the cruel hit-driven basis of their industry, where a small number of games generate the majority of financial returns. This has resulted in studios chasing ever more expensive games with familiar intellectual property (i.e. Star Wars) that has, to many game players, cut the soul from the games and has led to financial instability in even popular game studios.

    This article from IGN summarizes the state of the industry well — with so-called AAA games now costing $200 million to create, not to mention $100’s of millions to market, more and more studios have to wind down as few games can generate enough revenue to cover the cost of development and marketing.

    The article predicts — and I hope it’s right — that the games industry will learn some lessons that many studios in Hollywood/the film industry have been forced to: embrace more small budget games to experiment with new forms and IP. Blockbusters will have their place but going all-in on blockbusters is a recipe for a hollowing out of the industry and a cutting off of the creativity that it needs.

    Or, as the author so nicely puts it: “Maybe studios can remember that we used to play video games because they were fun – not because of their bigger-than-last-year maps carpeted by denser, higher-resolution grass that you walk across to finish another piece of side content that pushes you one digit closer to 100% completion.”


  • Why is it so Hard to Build a Diagnostic Business?

    Everywhere you look, the message seems clear: early detection (of cancer & disease) saves lives. Yet behind the headlines, companies developing these screening tools face a different reality. Many tests struggle to gain approval, adoption, or even financial viability. The problem isn’t that the science is bad — it’s that the math is brutal.

    This piece unpacks the economic and clinical trade-offs at the heart of the early testing / disease screening business. Why do promising technologies struggle to meet cost-effectiveness thresholds, despite clear scientific advances? And what lessons can diagnostic innovators take from these challenges to improve their odds of success? By the end, you’ll have a clearer view of the challenges and opportunities in bringing new diagnostic tools to market—and why focusing on the right metrics can make all the difference.

    The brutal math of diagnostics

    Image Credit: Wikimedia

    Technologists often prioritize metrics like sensitivity (also called recall) — the ability of a diagnostic test to correctly identify individuals with a condition (i.e., if the sensitivity of a test is 90%, then 90% of patients with the disease will register as positives and the remaining 10% will be false negatives) — because it’s often the key scientific challenge and aligns nicely with the idea of getting more patients earlier treatment.

    But when it comes to adoption and efficiency, specificity — the ability of a diagnostic test to correctly identify healthy individuals (i.e., if the specificity of a test is 90%, then 90% of healthy patients will register as negatives and the remaining 10% will be false positives) — is usually the more important and overlooked criteria.

    The reason specificity is so important is that it can have a profound impact on a test’s Positive Predictive Value (PPV) — whether or not a positive test result means a patient actually has a disease (i.e., if the positive predictive value of a test is 90%, then a patient that registers as positive has a 90% chance of having the disease and 10% chance of actually being healthy — being a false positive).

    What is counter-intuitive, even to many medical and scientific experts, is that because (by definition) most patients are healthy, many high accuracy tests have disappointingly low PPV as most positive results are actually false positives.

    Let me present an example (see table below for summary of the math) that will hopefully explain:

    • There are an estimated 1.2 million people in the US with HIV — that is roughly 0.36% (the prevalence) of the US population
    • Let’s say we have an HIV test with 99% sensitivity and 99% specificity — a 99% (very) accurate test!
    • If we tested 10,000 Americans at random, you would expect roughly 36 of them (0.36% x 10,000) to be HIV positive. That means, roughly 9,964 are HIV negative
      • 99% sensitivity means 99% of the 36 HIV positive patients will test positive (99% x 36 = ~36)
      • 99% specificity means 99% of the 9,964 HIV negative patients will test negative (99% x 9,964 = ~9,864) while 1% (1% x 9,964 = ~100) would be false positives
    • This means that even though the test is 99% accurate, it only has a positive predictive value of ~26% (36 true positives out of 136 total positive results)
    Math behind the hypothetical HIV test example (Google Sheet link)

    Below (if you’re on a browser) is an embedded calculator which will run this math for any values of disease prevalence and sensitivity / specificity (and here is a link to a Google Sheet that will do the same), but you’ll generally find that low disease rates result in low positive predictive values for even very accurate diagnostics.

    Typically, introducing a new diagnostic means balancing true positives against the burden of false positives. After all, for patients, false positives will result in anxiety, invasive tests, and, sometimes, unnecessary treatments. For healthcare systems, they can be a significant economic burden as the cost of follow-up testing and overtreatment add up, complicating their willingness to embrace new tests.

    Below (if you’re on a browser) is an embedded calculator which will run the basic diagnostic economics math for different values of the cost of testing and follow-up testing to calculate the cost of testing and follow-up testing per patient helped (and here is a link to a Google Sheet that will do the same)

    Finally, while diagnostics businesses face many of the same development hurdles as drug developers — the need to develop cutting-edge technology, to carry out large clinical studies to prove efficacy, and to manage a complex regulatory and reimbursement landscape — unlike drug developers, diagnostic businesses face significant pricing constraints. Successful treatments can command high prices for treating a disease. But successful diagnostic tests, no matter how sophisticated, cannot, because they ultimately don’t treat diseases, they merely identify them.

    Case Study: Exact Sciences and Cologuard

    Let’s take Cologuard (from Exact Sciences) as an example. Cologuard is a combination genomic and immunochemistry test for colon cancer carried out on patient stool samples. It’s two primary alternatives are:

    1. a much less sensitive fecal immunochemistry test (FIT) — which uses antibodies to detect blood in the stool as a potential, imprecise sign of colon cancer
    2. colonoscopies — a procedure where a skilled physician uses an endoscope to enter and look for signs of cancer in a patient’s colon. It’s considered the “gold standard” as it functions both as diagnostic and treatment (a physician can remove or biopsy any lesion or polyp they find). But, because it’s invasive and uncomfortable for the patient, this test is typically only done every 4-10 years

    Cologuard is (as of this writing) Exact Science’s primary product line, responsible for a large portion of Exact Science’s $2.5 billion in 2023 revenue. It can detect earlier stage colon cancer as well as pre-cancerous growths that could lead to cancer. Impressively, Exact Sciences also commands a gross margin greater than 70%, a high margin achieved mainly by pharmaceutical and software companies that have low per-unit costs of production. This has resulted in Exact Sciences, as of this writing, having a market cap over $11 billion.

    Yet for all its success, Exact Sciences is also a cautionary note, illustrating the difficulties of building a diagnostics company.

    • The company was founded in 1995, yet didn’t see meaningful revenue from selling diagnostics until 2014 (nearly 20 years later, after it received FDA approval for Cologuard)
    • The company has never had a profitable year (this includes the last 10 years it’s been in-market), losing over $200 million in 2023, and in the first three quarters of 2024, it has continued to be unprofitable.
    • Between 1997 (the first year we have good data from their SEC filings as summarized in this Google Sheet) and 2014 when it first achieved meaningful diagnostic revenue, Exact Sciences lost a cumulative $420 million, driven by $230 million in R&D spending, $88 million in Sales & Marketing spending, and $33 million in CAPEX. It funded those losses by issuing over $624 million in stock (diluting investors and employees)
    • From 2015-2023, it has needed to raise an additional $3.5 billion in stock and convertible debt (net of paybacks) to cover its continued losses (over $3 billion from 2015-2023)
    • Prior to 2014, Exact Sciences attempted to commercialize colon cancer screening technologies through partnerships with LabCorp (ColoSure and PreGenPlus). These were not very successful and led to concerns from the FDA and insurance companies. This forced Exact Sciences to invest heavily in clinical studies to win over the payers and the FDA, including a pivotal ~10,000 patient study to support Cologuard which recruited patients from over 90 sites and took over 1.5 years.
    • It took Exact Sciences 3 years after FDA approval of Cologuard for its annual diagnostic revenues to exceed what it spends on sales & marketing. It continues to spend aggressively there ($727M in 2023).

    While it’s difficult to know precisely what the company’s management / investors would do differently if they could do it all over again, the brutal math of diagnostics certainly played a key role.

    From a clinical perspective, Cologuard faces the same low positive predictive value problem all diagnostic screening tests face. From the data in their study on ~10,000 patients, it’s clear that, despite having a much higher sensitivity for cancer (92.3% vs 73.8%) and higher AUROC (94% vs 89%) than the existing FIT test, the PPV of Cologuard is only 3.7% (lower than the FIT test: 6.9%).

    Even using a broader disease definition that includes the pre-cancerous advanced lesions Exact Sciences touted as a strength, the gap on PPV does not narrow (Cologuard: 23.6% vs FIT: 32.6%)

    Clinical comparison of FIT vs Cologuard
    (Google Sheet link)

    The economic comparison with a FIT test fares even worse due to the higher cost of Cologuard as well as the higher rate of false positives. Under the Center for Medicare & Medicaid Service’s 2024Q4 laboratory fee schedule, a FIT test costs $16 (CPT code: 82274), but Cologuard costs $509 (CPT code: 81528), over 30x higher! If each positive Cologuard and FIT test results in a follow-up colonoscopy (which has a cost of $800-1000 according to this 2015 analysis), the screening cost per cancer patient is 5.2-7.1x higher for Cologuard than for the FIT test.

    Cost comparison of FIT vs Cologuard
    (Google Sheet link)

    This quick math has been confirmed in several studies.

    From ACS Clinical Congress 2022 Presentation

    While Medicare and the US Preventive Services Task Force concluded that the cost of Cologuard and the increase in false positives / colonoscopy complications was worth the improved early detection of colon cancer, it stayed largely silent on comparing cost-efficacy with the FIT test. It’s this unfavorable comparison that has probably required Exact Sciences to invest so heavily in sales and marketing to drive sales. That Cologuard has been so successful is a testament both to the value of being the only FDA-approved test on the market as well as Exact Science’s efforts in making Cologuard so well-known (how many other diagnostics do you know have an SNL skit dedicated to them?).

    Not content to rest on the laurels of Cologuard, Exact Sciences recently published a ~20,000 patient study on their next generation colon cancer screening test: Cologuard Plus. While the study suggests Exact Sciences has improved the test across the board, the company’s marketing around Cologuard Plus having both >90% sensitivity and specificity is misleading, because the figures for sensitivity and specificity are for different conditions: sensitivity for colorectal cancer but specificity for colorectal cancer OR advanced precancerous lesion (see the table below).

    Sensitivity and Specificity by Condition for Cologuard Plus Study
    (Google Sheet link)

    Disentangling these numbers shows that while Cologuard Plus has narrowed its PPV disadvantage (now worse by 1% on colorectal cancer and even on cancer or lesion) and its cost-efficacy disadvantage (now “only” 4.4-5.8x more expensive) vs the FIT test (see tables below), it still hasn’t closed the gap.

    Clinical: Cologuard+ vs FIT (Google Sheet link)
    Economic: Cologuard+ vs FIT (Google Sheet link)

    Time will tell if this improved test performance translates to continued sales performance for Exact Sciences, but it is telling that despite the significant time and resources that went into developing Cologuard Plus, the data suggests it’s still likely more cost effective for health systems to adopt FIT over Cologuard Plus as a means of preventing advanced colon cancer.

    Lessons for diagnostics companies

    The underlying math of the diagnostics business and the lessons from Exact Sciences’ long path to dramatic sales has several key lessons for diagnostic entrepreneurs:

    1. Focus on specificity — For diagnostic technologists, too little attention is paid to specificity while too much attention is paid on sensitivity. Positive predictive value and the cost-benefit for a health system are largely going to swing on specificity.
    2. Aim for higher value tests — Because the development and required validation for a diagnostic can be as high as that of a drug or medical device, it is important to pursue opportunities where the diagnostic can command a high price. These are usually markets where the alternatives are very expensive because they require new technology (e.g. advanced genetic tests) or a great deal of specialized labor (e.g. colonoscopy) or where the diagnostic directly decides on a costly course of treatment (e.g. a companion diagnostic for an oncology drug).
    3. Go after unmet needs — If a test is able to fill a mostly unmet need — for example, if the alternatives are extremely inaccurate or poorly adopted — then adoption will be determined by awareness (because there aren’t credible alternatives) and pricing will be determined by sensitivity (because this drives the delivery of better care). This also simplifies the sales process.
    4. Win beyond the test — Because performance can only ever get to 100%, each incremental point on sensitivity and specificity is both exponentially harder to achieve but also delivers less medical or financial value. As a result, it can be advantageous to focus on factors beyond the test such as regulatory approval / guidelines adoption, patient convenience, time to result, and impact on follow-up tests and procedures. Cologuard gained a great deal from being “the first FDA-approved colon cancer screening test”. Non-invasive prenatal testing, despite low positive predictive values and limited disease coverage, gained adoption in part by helping to triage follow-up amniocentesis (a procedure which has a low but still frighteningly high rate of miscarriage ~0.5%). Rapid antigen tests for COVID have also similarly been adopted despite their lower sensitivity and specificity than PCR tests due to their speed, low cost, and ability to carry out at home.

    Diagnostics developers must carefully navigate the intersection of scientific innovation and financial reality, while grappling with the fact that even the most impressive technology may be insufficient without taking into account clinical and economic factors to achieve market success.

    Ultimately, the path forward for diagnostic innovators lies in prioritizing specificity, targeting high-value and unmet needs, and crafting solutions that deliver value beyond the test itself. While Exact Science’s journey underscores the difficulty of these challenges, it also illustrates that with persistence, thoughtful investment, and strategic differentiation, it is possible to carve out a meaningful and impactful space in the market.

  • The Challenge of Capacity

    The rise of Asia as a force to be reckoned with in large scale manufacturing of critical components like batteries, solar panels, pharmaceuticals, chemicals, and semiconductors has left US and European governments seeking to catch up with a bit of a dilemma.

    These activities largely moved to Asia because financially-motivated management teams in the West (correctly) recognized that:

    • they were low return in a conventional financial sense (require tremendous investment and maintenance)
    • most of these had a heavy labor component (and higher wages in the US/European meant US/European firms were at a cost disadvantage)
    • these activities tend to benefit from economies of scale and regional industrial ecosystems, so it makes sense for an industry to have fewer and larger suppliers
    • much of the value was concentrated in design and customer relationship, activities the Western companies would retain

    What the companies failed to take into account was the speed at which Asian companies like WuXi, TSMC, Samsung, LG, CATL, Trina, Tongwei, and many others would consolidate (usually with government support), ultimately “graduating” into dominant positions with real market leverage and with the profitability to invest into the higher value activities that were previously the sole domain of Western industry.

    Now, scrambling to reposition themselves closer to the forefront in some of these critical industries, these governments have tried to kickstart domestic efforts, only to face the economic realities that led to the outsourcing to begin with.

    Northvolt, a major European effort to produce advanced batteries in Europe, is one example of this. Despite raising tremendous private capital and securing European government support, the company filed for bankruptcy a few days ago.

    While much hand-wringing is happening in climate-tech circles, I take a different view: this should really not come as a surprise. Battery manufacturing (like semiconductor, solar, pharmaceutical, etc) requires huge amounts of capital and painstaking trial-and-error to perfect operations, just to produce products that are steadily dropping in price over the long-term. It’s fundamentally a difficult and not-very-rewarding endeavor. And it’s for that reason that the West “gave up” on these years ago.

    But if US and European industrial policy is to be taken seriously here, the respective governments need to internalize that reality and be committed for the long haul. The idea that what these Asian companies are doing is “easily replicated” is simply not true, and the question is not if but when will the next recipient of government support fall into dire straits.


  • A Visual Timeline of Human Migration

    Beautiful map laying out when humans settled different parts of the world (from 2013, National Geographic’s Out of Eden project)


    A Walk Through Time
    Jeff Blossom | National Geographic

  • Updating my AI News Reader

    A few months ago, I shared that I had built an AI-powered personalized news reader which I use (and still do) on a near-daily basis. Since that post, I’ve made a couple of major improvements (which I have just reflected in my public Github).

    Switching to JAX

    I previously chose Keras 3 for my deep learning algorithm architecture because of its ease of use as well as the advertised ability to shift between AI/ML backends (at least between Tensorflow, JAX, and PyTorch). With Keras creator Francois Chollet noting significant speed-ups just from switching backends to JAX, I decided to give the JAX backend a shot.

    Thankfully, Keras 3 lived up to it’s multi-backend promise and made switching to JAX remarkably easy. For my code, I simply had to make three sets of tweaks.

    First, I had to change the definition of my container images. Instead of starting from Tensorflow’s official Docker images, I instead installed JAX and Keras on Modal’s default Debian image and set the appropriate environmental variables to configure Keras to use JAX as a backend:

    jax_image = (
        modal.Image.debian_slim(python_version='3.11')
        .pip_install('jax[cuda12]==0.4.35', extra_options="-U")
        .pip_install('keras==3.6')
        .pip_install('keras-hub==0.17')
        .env({"KERAS_BACKEND":"jax"}) # sets Keras backend to JAX
        .env({"XLA_PYTHON_CLIENT_MEM_FRACTION":"1.0"})
    Code language: Python (python)

    Second, because tf.data pipelines convert everything to Tensorflow tensors, I had to switch my preprocessing pipelines from using Keras’s ops library (which, because I was using JAX as a backend, expected JAX tensors) to Tensorflow native operations:

    ds = ds.map(
        lambda i, j, k, l: 
        (
            preprocessor(i), 
            j, 
            2*k-1, 
            loglength_norm_layer(tf.math.log(tf.cast(l, dtype=tf.float32)+1))
        ), 
        num_parallel_calls=tf.data.AUTOTUNE
    )
    Code language: Python (python)

    Lastly, I had a few lines of code which assumed Tensorflow tensors (where getting the underlying value required a .numpy() call). As I was now using JAX as a backend, I had to remove the .numpy() calls for the code to work.

    Everything else — the rest of the tf.data preprocessing pipeline, the code to train the model, the code to serve it, the previously saved model weights and the code to save & load them — remained the same! Considering that the training time per epoch and the time the model took to evaluate (a measure of inference time) both seemed to improve by 20-40%, this simple switch to JAX seemed well worth it!

    Model Architecture Improvements

    There were two major improvements I made in the model architecture over the past few months.

    First, having run my news reader for the better part of a year now, I now have accumulated enough data where my strategy to simultaneously train on two related tasks (predicting the human rating and predicting the length of an article) no longer required separate inputs. This reduced the memory requirement as well as simplified the data pipeline for training (see architecture diagram below)

    Secondly, I was successfully able to train a version of my algorithm which can use dot products natively. This not only allowed me to remove several layers from my previous model architecture (see architecture diagram below), but because the Supabase postgres database I’m using supports pgvector, it means I can even compute ratings for articles through a SQL query:

    UPDATE articleuser
    SET 
        ai_rating = 0.5 + 0.5 * (1 - (a.embedding <=> u.embedding)),
        rating_timestamp = NOW(),
        updated_at = NOW()
    FROM 
        articles a, 
        users u
    WHERE 
        articleuser.article_id = a.id
        AND articleuser.user_id = u.id
        AND articleuser.ai_rating IS NULL;
    Code language: SQL (Structured Query Language) (sql)

    The result is much greater simplicity in architecture as well as greater operational flexibility as I can now update ratings from the database directly as well as from serving a deep neural network from my serverless backend.

    Model architecture (output from Keras plot_model function)

    Making Sources a First-Class Citizen

    As I used the news reader, I realized early on that the ability to just have sorted content from one source (i.e. a particular blog or news site) would be valuable to have. To add this, I created and populated a new sources table within the database to track these independently (see database design diagram below) which was linked to the articles table.

    Newsreader database design diagram (produced by a Supabase tool)

    I then modified my scrapers to insert the identifier for each source alongside each new article, as well as made sure my fetch calls all JOIN‘d and pulled the relevant source information.

    With the data infrastructure in place, I added the ability to add a source parameter to the core fetch URLs to enable single (or multiple) source feeds. I then added a quick element at the top of the feed interface (see below) to let a user know when the feed they’re seeing is limited to a given source. I also made all the source links in the feed clickable so that they could take the user to the corresponding single source feed.

    <div class="feed-container">
      <div class="controls-container">
        <div class="controls">
          ${source_names && source_names.length > 0 && html`
            <div class="source-info">
              Showing articles from: ${source_names.join(', ')}
            </div>
            <div>
              <a href="/">Return to Main Feed</a>
            </div>
          `}
        </div>
      </div>
    </div>
    Code language: HTML, XML (xml)
    The interface when on a single source feed

    Performance Speed-Up

    One recurring issue I noticed in my use of the news reader pertained to slow load times. While some of this can be attributed to the “cold start” issue that serverless applications face, much of this was due to how the news reader was fetching pertinent articles from the database. It was deciding at the moment of the fetch request what was most relevant to send over by calculating all the pertinent scores and rank ordering. As the article database got larger, this computation became more complicated.

    To address this, I decided to move to a “pre-calculated” ranking system. That way, the system would know what to fetch in advance of a fetch request (and hence return much faster). Couple that with a database index (which effectively “pre-sorts” the results to make retrieval even faster), and I saw visually noticeable improvements in load times.

    But with any pre-calculated score scheme, the most important question is how and when re-calculation should happen. Too often and too broadly and you incur unnecessary computing costs. Too infrequently and you risk the scores becoming stale.

    The compromise I reached derived itself from the three ways articles are ranked in my system:

    1. The AI’s rating of an article plays the most important role (60%)
    2. How recently the article was published is tied with… (20%)
    3. How similar an article is with the 10 articles a user most recently read (20%

    These factors lent themselves to very different natural update cadences:

    • Newly scraped articles would have their AI ratings and calculated score computed at the time they enter the database
    • AI ratings for the most recent and the previously highest scoring articles would be re-computed after model training updates
    • On a daily basis, each article’s score was recomputed (focusing on the change in article recency)
    • The article similarity for unread articles is re-evaluated after a user reads 10 articles

    This required modifying the reader’s existing scraper and post-training processes to update the appropriate scores after scraping runs and model updates. It also meant tracking article reads on the users table (and modifying the /read endpoint to update these scores at the right intervals). Finally, it also meant adding a recurring cleanUp function set to run every 24 hours to perform this update as well as others.

    Next Steps

    With some of these performance and architecture improvements in place, my priorities are now focused on finding ways to systematically improve the underlying algorithms as well as increase the platform’s usability as a true news tool. To that end some of the top priorities for next steps in my mind include:

    • Testing new backbone models — The core ranking algorithm relies on Roberta, a model released 5 years ago before large language models were common parlance. Keras Hub makes it incredibly easy to incorporate newer models like Meta’s Llama 2 & 3, OpenAI’s GPT2, Microsoft’s Phi-3, and Google’s Gemma and fine-tune them.
    • Solving the “all good articles” problem — Because the point of the news reader is to surface content it considers good, users will not readily see lower quality content, nor will they see content the algorithm struggles to rank (i.e. new content very different from what the user has seen before). This makes it difficult to get the full range of data needed to help preserve the algorithm’s usefulness.
    • Creating topic and author feeds — Given that many people think in terms of topics and authors of interest, expanding what I’ve already done with Sources but with topics and author feeds sounds like a high-value next step

    I also endeavor to make more regular updates to the public Github repository (instead of aggregate many updates I had already made into two large ones). This will make the updates more manageable and hopefully help anyone out there who’s interested in building a similar product.

  • Not your grandma’s geothermal energy

    The pursuit of carbon-free energy has largely leaned on intermittent sources of energy — like wind and solar; and sources that require a great deal of initial investment — like hydroelectric (which requires elevated bodies of water and dams) and nuclear (which require you to set up a reactor).

    The theoretical beauty of geothermal power is that, if you dig deep enough, virtually everywhere on planet earth is hot enough to melt rock (thanks to the nuclear reactions that heat up the inside of the earth). But, until recently, geothermal has been limited to regions of Earth where well-formed geologic formations can deliver predictable steam without excessive engineering.

    But, ironically, it is the fracking boom, which has helped the oil & gas industries get access to new sources of carbon-producing energy, which may help us tap geothermal power in more places. As fracking and oil & gas exploration has led to a revolution in our ability to precisely drill deep underground and push & pull fluids, it also presents the ability for us to tap more geothermal power than ever before. This has led to the rise of enhanced geothermal, the process by which we inject water deep underground to heat, and leverage the steam produced to generate electricity. Studies suggest the resource is particularly rich and accessible in the Southwest of the United States (see map below) and could be an extra tool in our portfolio to green energy consumption.

    (Source: Figure 5 from NREL study on enhanced geothermal from Jan 2023)

    While there is a great deal of uncertainty around how much this will cost and just what it will take (not to mention the seismic risks that have plagued some fracking efforts), the hunger for more data center capacity and the desire to power this with clean electricity has helped startups like Fervo Energy and Sage Geosystems fund projects to explore.


  • Who needs humans? Lab of AIs designs valid COVID-binding proteins

    A recent preprint from Stanford has demonstrated something remarkable: AI agents working together as a team solving a complex scientific challenge.

    While much of the AI discourse focuses on how individual large language models (LLMs) compare to humans, much of human work today is a team effort, and the right question is less “can this LLM do better than a single human on a task” and more “what is the best team-up of AI and human to achieve a goal?” What is fascinating about this paper is that it looks at it from the perspective of “what can a team of AI agents achieve?”

    The researchers tackled an ambitious goal: designing improved COVID-binding proteins for potential diagnostic or therapeutic use. Rather than relying on a single AI model to handle everything, the researchers tasked an AI “Principal Investigator” with assembling a virtual research team of AI agents! After some internal deliberation, the AI Principal Investigator selected an AI immunologist, an AI machine learning specialist, and an AI computational biologist. The researchers made sure to add an additional role, one of a “scientific critic” to help ground and challenge the virtual lab team’s thinking.

    The team composition and phases of work planned and carried out by the AI principal investigator
    (Source: Figure 2 from Swanson et al.)

    What makes this approach fascinating is how it mirrors high functioning human organizational structures. The AI team conducted meetings with defined agendas and speaking orders, with a “devil’s advocate” to ensure the ideas were grounded and rigorous.

    Example of a virtual lab meeting between the AI agents; note the roles of the Principal Investigator (to set agenda) and Scientific Critic (to challenge the team to ground their work)
    (Source: Figure 6 from Swanson et al.)

    One tactic that the researchers said helped with boosting creativity that is harder to replicate with humans is running parallel discussions, whereby the AI agents had the same conversation over and over again. In these discussions, the human researchers set the “temperature” of the LLM higher (inviting more variation in output). The AI principal investigator then took the output of all of these conversations and synthesized them into a final answer (this time with the LLM temperature set lower, to reduce the variability and “imaginativeness” of the answer).

    The use of parallel meetings to get “creativity” and a diverse set of options
    (Source: Supplemental Figure 1 from Swanson et al.)

    The results? The AI team successfully designed nanobodies (small antibody-like proteins — this was a choice the team made to pursue nanobodies over more traditional antibodies) that showed improved binding to recent SARS-CoV-2 variants compared to existing versions. While humans provided some guidance, particularly around defining coding tasks, the AI agents handled the bulk of the scientific discussion and iteration.

    Experimental validation of some of the designed nanobodies; the relevant comparison is the filled in circles vs the open circles. The higher ELISA assay intensity for the filled in circles shows that the designed nanbodies bind better than their un-mutated original counterparts
    (Source: Figure 5C from Swanson et al.)

    This work hints at a future where AI teams become powerful tools for human researchers and organizations. Instead of asking “Will AI replace humans?”, we should be asking “How can humans best orchestrate teams of specialized AI agents to solve complex problems?”

    The implications extend far beyond scientific research. As businesses grapple with implementing AI, this study suggests that success might lie not in deploying a single, all-powerful AI system, but in thoughtfully combining specialized AI agents with human oversight. It’s a reminder that in both human and artificial intelligence, teamwork often trumps individual brilliance.

    I personally am also interested in how different team compositions and working practices might lead to better or worse outcomes — for both AI teams and human teams. Should we have one scientific critic, or should their be specialist critics for each task? How important was the speaking order? What if the group came up with their own agendas? What if there were two principal investigators with different strengths?

    The next frontier in AI might not be building bigger models, but building better teams.

  • Updating OpenMediaVault

    (Note: this is part of my ongoing series on cheaply selfhosting)

    I’ve been using OpenMediaVault 6 on a cheap mini-PC as a home server for over a year. Earlier this year, OpenMediaVault 7 was announced which upgrades the underlying Linux to Debian 12 Bookworm and made a number of other security, compatibility, and user interface improvements.

    Not wanting to start over fresh, I decided to take advantage of OpenMediaVault’s built-in command line tool to handle upgrades and, if like me, you are looking for a quick and clean way of upgrading from OpenMediaVault 6 to OpenMediaVault 7, look no further:

    1. SSH into your system / connect directly with a keyboard and monitor. While normally I would recommend WeTTY (accessible via [Services > WeTTY] from the web console interface) to handle any command line activity on your server, because WeTTY relies on the server to be running and updating the operating system necessitates shutting the server down, you’ll need to either plug in a keyboard and monitor or use SSH.
    2. Run sudo omv-upgrade in the command-line. This will start a long process of downloading and installing necessary files to complete the operating system update. From time to time you’ll be asked to accept / approve a set of changes via keyboard. If you’re on the online administrative panel, you’ll be booted off as the server shuts down.
    3. Restart the server. Once everything is complete, you’ll need to restart the server to make sure everything “takes”. This can be done by running reboot in the command line or by manually turning off and on the server.

    Assuming everything went smoothly, after the server completes its reboot (which will take a little bit of extra time after an operating system upgrade), upon logging into the administrative console as you had done before, you’ll be greeted by the new OMV 7 login screen. Congratulations!

  • A Digital Twin of the Whole World in the Cloud

    As a kid, I remember playing Microsoft Flight Simulator 5.0 — while I can’t say I really understood all the nuances of the several hundred page manual (which explained how ailerons and rudders and elevators worked), I remember being blown away with the idea that I could fly anywhere on the planet and see something reasonably representative there.

    Flash forward a few decades and Microsoft Flight Simulator 2024 can safely be said to be one of the most detailed “digital twins” of the whole planet ever built. In addition to detailed photographic mapping of many locations (I would imagine a combination of aerial surveillance and satellite imagery) and an accurate real world inventory of every helipad (including offshore oil rigs!) and glider airport, they also simulate flocks of animals, plane wear and tear, how snow vs mud vs grass behave when you land on it, wake turbulence, and more! And, just as impressive, it’s being streamed from the cloud to your PC/console when you play!

    Who said the metaverse is dead?


  • The Startup Battlefield: Lessons from History’s Greatest Military Leaders

    It is hard to find good analogies for running a startup that founders can learn from. Some of the typical comparisons — playing competitive sports & games, working on large projects, running large organizations — all fall short of capturing the feeling that the odds are stacked against you that founders have to grapple with.

    But the annals of military history offer a surprisingly good analogy to the startup grind. Consider the campaigns of some of history’s greatest military leaders — like Alexander the Great and Julius Caesar — who successfully waged offensive campaigns against numerically superior opponents in hostile territory. These campaigns have many of the same hallmarks as startups:

    1. Bad odds: Just as these commanders faced superior enemy forces in hostile territory, startups compete against incumbents with vastly more resources in markets that favor them.
    2. Undefined rules: Unlike games with clear rules and a limited set of moves, military commanders and startup operators have broad flexibility of action and must be prepared for all types of competitive responses.
    3. Great uncertainty: Not knowing how the enemy will act is very similar to not knowing how a market will respond to a new offering.

    As a casual military history enthusiast and a startup operator & investor, I’ve found striking parallels in how history’s most successful commanders overcame seemingly insurmountable odds with how the best startup founders operate, and think that’s more than a simple coincidence.

    In this post, I’ll explore the strategies and campaigns of 9 military commanders (see below) who won battle after battle against numerically superior opponents across a wide range of battlefields. By examining their approach to leadership and strategy, I found 5 valuable lessons that startup founders can hopefully apply to their own ventures.

    LeaderRepresentedNotable VictoriesLegacy
    Alexander the GreatMacedon
    (336-323 BCE)
    Tyre, Issus, Gaugamela, Persian Gate, HydapsesConquered the Persian Empire before the age of 32; spread Hellenistic culture across Eurasia and widely viewed in the West as antiquity’s greatest conqueror
    Hannibal BarcaCarthage
    (221-202 BCE)
    Ticinus, Trebia, Trasimene, CannaeBrought Rome the closest to its defeat until its fall in 5th century CE; he operated freely within Italy for over a decade
    Han Xin
    (韓信)
    Han Dynasty (漢朝) (206-202 BCE)Jingxing (井陘), Wei River (濰水), Anyi (安邑) Despite being a commoner, his victories led to the creation of the Han Dynasty (漢朝) and his being remembered as one of “the Three Heroes of the Han Dynasty” (漢初三傑)
    Gaius Julius CaesarRome
    (59-45 BCE)
    Alesia, PharsalusEstablished Rome’s dominance in Gaul (France); became undisputed leader of Rome, effectively ending the Roman Republic, and his name has since become synonymous with “emperor” in the West
    SubutaiMongol Empire
    (1211-1248)
    Khunan, Kalka River, Sanfengshan (
    三峰山), Mohi
    Despite being a commoner, became one of the most successful military commanders in the Mongol Empire. Successfully won battles in more theaters than any other commander (China, Central Asia, and Eastern Europe)
    TimurTimurid Empire
    (1370-1405)
    Kondurcha River, Terek River, Dehli, AnkaraCreated Central Asian empire with dominion over Turkey, Persia, Northern India, Eastern Europe, and Central Asia. His successors would eventually create the Mughal Empire in India which continued until the 1850s
    John Churchill, Duke of MarlboroughBritain
    (1670-1712)
    Blenheim, RamilliesConsidered one of the greatest British commanders in history; Paved the way for Britain to overtake France as the pre-eminent military and economic power in Europe
    Frederick the GreatPrussia
    (1740-1779)
    Hohenfriedberg, Rossbach, LeuthenEstablished Prussia as the pre-eminent Central European power after defeating nearly every major European power in battle; A cultural icon for the creation of Germany
    Napoleon BonaparteFrance
    (1785-1815)
    Rivoli, Tarvis, Ulm, Austerlitz, Jena-Auerstedt, Friedland, DresdenEstablished a French empire with dominion over most of continental Europe; the Napoleonic code now serves as basis for legal systems around the world and the word Napoleon synonymous with military genius and ambition

    Before I dive in, three important call-outs to remember:

    1. Running a startup is not actually warfare — there are limitations to this analogy. Startups are not (and should not be) life-or-death. Startup employees are not bound by military discipline (or the threat of imprisonment if they are derelict). The concept of battlefield deception, which is at the heart of many of the tactics of the greatest commanders, also doesn’t translate well. Treating your employees / co-founders as one would a soldier or condoning violent and overly aggressive tactics would be both an ethical failure and a misread of this analogy.
    2. Drawing lessons from these historical campaigns does not mean condoning the underlying sociopolitical causes of these conflicts, nor the terrible human and economic toll these battles led to. Frankly, many of these commanders were absolutist dictators with questionable motivations and sadistic streaks. This post’s focus is purely on getting applicable insights on strategy and leadership from leaders who were able to win despite difficult odds.
    3. This is not intended to be an exhaustive list of every great military commander in history. Rather, it represents the intersection of offensive military prowess and my familiarity with the historical context. Just because I did not mention a particular commander has no bearing on their actual greatness.

    With those in mind, let’s explore how the wisdom of historical military leaders can inform the modern startup journey. In the post, I’ll unpack five key principles (see below) drawn from the campaigns of history’s most successful military commanders, and show how they apply to the challenges ambitious founders face today.

    1. Get in the trenches with your team
    2. Achieve and maintain tactical superiority
    3. Move fast and stay on offense
    4. Unconventional teams win
    5. Pick bold, decisive battles

    Principle 1: Get in the trenches with your team

    One common thread unites the greatest military commanders: their willingness to share in the hardships of their soldiers. This exercise of leadership by example, of getting “in the trenches” with one’s team, is as crucial in the startup world as it was on historical battlefields.

    Every commander on our list was renowned for marching and fighting alongside their troops. This wasn’t mere pageantry; it was a fundamental aspect of their leadership style that yielded tangible benefits:

    1. Inspiration: Seeing their leader work shoulder-to-shoulder with them motivated soldiers to push beyond their regular limits.
    2. Trust: By sharing in their soldiers’ hardships, commanders demonstrated that they valued their troops and understood their needs.
    3. Insight: Direct involvement gave leaders firsthand knowledge of conditions on the ground, informing better strategic decisions.

    Perhaps no figure exemplified this better than Alexander the Great. Famous for being one of the first soldiers to jump into battle, Alexander was wounded seriously multiple times. This shared experience created a deep bond with his soldiers, culminating in his legendary speech at Opis where he was able to quell a mutiny of his soldiers, tired after years of campaigns, with a speech reminding them of their shared experiences:

    Alexander the Great from Alexandria, Egypt (3rd Century BCE); Image Credit: Wikimedia

    The wealth of the Lydians, the treasures of the Persians, and the riches of the Indians are yours; and so is the External Sea. You are viceroys, you are generals, you are captains. What then have I reserved to myself after all these labors, except this purple robe and this diadem? I have appropriated nothing myself, nor can any one point out my treasures, except these possessions of yours or the things which I am guarding on your behalf. Individually, however, I have no motive to guard them, since I feed on the same fare as you do, and I take only the same amount of sleep.

    Nay, I do not think that my fare is as good as that of those among you who live luxuriously; and I know that I often sit up at night to watch for you, that you may be able to sleep.

    But some one may say, that while you endured toil and fatigue, I have acquired these things as your leader without myself sharing the toil and fatigue. But who is there of you who knows that he has endured greater toil for me than I have for him? Come now, whoever of you has wounds, let him strip and show them, and I will show mine in turn; for there is no part of my body, in front at any rate, remaining free from wounds; nor is there any kind of weapon used either for close combat or for hurling at the enemy, the traces of which I do not bear on my person.

    For I have been wounded with the sword in close fight, I have been shot with arrows, and I have been struck with missiles projected from engines of war; and though oftentimes I have been hit with stones and bolts of wood for the sake of your lives, your glory, and your wealth, I am still leading you as conquerors over all the land and sea, all rivers, mountains, and plains. I have celebrated your weddings with my own, and the children of many of you will be akin to my children.

    Alexander the Great (as told by Arrian)

    This was not unique to Alexander. Julius Caesar famously slept in chariots and marched alongside his soldiers. Napoleon was called “le petit caporal” by his troops after he was found sighting the artillery himself, a task that put him within range of enemy fire and was usually delegated to junior officers.

    Frederick the Great also famously mingled with his soldiers while on tour, taking kindly to the nickname from his men, “Old Fritz”. Frederick understood the importance of this as he once wrote to his nephew:

    “You cannot, under any pretext whatever, dispense with your presence at the head of your troops, because two thirds of your soldiers could not be inspired by any other influence except your presence.”

    Frederick the Great
    “Old Fritz” after the Battle of Hochkirch
    Image credit: WikiMedia Commons

    For Startups

    For founders, the lesson is clear: show up when & where your team is and roll up your sleeves so they can see you work beside them. It’s not just that startups tend to need “all hands on deck”, but being in the trenches also provides “on the ground” context that is valuable and help create the morale needed to succeed.

    Elon Musk, for example, famously spent time on the Tesla factory floor — even sleeping on it — while the company worked through issues with its Model 3 production, noting in an interview:

    “I am personally on that line, in that machine, trying to solve problems personally where I can,” Musk said at the time. “We are working seven days a week to do it. And I have personally been here on zone 2 module line at 2:00 a.m. on a Sunday morning, helping diagnose robot calibration issues. So I’m doing everything I can.”

    Principle 2: Achieve and maintain tactical superiority

    To win battles against superior numbers requires a commander to have a strong tactical edge over their opponents. This can be in the form of a technological advantage (i.e. a weapons technology) or an organizational one (i.e. superior training or formations), but these successful commanders always made sure their soldiers could “punch above their weight”.

    Alexander the Great, for example, leveraged the Macedonian Phalanx, a modification of the “classical Greek phalanx” used by the Greek city states of the era, that his father Philip II helped create.

    Image Credit: RedTony via WikiMedia Commons

    The formation relied on “blocks” of heavy infantry equipped with six-meter (!!) long spears called sarissa which could rearrange themselves (to accommodate different formation widths and depths) and “pin” enemy formations down while the heavy cavalry would flank or exploit gaps in the enemy lines. This formation made Alexander’s army highly effective against every military force — Greeks, Persians, and Indians — it encountered.

    Macedonian Phalanx with sarissa; Image Credit: Wikimedia Commons

    A few centuries later, the brilliant Chinese commander Han Xin (韓信) leaned heavily on the value of military engineering. Han Xin (韓信)’s soldiers would rapidly repair & construct roads to facilitate his army’s movement or, at times, to deceive his enemies about which path he planned to take. His greatest military engineering accomplishment was at the Battle of Wei River (濰水) in 204 BCE. Han Xin (韓信) attacked the larger forces of the State of Qi (齊) and State of Chu (楚) and immediately retreated across the river, luring them to cross. What his rivals had not realized in their pursuit was that the water level of the Wei River was oddly low. Han Xin (韓信) had, prior to the attack, instructed his soldiers to construct a dam upstream to lower the water level. Once a sizable fraction of the enemy’s forces were mid-stream, Han Xin (韓信) ordered the dam released. The rush of water drowned a sizable portion of the enemy’s forces and divided the Chu (楚) / Qi (齊) forces letting Han Xin (韓信)’s smaller army defeat and scatter them.

    A century and a half later, Roman statesman and military commander Gaius Julius Caesar also famously advocated military engineering capability in his wars with the Germanic tribes in Gaul. He became the first Roman commander to cross the Rhine (twice!) by building bridges to make the point to the Germanic tribes that he could invade them whenever he wanted. At the Battle of Alesia in 52 BCE, after trading battles with the skilled Gallic commander Vercingetorix who had united the tribes in opposition to Rome, Caesar besieged Vercingetorix’s fortified settlement of Alesia while simultaneously holding off Gallic reinforcements. Caesar did this by building 25 miles of fortifications surrounding Alesia in a month, all while outnumbered and under constant harassment from both sides by the Gallic forces! Caesar’s success forced Vercingetorix to surrender, bringing an end to organized resistance to Roman rule in Gaul for centuries.

    Vercingetorix Throws Down his Arms at the Feet of Julius Caesar by Lionel Royer; Image Credit: Wikimedia

    The Mongol commander Subutai similarly made great use of Mongol innovations to overcome defenders from across Eurasia. The lightweight Mongol composite bow gave Mongol horse archers a devastating combination of long range (supposedly 150-200 meters!) and speed (because they were light enough to be fired while on horseback). The Mongol horses themselves were another “biotechnological” advantage in that they required less water and food which let the Mongols wage longer campaigns without worrying about logistics.

    Mongol horse archers, Image credit: Wikimedia Commons

    In the 18th century, Frederick the Great transformed warfare on the European continent with a series of innovations. First, he drilled his soldiers stressing things like firing speed. It is said that lines of Prussian riflemen could fire over twice as fast as other European armies they faced, making them exceedingly lethal in combat.

    Frederick’s Leibgarde Batallion in action; Image credit: Military Heritage

    Frederick was also famous for a battle formation: the oblique order. Instead of attacking an opponent head on, the oblique order involves confronting the enemy line at an angle with soldiers massed towards one end of the formation. If one’s soldiers are well-trained and disciplined, then even with a smaller force in aggregate, the massed wing can overwhelm the opponent in one area and then flank or surround the rest. Frederick famously boasted that the oblique order could allow a skilled force to win over an opposing one three times its size.

    Finally, Frederick is credited with popularizing horse artillery, the use of horse-drawn light artillery guns, in European warfare. With horse artillery units, Frederick was able to increase the adaptability of his forces and their ability to break through even numerically superior massed infantry by concentrating artillery fire where it was needed.

    Horse-drawn artillery unit; Image credit: Wikimedia Commons

    A few decades later, Napoleon Bonaparte became the undisputed master of much of continental Europe by mastering army-level logistics and organization. While a brilliant tactician and artillery commander, what set Napoleon’s military apart was its embrace of the “corps system”, which subdivided his forces into smaller, self-contained corps that were capable of independent operations. This allowed Napoleon the ability to pursue grander goals, knowing that he could focus his attention on the most important fronts of battle, while the other corps could independently pin an enemy down or pursue a different objective in parallel.

    Napoleon triumphantly entering Berlin by Charles Meynier; Image Credit: Wikimedia Commons

    Additionally, Napoleon invested heavily in overhauling military logistics, using a combination of forward supply depots and teaching his forces to forage for food and supplies in enemy territory (and, just as importantly, how to estimate what foraging can do to help determine the necessary supplies to take). This investment led to the invention of modern canning technology, first used to support the marches of the French Grande Armée. The result was Napoleon could field larger armies over longer campaigns all while keeping his soldiers relatively well-fed.

    For Startups

    Founders need to make sure they have a strong tactical advantage that fits their market(s). As evidenced above, it does not need to be something as grand as an unassailable advantage, but it needs to be a reliable winner and something you continuously invest in if you plan on competing with well-resourced incumbents in challenging markets.

    The successful payments company Stripe started out by making sure they would always win on developer ease of use, even going so far as to charge more than their competition during their Beta to make sure that their developer customers were valuing them for their ease of use. Stripe’s advantage here, and continuous investment in maintaining that advantage, ultimately let it win any customer that needed a developer payment integration, even against massive financial institutions. This advantage laid the groundwork for Stripe’s meteoric growth and expansion into adjacent categories from its humble beginnings.

    Principle 3: Move fast and stay on offense

    In both military campaigns and startups, speed and a focus on offense plays an outsized role in victory, because the ability to move quickly creates opportunities and increases resiliency to mistakes.

    Few understood this principle as well as the Mongol commander Subutai who frequently took advantage of the greater speed and discipline of the Mongol cavalry to create opportunities to win.

    In the Battle of the Kalka River (1223), Subutai took what initially appeared to be a Mongol defeat — when the Kievan Rus and their Cuman allies successfully entrapped the Mongol forces in the area — and turned it into a victory. The Mongols began a 9 day feigned retreat (many historians believe this was a real retreat that Subutai turned into a feigned one once he realized the situation), constantly tempting the enemy by staying just out of reach into overextending themselves in pursuit.

    After 9 days, Subutai’s forces took advantage of their greater speed to lay a trap. Once the Mongols crossed the river they reformed their lines to lie in ambush. As soon as the Rus forces crossed the Kalka River, they found themselves surrounded and confronted with a cavalry charge they were completely unprepared for. After all, they had been pursuing what they thought was a fleeing enemy! Their backs against the river, the Rus forces (including several major princes) were annihilated.

    Battle of Kalka River; Image Credit: Wikimedia Commons

    Subutai took advantage of the Mongol speed advantage in a number of his campaigns, coordinating fast-moving Mongol divisions across multiple objectives. In its destruction of the Central Asian Khwarazmian empire, the Mongols, under the command of Subutai and Mongol ruler Genghis Khan, overwhelmed the defenders with coordinated maneuvers. While much of the Mongol forces attacked from the East, where the Khwarazmian forces massed, Subutai used the legendary Mongol speed to go around the Khwarazmian lines altogether, ending up at Bukhara, 100 miles to the West of the Khwarazmian defensive position! In a matter of months, the empire was destroyed and its rulers chased out, never to return.

    Map of the Mongol force movements in the Mongol invasion of Khwarazmian Empire; Image Credit: Paul K. Davis, Masters of the Battlefield

    A few hundred years later, the Englishman John Churchill, the Duke of Marlborough also proved the value of speed in 1704 when he boldly marched an army of 21,000 Dutch and English troops on a 250-mile march across Europe in just five weeks to place themselves between French and Bavarian forces and their target of Vienna. Had Vienna been attacked, it would have forced England’s ally the Holy Roman Empire out of the conflict, giving France the victory in the War of the Spanish Succession. This march was made all the more challenging as Marlborough had to find a way to feed and equip his army along this march without unnecessarily burdening the neutral and friendly territories they were marching through.

    Marlborough’s “march to the Danube”; Image Credit: Rebel Redcoat

    Marlborough’s maneuver threw the Bavarian and French forces off-balance. What originally was supposed to be an “easy” French victory culminated in a crushing defeat for the French at Blenheim which turned the momentum of the war. This victory solidified Marlborough’s reputation and even resulted in the British government agreeing to build a lavish palace (called Blenheim Palace in honor of the battle) as a reward to Marlborough.

    Marlborough proved the importance of speed again at the Battle of Oudenarde. In 1708, French forces captured Ghent and Bruges (in modern day Belgium), threatening the alliance’s ability to maintain contact with Britain. Recognizing this, Marlborough force-marched his army to the city of Oudenarde, marching 30 miles in about as many hours. The French, confident from their recent victories and suffering from an internal leadership squabble, misjudged the situation, allowing Marlborough’s forces to build five pontoon bridges to move his 80,000 soldiers across the nearby river.

    When the French commander received news that the allies were already at Oudenarde building bridges, he said, “If they are there, then the devil must have carried them. Such marching is impossible!

    Marlborough’s forces, not yet at full strength, engaged the French, buying sufficient time for his forces to cross and form up. Once in formation, they counterattacked and collapsed one wing of the French line, saving the Allied position in the Netherlands, and resulting in a bad defeat for French forces.

    The Battle of Oudenarde, showing the position of the bridges the Allied forces needed to cross to get into position; Image Credit: WikiMedia Commons

    For Startups

    The pivotal role speed played in achieving victory for Subutai and the Duke of Marlborough apply in the startup domain as well. The ability to make fast decisions, to quickly shift focus to rapidly adapt to a new market context creates opportunities that slower moving incumbents (and military commanders!) cannot seize. Speed also gifts resiliency against mistakes and weak positions, in much the same way that speed let the Mongols and the Anglo-Prussian-Dutch alliance overcome their initial missteps at Kalka River and Oudenarde. Founders would be wise to remember to embrace speed of action in all they do.

    Facebook and it’s (now in)famous “move fast, break things” motto is one classic example of how a company can internalize speed as a culture. It leveraged that to ship products and features which has kept it a leader in social and AI even in the face of constant competition and threats from well-funded companies like Google, Snapchat, and Bytedance.

    Principle 4: Unconventional teams win

    Another unifying hallmark of the great commanders is that they made unconventional choices with regards to their army composition. Relative to their peers, these commanders tended to build armies that were more diverse in class and nationality. While this required exceptional communication and inspiration skills, it gave the commanders significant advantages:

    1. Ability to recruit in challenging conditions: For many of the commanders, the unconventional team structure was a necessity to build up the forces they needed given logistical / resource constraints while operating in enemy territory.
    2. Operational flexibility from new tactics: Bringing on personnel from different backgrounds let commanders incorporate additional tactics and strategies, creating a more effective and flexible fighting force.

    The Carthaginian general Hannibal Barca for example famously fielded a multi-nationality army consisting of Carthaginians, Libyans, Iberians, Numidians, Balearic soldiers, Gauls, and Italians. This allowed Hannibal to raise an army in hostile territory — after all, waging war in the heart of Italy against Rome made it difficult to get reinforcements from Carthage.

    Illustration of troop types employed in the Second Punic War by Carthage/Hannibal Barca; Image Credit: Travis’s Ancient History

    But, it also gave Hannibal’s army flexibility in tactics. Balearic slingers provided superior long range attack to the best Roman-used bows of the time. Numidian light cavalry provided Hannibal with fast reconnaissance and a quick way to flank and outmaneuver Roman forces. Gallic and Iberian soldiers provided shock infantry and cavalry. Each of these groups of soldiers added their own distinctive capabilities to Hannibal’s armies and great victories over Rome.

    The Central Asian conqueror Timur similarly fielded a diverse army which included Mongols, Turks, Persians, Indians, Arabs, and others. This allowed Timur to field larger armies for his campaigns by recruiting from the countries he forced into submission. Like with Hannibal, it also gave Timur’s army access to a diverse set of tactics: war elephants (from India), infantry and siege technology from the Persians, gunpowder from the Ottomans, and more. This combination of operational flexibility and ability to field large armies let Timur build an empire which defeated every major power in Central Asia and the Middle East.

    The Defeat by Timur of the Sultan of Dehli (from the Imperial Library of Emperor Akbar);
    Image credit: Wikimedia

    It should not be a surprise that some of the great commanders were drawn towards assembling unconventional teams as several of them were ultimately “commoners”. Subutai (a son of a blacksmith who Genghis Khan took interest in), Timur (a common thief), and Han Xin (韓信, who famously had to beg for food in his childhood) all came from relatively humble origins. Napoleon, famous for declaring the military “la carrier est ouvérte aux talents” (“the career open to the talents”) and creating the first modern order of merit Légion d’honneur (open to all, regardless of social class), was similarly motivated by the difficulties he faced in securing promotion early in his career due to his not being from the French nobility.

    But, by embracing more of a meritocracy, Napoleon was ultimately able to field some of the largest European armies in existence as he waged war successfully against every other major European power (at once).

    First Légion d’Honneur Investiture by Jean-Baptiste Debret;
    Image Credit: Wikimedia

    For Startups

    Hiring is one of the key tasks for startup founders. While hiring the people that larger, better-resourced companies want to can be helpful for a startup, it’s important to always remember that transformative victories require unconventional approaches. Leaning on unconventional hires may help you get out of a salary bidding war with those deeper-pocketed competitors. Choosing unconventional hires may also add different skills and perspectives to the team.

    In pursuing this strategy, it’s also vital to excel at communication & organization as well as fostering a shared sense of purpose. All teams require strong leadership to be effective but this is especially true with an unconventional team composition facing uphill odds.

    The enterprise API company Zapier is one example of taking an unconventional approach to team construction by having been 100% remote from inception (pre-COVID even). This let the company assemble a team without being confined by location and eliminate the need to spend on unnecessary facilities. They’ve had to invest in norms around documentation and communication to make this work, and, while it’d be too far of a leap to argue all startups should go 100% remote, for Zapier’s market and team culture, it’s worked.

    Principle 5: Pick bold, decisive battles

    When in a challenging environment with limited resources, it’s important to prioritize decisive moves — actions which can result in a huge payoff — even if risky over safer, less impactful ones. This is as true for startups, which have limited runway and need to make a big splash in order to fundraise, as for military commanders who need more than just battlefield wins but strategic victories.

    Few understood this as well as the Carthaginian general Hannibal Barca who, in waging the Second Punic War against Rome, crossed the Alps from Spain with his army in 218 BCE (at the age of 29!). Memorialized in many works of art (see below for one by Francisco Goya), this was a dangerous move (one that resulted in the loss of many men and almost his entire troop of war elephants) and was widely considered to be impossible.

    The Victorious Hannibal Seeing Italy from the Alps for the First Time by Francisco Goya in Museo del Prado; Image Credit: Wikimedia

    While history (rightly) remembers Hannibal’s boldness, it’s important to remember that Hannibal’s move was highly calculated. He realized that the Gauls in Northern Italy, who had recently been subjugated by the Romans, were likely to welcome a Roman rival. Through his spies, he also knew that Rome was planning an invasion of Carthage in North Africa. He knew he had little chance to bypass the Roman navy or Roman defensive placements if he invaded in another way.

    And Hannibal’s bet paid off! Having caught the Romans entirely by surprise, they cancelled their planned invasion of Africa, and Hannibal lined up many Gallic allies to his cause. Within two years of his entry into Italy, Hannibal trounced the Roman armies sent to battle him at the River Ticinus, at the River Trebia, and at Lake Trasimene. Shocked by their losses, the Romans elected two consuls with the mandate to battle Hannibal and stop him once and for all.

    Knowing this, Hannibal seized a supply depot at the town of Cannae, presenting a tempting target to the Roman consuls to prove themselves. They (foolishly) took the bait. Despite fielding over 80,000 soldiers against Hannibal’s 50,000, Hannibal successfully executed a legendary double-envelopment maneuver (see below) and slaughtered almost the entire Roman force that met him in battle.

    Hannibal’s double envelopment of Roman forces at Cannae;
    Image Credit: Wikimedia

    To put this into perspective, in the 2 years after Hannibal crossed the Alps, Hannibal’s army killed 20% of all male Romans over the age of 17 (including at least 80 Roman Senators and one previous consul). Cannae is today considered one of the greatest examples of military tactical brilliance, and, as historian Will Durant wrote, “a supreme example of generalship, never bettered in history”.

    Cannae was a great example of Hannibal’s ability to pick a decisive battle with favorable odds. Hannibal knew that his only chance was to encourage the city-states of Italy to side with him. He knew the Romans had just elected consuls itching for a fight. He chose the field of battle by seizing a vital supply depot at Cannae. Considering the Carthaginians had started and pulled back from several skirmishes with the Romans in the days leading up to the battle, it’s clear Hannibal also chose when to fight, knowing full well the Romans outnumbered him. After Cannae, many Italian city-states and the kingdom of Macedon sided with Carthage. That Carthage ultimately lost the Second Punic War is a testament more to Rome’s indomitable spirit and the sheer odds Hannibal faced than any indication of Hannibal’s skills.

    In the Far East, about a decade later, the brilliant Chinese military commander Han Xin (韓信) was laying the groundwork for the creation of the Han Dynasty (漢朝) in a China-wide civil war known as the the Chu-Han contention between the State of Chu (楚) and the State of Han (漢) led by Liu Bang (劉邦, who would become the founding emperor Gaozu 高祖 of the Han Dynasty 漢朝).

    Under the leadership of Han Xin (韓信), the State of Han (漢) won many victories over their neighbors. Overconfident from those victories, his king Liu Bang (劉邦) led a Han (漢) coalition to a catastrophic defeat when he briefly captured but then lost the Chu (楚) capital of Pengcheng (彭城) in 205 BCE. Chu forces (楚) were even able to capture the king’s father and wife as hostages, and several Han (漢) coalition states switched their loyalty to the Chu (楚).

    Map of the 18 states that existed at the start of the Chu-Han Contention, the two sides being the Han (in light purple on the Southwest) and the Chu (in green on the East); Image Credit: Wikimedia

    To fix his king’s blunder, Han Xin (韓信) tasked the main Han (漢) army with setting up fortified positions in the Central Plain, drawing Chu (楚) forces there. Han Xin (韓信) would himself take a smaller force of less experienced soldiers to attack rival states in the North to rebuild the Han (漢) military position.

    After successfully subjugating the State of Wei (魏), Han Xin (韓信)’s forces moved to attack the State of Zhao (趙, also called Dai 代) through the Jingxing Pass (井陘關) in late 205 BCE. The Zhao (趙) forces, which outnumbered Han Xin (韓信)’s, encamped on the plain just outside the pass to meet them.

    Sensing an opportunity to deal a decisive blow to the overconfident Zhao (趙), Han Xin (韓信) ordered a cavalry unit to sneak into the mountains behind the Zhao (趙) camp and to remain hidden until battle started. He then ordered half of his remaining army to position themselves in full view of the Zhao (趙) forces with their backs to the Tao River (洮水), something Sun Tzu’s Art of War (孫子兵法) explicitly advises against (due to the inability to retreat). This “error” likely reinforced the Zhao (趙) commander’s overconfidence, as he made no move to pre-emptively flank or deny the Han (漢) forces their encampment.

    Han Xin (韓信) then deployed his full army which lured the Zhao (趙) forces out of their camp to counterattack. Because the Tao River (洮水) cut off all avenues of escape, the outnumbered Han (漢) forces had no choice but to dig in and fight for their lives, just barely holding the Zhao (趙) forces at bay. By luring the enemy out for what appeared to be “an easy victory”, Han Xin (韓信) created an opportunity for his hidden cavalry unit to capture the enemy Zhao (趙) camp, replacing their banners with those of the Han (漢). The Zhao (趙) army saw this when they regrouped, which resulted in widespread panic as the Zhao (趙) army concluded they must be surrounded by a superior force. The opposition’s morale in shambles, Han Xin (韓信) ordered a counter-attack and the Zhao (趙) army crumbled, resulting in the deaths of the Zhao (趙) commander and king!

    Han Xin (韓信) bet his entire outnumbered command on a deception tactic based on little more than an understanding of his army’s and the enemy’s psychology. He won a decisive victory which helped reverse the tide of the war. The State of Zhao (趙) fell, and the State of Jiujiang (九江) and the State of Yan (燕) switched allegiances to the Han (漢). This battle even inspired a Chinese expression “fighting a battle with one’s back facing a river” (背水一戰) to describe fighting for survival in a “last stand”.

    Caesar crosses the Rubicon by Bartolomeo Pinelli; Image Credit: Wikimedia

    Roughly a century later, on the other side of the world, the Roman statesman and military commander Julius Caesar made a career of turning bold, decisive bets into personal glory. After Caesar conquered Gaul, Caesar’s political rivals led by Gnaeus Pompeius Magnus (Pompey the Great), a famed military commander, demanded Caesar return to Rome and give up his command. Caesar refused and crossed the Rubicon, a river marking the boundary of Italy, in January 49 BCE starting a Roman Civil War and coining at least two famous expressions (including alea iacta est – “the die is cast”) for “the point of no return”.

    This bold move came as a complete shock to the Roman elite. Pompey and his supporters fled Rome. Taking advantage of this, Caesar captured Italy without much bloodshed. Caesar then pursued Pompey to Macedon, seeking a decisive land battle which Pompey, wisely, given his broad network of allies and command of the Roman navy, refused to give him. Instead, Caesar tried and failed to besiege Pompey at Dyrrhachium which forced him into retreat in Greece.

    Pompey’s supporters, however, lacked Pompey’s patience (and judgement). Overconfident from their naval strength, numerical advantage, and Caesar’s failure at Dyrrhachium, they pressured Pompey into a battle with Caesar who was elated at the opportunity. In the summer of 48 BCE, the two sides met at the Battle of Pharsalus.

    The initial battle formations at the Battle of Pharsalus; Image Credit: Wikimedia

    Always cautious, Pompey took up a position on a mountain and oriented his forces such that his larger cavalry wing would have ability to overpower Caesar’s cavalry and then flank Caesar’s forces while his numerically superior infantry would be arranged deeper to smash through or at least hold back Caesar’s lines.

    Caesar made a bold tactical choice when he saw Pompey’s formation. He thinned his (already outnumbered) lines to create a 4th reserve line of veterans which he positioned behind his cavalry at an angle (see battle formation above).

    Caesar initiated the battle and attacked with two of his infantry lines. As Caesar expected, Pompey ordered a cavalry charge which soon forced back Caesar’s outnumbered cavalry. But Pompey’s cavalry then encountered Caesar’s 4th reserve line which had been instructed to use their javelins to stab at the faces of Pompey’s cavalry like bayonets. Pompey’s cavalry, while larger in size, was made up of relatively inexperienced soldiers and the shock of the attack caused them to panic. This let Caesar’s cavalry regroup and, with the 4th reserve line, swung around Pompey’s army completing an expert flanking maneuver. Pompey’s army, now surrounded, collapsed once Caesar sent his final reserve line into battle.

    Caesar’s boldness and speed of action let him take advantage of a lapse in Pompey’s judgement. Seeing a rare opportunity to win a decisive battle, Caesar was even willing to risk a disadvantage in infantry, cavalry, and position (Pompey’s army had the high ground and had forced Caesar to march to him). But this strategic and tactical gamble (thinning his lines to counter Pompey’s cavalry charge) paid off as Pharsalus shattered the myth of Pompey’s inevitability. Afterwards, Pompey’s remaining allies fled or defected to Caesar, and Pompey himself fled to Egypt where he was assassinated (by a government wishing to win favor with Caesar). And, all of this — from Gaul to crossing the Rubicon to the Civil War — paved the way for Caesar to become the undisputed master of Rome.

    For Startups

    Founders need to take bold, oftentimes uncomfortable bets that have large payoffs. While a large company can take its time winning a war of attrition, startups need to score decisive wins quickly in order to attract talent, win deals, and shift markets towards them. Only taking the “safe and rational” path is a failure to recognize the opportunity cost when operating with limited resources.

    In other words, founders need to find their own Alps / Rubicons to cross.

    In the startup world, few moves are as bold (while also uncomfortable and risky) as big pivots. But, there are examples of incredible successes like Slack that were able to make this work. In Slack’s case, the game they originally developed ended up a flop, but CEO & founder Stewart Butterfield felt the messaging product they had built to support the game development had potential. Leaning on that insight, over the skepticism of much of his team and some high profile investors, Butterfield made a bet-the-company move similar to Han Xin (韓信) digging in with no retreat which created a seminal product in the enterprise software space.

    Summary

    I hope I’ve been able to show that history’s greatest military commanders can offer valuable lessons on leadership and strategy for startup founders.

    The five principles derived from studying some of the commanders’ campaigns – the importance of getting in the trenches, achieving tactical superiority, moving fast, building unconventional teams, and picking bold, decisive battles – played a key role in the commanders’ success and generalize well to startup execution.

    After all, what is a more successful founder than one who can recruit teams despite resource constraints (unconventional teams), inspire them (by getting in the trenches alongside them), and move with speed & urgency (move fast) to take a competitive edge (achieve tactical superiority) and apply it where there is the greatest chance of a huge impact on the market (pick bold, decisive battles).

  • Making a Movie to Make Better Video Encoding

    Until I read this Verge article, I had assumed that video codecs were a boring affair. In my mind, every few years, the industry would get together and come up with a new standard that promised better compression and better quality for the prevailing formats and screen types and, after some patent licensing back and forth, the industry would standardize around yet another MPEG standard that everyone uses. Rinse and repeat.

    The article was an eye-opening look at how video streamers like Netflix are pushing the envelope on using video codecs. Since one of a video streamer’s core costs is the cost of video bandwidth, it would make sense that they would embrace new compression approaches (like different kinds of compression for different content, etc.) to reduce those costs. As Netflix embraces more live streaming content, it seems they’ll need to create new methods to accommodate.

    But what jumped out to me the most was that, in order to better test and develop the next generation of codec, they produced a real 12 minute noir film called Meridian (you can access it on Netflix, below is someone who uploaded it to YouTube) which presents scenes that have historically been more difficult to encode with conventional video codecs (extreme lights and shadows, cigar smoke and water, rapidly changing light balance, etc).

    Absolutely wild.


  • Games versus Points

    The Dartmouth College Class of 2024, for their graduation, got a very special commencement address from tennis legend Roger Federer.

    There is a wealth of good advice in it, but the most interesting point that jumped out to me is that while Federer won a whopping 80% of the matches he played in his career, he only won 54% of the points. It underscores the importance of letting go of small failures (“When you lose every second point, on average, you learn not to dwell on every shot”) but also of keeping your eye on the right metric (games, not points).


  • Biopharma scrambling to handle Biosecure Act

    Strong regional industrial ecosystems like Silicon Valley (tech), Boston (life science), and Taiwan (semiconductors) are fascinating. Their creation is rare and requires local talent, easy access to supply chains and distribution, academic & government support, business success, and a good amount of luck.

    But, once set in place, they can be remarkably difficult to unseat. Take the semiconductor industry as an example. It’s geopolitical importance has directed billions of dollars towards re-creating a domestic US industry. But, it faces an uphill climb. After all, it’s not only a question of recreating the semiconductor manufacturing factories that have gone overseas, but also:

    • the advanced and low-cost packaging technologies and vendors that are largely based in Asia
    • the engineering and technician talent that is no longer really in the US
    • the ecosystem of contractors and service firms that know exactly how to maintain the facilities and equipment
    • the supply chain for advanced chemicals and specialized parts that make the process technology work
    • the board manufacturers and ODMs/EMSs who do much of the actual work post-chip production that are also concentrated in Asia

    A similar thing has happened in the life sciences CDMO (contract development and manufacturing organization) space. In much the same way that Western companies largely outsourced semiconductor manufacturing to Asia, Western biopharma companies outsourced much of their core drug R&D and manufacturing to Chinese companies like WuXi AppTec and WuXi Biologics. This has resulted in a concentration of talent and an ecosystem of talent and suppliers there that would be difficult to supplant.

    Enter the BIOSECURE Act, a bill being discussed in the House with a strong possibility of becoming a law. It prohibits the US government from working with companies that obtain technology from Chinese biotechnology companies of concern (including WuXi AppTec and WuXi Biologics, among others). This is causing the biopharma industry significant anxiety as they are forced to find (and potentially fund) an alternative CDMO ecosystem that currently does not exist at the level of scale and quality as it does with WuXi.


  • Freedom and Prosperity Under Xi Jinping

    Fascinating chart from Bloomberg showing level of economic freedom and prosperity under different Chinese rulers and how Xi Jinping is the first Chinese Communist Party ruler in history to have presided over sharp declines in both freedom and prosperity.

    Given China’s rising influence in economic and geopolitical affairs, how it’s leaders (and in particular, Xi) and it’s people react to this will have significant impacts on the world



    ‘Are You Better Off?’ Asking Reagan’s Question in Xi’s China
    Rebecca Choong Wilkins and Tom Orlik | Bloomberg

  • My Two-Year Journey to Home Electrification

    Summary

    • Electrifying our (Bay Area) home was a complex and drawn-out process, taking almost two years.
    • Installing solar panels and storage was particularly challenging, involving numerous hurdles and unexpected setbacks.
    • We worked with a large solar installer (Sunrun) and, while the individuals we worked with were highly competent, handoffs within Sunrun and with other entities (like local utility PG&E and the local municipality) caused significant delays.
    • While installing the heat pumps, smart electric panel, and EV charger was more straightforward, these projects also featured greater complexity than we expected.
    • The project resulted in significant quality of improvements around home automation and comfort. However, bad pricing dynamics between electricity and natural gas meant direct cost savings from electrifying gas loads are, at best, small. While solar is an economic slam-dunk (especially given the rising PG&E rates our home sees), the batteries, in the absence of having backup, have less obvious economic value.
    • Our experience underscored the need for the industry to adopt a more holistic approach to electrification and for policymakers to make the process more accessible for all homeowners to achieve the state’s ambitious goals.

    Why

    The decision to electrify our home was an easy one. From my years of investing in & following climate technologies, I knew that the core technologies were reliable and relatively inexpensive. As parents of young children, my wife and I were also determined to contribute positively to the environment. We also knew there were abundant financial supports from local governments and utilities to help make this all work.

    Yet, as we soon discovered, what we expected to be a straightforward path turned into a nearly two-year process!

    Even for a highly motivated household which had budgeted significant sums for it all, it was still shocking how long (and much money) it took. It made me skeptical that households across California would be able to do the same to meet California’s climate goals without additional policy changes and financial support.

    The Plan

    Two years ago, we set out a plan:

    1. Smart electrical panel —  From my prior experience, I knew that many home electrification projects required a main electrical panel upgrade. These were typically costly and left you at the mercy of the utility to actually carry them out (I would find out how true this was later!). Our home had an older main panel rated for 125 A and we suspected we would normally need a main panel upgrade to add on all the electrical loads we were considering.

      To try to get around this, we decided to get a smart electrical panel which could:
      • use software smarts to deal with the times where peak electrical load got high enough to need the entire capacity of the electrical line
      • give us the ability to intelligently manage backups and track solar production

      In doing our research, Span seemed like the clear winner. They were the most prominent company in the space and had the slickest looking device and app (many of their team had come from Tesla). They also had an EV charger product we were interested in, the Span Drive.
    2. Heat pumps — To electrify is to ditch natural gas. As the bulk of our gas consumption was heating air and water, this involved replacing our gas furnace and gas water heater with heat pumps. In addition to significant energy savings — heat pumps are famous for their >200% efficiency (as they move heat rather than “create” it like gas furnaces do) — heat pumps would also let us add air conditioning (just run the heat pump in reverse!) and improve our air quality (from not combusting natural gas indoors). We found a highly rated Bay Area HVAC installer who specializes in these types of energy efficiency projects (called Building Efficiency) and trusted that they would pick the right heat pumps for us.
    3. Solar and Batteries — No electrification plan is complete without solar. Our goal was to generate as much clean electricity as possible to power our new electric loads. We also wanted energy storage for backup power during outages (something that, while rare, we seemed to run into every year) and to take advantage of time-of-use rates (by storing solar energy when the price of electricity is low and then using it when the price is high).

      We looked at a number of solar installers and ultimately chose Sunrun. A friend of ours worked there at the time and spoke highly of a prepaid lease they offered that was vastly cheaper all-in than every alternative. It offered minimum energy production guarantees, came with a solid warranty, and the “peace of mind” that the installation would be done with one of the largest and most reputable companies in the solar industry.
    4. EV Charger — Finally, with our plan to buy an electric vehicle, installing a home charger at the end of the electrification project was a simple decision. This would allow us to conveniently charge the car at home, and, with solar & storage, hopefully let us “fuel up” more cost effectively. Here, we decided to go with the Span Drive. It’s winning feature was the ability to provide Level 2 charging speeds without a panel upgrade (it does this by ramping up or down charging speeds depending on how much electricity the rest of the house needed). While pricey, the direct integration into our Span smart panel (and its app) and the ability to hit high charging rates without a panel upgrade felt like the smart path forward.
    5. What We Left Out — There were two appliances we decided to defer “fully going green” on.

      The first was our gas stove (with electric oven). While induction stoves have significant advantages, because our current stove is still relatively new, works well, uses relatively little gas, and an upgrade would have required additional electrical work (installing a 240 V outlet), we decided to keep our current stove and consider a replacement at it’s end of life.

      The second was our electric resistive dryer. While heat pump based dryers would certainly save us a great deal of electricity, the existing heat pump dryers on the market have much smaller capacities than traditional resistive dryers, which may have necessitated our family of four doing additional loads of drying. As our current dryer was also only a few years old, and already running on electricity, we decided we would also wait to consider heat pump dryer only after it’s end of life.

    With what we thought was a well-considered plan, we set out and lined up contractors.

    But as Mike Tyson put it, “Everyone has a plan ’till they get punched in the face.”

    The Actual Timeline

    Smart Panel

    The smart panel installation was one of the more straightforward parts of our electrification journey. Span connected us with a local electrician who quickly assessed our site, provided an estimate, and completed the installation in a single day. However, getting the permits to pass inspection was a different story.

    We failed the first inspection due to a disagreement over the code between the electrician and the city inspector. This issue nearly turned into a billing dispute with the electrician, who wanted us to cover the extra work needed to meet the code (an unexpected cost). Fortunately, after a few adjustments and a second inspection, we passed.

    The ability to control and monitor electric flows with the smart panel is incredibly cool. For the first few days, I checked the charts in the apps every few minutes tracking our energy use while running different appliances. It was eye-opening to see just how much power small, common household items like a microwave or an electric kettle could draw!

    However, the true value of a smart panel is only achieved when it’s integrated with batteries or significant electric loads that necessitate managing peak demand. Without these, the monitoring and control benefits are more novelties and might not justify the cost.

    Note: if you, like us, use Pihole to block tracking ads, you’ll need to disable it for the Span app. The app uses some sort of tracker that Pihole flags by default. It’s an inconvenience, but worth mentioning for anyone considering this path.

    Heating

    Building Efficiency performed an initial assessment of our heating and cooling needs. We had naively assumed they’d be able to do a simple drop-in replacement for our aging gas furnace and water heater. While the water heater was a straightforward replacement (with a larger tank), the furnace posed more challenges.

    Initially, they proposed multiple mini-splits to provide zoned control, as they felt the crawlspace area where the gas furnace resided was too small for a properly sized heat pump. Not liking the aesthetics of mini-splits, we requested a proposal involving two central heat pump systems instead.

    Additionally, during the assessment, they found some of our old vents, in particular the ones sending air to our kids’ rooms, were poorly insulated and too small (which explains why their rooms always seemed under-heated in the winter). To fix this, they had to cut a new hole through our garage concrete floor (!!) to run a larger, better-insulated vent from our crawlspace. They also added insulation to the walls of our kids’ rooms to improve our home’s ability to maintain a comfortable temperature (but which required additional furniture movement, drywall work, and a re-paint).

    Building Efficiency spec’d an Ecobee thermostat to control the two central heat pumps. As we already had a Nest Learning Thermostat (with Nest temperature sensors covering rooms far from the thermostat), we wanted to keep our temperature control in the Nest app. At the time, we had gotten a free thermostat from Nest after signing with Sunrun. We realized later, what Sunrun gifted us was the cheaper (and, less attractive) Nest Thermostat which doesn’t support Nest temperature sensors (why?), so we had to buy our own Nest Learning Thermostat to complete the setup.

    Despite some of these unforeseen complexities, the whole process went relatively smoothly. There were a few months of planning and scheduling, but the actual installation was completed in about a week. It was a very noisy (cutting a hole through concrete is not quiet!) and chaotic week, but, the process was quick, and the city inspection was painless.

    Solar & Storage

    The installation of solar panels and battery storage was a lengthy ordeal. Sunrun proposed a system with LONGI solar panels, two Tesla Powerwalls, a SolarEdge inverter, and a Tesla gateway. Despite the simplicity of the plan, we encountered several complications right away.

    First, a main panel upgrade was required. Although we had installed the Span smart panel to avoid this, Sunrun insisted on the upgrade and offered to cover the cost. Our utility PG&E took over a year (!!) to approve our request, which started a domino of delays.

    After PG&E’s approval, Sunrun discovered that local ordinances needed a concrete pad to be poured and safety fence erected around the panel, requiring a subcontractor and yet more coordination.

    After the concrete pad was in place and the panel installed, we faced another wait for PG&E to connect the new setup. Ironically, during this wait, I received a request from Sunrun to pour another concrete pad. This was, thankfully, a false alarm and occurred because the concrete pad / safety fence work had not been logged in Sunrun’s tracking system!

    The solar and storage installation itself took only a few days, but during commissioning, a technician found that half the panels weren’t connected properly, necessitating yet another visit before Sunrun could request an inspection from the city.

    Sadly, we failed our first city inspection. Sunrun’s team had missed a local ordinance that required the Powerwalls to have a minimum distance between them and the sealing off of vents within a certain distance from each Powerwall. This necessitated yet another visit from Sunrun’s crew, and another city inspection (which we thankfully passed).

    The final step was obtaining Permission to Operate (PTO) from PG&E. The application for this was delayed due to a clerical error. About four weeks after submission, we finally received approval.

    Seeing the flow of solar electricity in my Span app (below) almost brought a tear to my eye. Finally!

    EV Charger

    When my wife bought a Nissan Ariya in early 2023, it came with a year of free charging with EVgo. We hoped this would allow us enough time to install solar before needing our own EV charger. However, the solar installation took longer than expected (by over a year!), so we had to expedite the installation of a home charger.

    Span connected us with the same electrician who installed our smart panel. Within two weeks of our free charging plan expiring, the Span Drive was installed. The process was straightforward, with only two notable complications we had to deal with:

    1. The 20 ft cable on the Span Drive sounds longer than it is in practice. We adjusted our preferred installation location to ensure it comfortably reached the Ariya’s charging port.
    2. The Span software initially didn’t recognize the Span Drive after installation. This required escalated support from Span to reset the software, costing the poor electrician who had expected the commissioning step to be a few minute affair to stick around my home for several hours.

    Result

    So, “was it worth it?” Yes! There are significant environmental (our carbon footprint is meaningfully lower) benefits. But there were also quality of life improvements and financial gains from these investments in what are just fundamentally better appliances.

    Quality of Life

    Our programmable, internet-connected water heater allows us to adjust settings for vacations, saving energy and money effortlessly. It also lets us program temperature cycles to avoid peak energy pricing, heating water before peak rates hit.

    With the new heat pumps, our home now has air conditioning, which is becoming increasingly necessary in the Bay Area’s warmer summers. Improved vents and insulation have also made our home (and, in particular, our kids’ rooms) more comfortable. We’ve also found that the heat from the heat pumps is more even and less drying compared to the old gas furnace, which created noticeable hot spots.

    Backup power during outages is another significant benefit. Though we haven’t had to use it since we received permission to operate, we had an accidental trial run early on when a Sunrun technician let our batteries be charged for a few days in the winter. During two subsequent outages in the ensuing months, our system maintained power to our essential appliances, ensuring our kids didn’t even notice the disruptions!

    The EV charger has also been a welcome change. While free public charging was initially helpful, reliably finding working and available fast chargers could be time-consuming and stressful. Now, charging at home is convenient and cost-effective, reducing stress and uncertainty.

    Financial

    There are two financial aspects to consider: the cost savings from replacing gas-powered appliances with electric ones and the savings from solar and storage.

    On the first, the answer is not promising.

    The chart below comes from our PG&E bill for Jan 2023. It shows our energy usage year-over-year. After installing the heat pumps in late October 2022, our natural gas consumption dropped by over 98% (from 5.86 therms/day to 0.10), while our electricity usage more than tripled (from 15.90 kWh/day to 50.20 kWh/day). Applying the conversion of 1 natural gas therm = ~29 kWh of energy shows that our total energy consumption decreased by over 70%, a testament to the much higher efficiency of heat pumps.

    Our PG&E bill from Feb 2023 (for Jan 2023)

    Surprisingly, however, our energy bills remained almost unchanged despite this! The graph below shows our PG&E bills over the 12 months ending in Jan 2023. Despite a 70% reduction in energy consumption, the bill stayed roughly the same. This is due to the significantly lower cost of gas in California compared to the equivalent amount of energy from electricity. It highlights a major policy failing in California: high electricity costs (relative to gas) will deter households from switching to greener options.

    Our PG&E bill from Feb 2023 (for Jan 2023)

    Solar, however, is a clear financial winner. With our prepaid lease, we’d locked in savings compared to 2022 rates (just by dividing the total prepaid lease amount by the expected energy production over the lifetime of the lease), and these savings have only increased as PG&E’s rates have risen (see chart below).

    PG&E Rates 2022 vs 2024 (Source: PG&E; Google Sheet)

    Batteries, on the other hand, are much less clear-cut financially due to their high initial cost and only modest savings from time-shifting electricity use. However, the peace of mind from having backup power during outages is valuable (not to mention the fact that, without a battery, solar panels can’t be used to power your home during an outage), and, with climate change likely to increase both peak/off-peak rate disparities and the frequency of outages, we believe this investment will pay off in the long run.

    Taking Advantage of Time of Use Rates

    Time of Use (TOU) rates, like PG&E’s electric vehicle time of use rates, offer a smart way to reduce electricity costs for homes with solar panels, energy storage, and smart automation. This approach has fundamentally changed how we manage home energy use. Instead of merely conserving energy by using efficient appliances or turning off devices when not needed, we now view our home as a giant configurable battery. We “save” energy when it’s cheap and use it when it’s expensive.

    • Backup Reserve: We’ve set our Tesla Powerwall to maintain a 25% reserve. This ensures we always have a good supply of backup power for essential appliances (roughly 20 hours for our highest priority circuits by the Span app’s latest estimates) during outages
    • Summer Strategy: During summer, our Powerwall operates in “Self Power” mode, meaning solar energy powers our home first, then charges the battery, and lastly any excess goes to the grid. This maximizes the use of our “free” solar energy. We also schedule our heat pumps to run during midday when solar production peaks and TOU rates are lower. This way, we “store” cheaper energy in the form of pre-chilled or pre-heated air and water which helps maintain the right temperatures for us later (when the energy is more expensive).
    • Winter Strategy: In winter, we will switch the Powerwall to “Time-Based Control.” This setting preferentially charges the battery when electricity is cheap and discharges it when prices are high, maximizing the financial value of our solar energy during the months where solar production is likely to be limited.

    This year will be our first full cycle with all systems in place, and we expect to make adjustments as rates and energy usage evolve. For those considering home electrification, hopefully these strategies give hints to what is possible to improve economic value of your setup.

    Takeaways

    • Two years is too long: The average household might not have started this journey if they knew the extent of time and effort involved. This doesn’t even consider the amount of carbon emissions from running appliances off grid energy due to the delays. Streamlining the process is essential to make electrification more accessible and appealing.
    • Align gas and electricity prices with climate goals: The current pricing dynamics make it financially challenging for households to switch from gas appliances to greener options like heat pumps. To achieve California’s ambitious climate goals, it’s crucial to align the cost of electricity more closely with electrification.
    • Streamline permitting: Electrification projects are slowed by complex, inconsistent permitting requirements across different jurisdictions. Simplifying and unifying these processes will reduce time and costs for homeowners and their contractors.
    • Accelerate utility approvals: The two-year timeframe was largely due to delays from our local utility, PG&E. As utilities lack incentives to expedite these processes, regulators should build in ways to encourage utilities to move faster on home electrification-related approvals and activities, especially as many homes will likely need main panel upgrades to properly electrify.
    • Improve financing accessibility: High upfront costs make it difficult for households to adopt electrification, even when there are significant long-term savings. Expanding financing options (like Sunrun’s leases) can encourage more households to invest in these technologies. Policy changes should be implemented so that even smaller installers have the ability to offer attractive financing options to their clients.
    • Break down electrification silos: Coordination between HVAC specialists, solar installers, electricians, and smart home companies is sorely missing today. As a knowledgeable early adopter, I managed to integrate these systems on my own, but this shouldn’t be the expectation if we want broad adoption of electrification. The industry (in concert with policymakers) should make it easier for different vendors to coordinate and for the systems to interoperate more easily in order to help homeowners take full advantage of the technology.

    This long journey highlighted to me, in a very visceral way, both the rewards and practical challenges of home electrification. While the environmental, financial, and quality-of-life benefits are clear, it’s also clear that we have a ways to go on the policy and practical hurdles before electrification becomes an easy choice for many more households. I only hope policymakers and technologists are paying attention. Our world can’t wait much longer.

  • How the Jones Act makes energy more expensive and less green

    The Merchant Marine Act of 1920 (aka “The Jones Act”) is a law which requires ships operating between US ports to be owned by, made in, and crewed by US citizens.

    While many “Made in the USA” laws are on the books and attract the anger of economists and policy wonks, the Jones Act is particularly egregious as the costs and effects are so large. The Jones Act costs states like Hawaii and Alaska and territories like Puerto Rico dramatically as they rely so much on ships for basic commerce that it was actually cheaper for Hawaii and New England to import oil from other countries (like Hawaii did from Russia until the Ukraine war) than it was to have oil shipped from the Gulf of Mexico (where American oil is abundant).

    In the case of offshore wind, the Jones Act has pushed those companies willing to experiment with the promising technology, to ship the required parts and equipment from overseas because there are no Jones Act-compliant ships capable of moving the massive equipment that is involved.

    This piece from Canary Media captures some of the dynamics and the “launch” of the still-in-construction $625 million Jones Act-compliant ship the Charybdis Dominion Energy will use to support its offshore wind facility.


  • Backup Your Home Server with Duplicati

    (Note: this is part of my ongoing series on cheaply selfhosting)

    Through some readily available Docker containers and OpenMediaVault, I have a cheap mini-PC which serves as:

    But, over time, as the server has picked up more uses, it’s also become a vulnerability. If any of the drives on my machine ever fail, I’ll lose data that is personally (and sometimes economically) significant.

    I needed a home server backup plan.

    Duplicati

    Duplicati is open source software that helps you efficiently and securely backup specific partitions and folders to any destination. This could be another home server or it can be a cloud service provider (like Amazon S3 or Backblaze B2 or even a consumer service like Dropbox, Google Drive, and OneDrive). While there are many other tools that can support backup, I went with Duplicati because I wanted:

    • Support for consumer storage services as a target: I am a customer of Google Drive (through Google One) and Microsoft 365 (which comes with generous OneDrive subscription) and only intend to backup some of the files I’m currently storing (mainly some of the network storage I’m using to hold important files)
    • A web-based control interface so I could access this from any computer (and not just whichever machine had the software I wanted)
    • An active user forum so I could find how-to guides and potentially get help
    • Available as a Docker container on linuxserver.io: linuxserver.io is well-known for hosting and maintaining high quality and up-to-date Docker container images

    Installation

    Update 2024 Dec 18: One reason Duplicati is a great solution is that it is actively being developed. However, occasionally this can introduce breaking changes. Since version 2.0.9.105, Duplicati now requires a password. This has required an update to the Docker compose setup to include an Encryption Key, a Password, and an earlier update required Nginx proxy to pass additional headers to handle the Websocket-based interface the web interface now uses to keep its interface dynamic. I’ve changed the text below to reflect these

    To install Duplicati on OpenMediaVault:

    • If you haven’t already, make sure you have OMV Extras and Docker Compose installed (refer to the section Docker and OMV-Extras in my previous post, you’ll want to follow all 10 steps as I refer to different parts of the process throughout this post) and have a static local IP address assigned to your server.
    • Login to your OpenMediaVault web admin panel, and then go to [Services > Compose > Files] in the sidebar. Press the button in the main interface to add a new Docker compose file.

      Under Name put down Duplicati and under File, adapt the following (making sure the number of spaces are consistent)
    ---
    services:
       duplicati:
         image: lscr.io/linuxserver/duplicati:latest
         container_name: duplicati
         ports:
           - <unused port number>:8200
         environment:
           - TZ: 'America/Los_Angeles'
           - PUID=<UID of Docker User>
           - PGID=<GID of Docker User>
           - DUPLICATI__WEBSERVICE_PASSWORD=<Password to access interface>
           - SETTINGS_ENCRYPTION_KEY=<random set of at least 8 characters/numbers>
         volumes:
           - <absolute paths to folders to backup>:<names to use in Duplicati interface>
           - <absolute path to shared config folder>/Duplicati:/config
         restart: unless-stopped
    Code language: YAML (yaml)
    • Under ports:, make sure to add an unused port number (I went with 8200).

      Replace <absolute path to shared config folder> with the absolute path to the config folder where you want Docker-installed applications to store their configuration information (accessible by going to [Storage > Shared Folders] in the administrative panel).

      You’ll notice there’s extra lines under volumes: for <absolute paths to folders to backup>. This should correspond with the folders you are interested in backing up. You should map them to names that will show up in the Duplicati interface that you recognize. For example, I directed my <absolute path to shared config folder> to /containerconfigs as one of the things I want to make sure I backup are my containers.

      Once you’re done, hit Save and you should be returned to your list of Docker compose files for the next step. Notice that the new Duplicati entry you created has a Down status, showing the container has yet to be initialized.
    • To start your Duplicati container, click on the new Duplicati entry and press the (up) button. This will create the container, download any files needed, and run it.

      To show it worked, go to your-servers-static-ip-address:8200 from a browser that’s on the same network as your server (replacing 8200 if you picked a different port in the configuration file above) and you should see the Duplicati web interface which should look something like below
    • You can skip this step if you didn’t set up Pihole and local DNS / Nginx proxy or if you don’t care about having a user-readable domain name for Duplicati. But, assuming you do and you followed my instructions, open up WeTTy (which you can do by going to wetty.home in your browser if you followed my instructions or by going to [Services > WeTTY] from OpenMediaVault administrative panel and pressing Open UI button in the main panel) and login as the root user. Run:
    cd /etc/nginx/conf.d
    ls
    nano <your file name>.confCode language: Shell Session (shell)
    • This opens up the text editor nano with the file you just listed. Use your cursor to go to the very bottom of the file and add the following lines (making sure to use tabs and end each line with a semicolon)
    server {
        listen             80;
        server_name        <duplicati.home or the domain you'd like to use>;
        location / {
            proxy_pass             http://<your-server-static-ip>:<duplicati port no.>;
            proxy_http_version     1.1;
            proxy_set_header       Upgrade $http_upgrade;
            proxy_set_header       Connection "upgrade";
            proxy_set_header       Host $host;
            proxy_set_header       X-Real-IP $remote_addr;
            proxy_set_header       X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header       X-Forwarded-Proto $scheme;
        }
    }Code language: HTML, XML (xml)
    • And then hit Ctrl+X to exit, Y to save, and Enter to overwrite the existing file. Then in the command line run the following to restart Nginx with your new configuration loaded.
    systemctl restart nginx
    • Now, if your server sees a request for duplicati.home (or whichever domain you picked), it will direct them to Duplicati. With the additional proxy_http_version and proxy_set_header‘s, it will also properly forward the Websocket requests the web interface uses.
    • Login to your Pihole administrative console (you can just go to pi.hole in a browser) and click on [Local DNS > DNS Records] from the sidebar. Under the section called Add a new domain/IP combination, fill out under Domain: the domain you just added above (i.e. duplicati.home) and next to IP Address: you should add your server’s static IP address. Press the Add button and it will show up below.
    • To make sure it all works, enter the domain you just added (duplicati.home if you went with my default) in a browser and you should see the Duplicati interface!

    Configuring your Backups

    Duplicati conceives of each “backup” as a “source” (folder of files to backup), a “destination” (the place the files should be backed up to), a schedule (how often does the backup run), and some options to configure how the backup works.

    After logging in (with the password you specified in the Docker compose file), to configure a “backup”, click on +Add Backup button on the menu on the lefthand side. I’ll show you the screens I went through to backup my Docker container configurations:

    1. Add a name (I called it DockerConfigs) and enter a Passphrase (you can use the Generate link to create a strong password) which you’d use to restore from backup. Then hit Next
    2. Enter a destination. Here, you can select another computer or folder connected to your network. You can also select an online storage service.

      I’m using Microsoft OneDrive — for a different service, a quick Google search or a search of the Duplicati how-to forum can give you more specific instructions, but the basic steps of generating an AuthID link appear to be similar across many services.

      I selected Microsoft OneDrive v2 and picked a path in my OneDrive for the backup to go to (Backup/dockerconfigs). I then clicked on the AuthID link and went through an authentication process to formally grant Duplicati access to OneDrive. Depending on the service, you may need to manually copy a long string of letters and numbers and colons into the text field. After all of that, to prove it all worked, press Test connection!

      Then hit Next
    3. Select the source. Use the folder browsing widget on the interface to select the folder you wish to backup.

      If you recall in my configuration step, I mapped the <absolute path to shared config folder> to /containerconfigs which is why I selected this as a one-click way to backup all my Docker container configurations. If necessary, feel free to shut down and delete your current container and start over with a configuration where you point and map the folders in a better way.

      Then hit Next
    4. Pick a schedule. Do you want to backup every day? Once a week? Twice a week? Since my docker container configurations don’t change that frequently, I decided to schedule weekly backups on Saturday early morning (so it wouldn’t interfere with something else I might be doing).

      Pick your option and then hit Next
    5. Select your backup options. Unless you have a strong reason to, I would not change the remote volume size from the default (50 MB). The backup retention, however, is something you may want to think about. Duplicati gives you the option to hold on to every backup (something I would not do unless you have a massive amount of storage relative to the amount of data you want to backup), to hold on to backups younger than a certain age, to hold on to a specific number of backups, or customized permutations of the above.

      The option you should choose depends on your circumstances, but to share what I did. For some of my most important files, I’m using Duplicati’s smart backup retention option (which gives me one backup from the last week, one for each of the last 4 weeks, and one for each of the last 12 months). For some of my less important files (for example, my docker container configurations), I’m holding on to just the last 2 weeks worth of backups.

      Then hit Save and you’re set!

    I hope this helps you on your self-hosted backup journey.

    If you’re interested in how to setup a home server on OpenMediaVault or how to self-host different services, check out all my posts on the subject!