Beautiful map laying out when humans settled different parts of the world (from 2013, National Geographic’s Out of Eden project)
-
A Visual Timeline of Human Migration
-
Updating my AI News Reader
A few months ago, I shared that I had built an AI-powered personalized news reader which I use (and still do) on a near-daily basis. Since that post, I’ve made a couple of major improvements (which I have just reflected in my public Github).
Switching to JAX
I previously chose Keras 3 for my deep learning algorithm architecture because of its ease of use as well as the advertised ability to shift between AI/ML backends (at least between Tensorflow, JAX, and PyTorch). With Keras creator Francois Chollet noting significant speed-ups just from switching backends to JAX, I decided to give the JAX backend a shot.
Thankfully, Keras 3 lived up to it’s multi-backend promise and made switching to JAX remarkably easy. For my code, I simply had to make three sets of tweaks.
First, I had to change the definition of my container images. Instead of starting from Tensorflow’s official Docker images, I instead installed JAX and Keras on Modal’s default Debian image and set the appropriate environmental variables to configure Keras to use JAX as a backend:
jax_image = ( modal.Image.debian_slim(python_version='3.11') .pip_install('jax[cuda12]==0.4.35', extra_options="-U") .pip_install('keras==3.6') .pip_install('keras-hub==0.17') .env({"KERAS_BACKEND":"jax"}) # sets Keras backend to JAX .env({"XLA_PYTHON_CLIENT_MEM_FRACTION":"1.0"})
Code language: Python (python)Second, because
tf.data
pipelines convert everything to Tensorflow tensors, I had to switch my preprocessing pipelines from using Keras’sops
library (which, because I was using JAX as a backend, expected JAX tensors) to Tensorflow native operations:ds = ds.map( lambda i, j, k, l: ( preprocessor(i), j, 2*k-1, loglength_norm_layer(tf.math.log(tf.cast(l, dtype=tf.float32)+1)) ), num_parallel_calls=tf.data.AUTOTUNE )
Code language: Python (python)Lastly, I had a few lines of code which assumed Tensorflow tensors (where getting the underlying value required a
.numpy()
call). As I was now using JAX as a backend, I had to remove the.numpy()
calls for the code to work.Everything else — the rest of the
tf.data
preprocessing pipeline, the code to train the model, the code to serve it, the previously saved model weights and the code to save & load them — remained the same! Considering that the training time per epoch and the time the model took to evaluate (a measure of inference time) both seemed to improve by 20-40%, this simple switch to JAX seemed well worth it!Model Architecture Improvements
There were two major improvements I made in the model architecture over the past few months.
First, having run my news reader for the better part of a year now, I now have accumulated enough data where my strategy to simultaneously train on two related tasks (predicting the human rating and predicting the length of an article) no longer required separate inputs. This reduced the memory requirement as well as simplified the data pipeline for training (see architecture diagram below)
Secondly, I was successfully able to train a version of my algorithm which can use dot products natively. This not only allowed me to remove several layers from my previous model architecture (see architecture diagram below), but because the Supabase postgres database I’m using supports
pgvector
, it means I can even compute ratings for articles through a SQL query:UPDATE articleuser SET ai_rating = 0.5 + 0.5 * (1 - (a.embedding <=> u.embedding)), rating_timestamp = NOW(), updated_at = NOW() FROM articles a, users u WHERE articleuser.article_id = a.id AND articleuser.user_id = u.id AND articleuser.ai_rating IS NULL;
Code language: SQL (Structured Query Language) (sql)The result is much greater simplicity in architecture as well as greater operational flexibility as I can now update ratings from the database directly as well as from serving a deep neural network from my serverless backend.
Model architecture (output from Keras plot_model
function)Making Sources a First-Class Citizen
As I used the news reader, I realized early on that the ability to just have sorted content from one source (i.e. a particular blog or news site) would be valuable to have. To add this, I created and populated a new
sources
table within the database to track these independently (see database design diagram below) which was linked to thearticles
table.Newsreader database design diagram (produced by a Supabase tool) I then modified my scrapers to insert the identifier for each source alongside each new article, as well as made sure my fetch calls all
JOIN
‘d and pulled the relevant source information.With the data infrastructure in place, I added the ability to add a source parameter to the core fetch URLs to enable single (or multiple) source feeds. I then added a quick element at the top of the feed interface (see below) to let a user know when the feed they’re seeing is limited to a given source. I also made all the source links in the feed clickable so that they could take the user to the corresponding single source feed.
<div class="feed-container"> <div class="controls-container"> <div class="controls"> ${source_names && source_names.length > 0 && html` <div class="source-info"> Showing articles from: ${source_names.join(', ')} </div> <div> <a href="/">Return to Main Feed</a> </div> `} </div> </div> </div>
Code language: HTML, XML (xml)The interface when on a single source feed Performance Speed-Up
One recurring issue I noticed in my use of the news reader pertained to slow load times. While some of this can be attributed to the “cold start” issue that serverless applications face, much of this was due to how the news reader was fetching pertinent articles from the database. It was deciding at the moment of the fetch request what was most relevant to send over by calculating all the pertinent scores and rank ordering. As the article database got larger, this computation became more complicated.
To address this, I decided to move to a “pre-calculated” ranking system. That way, the system would know what to fetch in advance of a fetch request (and hence return much faster). Couple that with a database index (which effectively “pre-sorts” the results to make retrieval even faster), and I saw visually noticeable improvements in load times.
But with any pre-calculated score scheme, the most important question is how and when re-calculation should happen. Too often and too broadly and you incur unnecessary computing costs. Too infrequently and you risk the scores becoming stale.
The compromise I reached derived itself from the three ways articles are ranked in my system:
- The AI’s rating of an article plays the most important role (60%)
- How recently the article was published is tied with… (20%)
- How similar an article is with the 10 articles a user most recently read (20%
These factors lent themselves to very different natural update cadences:
- Newly scraped articles would have their AI ratings and calculated score computed at the time they enter the database
- AI ratings for the most recent and the previously highest scoring articles would be re-computed after model training updates
- On a daily basis, each article’s score was recomputed (focusing on the change in article recency)
- The article similarity for unread articles is re-evaluated after a user reads 10 articles
This required modifying the reader’s existing scraper and post-training processes to update the appropriate scores after scraping runs and model updates. It also meant tracking article reads on the
users
table (and modifying the/read
endpoint to update these scores at the right intervals). Finally, it also meant adding a recurringcleanUp
function set to run every 24 hours to perform this update as well as others.Next Steps
With some of these performance and architecture improvements in place, my priorities are now focused on finding ways to systematically improve the underlying algorithms as well as increase the platform’s usability as a true news tool. To that end some of the top priorities for next steps in my mind include:
- Testing new backbone models — The core ranking algorithm relies on Roberta, a model released 5 years ago before large language models were common parlance. Keras Hub makes it incredibly easy to incorporate newer models like Meta’s Llama 2 & 3, OpenAI’s GPT2, Microsoft’s Phi-3, and Google’s Gemma and fine-tune them.
- Solving the “all good articles” problem — Because the point of the news reader is to surface content it considers good, users will not readily see lower quality content, nor will they see content the algorithm struggles to rank (i.e. new content very different from what the user has seen before). This makes it difficult to get the full range of data needed to help preserve the algorithm’s usefulness.
- Creating topic and author feeds — Given that many people think in terms of topics and authors of interest, expanding what I’ve already done with Sources but with topics and author feeds sounds like a high-value next step
I also endeavor to make more regular updates to the public Github repository (instead of aggregate many updates I had already made into two large ones). This will make the updates more manageable and hopefully help anyone out there who’s interested in building a similar product.
-
Not your grandma’s geothermal energy
The pursuit of carbon-free energy has largely leaned on intermittent sources of energy — like wind and solar; and sources that require a great deal of initial investment — like hydroelectric (which requires elevated bodies of water and dams) and nuclear (which require you to set up a reactor).
The theoretical beauty of geothermal power is that, if you dig deep enough, virtually everywhere on planet earth is hot enough to melt rock (thanks to the nuclear reactions that heat up the inside of the earth). But, until recently, geothermal has been limited to regions of Earth where well-formed geologic formations can deliver predictable steam without excessive engineering.
But, ironically, it is the fracking boom, which has helped the oil & gas industries get access to new sources of carbon-producing energy, which may help us tap geothermal power in more places. As fracking and oil & gas exploration has led to a revolution in our ability to precisely drill deep underground and push & pull fluids, it also presents the ability for us to tap more geothermal power than ever before. This has led to the rise of enhanced geothermal, the process by which we inject water deep underground to heat, and leverage the steam produced to generate electricity. Studies suggest the resource is particularly rich and accessible in the Southwest of the United States (see map below) and could be an extra tool in our portfolio to green energy consumption.
(Source: Figure 5 from NREL study on enhanced geothermal from Jan 2023) While there is a great deal of uncertainty around how much this will cost and just what it will take (not to mention the seismic risks that have plagued some fracking efforts), the hunger for more data center capacity and the desire to power this with clean electricity has helped startups like Fervo Energy and Sage Geosystems fund projects to explore.
On 17 October, Fervo Energy, a start-up based in Houston, Texas, got a major boost as the US government gave the green light to the expansion of a geothermal plant Fervo is building in Beaver County, Utah. The project could eventually generate as much as 2,000 megawatts — a capacity comparable with that of two large nuclear reactors. Although getting to that point could take a while, the plant already has 400 MW of capacity in the pipeline, and will be ready to provide around-the-clock power to Google’s energy-hungry data centres, and other customers, by 2028. In August, another start-up, Sage Geosystems, announced a partnership with Facebook’s parent company Meta to deliver up to 150 MW of geothermal power to Meta’s data centres by 2027.
Geothermal power is vying to be a major player in the world’s clean-energy future
David Castelvecchi | Nature News -
Who needs humans? Lab of AIs designs valid COVID-binding proteins
A recent preprint from Stanford has demonstrated something remarkable: AI agents working together as a team solving a complex scientific challenge.
While much of the AI discourse focuses on how individual large language models (LLMs) compare to humans, much of human work today is a team effort, and the right question is less “can this LLM do better than a single human on a task” and more “what is the best team-up of AI and human to achieve a goal?” What is fascinating about this paper is that it looks at it from the perspective of “what can a team of AI agents achieve?”
The researchers tackled an ambitious goal: designing improved COVID-binding proteins for potential diagnostic or therapeutic use. Rather than relying on a single AI model to handle everything, the researchers tasked an AI “Principal Investigator” with assembling a virtual research team of AI agents! After some internal deliberation, the AI Principal Investigator selected an AI immunologist, an AI machine learning specialist, and an AI computational biologist. The researchers made sure to add an additional role, one of a “scientific critic” to help ground and challenge the virtual lab team’s thinking.
The team composition and phases of work planned and carried out by the AI principal investigator
(Source: Figure 2 from Swanson et al.)What makes this approach fascinating is how it mirrors high functioning human organizational structures. The AI team conducted meetings with defined agendas and speaking orders, with a “devil’s advocate” to ensure the ideas were grounded and rigorous.
Example of a virtual lab meeting between the AI agents; note the roles of the Principal Investigator (to set agenda) and Scientific Critic (to challenge the team to ground their work)
(Source: Figure 6 from Swanson et al.)One tactic that the researchers said helped with boosting creativity that is harder to replicate with humans is running parallel discussions, whereby the AI agents had the same conversation over and over again. In these discussions, the human researchers set the “temperature” of the LLM higher (inviting more variation in output). The AI principal investigator then took the output of all of these conversations and synthesized them into a final answer (this time with the LLM temperature set lower, to reduce the variability and “imaginativeness” of the answer).
The use of parallel meetings to get “creativity” and a diverse set of options
(Source: Supplemental Figure 1 from Swanson et al.)The results? The AI team successfully designed nanobodies (small antibody-like proteins — this was a choice the team made to pursue nanobodies over more traditional antibodies) that showed improved binding to recent SARS-CoV-2 variants compared to existing versions. While humans provided some guidance, particularly around defining coding tasks, the AI agents handled the bulk of the scientific discussion and iteration.
Experimental validation of some of the designed nanobodies; the relevant comparison is the filled in circles vs the open circles. The higher ELISA assay intensity for the filled in circles shows that the designed nanbodies bind better than their un-mutated original counterparts
(Source: Figure 5C from Swanson et al.)This work hints at a future where AI teams become powerful tools for human researchers and organizations. Instead of asking “Will AI replace humans?”, we should be asking “How can humans best orchestrate teams of specialized AI agents to solve complex problems?”
The implications extend far beyond scientific research. As businesses grapple with implementing AI, this study suggests that success might lie not in deploying a single, all-powerful AI system, but in thoughtfully combining specialized AI agents with human oversight. It’s a reminder that in both human and artificial intelligence, teamwork often trumps individual brilliance.
I personally am also interested in how different team compositions and working practices might lead to better or worse outcomes — for both AI teams and human teams. Should we have one scientific critic, or should their be specialist critics for each task? How important was the speaking order? What if the group came up with their own agendas? What if there were two principal investigators with different strengths?
The next frontier in AI might not be building bigger models, but building better teams.
The Virtual Lab: AI Agents Design New SARS-CoV-2 Nanobodies with Experimental Validation
Kyle Swanson et al. | bioRxiv PrePrint -
Updating OpenMediaVault
(Note: this is part of my ongoing series on cheaply selfhosting)
I’ve been using OpenMediaVault 6 on a cheap mini-PC as a home server for over a year. Earlier this year, OpenMediaVault 7 was announced which upgrades the underlying Linux to Debian 12 Bookworm and made a number of other security, compatibility, and user interface improvements.
Not wanting to start over fresh, I decided to take advantage of OpenMediaVault’s built-in command line tool to handle upgrades and, if like me, you are looking for a quick and clean way of upgrading from OpenMediaVault 6 to OpenMediaVault 7, look no further:
- SSH into your system / connect directly with a keyboard and monitor. While normally I would recommend WeTTY (accessible via
[Services > WeTTY]
from the web console interface) to handle any command line activity on your server, because WeTTY relies on the server to be running and updating the operating system necessitates shutting the server down, you’ll need to either plug in a keyboard and monitor or use SSH. - Run
sudo omv-upgrade
in the command-line. This will start a long process of downloading and installing necessary files to complete the operating system update. From time to time you’ll be asked to accept / approve a set of changes via keyboard. If you’re on the online administrative panel, you’ll be booted off as the server shuts down. - Restart the server. Once everything is complete, you’ll need to restart the server to make sure everything “takes”. This can be done by running
reboot
in the command line or by manually turning off and on the server.
Assuming everything went smoothly, after the server completes its reboot (which will take a little bit of extra time after an operating system upgrade), upon logging into the administrative console as you had done before, you’ll be greeted by the new OMV 7 login screen. Congratulations!
- SSH into your system / connect directly with a keyboard and monitor. While normally I would recommend WeTTY (accessible via
-
A Digital Twin of the Whole World in the Cloud
As a kid, I remember playing Microsoft Flight Simulator 5.0 — while I can’t say I really understood all the nuances of the several hundred page manual (which explained how ailerons and rudders and elevators worked), I remember being blown away with the idea that I could fly anywhere on the planet and see something reasonably representative there.
Flash forward a few decades and Microsoft Flight Simulator 2024 can safely be said to be one of the most detailed “digital twins” of the whole planet ever built. In addition to detailed photographic mapping of many locations (I would imagine a combination of aerial surveillance and satellite imagery) and an accurate real world inventory of every helipad (including offshore oil rigs!) and glider airport, they also simulate flocks of animals, plane wear and tear, how snow vs mud vs grass behave when you land on it, wake turbulence, and more! And, just as impressive, it’s being streamed from the cloud to your PC/console when you play!
Who said the metaverse is dead?
People are dressed in clothes and styles matching their countries of origin. They speak in the language of their home countries. Flying from the US to Finland on a commercial plane? Walk through the cabin: you’ll hear both English and Finnish being spoken by the passengers.
Neumann, who has a supervising producer credit on 2013’s Zoo Tycoon and a degree in biology, has a soft-spot for animals and wants to make sure they’re also being more realistically simulated in MSFS 2024. “I really didn’t like the implementation of the animal flights in 2020,” he admitted. “It really bothered me, it was like, ‘Hey, find the elephants!’ and there’s a stick in the UI and there’s three sad-looking elephants.
“There’s an open source database that has all wild species, extinct and living, and it has distribution maps with density over time,” Neumann continued. Asobo is drawing from that database to make sure animals are exactly where they’re supposed to be, and that they have the correct population densities. In different locations throughout the year, “you will find different stuff, but also they’re migrating,” so where you spot a herd of wildebeests or caribou one day might not be the same place you find them the next.
Microsoft Flight Simulator 2024: The First Preview
Seth G. Macy | IGN -
Making a Movie to Make Better Video Encoding
Until I read this Verge article, I had assumed that video codecs were a boring affair. In my mind, every few years, the industry would get together and come up with a new standard that promised better compression and better quality for the prevailing formats and screen types and, after some patent licensing back and forth, the industry would standardize around yet another MPEG standard that everyone uses. Rinse and repeat.
The article was an eye-opening look at how video streamers like Netflix are pushing the envelope on using video codecs. Since one of a video streamer’s core costs is the cost of video bandwidth, it would make sense that they would embrace new compression approaches (like different kinds of compression for different content, etc.) to reduce those costs. As Netflix embraces more live streaming content, it seems they’ll need to create new methods to accommodate.
But what jumped out to me the most was that, in order to better test and develop the next generation of codec, they produced a real 12 minute noir film called Meridian (you can access it on Netflix, below is someone who uploaded it to YouTube) which presents scenes that have historically been more difficult to encode with conventional video codecs (extreme lights and shadows, cigar smoke and water, rapidly changing light balance, etc).
Absolutely wild.
While contributing to the development of new video codecs, Aaron and her team stumbled across another pitfall: video engineers across the industry have been relying on a relatively small corpus of freely available video clips to train and test their codecs and algorithms, and most of those clips didn’t look at all like your typical Netflix show. “The content that they were using that was open was not really tailored to the type of content we were streaming,” recalled Aaron. “So, we created content specifically for testing in the industry.”
In 2016, Netflix released a 12-minute 4K HDR short film called Meridian that was supposed to remedy this. Meridian looks like a film noir crime story, complete with shots in a dusty office with a fan in the background, a cloudy beach scene with glistening water, and a dark dream sequence that’s full of contrasts. Each of these shots has been crafted for video encoding challenges, and the entire film has been released under a Creative Commons license. The film has since been used by the Fraunhofer Institute and others to evaluate codecs, and its release has been hailed by the Creative Commons foundation as a prime example of “a spirit of cooperation that creates better technical standards.”
Inside Netflix’s Bet on Advanced Video Encoding
Janko Roettgers | The Verge -
Games versus Points
The Dartmouth College Class of 2024, for their graduation, got a very special commencement address from tennis legend Roger Federer.
There is a wealth of good advice in it, but the most interesting point that jumped out to me is that while Federer won a whopping 80% of the matches he played in his career, he only won 54% of the points. It underscores the importance of letting go of small failures (“When you lose every second point, on average, you learn not to dwell on every shot”) but also of keeping your eye on the right metric (games, not points).
In tennis, perfection is impossible… In the 1,526 singles matches I played in my career, I won almost 80% of those matches… Now, I have a question for all of you… what percentage of the POINTS do you think I won in those matches?
Only 54%.
In other words, even top-ranked tennis players win barely more than half of the points they play.
-
Biopharma scrambling to handle Biosecure Act
Strong regional industrial ecosystems like Silicon Valley (tech), Boston (life science), and Taiwan (semiconductors) are fascinating. Their creation is rare and requires local talent, easy access to supply chains and distribution, academic & government support, business success, and a good amount of luck.
But, once set in place, they can be remarkably difficult to unseat. Take the semiconductor industry as an example. It’s geopolitical importance has directed billions of dollars towards re-creating a domestic US industry. But, it faces an uphill climb. After all, it’s not only a question of recreating the semiconductor manufacturing factories that have gone overseas, but also:
- the advanced and low-cost packaging technologies and vendors that are largely based in Asia
- the engineering and technician talent that is no longer really in the US
- the ecosystem of contractors and service firms that know exactly how to maintain the facilities and equipment
- the supply chain for advanced chemicals and specialized parts that make the process technology work
- the board manufacturers and ODMs/EMSs who do much of the actual work post-chip production that are also concentrated in Asia
A similar thing has happened in the life sciences CDMO (contract development and manufacturing organization) space. In much the same way that Western companies largely outsourced semiconductor manufacturing to Asia, Western biopharma companies outsourced much of their core drug R&D and manufacturing to Chinese companies like WuXi AppTec and WuXi Biologics. This has resulted in a concentration of talent and an ecosystem of talent and suppliers there that would be difficult to supplant.
Enter the BIOSECURE Act, a bill being discussed in the House with a strong possibility of becoming a law. It prohibits the US government from working with companies that obtain technology from Chinese biotechnology companies of concern (including WuXi AppTec and WuXi Biologics, among others). This is causing the biopharma industry significant anxiety as they are forced to find (and potentially fund) an alternative CDMO ecosystem that currently does not exist at the level of scale and quality as it does with WuXi.
According to [Harvey Berger, CEO of Kojin Therapeutics], China’s CDMO industry has evolved to a point that no other country comes close to. “Tens of thousands of people work in the CDMO industry in China, which is more than the rest of the world combined,” he says.
Meanwhile, Sound’s Kil says he has worked with five CDMOs over the past 15 years and is sure he wouldn’t return to three of them. The two that he finds acceptable are WuXi and a European firm.
“When we asked the European CDMO about their capacity to make commercial-stage quantities, they told us they would have to outsource it to India,” Kil says. WuXi, on the other hand, is able to deliver large quantities very quickly. “It would be terrible for anyone to restrict our ability to work with WuXi AppTec.”
Industry braces for Biosecure Act impact
Aayushi Pratap | C&EN -
Freedom and Prosperity Under Xi Jinping
Fascinating chart from Bloomberg showing level of economic freedom and prosperity under different Chinese rulers and how Xi Jinping is the first Chinese Communist Party ruler in history to have presided over sharp declines in both freedom and prosperity.
Given China’s rising influence in economic and geopolitical affairs, how it’s leaders (and in particular, Xi) and it’s people react to this will have significant impacts on the world
‘Are You Better Off?’ Asking Reagan’s Question in Xi’s China
Rebecca Choong Wilkins and Tom Orlik | Bloomberg -
My Two-Year Journey to Home Electrification
Summary
- Electrifying our (Bay Area) home was a complex and drawn-out process, taking almost two years.
- Installing solar panels and storage was particularly challenging, involving numerous hurdles and unexpected setbacks.
- We worked with a large solar installer (Sunrun) and, while the individuals we worked with were highly competent, handoffs within Sunrun and with other entities (like local utility PG&E and the local municipality) caused significant delays.
- While installing the heat pumps, smart electric panel, and EV charger was more straightforward, these projects also featured greater complexity than we expected.
- The project resulted in significant quality of improvements around home automation and comfort. However, bad pricing dynamics between electricity and natural gas meant direct cost savings from electrifying gas loads are, at best, small. While solar is an economic slam-dunk (especially given the rising PG&E rates our home sees), the batteries, in the absence of having backup, have less obvious economic value.
- Our experience underscored the need for the industry to adopt a more holistic approach to electrification and for policymakers to make the process more accessible for all homeowners to achieve the state’s ambitious goals.
Why
The decision to electrify our home was an easy one. From my years of investing in & following climate technologies, I knew that the core technologies were reliable and relatively inexpensive. As parents of young children, my wife and I were also determined to contribute positively to the environment. We also knew there were abundant financial supports from local governments and utilities to help make this all work.
Yet, as we soon discovered, what we expected to be a straightforward path turned into a nearly two-year process!
Even for a highly motivated household which had budgeted significant sums for it all, it was still shocking how long (and much money) it took. It made me skeptical that households across California would be able to do the same to meet California’s climate goals without additional policy changes and financial support.
The Plan
Two years ago, we set out a plan:
- Smart electrical panel — From my prior experience, I knew that many home electrification projects required a main electrical panel upgrade. These were typically costly and left you at the mercy of the utility to actually carry them out (I would find out how true this was later!). Our home had an older main panel rated for 125 A and we suspected we would normally need a main panel upgrade to add on all the electrical loads we were considering.
To try to get around this, we decided to get a smart electrical panel which could:- use software smarts to deal with the times where peak electrical load got high enough to need the entire capacity of the electrical line
- give us the ability to intelligently manage backups and track solar production
In doing our research, Span seemed like the clear winner. They were the most prominent company in the space and had the slickest looking device and app (many of their team had come from Tesla). They also had an EV charger product we were interested in, the Span Drive. - Heat pumps — To electrify is to ditch natural gas. As the bulk of our gas consumption was heating air and water, this involved replacing our gas furnace and gas water heater with heat pumps. In addition to significant energy savings — heat pumps are famous for their >200% efficiency (as they move heat rather than “create” it like gas furnaces do) — heat pumps would also let us add air conditioning (just run the heat pump in reverse!) and improve our air quality (from not combusting natural gas indoors). We found a highly rated Bay Area HVAC installer who specializes in these types of energy efficiency projects (called Building Efficiency) and trusted that they would pick the right heat pumps for us.
- Solar and Batteries — No electrification plan is complete without solar. Our goal was to generate as much clean electricity as possible to power our new electric loads. We also wanted energy storage for backup power during outages (something that, while rare, we seemed to run into every year) and to take advantage of time-of-use rates (by storing solar energy when the price of electricity is low and then using it when the price is high).
We looked at a number of solar installers and ultimately chose Sunrun. A friend of ours worked there at the time and spoke highly of a prepaid lease they offered that was vastly cheaper all-in than every alternative. It offered minimum energy production guarantees, came with a solid warranty, and the “peace of mind” that the installation would be done with one of the largest and most reputable companies in the solar industry. - EV Charger — Finally, with our plan to buy an electric vehicle, installing a home charger at the end of the electrification project was a simple decision. This would allow us to conveniently charge the car at home, and, with solar & storage, hopefully let us “fuel up” more cost effectively. Here, we decided to go with the Span Drive. It’s winning feature was the ability to provide Level 2 charging speeds without a panel upgrade (it does this by ramping up or down charging speeds depending on how much electricity the rest of the house needed). While pricey, the direct integration into our Span smart panel (and its app) and the ability to hit high charging rates without a panel upgrade felt like the smart path forward.
- What We Left Out — There were two appliances we decided to defer “fully going green” on.
The first was our gas stove (with electric oven). While induction stoves have significant advantages, because our current stove is still relatively new, works well, uses relatively little gas, and an upgrade would have required additional electrical work (installing a 240 V outlet), we decided to keep our current stove and consider a replacement at it’s end of life.
The second was our electric resistive dryer. While heat pump based dryers would certainly save us a great deal of electricity, the existing heat pump dryers on the market have much smaller capacities than traditional resistive dryers, which may have necessitated our family of four doing additional loads of drying. As our current dryer was also only a few years old, and already running on electricity, we decided we would also wait to consider heat pump dryer only after it’s end of life.
With what we thought was a well-considered plan, we set out and lined up contractors.
But as Mike Tyson put it, “Everyone has a plan ’till they get punched in the face.”
The Actual Timeline
Smart Panel
The smart panel installation was one of the more straightforward parts of our electrification journey. Span connected us with a local electrician who quickly assessed our site, provided an estimate, and completed the installation in a single day. However, getting the permits to pass inspection was a different story.
We failed the first inspection due to a disagreement over the code between the electrician and the city inspector. This issue nearly turned into a billing dispute with the electrician, who wanted us to cover the extra work needed to meet the code (an unexpected cost). Fortunately, after a few adjustments and a second inspection, we passed.
The ability to control and monitor electric flows with the smart panel is incredibly cool. For the first few days, I checked the charts in the apps every few minutes tracking our energy use while running different appliances. It was eye-opening to see just how much power small, common household items like a microwave or an electric kettle could draw!
However, the true value of a smart panel is only achieved when it’s integrated with batteries or significant electric loads that necessitate managing peak demand. Without these, the monitoring and control benefits are more novelties and might not justify the cost.
Note: if you, like us, use Pihole to block tracking ads, you’ll need to disable it for the Span app. The app uses some sort of tracker that Pihole flags by default. It’s an inconvenience, but worth mentioning for anyone considering this path.
Heating
Building Efficiency performed an initial assessment of our heating and cooling needs. We had naively assumed they’d be able to do a simple drop-in replacement for our aging gas furnace and water heater. While the water heater was a straightforward replacement (with a larger tank), the furnace posed more challenges.
Initially, they proposed multiple mini-splits to provide zoned control, as they felt the crawlspace area where the gas furnace resided was too small for a properly sized heat pump. Not liking the aesthetics of mini-splits, we requested a proposal involving two central heat pump systems instead.
Additionally, during the assessment, they found some of our old vents, in particular the ones sending air to our kids’ rooms, were poorly insulated and too small (which explains why their rooms always seemed under-heated in the winter). To fix this, they had to cut a new hole through our garage concrete floor (!!) to run a larger, better-insulated vent from our crawlspace. They also added insulation to the walls of our kids’ rooms to improve our home’s ability to maintain a comfortable temperature (but which required additional furniture movement, drywall work, and a re-paint).
Building Efficiency spec’d an Ecobee thermostat to control the two central heat pumps. As we already had a Nest Learning Thermostat (with Nest temperature sensors covering rooms far from the thermostat), we wanted to keep our temperature control in the Nest app. At the time, we had gotten a free thermostat from Nest after signing with Sunrun. We realized later, what Sunrun gifted us was the cheaper (and, less attractive) Nest Thermostat which doesn’t support Nest temperature sensors (why?), so we had to buy our own Nest Learning Thermostat to complete the setup.
Despite some of these unforeseen complexities, the whole process went relatively smoothly. There were a few months of planning and scheduling, but the actual installation was completed in about a week. It was a very noisy (cutting a hole through concrete is not quiet!) and chaotic week, but, the process was quick, and the city inspection was painless.
Solar & Storage
The installation of solar panels and battery storage was a lengthy ordeal. Sunrun proposed a system with LONGI solar panels, two Tesla Powerwalls, a SolarEdge inverter, and a Tesla gateway. Despite the simplicity of the plan, we encountered several complications right away.
First, a main panel upgrade was required. Although we had installed the Span smart panel to avoid this, Sunrun insisted on the upgrade and offered to cover the cost. Our utility PG&E took over a year (!!) to approve our request, which started a domino of delays.
After PG&E’s approval, Sunrun discovered that local ordinances needed a concrete pad to be poured and safety fence erected around the panel, requiring a subcontractor and yet more coordination.
After the concrete pad was in place and the panel installed, we faced another wait for PG&E to connect the new setup. Ironically, during this wait, I received a request from Sunrun to pour another concrete pad. This was, thankfully, a false alarm and occurred because the concrete pad / safety fence work had not been logged in Sunrun’s tracking system!
The solar and storage installation itself took only a few days, but during commissioning, a technician found that half the panels weren’t connected properly, necessitating yet another visit before Sunrun could request an inspection from the city.
Sadly, we failed our first city inspection. Sunrun’s team had missed a local ordinance that required the Powerwalls to have a minimum distance between them and the sealing off of vents within a certain distance from each Powerwall. This necessitated yet another visit from Sunrun’s crew, and another city inspection (which we thankfully passed).
The final step was obtaining Permission to Operate (PTO) from PG&E. The application for this was delayed due to a clerical error. About four weeks after submission, we finally received approval.
Seeing the flow of solar electricity in my Span app (below) almost brought a tear to my eye. Finally!
EV Charger
When my wife bought a Nissan Ariya in early 2023, it came with a year of free charging with EVgo. We hoped this would allow us enough time to install solar before needing our own EV charger. However, the solar installation took longer than expected (by over a year!), so we had to expedite the installation of a home charger.
Span connected us with the same electrician who installed our smart panel. Within two weeks of our free charging plan expiring, the Span Drive was installed. The process was straightforward, with only two notable complications we had to deal with:
- The 20 ft cable on the Span Drive sounds longer than it is in practice. We adjusted our preferred installation location to ensure it comfortably reached the Ariya’s charging port.
- The Span software initially didn’t recognize the Span Drive after installation. This required escalated support from Span to reset the software, costing the poor electrician who had expected the commissioning step to be a few minute affair to stick around my home for several hours.
Result
So, “was it worth it?” Yes! There are significant environmental (our carbon footprint is meaningfully lower) benefits. But there were also quality of life improvements and financial gains from these investments in what are just fundamentally better appliances.
Quality of Life
Our programmable, internet-connected water heater allows us to adjust settings for vacations, saving energy and money effortlessly. It also lets us program temperature cycles to avoid peak energy pricing, heating water before peak rates hit.
With the new heat pumps, our home now has air conditioning, which is becoming increasingly necessary in the Bay Area’s warmer summers. Improved vents and insulation have also made our home (and, in particular, our kids’ rooms) more comfortable. We’ve also found that the heat from the heat pumps is more even and less drying compared to the old gas furnace, which created noticeable hot spots.
Backup power during outages is another significant benefit. Though we haven’t had to use it since we received permission to operate, we had an accidental trial run early on when a Sunrun technician let our batteries be charged for a few days in the winter. During two subsequent outages in the ensuing months, our system maintained power to our essential appliances, ensuring our kids didn’t even notice the disruptions!
The EV charger has also been a welcome change. While free public charging was initially helpful, reliably finding working and available fast chargers could be time-consuming and stressful. Now, charging at home is convenient and cost-effective, reducing stress and uncertainty.
Financial
There are two financial aspects to consider: the cost savings from replacing gas-powered appliances with electric ones and the savings from solar and storage.
On the first, the answer is not promising.
The chart below comes from our PG&E bill for Jan 2023. It shows our energy usage year-over-year. After installing the heat pumps in late October 2022, our natural gas consumption dropped by over 98% (from 5.86 therms/day to 0.10), while our electricity usage more than tripled (from 15.90 kWh/day to 50.20 kWh/day). Applying the conversion of 1 natural gas therm = ~29 kWh of energy shows that our total energy consumption decreased by over 70%, a testament to the much higher efficiency of heat pumps.
Our PG&E bill from Feb 2023 (for Jan 2023) Surprisingly, however, our energy bills remained almost unchanged despite this! The graph below shows our PG&E bills over the 12 months ending in Jan 2023. Despite a 70% reduction in energy consumption, the bill stayed roughly the same. This is due to the significantly lower cost of gas in California compared to the equivalent amount of energy from electricity. It highlights a major policy failing in California: high electricity costs (relative to gas) will deter households from switching to greener options.
Our PG&E bill from Feb 2023 (for Jan 2023) Solar, however, is a clear financial winner. With our prepaid lease, we’d locked in savings compared to 2022 rates (just by dividing the total prepaid lease amount by the expected energy production over the lifetime of the lease), and these savings have only increased as PG&E’s rates have risen (see chart below).
PG&E Rates 2022 vs 2024 (Source: PG&E; Google Sheet) Batteries, on the other hand, are much less clear-cut financially due to their high initial cost and only modest savings from time-shifting electricity use. However, the peace of mind from having backup power during outages is valuable (not to mention the fact that, without a battery, solar panels can’t be used to power your home during an outage), and, with climate change likely to increase both peak/off-peak rate disparities and the frequency of outages, we believe this investment will pay off in the long run.
Taking Advantage of Time of Use Rates
Time of Use (TOU) rates, like PG&E’s electric vehicle time of use rates, offer a smart way to reduce electricity costs for homes with solar panels, energy storage, and smart automation. This approach has fundamentally changed how we manage home energy use. Instead of merely conserving energy by using efficient appliances or turning off devices when not needed, we now view our home as a giant configurable battery. We “save” energy when it’s cheap and use it when it’s expensive.
- Backup Reserve: We’ve set our Tesla Powerwall to maintain a 25% reserve. This ensures we always have a good supply of backup power for essential appliances (roughly 20 hours for our highest priority circuits by the Span app’s latest estimates) during outages
- Summer Strategy: During summer, our Powerwall operates in “Self Power” mode, meaning solar energy powers our home first, then charges the battery, and lastly any excess goes to the grid. This maximizes the use of our “free” solar energy. We also schedule our heat pumps to run during midday when solar production peaks and TOU rates are lower. This way, we “store” cheaper energy in the form of pre-chilled or pre-heated air and water which helps maintain the right temperatures for us later (when the energy is more expensive).
- Winter Strategy: In winter, we will switch the Powerwall to “Time-Based Control.” This setting preferentially charges the battery when electricity is cheap and discharges it when prices are high, maximizing the financial value of our solar energy during the months where solar production is likely to be limited.
This year will be our first full cycle with all systems in place, and we expect to make adjustments as rates and energy usage evolve. For those considering home electrification, hopefully these strategies give hints to what is possible to improve economic value of your setup.
Takeaways
- Two years is too long: The average household might not have started this journey if they knew the extent of time and effort involved. This doesn’t even consider the amount of carbon emissions from running appliances off grid energy due to the delays. Streamlining the process is essential to make electrification more accessible and appealing.
- Align gas and electricity prices with climate goals: The current pricing dynamics make it financially challenging for households to switch from gas appliances to greener options like heat pumps. To achieve California’s ambitious climate goals, it’s crucial to align the cost of electricity more closely with electrification.
- Streamline permitting: Electrification projects are slowed by complex, inconsistent permitting requirements across different jurisdictions. Simplifying and unifying these processes will reduce time and costs for homeowners and their contractors.
- Accelerate utility approvals: The two-year timeframe was largely due to delays from our local utility, PG&E. As utilities lack incentives to expedite these processes, regulators should build in ways to encourage utilities to move faster on home electrification-related approvals and activities, especially as many homes will likely need main panel upgrades to properly electrify.
- Improve financing accessibility: High upfront costs make it difficult for households to adopt electrification, even when there are significant long-term savings. Expanding financing options (like Sunrun’s leases) can encourage more households to invest in these technologies. Policy changes should be implemented so that even smaller installers have the ability to offer attractive financing options to their clients.
- Break down electrification silos: Coordination between HVAC specialists, solar installers, electricians, and smart home companies is sorely missing today. As a knowledgeable early adopter, I managed to integrate these systems on my own, but this shouldn’t be the expectation if we want broad adoption of electrification. The industry (in concert with policymakers) should make it easier for different vendors to coordinate and for the systems to interoperate more easily in order to help homeowners take full advantage of the technology.
This long journey highlighted to me, in a very visceral way, both the rewards and practical challenges of home electrification. While the environmental, financial, and quality-of-life benefits are clear, it’s also clear that we have a ways to go on the policy and practical hurdles before electrification becomes an easy choice for many more households. I only hope policymakers and technologists are paying attention. Our world can’t wait much longer.
-
How the Jones Act makes energy more expensive and less green
The Merchant Marine Act of 1920 (aka “The Jones Act”) is a law which requires ships operating between US ports to be owned by, made in, and crewed by US citizens.
While many “Made in the USA” laws are on the books and attract the anger of economists and policy wonks, the Jones Act is particularly egregious as the costs and effects are so large. The Jones Act costs states like Hawaii and Alaska and territories like Puerto Rico dramatically as they rely so much on ships for basic commerce that it was actually cheaper for Hawaii and New England to import oil from other countries (like Hawaii did from Russia until the Ukraine war) than it was to have oil shipped from the Gulf of Mexico (where American oil is abundant).
In the case of offshore wind, the Jones Act has pushed those companies willing to experiment with the promising technology, to ship the required parts and equipment from overseas because there are no Jones Act-compliant ships capable of moving the massive equipment that is involved.
This piece from Canary Media captures some of the dynamics and the “launch” of the still-in-construction $625 million Jones Act-compliant ship the Charybdis Dominion Energy will use to support its offshore wind facility.
To satisfy that mandate, Dominion commissioned the first-ever Jones Act–compliant vessel for offshore wind installation, which hit the water in Brownsville, Texas, last week. The hull welding on the 472-foot vessel is complete, as are its four enormous legs, which will hoist it out of the water during turbine installation. This $625 million leviathan, named Charybdis after the fearsome sea-monster foe of Odysseus, still needs some finishing touches before it sets sail to Virginia, which is expected to happen later this year.
Charybdis’ completion will be a win for what’s left of the American shipbuilding industry. The Jones Act, after all, was intended to bolster American shipbuilders and merchant seamen in the isolationist spell following World War I. But a century later, it creates a series of confounding and counterintuitive challenges for America’s energy industry, which frequently redound poorly for most Americans.
…
Elsewhere in the energy industry, the expense and difficulty associated with finding scarce Jones Act–compliant ships push certain American communities to rely more on foreign energy suppliers. Up until 2022, Hawaii turned to Russia for one-third of the oil that powered its cars and power plants. The Jones Act made it too hard or costly to import abundant American oil to the U.S. state, leaving Hawaii scrambling for other sources when Russia invaded Ukraine.
Over in New England, constraints on fossil-gas pipelines sometimes force the region to import gas via LNG terminals. The U.S. has plenty of fossil gas to tap in the Gulf of Mexico, but a lack of U.S. ships pushes Massachusetts and its neighbors to buy gas from other countries instead.
US offshore wind needs American-made ships. The first is nearly ready
Julian Spector | Canary Media -
Backup Your Home Server with Duplicati
(Note: this is part of my ongoing series on cheaply selfhosting)
Through some readily available Docker containers and OpenMediaVault, I have a cheap mini-PC which serves as:
- ad blocker for all the devices in my household
- media streamer (so I can play movies and read downloaded ebooks/documents anywhere that has internet access)
- acts as my personal RSS/newsreader
- handles every PDF related operation you can think of
- functions as network storage for my family
But, over time, as the server has picked up more uses, it’s also become a vulnerability. If any of the drives on my machine ever fail, I’ll lose data that is personally (and sometimes economically) significant.
I needed a home server backup plan.
Duplicati
Duplicati is open source software that helps you efficiently and securely backup specific partitions and folders to any destination. This could be another home server or it can be a cloud service provider (like Amazon S3 or Backblaze B2 or even a consumer service like Dropbox, Google Drive, and OneDrive). While there are many other tools that can support backup, I went with Duplicati because I wanted:
- Support for consumer storage services as a target: I am a customer of Google Drive (through Google One) and Microsoft 365 (which comes with generous OneDrive subscription) and only intend to backup some of the files I’m currently storing (mainly some of the network storage I’m using to hold important files)
- A web-based control interface so I could access this from any computer (and not just whichever machine had the software I wanted)
- An active user forum so I could find how-to guides and potentially get help
- Available as a Docker container on linuxserver.io: linuxserver.io is well-known for hosting and maintaining high quality and up-to-date Docker container images
Installation
Update 2024 Dec 18: One reason Duplicati is a great solution is that it is actively being developed. However, occasionally this can introduce breaking changes. Since version 2.0.9.105, Duplicati now requires a password. This has required an update to the Docker compose setup to include an Encryption Key, a Password, and an earlier update required Nginx proxy to pass additional headers to handle the Websocket-based interface the web interface now uses to keep its interface dynamic. I’ve changed the text below to reflect these
To install Duplicati on OpenMediaVault:
- If you haven’t already, make sure you have OMV Extras and Docker Compose installed (refer to the section
Docker and OMV-Extras
in my previous post, you’ll want to follow all 10 steps as I refer to different parts of the process throughout this post) and have a static local IP address assigned to your server. - Login to your OpenMediaVault web admin panel, and then go to
[Services > Compose > Files]
in the sidebar. Press thebutton in the main interface to add a new Docker compose file.
UnderName
put downDuplicati
and underFile
, adapt the following (making sure the number of spaces are consistent)
--- services: duplicati: image: lscr.io/linuxserver/duplicati:latest container_name: duplicati ports: - <unused port number>:8200 environment: - TZ: 'America/Los_Angeles' - PUID=<UID of Docker User> - PGID=<GID of Docker User> - DUPLICATI__WEBSERVICE_PASSWORD=<Password to access interface> - SETTINGS_ENCRYPTION_KEY=<random set of at least 8 characters/numbers> volumes: - <absolute paths to folders to backup>:<names to use in Duplicati interface> - <absolute path to shared config folder>/Duplicati:/config restart: unless-stopped
Code language: YAML (yaml)- Under
ports:
, make sure to add an unused port number (I went with8200
).
Replace<absolute path to shared config folder>
with the absolute path to the config folder where you want Docker-installed applications to store their configuration information (accessible by going to[Storage > Shared Folders]
in the administrative panel).
You’ll notice there’s extra lines undervolumes:
for<absolute paths to folders to backup>
. This should correspond with the folders you are interested in backing up. You should map them to names that will show up in the Duplicati interface that you recognize. For example, I directed my<absolute path to shared config folder>
to/containerconfigs
as one of the things I want to make sure I backup are my containers.
Once you’re done, hitSave
and you should be returned to your list of Docker compose files for the next step. Notice that the newDuplicati
entry you created has aDown
status, showing the container has yet to be initialized. - To start your Duplicati container, click on the new
Duplicati
entry and press the(up) button. This will create the container, download any files needed, and run it.
To show it worked, go toyour-servers-static-ip-address:8200
from a browser that’s on the same network as your server (replacing8200
if you picked a different port in the configuration file above) and you should see the Duplicati web interface which should look something like below - You can skip this step if you didn’t set up Pihole and local DNS / Nginx proxy or if you don’t care about having a user-readable domain name for Duplicati. But, assuming you do and you followed my instructions, open up WeTTy (which you can do by going to
wetty.home
in your browser if you followed my instructions or by going to[Services > WeTTY]
from OpenMediaVault administrative panel and pressingOpen UI
button in the main panel) and login as the root user. Run:
cd /etc/nginx/conf.d ls nano <your file name>.conf
Code language: Shell Session (shell)- This opens up the text editor nano with the file you just listed. Use your cursor to go to the very bottom of the file and add the following lines (making sure to use tabs and end each line with a semicolon)
server { listen 80; server_name <duplicati.home or the domain you'd like to use>; location / { proxy_pass http://<your-server-static-ip>:<duplicati port no.>; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } }
Code language: HTML, XML (xml)- And then hit
Ctrl+X
to exit,Y
to save, andEnter
to overwrite the existing file. Then in the command line run the following to restart Nginx with your new configuration loaded.
systemctl restart nginx
- Now, if your server sees a request for
duplicati.home
(or whichever domain you picked), it will direct them to Duplicati. With the additionalproxy_http_version
andproxy_set_header
‘s, it will also properly forward the Websocket requests the web interface uses. - Login to your Pihole administrative console (you can just go to
pi.hole
in a browser) and click on[Local DNS > DNS Records]
from the sidebar. Under the section calledAdd a new domain/IP combination
, fill out underDomain:
the domain you just added above (i.e.duplicati.home
) and next toIP Address:
you should add your server’s static IP address. Press theAdd
button and it will show up below. - To make sure it all works, enter the domain you just added (
duplicati.home
if you went with my default) in a browser and you should see the Duplicati interface!
Configuring your Backups
Duplicati conceives of each “backup” as a “source” (folder of files to backup), a “destination” (the place the files should be backed up to), a schedule (how often does the backup run), and some options to configure how the backup works.
After logging in (with the password you specified in the Docker compose file), to configure a “backup”, click on +Add Backup button on the menu on the lefthand side. I’ll show you the screens I went through to backup my Docker container configurations:
- Add a name (I called it
DockerConfigs
) and enter a Passphrase (you can use theGenerate
link to create a strong password) which you’d use to restore from backup. Then hitNext
- Enter a destination. Here, you can select another computer or folder connected to your network. You can also select an online storage service.
I’m using Microsoft OneDrive — for a different service, a quick Google search or a search of the Duplicati how-to forum can give you more specific instructions, but the basic steps of generating an AuthID link appear to be similar across many services.
I selectedMicrosoft OneDrive v2
and picked a path in my OneDrive for the backup to go to (Backup/dockerconfigs
). I then clicked on theAuthID
link and went through an authentication process to formally grant Duplicati access to OneDrive. Depending on the service, you may need to manually copy a long string of letters and numbers and colons into the text field. After all of that, to prove it all worked, pressTest connection
!
Then hitNext
- Select the source. Use the folder browsing widget on the interface to select the folder you wish to backup.
If you recall in my configuration step, I mapped the<absolute path to shared config folder>
to/containerconfigs
which is why I selected this as a one-click way to backup all my Docker container configurations. If necessary, feel free to shut down and delete your current container and start over with a configuration where you point and map the folders in a better way.
Then hitNext
- Pick a schedule. Do you want to backup every day? Once a week? Twice a week? Since my docker container configurations don’t change that frequently, I decided to schedule weekly backups on Saturday early morning (so it wouldn’t interfere with something else I might be doing).
Pick your option and then hitNext
- Select your backup options. Unless you have a strong reason to, I would not change the remote volume size from the default (50 MB). The backup retention, however, is something you may want to think about. Duplicati gives you the option to hold on to every backup (something I would not do unless you have a massive amount of storage relative to the amount of data you want to backup), to hold on to backups younger than a certain age, to hold on to a specific number of backups, or customized permutations of the above.
The option you should choose depends on your circumstances, but to share what I did. For some of my most important files, I’m using Duplicati’ssmart backup retention
option (which gives me one backup from the last week, one for each of the last 4 weeks, and one for each of the last 12 months). For some of my less important files (for example, my docker container configurations), I’m holding on to just the last 2 weeks worth of backups.
Then hitSave
and you’re set!
I hope this helps you on your self-hosted backup journey.
If you’re interested in how to setup a home server on OpenMediaVault or how to self-host different services, check out all my posts on the subject!
-
The California home insurance conundrum
As a California homeowner, I’ve watched with dismay as homeowner insurance provider after homeowner insurance provider have fled the state in the face of wildfire risk.
It was quite the shock when I discovered recently (HT: Axios Markets newsletter) that, according to NerdWallet, California actually has some of the cheapest homeowners insurance rates in the country!
It begs the Econ 101 question — is it really that the cost of wildfires are too high? Or that the price insurance companies can charge (something heavily regulated by state insurance commissions) is kept too low / not allowed to vary enough based on actual fire risk?
-
Why Intel has to make its foundry business work
Historically, Intel has (1) designed and (2) manufactured its chips that it sells (primarily into computer and server systems). It prided itself on having the most advanced (1) designs and (2) manufacturing technology, keeping both close to its chest.
In the late 90s/00s, semiconductor companies increasingly embraced the “fabless model”, whereby they would only do the (1) design while outsourcing the manufacturing to foundries like TSMC. This made it much easier and less expensive to build up a burgeoning chip business and is the secret to the success of semiconductor giants like NVIDIA and Qualcomm.
Companies like Intel scoffed at this, arguing that the combination of (1) design and (2) manufacturing gave their products an advantage, one that they used to achieve a dominant position in the computing chip segment. And, it’s an argument which underpins why they have never made a significant effort in becoming a contract manufacturer — after all, if part of your technological magic is the (2) manufacturing, why give it to anyone else?
The success of TSMC has brought a lot of questions about Intel’s advantage in manufacturing and, given recent announcements by Intel and the US’s CHIPS Act, a renewed focus on actually becoming a contract manufacturer to the world’s leading chip designers.
While much of the attention has been paid to the manufacturing prowess rivalry and the geopolitical reasons behind this, I think the real reason Intel has to make the foundry business work is simple: their biggest customers are all becoming chip designers.
While a lot of laptops and desktops and servers are still sold in the traditional fashion, the reality is more and more of the server market is being dominated by a handful of hyperscale data center operators like Amazon, Google, Meta/Facebook, and Microsoft, companies that have historically been able to obtain the best prices from Intel because of their volume. But, in recent years, in the chase for better and better performance and cost and power consumption, they have begun designing their own chips adapted to their own systems (as this latest Google announcement for Google’s own ARM-based server chips shows).
Are these chips as good as Intel’s across every dimension? Almost certainly not. It’s hard to overtake a company like Intel’s decades of design prowess and market insight. But, they don’t have to be. They only have to be better at the specific use case Google / Microsoft / Amazon / etc need it to be for.
And, in that regard, that leaves Intel with really only one option: it has to make the foundry business work, or it risks losing not just the revenue from (1) designing a data center chip, but from the (2) manufacturing as well.
Axion processors combine Google’s silicon expertise with Arm’s highest performing CPU cores to deliver instances with up to 30% better performance than the fastest general-purpose Arm-based instances available in the cloud today, up to 50% better performance and up to 60% better energy-efficiency than comparable current-generation x86-based instances1. That’s why we’ve already started deploying Google services like BigTable, Spanner, BigQuery, Blobstore, Pub/Sub, Google Earth Engine, and the YouTube Ads platform on current generation Arm-based servers and plan to deploy and scale these services and more on Axion soon.
Introducing Google Axion Processors, our new Arm-based CPUs
Amin Vahdat | Google Blog -
Starlink in the wrong hands
On one level, this shouldn’t be a surprise. Globally always available satellite constellation = everyone and anyone will try to access this. This was, like many technologies, always going to have positive impacts — i.e. people accessing the internet where they otherwise couldn’t due to lack of telecommunications infrastructure or repression — and negative — i.e. terrorists and criminal groups evading communications blackouts.
The question is whether or not SpaceX had the foresight to realize this was a likely outcome and to institute security processes and checks to reduce the likelihood of the negative.
That remains to be seen…
In Yemen, which is in the throes of a decade-long civil war, a government official conceded that Starlink is in widespread use. Many people are prepared to defy competing warring factions, including Houthi rebels, to secure terminals for business and personal communications, and evade the slow, often censored internet service that’s currently available.
Or take Sudan, where a year-long civil war has led to accusations of genocide, crimes against humanity and millions of people fleeing their homes. With the regular internet down for months, soldiers of the paramilitary Rapid Support Forces are among those using the system for their logistics, according to Western diplomats.
Elon Musk’s Starlink Terminals Are Falling Into the Wrong Hands
Bruce Einhorn, Loni Prinsloo, Marissa Newman, Simon Marks | Bloomberg -
Why don’t we (still) have rapid viral diagnostics?
One of the most disappointing outcomes in the US from the COVID pandemic was the rise of the antivaxxer / public health skeptic and the dramatic politicization of public health measures.
But, not everything disappointing has stemmed from that. Our lack of cheap rapid tests for diseases like Flu and RSV is a sad reminder of our regulatory system failing to learn from the COVID crisis of the value of cheap, rapid in-home testing or adopting to the new reality that many Americans now know how to do such testing.
Dr. Michael Mina, the chief science officer for the at-home testing company eMed, said the FDA tends to have strict requirements for over-the-counter tests. The agency often asks manufacturers to conduct studies that demonstrate that people can administer at-home tests properly — a process that may cost millions of dollars and delay the test’s authorization by months or years, Mina said.
“It’s taken a very long time in the past to get new self-tests authorized, like HIV tests or even pregnancy tests,” he said. “They’ve taken years and years and years and years. We have a pretty conservative regulatory approach.”
Rapid tests for Covid, RSV and the flu are available in Europe. Why not in the U.S.?
Aria Bendix | NBC News -
Huggingface: security vulnerability?
Anyone who’s done any AI work is familiar with Huggingface. They are a repository of trained AI models and maintainer of AI libraries and services that have helped push forward AI research. It is now considered standard practice for research teams with something to boast to publish their models to Huggingface for all to embrace. This culture of open sharing has helped the field make its impressive strides in recent years and helped make Huggingface a “center” in that community.
However, this ease of use and availability of almost every publicly accessible model under the sun comes with a price. Because many AI models require additional assets as well as the execution of code to properly initialize, Huggingface’s own tooling could become a vulnerability. Aware of this, Huggingface has instituted their own security scanning procedures on models they host.
But security researchers at JFrog have found that even with such measures, have identified a number of models that exploit gaps in Huggingface’s scanning which allow for remote code execution. One example model they identified baked into a Pytorch model a “phone home” functionality which would initiate a secure connection between the server running the AI model and another (potentially malicious) computer (seemingly based in Korea).
The JFrog researchers were also able to demonstrate that they could upload models which would allow them to execute other arbitrary Python code which would not be flagged by Huggingface’s security scans.
While I think it’s a long way from suggesting that Huggingface is some kind of security cesspool, the research reminds us that so long as a connected system is both popular and versatile, there will always be the chance for security risk, and it’s important to keep that in mind.
As with other open-source repositories, we’ve been regularly monitoring and scanning AI models uploaded by users, and have discovered a model whose loading leads to code execution, after loading a pickle file. The model’s payload grants the attacker a shell on the compromised machine, enabling them to gain full control over victims’ machines through what is commonly referred to as a “backdoor”. This silent infiltration could potentially grant access to critical internal systems and pave the way for large-scale data breaches or even corporate espionage, impacting not just individual users but potentially entire organizations across the globe, all while leaving victims utterly unaware of their compromised state.
Data Scientists Targeted by Malicious Hugging Face ML Models with Silent Backdoor
David Cohen | JFrog blog -
Nope, the Dunning-Kruger Effect is just bad statistics
The Dunning-Kruger effect encapsulates something many of us feel familiar with: that the least intelligent oftentimes assume they know more than they actually do. Wrap that sentiment in an academic paper written by two professors at an Ivy League institution and throw in some charts and statistics and you’ve got a easily citable piece of trivia to make yourself feel smarter than the person who you just caught commenting on something they know nothing about.
Well, according to this fascinating blog post (HT: Eric), we have it all wrong. The way that Dunning-Kruger constructed their statistical test was designed to always construct a positive relationship between skill and perceived ability.
The whole thing is worth a read, but they showed that using completely randomly generated numbers (where there is no relationship between perceived ability and skill), you will always find a relationship between the “skill gap” (perceived ability – skill) and skill, or to put it more plainly,
With
y
being perceived ability andx
being actual measured ability.What you should be looking for is a relationship between perceived ability and measured ability (or directly between
y
andx
) and when you do this with data, you find that the evidence for such a claim generally isn’t there!In other words:
The Dunning-Kruger effect also emerges from data in which it shouldn’t. For instance, if you carefully craft random data so that it does not contain a Dunning-Kruger effect, you will still find the effect. The reason turns out to be embarrassingly simple: the Dunning-Kruger effect has nothing to do with human psychology. It is a statistical artifact — a stunning example of autocorrelation.
The Dunning-Kruger Effect is Autocorrelation
Blair Fix | Economics from the Top Down