Another month, another paper (although this one is almost two weeks overdue – sorry!)
In my life in venture capital, I’ve started more seriously looking at new bioinformatics technologies so I decided to dig into a topic that is right up that alley. This month’s paper from Nature Biotechnology covers the use of next-generation DNA sequencing technologies to look into something which had been previously extremely difficult to study with past sequencing technologies.
As the vast majority of human DNA is the same from person to person, one would expect that the areas of our genetic code which tend to vary the most from person to person, locations which are commonly known as Single Nucleotide Polymorphisms, or SNPs, would be the biggest driver of the variation we see in the human race (at least the variations that we can attribute to genes). This paper from researchers at the Beijing Genomics Institute (now the world’s largest sequencing facility – yes, its in China) adds another dimension to this – its not just SNPs that make us different from one another: humans also appear to have a wide range of variations on an individual level in the “structure” of our DNA, what are called Structural Variations, or SVs.
Whereas SNPs represent changes at the individual DNA code level (for instance, turning a C into a T), SVs are examples where DNA is moved (i.e., between chromosomes), repeated, inverted (i.e., large stretches of DNA reversed in sequence), or subject to deletions/insertions (i.e., where a stretch of DNA is removed or inserted into the original code). Yes, at the end of the day, these are changes to the underlying genetic code, but because of the nature of these changes, they are more difficult to detect with “old school” sequencing technologies which rely on starting at one position in the DNA and “reading” a stretch of DNA from that point onward. Take the example of a stretch of DNA that is moved – unless you start your “reading” right before or right at the end of where the new DNA has been moved to, you’d never know as the DNA would read normally everywhere else and in the middle of the DNA fragment.
What the researchers figured out is that new sequencing technologies let you tackle the problem of detecting SVs in a very different way. Instead of approaching each SV separately (trying to structure your reading strategy to catch these modifications), why not use the fact that so-called “next generation sequencing” is far faster and cheaper to read an individual’s entire genome and then look at the overall structure that way?
And that’s exactly what they did (see figures 1b and 1c above). They applied their sequencing technologies to the genomes of an African individual (1c) and an Asian individual (1b) and compared them to some of the genomes we have on file. The circles above map out the chromosomes for each of the individuals on the outer-most ring. On the inside, the lines show spots where DNA was moved or copied from place to place. The blue histogram shows where all the insertions are located, and the red histogram does the same thing with deletions. All in all: there looks to be a ton of structural variation between individuals. The two individuals had 80-90,000 insertions, 50-60,000 deletions, 20-30 inversions, and 500-800 copy/moves.
The key question that the authors don’t answer (mainly because the paper was about explaining how they did this approach, which I heavily glossed over here partly because I’m no expert, and how they know this approach is a valid one) is what sort of effect do these structural variations have on us biologically? The authors did a little hand-waving to show, with the limited data that they have, that humans seem to have more rare structural variations than we do rare SNPs – in other words, that you and I are more likely to have different SVs than different SNPs: a weak, but intriguing argument that structural variations drive a lot of the genetic-related individual variations between people. But that remains to be validated.
Suffice to say, this was an interesting technique with a very cool “million dollar figure” and I’m looking forward to seeing further research in this field as well as new uses that researchers and doctors dig up for the new DNA sequencing technology that is coming our way.
(All figures from paper)
Paper: Li et al., “Structural Variation in Two Human Genomes Mapped at Single-Nucleotide Resolution by Whole Genome Assembly.” Nature Biotechnology 29 (Jul 2011) — doi:10.1038/nbt.1904