Tuesday, March 3, 2015

Hot under the (epidemiological) collar

Blogs like this are venues for expressing views on the current scene, in our case, related to genetics, evolution and a few other things we throw in.  If you express a view, unless it's just plain vanilla, you will irritate some readers.  In a sense, if you don't then there's no point in writing the blogpost.  In this case, we heavily criticized the recent NYTimes article reporting that the government has now backed off its claim that dietary cholesterol is a heart disease risk factor.  We try to be responsible, but that doesn't mean we have to expect agreement or to mince words!

We argued that the kind of herky-jerky yes/no results from huge long-term megastudies are so common that this shows the studies are rather useless and we think should be phased out, the results to date archived for anyone who wants to mine them, and the funds put to something that actually generates more trustworthy and stable results (if risks are stable enough to be estimated in these ways).

Well, this generated a very heated message from an old friend, a prominent genetic epidemiologist, who said that if we were listened to, it would lead to throwing the baby out with the bathwater.  He was upset because he said it was not the data but the analysis of these big epidemiological (environmental or genetic) studies that was at fault.  The studies are based essentially on correlation or regression models that assume everyone starts out as an equal blank slate, and whose individual risk is the basically additive total of the various risk factor exposures.  Your sex gives you a 'dose' of risk, which age adds to, then smoking history, diet, and so on.  Once all your risk-factor measures are toted up, your net risk can be estimated.  It is this general approach that regardless of sophisticated details in the statistical method, is not good at finding what we really should be looking for.

Our friend's idea is that, for example, dietary cholesterol may on average not be harmful, but there are likely some subsets of the population for which it is a risk.  Standard models may be convenient to apply, and everyone knows how to do that....but they miss the boat.  The key problem is basically that risk factors interact and searching for complex interaction is rarely done because it is very demanding in terms of sample size, sample structure, and analytic tools.  But there is no reason, for example, to think that males and females respond to a given risk factor equally per exposure dose.  So interactions ('epistasis' in relation to genome elements)  are given very light treatment and basically wished away.

My irate friend basically argued that what is needed is not an end to the data but to the methods. 
One recent paper I was referred to applies application of a method for finding high-risk subsets, and has references to earlier descriptions of the methodology is this: Int J Epidemiol. ("Subgroups at high risk for ischaemic heart disease:identification and validation in 67 000 individuals from the general population", Frikke-Schmidt R et al."), 2015 Feb;44(1):117-28 (but unfortunately it is not freely available).

How effectively this will find really different subgroups is open.  We know, for example, that males and females are not at equal risk and respond differently to other factors, as mentioned above.  We know genomic components interact.  We know that as you get older you get closer to various risks, such as heart disease or cancer, and that the same exposure has different impacts with age, and so on.

The idea both in public health and in medicine (and in evolutionary inference) of identifying causation as effectively as possible, and that includes identifying high risk individuals as early as possible, is of course absolutely the right thing.  There are many instances of genetic risk factors like some variants in the gene responsible for cystic fibrosis, or in the BRCA1 gene related to breast cancer, where the Who Cares? principle applies: the single factor's effect is so predictably strong that one intervenes regardless of the details of how much risk is associated or what other factors might slightly modify the outcome.  We know that the nominal risk factor (e.g., a mutation) doesn't always lead to the same degree of severity, but the variation isn't enough to cause doubt: Who Cares about the details?

But whether in general this sort of method of searching for statistical associations can identify risk earlier enough, where preventive measures might be more helpful, is unclear.  We know about age and sex and smoking and so on, and maybe we don't really gain much from adjusting the exact values.  Or maybe we would.  Likewise for the complex interactions among hundreds of contributing genomic factors.  But there the number of factors and the assumption of independence and so on need to be recognized as being as daunting as they are, relative to statistical risk analysis.

In my view, this is still walking in dreamland.  The major factors will be identified, perhaps with more precision, but we face a huge, open-ended kind of 'multibody' problem.  It's just not possible to analyze all the possible combinations of factors and their interactions to get combination-specific risk estimates.  First, risks are contingent, one factor's effect depending on what else is present, as discussed above.  Second, not all combinations will show up in the data, even in huge samples, so estimating risks if interactions must be accounted for will simply come up short, or perhaps better-put, with unknown or even unknowable precision.

Third, we know very well even from just the recent few decades, that incidence of outcomes changes hugely, yet the genomes basically don't, and while we may estimate the effects of the particular combinations of environments our sampled individuals were exposed to, we simply cannot, even in principle, know what environmental factors current individuals for whom we are being promised precise predictions, will be exposed to.  Yet heritabilities, with all their problems, clearly show that genomes contribute typically far less than half of all risk.  Environmentally and genomically, specific factors or variants come and go, and no two people are identical or even close to it.

A major issue is not just that there is no way, that is no way to know what risks are associated with most risk factors, much less interactions among them, even in principle, but we have no way of knowing the degree of precision of predictions.   This is why, among other things, even if increasing our understanding is a very noble pursuit, promising 'precision' in prediction based on genomes or, really, almost any other risk factors, is irresponsible.

Monday, March 2, 2015

When even well-posed questions are hard to answer

On Friday, in acknowledgement of Rare Disease Day, our daughter Ellen blogged about living with a rare disease.  She wrote eloquently about her wish to understand why she has this disease, including, if it's a single gene disorder, knowing the causal variant.  She wrote about the advantages of this when navigating a medical system that isn't always sensitive to rare diseases, but in which genetics has become the gold standard.  We fully support her wish to understand why she has this disease, and have tried to help as much as we can. We would do the DNA work ourselves if we could.

Even so, she mentioned that her parents, Ken and I, are skeptics about a lot of genetic research.  Yes, that's true, but another word for that is 'realist'.  We are alive at a time in history when more is known about genes and genomes than ever before, and for decades we've been hearing promises of what this new knowledge will mean for medicine, and the promises roll on.  Once we all have our genomes on a disk, we'll be able to predict and treat whatever it is our DNA foretells.

Lazuli Bunting; rare birds in Central Pennsylvania; Wikipedia, Leander Sylvester Keyser

Except, except, Ellen's genome is on a disk.  Or at least her exome, the protein-coding parts of her genome.  Her disease, hypokalemic periodic paralysis, is one of several forms of periodic paralysis, which have been found to be associated with three different ion channel genes.  At one time a researcher in Germany was offering free genotyping to anyone diagnosed with the disease.  Ellen sent  blood samples, but was told that she doesn't have any of the known causal variants in these genes. She was also involved in a large whole exome study of unexplained Mendelian disease, but all they were able to tell her was that she doesn't have any potentially causal de novo mutations, mutations that neither Ken nor I have.  She is the only family member with HKPP, and as such, the initial question in a search for the cause is whether she has a variant that we don't have, that might be responsible.

And, to her frustration, that is all she knows.  It has been suggested that she go the clinical genetics route, having her DNA tested for known causes of HKPP, but that seems unlikely to be helpful, given that she knows what disease she has, just doesn't know why, and clinical labs don't look for new causal genes or variants, but instead a battery of those that are known.

Ellen has classic symptoms and classic triggers, and her disease is pretty well controlled at the moment, so identifying the cause, as she wrote in her post, might not change her treatment, but it would ease her mind about future dealings with the medical system.  As importantly, it might help future patients avoid the lengthy, destructive diagnostic odyssey she herself experienced, which itself would be a very satisfying outcome.

Big Data advocates will say that the problem is that not enough people with HKPP have been sequenced, and once we've got a million genomes or more, that will facilitate identifying Ellen's and others' causal variants.  But only 1 in 200,000 people have HKPP, so one million is unlikely to help.  And, though the data are rather sparse, some estimates based on those data suggest that a fairly large minority, a third or so, won't have one of the known causal genetic variants.  As with most diseases, the phenotypes vary greatly, and again as with most diseases, this is likely to be because every genome is unique, and genetic background matters, along with exposure to other triggering factors.

Perhaps there's an as-yet unidentified gene that would explain many of the unidentified cases, or there are many unique pathways to the disease, or both, but given the rarity and the heterogeneity of the periodic paralyses, it would take a huge amount of luck for even a large database to answer Ellen's question.  We should perhaps call it dumb luck, because the investigators vacuum up generic data without specific regard to, say, the physiology of this particular disorder (and the same for countless other disorders).  Of course, collecting data on every possible physiological or environmental factor, mostly with weak individual effects, isn't possible and that is a dilemma for modern public health science.

In addition, it's known from affected families that penetrance of alleles related to the periodic paralyses is not 100% -- some people with a 'causal' variant never experience an attack, making associating genotype with phenotype even harder.  Again, genetic background may affect this but, as with many genetic disorders with variable penetrance, it's not at all clear.  Incomplete penetrance is a fact, but also a fudge factor, because it leaves the impression the trait really is 'genetic'; in fact, we often don't know how many people have such mutations but no symptoms at all, because they aren't screened (but some studies looking for such asymptomatic cases have easily found them, and they can be as common as the 'causal' mutations in affected patients).

Further, it's possible that there are non-ion channel related causes of these channelopathies.  That is, something upstream is going wrong.  In that case, it's unclear where to even begin to look for genetic causation.  Thus, hypothetically in this instance, ion channels respond to the ionic concentrations inside the cell and in its environs.  Factors that affect the ion concentrations themselves could lead to effects similar to ion channel defects per se.  Thus, again just surmising, there are known environmental stimuli for attacks but these may affect the ion concentrations themselves, not the channel protein function.  And, of course, both could be at work, which would be rather expected given the many precedents for disease complexity.

And, it's possible that Ellen's disease is polygenic, or not genetic at all, though given that many cases of periodic paralysis, including in families, seem to have a single genetic cause, this seems unlikely.

Genetics asks two basic questions: What causes disease X?  And, who will get it?  The promises of the past few decades are that answers to both these questions are just around the corner for most diseases.  The NIH Office of Rare Disease Research reports that there are 7000 known rare diseases (diseases that affect fewer than 1 in 200,000 people).  The cause of many of these diseases has been identified, and by some criteria over 6000 specific genes have been associated with some usually rare single-gene disorder.  In many cases, it's possible to predict who will get the disease, and that is where genetic counseling is so useful.  It is, in our view, also where our limited research resources should be directed.  

But, if you read MT at all regularly, you know what we think about the promise of predicting common, complex diseases with genes.  Current science is very far from answering the two simple questions, what causes common, complex disease X?, and who will get it?  And, you know that we think that's because these questions can't be answered in any way approximating the promise of, say, precision medicine.  

But single-gene disorders are a different kind of problem.  What causes Ellen's HKPP? That seems to be a well-posed question, and should be answerable.  But to date, it hasn't been.  Labs are reporting 25-30% success with identifying the cause of rare genetic diseases (some somewhat higher success rates), so she is not at all unique.   We commented last week on the problem of identifying specific at-risk subgroups more effectively than blanket epidemiological studies currently can.

Are we skeptics?  Or are we realists?  When even the 'easy' cases, like Ellen's, the low-hanging fruit, are hard, what does this mean about the promises for genomics?  

Friday, February 27, 2015

The story of a rare disease

By Ellen Weiss

Despite being the product of  two of the authors of this blog – two people skeptical about just how many of the fruits of genetic testing that we've been promised will ever actually materialize  – I have been involved in several genetic studies over the years, hoping to identify the cause of my rare disease.

February 28 is Rare Disease Day (well, Feb 29 technically; the last day of February which is, every four years, a rare day itself!); the day on which those who have, or who advocate for those who have, a rare disease publicly discuss what it is like to live with an unusual illness, raise awareness about our particular set of challenges, and talk about solutions for them.

I have hypokalemic periodic paralysis, which is a neuromuscular disease; a channelopathy that manifests itself as episodes of low blood potassium in response to known triggers (such as sodium, carbohydrates, heat, and illness) that force potassium from the blood into muscle cells, where it remains trapped due to faulty ion channels.  These hypokalemic episodes cause muscle weakness (ranging from mild to total muscular paralysis), heart arrhythmias, difficulty breathing or swallowing and nausea.  The symptoms may last only briefly or muscle weakness may last for weeks, or months, or, in some cases, become permanent.

I first became ill, as is typical of HKPP, at puberty.  It was around Christmas of my seventh grade year, and I remember thinking to myself that it would be the last Christmas that I would ever see.  That thought, and the physical feelings that induced it, were unbelievably terrifying for a child.  I had no idea what was happening; only that it was hard to breathe, hard to eat, hard to walk far, and that my heart skipped and flopped all throughout the day.  All I knew was that it felt like something terrible was wrong.

Throughout my high school years I continued to suffer. I had numerous episodes of heart arrhythmia that lasted for many hours, that I now know should've been treated in the emergency department, and that made me feel as if I was going to die soon; it is unsettling for the usually steady, reliable metronome of the heart to suddenly beat chaotically. But bound within the privacy teenagers are known for, my parents struggled to make sense of my new phobic avoidance of exercise and other activities as I was reluctant to talk about what was happening in my body.

HKPP is a genetic disease and causal variants have been found in three different ion channel genes.  Although my DNA has been tested, the cause of my particular variant of the disease has not yet been found.  I want my mutation to be identified.  Knowing it would likely not improve my treatment or daily life in any applicable way.  I'm not sure it would even quell any real curiosity on my part, since, despite having the parents I have, it probably wouldn't mean all that much to this non-scientist.  

But I want to know, because genetics has become the gold standard of diagnostics.  Whether it should be or not, a genetic diagnosis is considered to be the hard-wired, undeniable truth.  I want that proof in my hand to give to physicians for the rest of my life.  And of course, I would also like to contribute to the body of knowledge about HKPP in the hopes that future generations of us will not have to struggle with the unknown for so many years.

For many people, having a rare disease means having lived through years of confusion, terrible illness, misdiagnoses, and the pressure to try to convince skeptical or detached physicians to engage in investigating their suffering.

I was sick for all of my adolescent and young adult years; so sick that I neared the edge of what was bearable.  The years of undiagnosed, untreated chaos in my body created irrevocable changes in how I viewed myself and my life.  It changed my psychology, induced serious anxiety and phobias, and was the backdrop to every single detail of every day of my life.  And yet, it wasn't until I was 24 years old that I got my first clinical clues of what was wrong.  An emergency room for arrhythmia visit revealed very low blood potassium.  Still, for 4 more years I remained undiagnosed, and there was horrible suffering during which my loved ones had to take care of me like a near-infant, accompanying me to the hospital, watching me vomit, struggle to eat or walk to the bathroom, and waking up at 3am to take care of me.  For 4 more years I begged my primary physician and countless ER doctors during desperate visits to investigate what was going wrong, asked them to believe that anxiety was a symptom not a cause, and scoured medical information myself, until I was diagnosed.  It wasn't until I was 28 that I found a doctor who listened to me when I told him what I thought I had, made sense of my symptoms, recognized the beast within me, and began to treat me.

My existence, while still stained to a degree every day by my illness, has improved so immeasurably since being treated properly that the idea of returning to the uncontrolled, nearly unbearable sickness I once lived with frightens me very much.  I fear having to convince physicians of what I know of my body again.

What I went through isn't all that uncommon among the millions of us with a rare disease.  Lengthy periods of misdiagnoses, lack of diagnoses, begging well-meaning but stumped, disbelieving, or truly apathetic physicians to listen to us are common themes.  These lost years lay waste to plans, make decisions for us about parenthood, careers, and even whether we can brush our own teeth.  They induce mistrust, anxiety, exhaustion.

Each rare disease is, of course, by definition rare.  But having a rare disease isn't. Something like 10% of us has one.  It shouldn't be a frightening, frustrating, lengthy ordeal to find a physician willing to consider that what a patient is suffering from may be outside of the ordinary since it isn't all that unlikely at all.  Mathematically, it only makes sense for doctors to keep their eye out for the unusual.

I hope that one day the messages we spread on Rare Disease Day will have swept through our public consciousness enough that they will penetrate the medical establishment.  Until then, I will continue to crave the irrefutable proof of my disorder.  I will continue to worry about someday lying in a hospital bed, weak and verging on intolerably sick, trying to convince a doctor that I know what my body needs, a fear I am certain many of my fellow medically-extraordinary peers share.

And that is why I, this child of skeptics, seek answers, hope and proof through genetics.

Thursday, February 26, 2015

Digesting yeast's message

A new paper in Nature by Levy et al. reports on the genomic consequences of large-scale selection experiments in yeast.  Yeast reproduce asexually and clones can be labeled with DNA 'barcode' tags and followed in terms of their relative frequency in a colony over time.  This study was able to deal with very large numbers of yeast cells and because they used barcodes the investigators could practicably follow individual clones without needing to do large-scale genome sequencing.  Prior to this, this sort of experiment was prohibitively costly and laborious.  So the authors add to findings in selection experiments using bacteria or flies and so on, where mostly aggregate responses could be identified.

In this case, nutrient stress was imposed, and as beneficial mutations occurred and gave their descendant cells (identified by their barcode) an advantage, the dynamics of adaptation could be followed.  The authors showed, in essence, that at the beginning the fitness of the overall colony increased as some clones, bearing advantageous mutations, rose rapidly in relative frequency.  Then, the overall colony fitness stabilized and subsequent advantageous mutations were largely kept at low frequency (most eventually went extinct).  But overall, the authors found thousands of colonies with different advantageous variants; most fitness effects were of only a small (or, for the majority, very small) percent.  Once a set of large numbers of 'fit' variants had become established, new ones had a difficult time making any difference, and hence staying around very long.

This study will be of value to those interested in evolutionary dynamics, though I think the interpretation may be rather more limited than it should, for reasons I'll suggest below.  But I would like to comment on the implications beyond this study itself.

Who cares about yeast (except bakers, brewers, and a few labs)?  You should!
This is interesting (or not) you might say, depending on whether you're running a yeast lab, or in the microbrew or bakery business. But there are important lessons for other areas of science, especially genomics and the promises being made these days.  Of course, the lesson isn't a pleasant one (which, you might correctly assume, is why we're writing about it!).

This study has important implications for basic evolutionary theory perhaps, but also for much that is going on these days in human biomedical (and also evolutionary) genetics, where causal connections between genomic genotypes and phenotypes are the interest.  In evolution, selection only works on what is inherited, mainly genotypes, but if causation is too complex, the individual genotype components have little net causal effect and as a result are hardly 'seen' by selection, and evolve largely by chance.  That's important because it's very different from Darwin's notions and the widespread idea that evolution is causally rather simple or even deterministic at the gene level.

Put another way, genomic causation evolved via the evolutionary process.  If natural selection didn't or couldn't refine causation to a few strong-effect genes, that is, to make it highly deterministic at the individual gene level, then biomedical prediction from genome sequences won't work very effectively.  This is especially true for traits, disease or otherwise, that are heavily affected by the environment (as most are) or for late-onset traits that were hardly present in the past or arose post-reproductively and hence didn't affect reproductive fitness and are not really 'specified' by genes.

There was considerable genomic variation between the authors' two replicate yeast experiments.  As one might say, meta-analysis would have some troubles here.  Likewise, from cell lineage to cell lineage, different sets of mutations were responsible for the fitness of the lineage in this controlled, fixed environment. This means that even in this very simplified set-up, genomic causation was very complex.  No 'precise' yeastomic prognostication!

In real biological history, even for yeast and much more so for sexually reproducing species in variable environments, selection has never been unitary or fixed, and genomes much more complex. Human populations have been until very recently very much smaller than 10^8 in the yeast experiments, and recent population expansion will make the number of low-frequency variants much greater, and with recombination, vastly more genomically unique.

The bottom line here is that our traits should be much less predictable from genotypes than traits in yeast. We have not reached, nor did our ancestors ever reach, the kind of fitness equilibrium reached in the yeast study under controlled selection, and fixed environments.

Somatic mutation
The authors also compare the large numbers of cells whose evolution they were able to follow with their barcode-tagging method, to the evolution of genetic variation in cancer and microbial infections, where there are even larger numbers of cells in an affected person and, importantly, clones expanding because of advantageous mutations. From the yeast results, these clonal advantages may not generally be due to one or two specific mutations (with perhaps, hopefully, exceptions when chemotherapy or antibiotics exert far stronger selection than was imposed in the yeast experiment). But the general complexity of such clonal expansions present major challenges, because they may end up with descendant branches distributed throughout the body where even in principle the responsible variation can't be directly assessed.

But the implications go far beyond cancer.  As we've recently posted, cancer is a clear but perhaps only a single manifestation of a more general phenotypic relevance of the accumulation of somatic mutations, that occur in body cells during life and can in aggregate have systemic or organismal-level implications.  The older we get the more likely we are to generate such clones, all over the body, and it seems likely that they can become manifest not just as individually ill-behaving cells, but as disease for the whole person.

But it's not just late onset implications that the yeast work may forebode.  There are already huge numbers of cells in the early embryo and fetus whose even huger descendant clades of cells during life grow many, many fold by adulthood.  There is no reason not to expect that each of us will carry clades that include differently-than-normal functioning cells in our tissues.  Let age, environmental exposure, and further mutations add to this and disease or age-related degeneration can result.  Yet none of this can be detected in the usual individual's 'genome' as currently viewed.  This is a potentially important fact that, for practical reasons or what one might call reasons of convenience, is ignored in the wealth of mega-sequencing projects being lobbied for based on genome sequencing (precision prediction being the most egregious claim).

So a bit of brewer's yeast may be telling us a lot--including a lot that we don't want to hear. Inconvenient facts can be dismissed.  Oh, well, that's just yeast!  They evolve differently!  That was just a lab experiment!  Brewers and bakers won't even care!

So let's just ignore it, as if it only applies to those rarefied yeast biologists.  Eat, drink, and be merry!

Wednesday, February 25, 2015

Survival of the safest: Darwinian conservatism, not derring-do

A mantra for many in life science is 'survival of the fittest'.  This phrase, one Darwin liked and used many times after he saw its use by Herbert Spencer, reflects Darwin's view of life as a relentlessly competitive phenomenon.  To Darwin, life was an unending struggle for survival (and reproduction) among individuals in every species all the time.  Natural selection, a relentless force like Newtonian gravity, always identified the 'fittest', weeding out the others.

Darwin's objective was to show how new characters could arise, that were suited--'fitted'--to their environment, without the intervention of God via special creation events.  Because organisms, all and always, were struggling against each other for limited resources, they 'tried' (via their inherited genomic drivers) to be better, different, more exploitive of environmental opportunities than their fellows.  Dare to be different!

But in perhaps fundamental ways, Darwin had it very wrong, perhaps inverted from what is really going on.  We know this from the analysis of genomes, the presumed source of all evolutionary evidence, since everything that's inherited goes back, at least indirectly and usually directly, to information carried in DNA.

When DNA sequences are compared within or between species, there are segments that are seen to have very little variation among the sequences, and segments with much more variation.  Now we have learned how to identify truly functional parts like coding exons, transcription start sites, introns, promoter and some regulatory regions, functional RNAs (like tRNA, rRNA and so on), telomeres, and so on.  And we have also identified many parts (the majority, actually) that has far less obvious or strong function, if indeed any function at all.  So what do we see?

The clear, consistent pattern is that the more strongly functional, the more highly conserved (there are a few exceptions, like sensory system genes in the olfactory and immune system, but even their variation proves the rule).  The less, or non-functional regions vary much more, both within and between species.
Herd of dairy goats, Polymeadows Farm; photo A Buchanan

This has been seen so consistently, that for many purposes (like the ENCODE project to characterize all DNA elements) sequence conservation is the very definition of biological function.  The reason is that evolution conserves function but doesn't care about bits that have no function.  One can quibble about the details, but the main gist of the message seems unequivocally correct.  But this now near-dogmatic principle has some little-digested implications.

Darwinian evolution
The problem Darwin wanted to solve was to explain the differences among species, and the way they were suited to their ways of life, in terms of historical processes rather than Divine creation.  He had a deep sense of geological change and biogeography from his trip on the Beagle, that showed the evidence of local relationships that suggested common ancestry.  And then he had an idea of a law of Nature, an ineluctable force-like process of adaptive change, the way gravity is a force, that would gradually form the kinds of differences that characterized species.

'Natural selection' was the name he gave to that force.  And because it was force, like the way gravity is a force, it could detect the tiniest differences among competing organisms and favor them to produce the next generation.

The idea is that species always over-reproduce relative to their resources (an idea that was already 'in the air' in Britain at the time), and struggle to obtain what become limited resources.  Because of inherited variation, the individuals with the best genotype (to use our term for it) reproduced, their poor lesser peers fell to what Tennyson would call 'Nature red in tooth and claw'.

Darwin's idea was that selection always favored the innovator.  Relentless striving to be different from the herd, to get the scarce food or mate supply.  As the late thinker Leigh Van Valen suggested, the Darwinian struggle was like the Red Queen in Alice and Wonderland--always running as fast as she could, but never getting ahead because the competition was always trying to out-do you with their own adaptations.

This is so entrenched in the biological and evolutionary literature, that it may be surprising to realize how different from what we see in the actual data--the genetic data, our most precise indicator--about how evolution works.

Or is this Darwinian?
What we actually see in the genetic data is not the kind of chaotic variation that an intense, force-like, relentless struggle to be better than your peers would lead us to expect.  Instead, what we see is what can only be called herd behavior at the genome level.  Our ancestors did, and our contemporaries do, their very best to stay with the herd.  The high conservation of functional DNA sequence suggests that mutational variation is mainly harmful to fitness, that what selection really favors is conformism. Don't be very different, or you'll get pruned away from your species' posterity!

Herd of flamingos, the Camargue, France; A Buchanan
Instead of 'survival of the fittest', what is by far mainly going on is 'survival of the safest'.  Stay with the mean.  Why is that?  A standard answer that is likely accurate, is that we today are the product of a long past in which the traits we bear were able to survive.  We're very complex organisms, so what evolution hath joined, let no one put asunder except at their peril!

Survival of the safest might suggest that there is no innovation, which is clearly not correct, since different species have different adaptations--fish swim, cats eat meat, bats fly, we write blog posts. So clearly differences do arise and have been favored regularly in the past.  However, that seems inconsistent with the high level of genomic conservatism that is so predictably identified.

One way, perhaps the major way, that these apparent contradictions are reconciled is this:  As Darwin stressed repeatedly, evolutionary change is very, creepingly slow.  That means that either there are occasional short bursts of rapid, major change, brought about by largely catastrophic changes in circumstances, or, only a very minor 'ooze' of the distribution of traits occurs in some favored direction from one generation to the next.  This is imperceptibly slow at any given time, because being near the mean is still the safest place to be.  Just be a tiny bit different.

Survival of the safest is Darwinian in that it is a form of natural selection.  Indeed, it is a lot more Darwinian than Darwin was himself.  Selection is more probabilistic and less force-like than he thought (he lived still in Newton's shadow), but it is always at work as he said it was.  It's just that it's mainly at work removing rather than favoring what is different.  Every geneticist knows this, but it is far from thoroughly integrated into the common view of evolution even by professionals.

If a trait were being strongly driven by selection in some new direction all the time, as in the more exclusive connotation of survival of the fittest, we might expect only a few variants with strong effect in the favored direction would be contributing to its newly adaptive instances.  Mapping the variation in the trait would perhaps yield a rather simple genetic causal picture as a result.

But if a trait is being roughly maintained, by survival of the safest, pruning away serious deviants, then any genotypes that are consistent with being somewhere near the average can stay around, with individual variants coming and going by chance (genetic drift).  Variants conferring trait values too far from the mean are pruned by selection.  But most variants can hang around.  Mapping would reveal very large numbers of contributing variants across the genome.  And there would not be precise predictability from genotype to phenotype.

This is what we see in biology.

Survival of the safest in daily life, too
'Survival of the safest' thus seems to be a better metaphor for adaptive biological evolution.  But if you think about it you'll see that we see much of the same regularly in most aspects of our society. We may say heady things like 'dare to be different!', but those who dare to be very different are quickly punished by being ignored or directly slapped down.  This is true through history. It's the general fact in religion, government, social behavior.  And it's true in science, too.  Business as usual is safe, real innovation is a threat to the established.  We see the press of society to claim to be different, but not really to be different.  Innovation mostly means incremental change trumpeted with exaggerated verbiage.  We may even think we want major, rapid change, but emotionally we shy from it, and feel too nervous about what it might mean for our own current state.

Organizations and sociocultural and political systems are very slow to change. or they change in herd-like fashion.  This is true even in realms, like the business world, where one often hears a rather self-satisfied pronouncement that allowing free-market Darwinian competition is the way to get innovation. Innovation is often claimed, but much less often really major.  It's true in science, too: everyone is playing grantsmanship, to seem different to draw attention or funds, in this case, but you dare to be very different at your peril, lest reviewers suspect, distrust, can't grasp, or are jealous of your idea.  Some rapid change may occur, but it's not so common relative to the inertia of survival of the safest.

That is what we see in society.

Tuesday, February 24, 2015

Causation revisited again

A paper* published recently in The Medical Journal of the Islamic Republic of Iran ("X-ray radiation and the risk of multiple sclerosis: Do the site and dose of exposure matter?" Motamed et al.) explores the possibility that X-rays are a risk factor for multiple sclerosis (MS).  (Do we routinely read this journal?  No.  Ken sent me a pdf of the paper, and when I asked him where he'd gotten it, he said he thought I'd sent it to him.  Which I had not.  On looking back at the email, he finds that it contained no actual message, just the pdf, and not even an identifiable sender.  Creepy spam? I guess we'll find out.  But until our computers are taken over by bots, despite its iffy provenience, the paper does bring up some interesting questions.)

From the paper abstract:
Methods: This case-control study was conducted on 150 individuals including 65 MS patients and 85 age- and sex-matched healthy controls enrolled using non-probability convenient sampling. Any history of previous Xray radiation consisted of job-related X-ray exposure, radiotherapy, radiographic evaluations including chest Xray, lumbosacral X-ray, skull X-ray, paranasal sinuses (PNS) X-ray, gastrointestinal (GI) series, foot X-ray and brain CT scanning were recorded and compared between two groups. Statistical analysis was performed using independent t test, Chi square and receiver operating characteristics (ROC) curve methods through SPSS software. 
Results: History of both diagnostic [OR=3.06 (95% CI: 1.32-7.06)] and therapeutic [OR=7.54 (95% CI: 1.59-35.76) X-ray radiations were significantly higher among MS group. Mean number of skull X-rays [0.4 (SD=0.6) vs. 0.1 (SD=0.3), p=0.004] and brain CT scanning [0.9 (SD=0.8) vs. 0.5 (SD=0.7), p=0.005] was higher in MS group as well as mean of the cumulative X-ray radiation dosage [1.84 (SD=1.70) mSv vs. 1.11 (SD=1.54) mSv; p=0.008].
So, it was a very small study, but the odds ratios were quite significant, particularly for therapeutic X-ray, for which dosage is likely to be higher than for diagnostic X-ray.

Chest x-ray; Wikipedia

And this isn't the only study, in fact, that has found an association between X-ray and MS.  Axelson et al. find a similar link in Sweden, described here, also in a very small study.  But, this about exhausts the reports of such a link.  The problem is that the risk of MS is small (the National Multiple Sclerosis Society estimates that there are 400,000 people in the US with MS, or about 1/1000) relative to the number of people getting X-ray, therapeutic or diagnostic.  This means that even if X-ray is causal, and the odds ratios (relative risk) in these small studies fairly large, the actual (absolute) risk is minuscule.

The cause of MS is unknown.  Many hypotheses have been considered -- it may be immune-related, or viral, or genetic, or perhaps environmental, and some believe lack of vitamin D is a prime candidate.  But as with other complex diseases, with the kind of varying and complex phenotype that is seen with MS, it's possible that there are numerous causes, and/or numerous triggers, rather than a single one.  So, if X-ray really is causal, perhaps it causes some tissue irritation that stimulates immune response, triggering some over-response that contributes to MS risk.  Thus, it's conceivably possible that X-ray is in fact contributory in some cases.  But one can imagine many such explanations.

But this again raises a larger question, one we've been blogging about off and on, well, forever, but recently, including last week, and again yesterday with respect to the new dietary recommendations, that no longer include cautions against eating foods high in cholesterol.  Why is it so hard to determine the cause of so many diseases?  Why don't we yet know the cause of MS, or heart disease, or obesity, or many other common diseases?  Essentially, it comes down to the fact that our methods for determining causation just aren't good enough when every case is different.

In the near future, we'll write about this issue in the context of how epidemiology is done these days.

*Here's the link, if you, too, want to chance it.  http://mjiri.iums.ac.ir/browse.php?a_code=A-10-1-758&slc_lang=en&sid=1&sw=sclerosis

Monday, February 23, 2015

When the methodology fails

Aop-ed piece by Nina Teicholz in Friday's NYTimes lays it on the line, chastising the government for its regular bulletins on dietary advice that, for 50 or so years have altered what we eat, what we fear to eat, and what the risks are.  Now, new studies tell us that what was bad is good and what was good is bad, and that the prior half-century of studies were wrong.  We've eliminated fats and cholesterol, and replaced them with carbohydrates, but, as Teicholz writes,
...recent science has increasingly shown that a high-carb diet rich in sugar and refined grains increases the risk of obesity, diabetes and heart disease — much more so than a diet high in fat and cholesterol.
But why should we believe these new studies?  Teicholz basically takes the underlying methodology to task, and yet she has written a book recommending that we eat more fats (“The Big Fat Surprise: Why Butter, Meat and Cheese Belong in a Healthy Diet"), but those recommendations are based on the very same faulty methodology as the recommendations with which she, and the current USDA advisory committee, find fault.

Embrace the fat! (Wikipedia)

The same, almost exactly the same, critiques are earned by many of the 'big data' genomics studies (and other long-term go-not-very-far megaprojects).  It is the statistical correlation methodology.  When many factors are studied at once (perhaps properly since many factors, genetic and environmental, are responsible for health or other traits), we can't expect simple answers.  We can't expect correlation to imply causation.  We can't expect replication.  We can't predict the risk factors that people, for whom risk advice is based on such studies, will face in the future.

The real conclusion is to shut down the nutrition megaprojects at Harvard (singled out by the op-ed) and the other genetics and public health departments that have been running them for decades, and do something different.  The megaprojects have become part of the entrenched System, with little or no real accountability.

Pulling the plug would be a major acknowledgment of failure, both by the feds for what they funded, the program officers for defending weak portfolios and their budgets, the universities defending their overhead and prestige projects and, of course, the investigators who are either simply unable to recognize what they're doing, or too dishonest and self-protecting to come clean about it.  And then they and their students could go on to do something actually productive.

Of course such a multi-million dollar threat will be resisted, and that's why the usual answer to the kinds of conflicting, confusing reports that so often come out of these megaprojects is to increase their size, length and, geez, what a surprise!, their cost.  To keep funding the same investigators and their proteg├ęs.  This is only to be expected, and many people's jobs are covered by the relevant grants, a genuine concern.  However, research projects are not supposed to be part of a welfare system, but to solve real problems.  And the same peoples' skills could be put to better use, addressing real problems in ways that might be more effective and accountable.

And we used to laugh at the Soviets' entrenched, never successful, Five Year Plans!

It is a public misappropriation that is taking place.  Yes, there are health problems we wish to avoid, and government and universities are set up to identify them and recommend changes.  But, for most of today's common chronic diseases, lifestyle changes would largely do the trick.

But then, that would just let people live longer to get diseases that might be worse, even if at older ages.  And meanwhile we aren't putting on a full court press for things that really are genetic, or really do have identifiable life-style causes.

Much of this research is being done at taxpayer expense.  We should let the people keep their money, or we should spend it more effectively.  We won't be able to do the latter until we admit, formally and fully, that we have a problem.  Given vested and entrenched interests, getting that to happen is a very hard trick to pull off.