Wednesday, December 31, 2014

A change of pace and a little humor, from Boccaccio

We spend the year writing (ranting?) about the passing scene in science, the good and the bad, the wheel-spinning and the hopeful. But we thought we'd end the year with a tale of good cheer, and of judgments no more ignoble than what we see in science, though having nothing to do with science--but a lot more fun. This is from Giovanni Boccaccio's Decameron (1350), which consists of 100 tales told by a group of ten lusty young noble men and women, who have retreated from Florence to the countryside to escape the plague in 1348, and who entertain themselves by each telling a story on each of their ten-day retreat.  This one is tale number two on their ninth day:

"I have to tell of a young nun, who by a happy retort, and the favour of Fortune, delivered herself from imminent peril. And as you know that there are not a few most foolish folk, who, notwithstanding their folly, take upon themselves the governance and correction of others; so you may learn from my story that Fortune at times justly puts them to shame; which befell the abbess, who was the superior of the nun of whom I am about to speak.

You are to know, then, that in a convent in Lombardy of very great repute for strict and holy living there was, among other ladies that there wore the veil, a young woman of noble family, and extraordinary beauty. Now Isabetta--for such was her name--having speech one day of one of her kinsmen at the grate, became enamoured of a fine young gallant that was with him; who, seeing her to be very fair, and reading her passion in her eyes, was kindled with a like flame for her: which mutual and unsolaced love they bore a great while not without great suffering to both. But at length, both being intent thereon, the gallant discovered a way by which he might with all secrecy visit his nun; and she approving, he paid her not one visit only, but many, to their no small mutual solace. But, while thus they continued their intercourse, it so befell that one night one of the sisters observed him take his leave of Isabetta and depart, albeit neither he nor she was ware that they had thus been discovered.

The sister imparted what she had seen to several others. At first they were minded to denounce her to the abbess, one Madonna Usimbalda, who was reputed by the nuns, and indeed by all that knew her, to be a good and holy woman; but on second thoughts they deemed it expedient, that there might be no room for denial, to cause the abbess to take her and the gallant in the act. So they held their peace, and arranged between them to keep her in watch and close espial, that they might catch her unawares. Of which practice Isabetta recking, witting nought, it so befell that one night, when she had her lover to see her, the sisters that were on the watch were soon ware of it, and at what they deemed the nick of time parted into two companies of which one mounted guard at the threshold of Isabetta's cell, while the other hasted to the abbess's chamber, and knocking at the door, roused her, and as soon as they heard her voice, said:--"Up, Madam, without delay: we have discovered that Isabetta has a young man with her in her cell."

Now that night the abbess had with her a priest whom she used not seldom to have conveyed to her in a chest; and the report of the sisters making her apprehensive lest for excess of zeal and hurry they should force the door open, she rose in a trice; and huddling on her clothes as best she might in the dark, instead of the veil that they wear, which they call the psalter, she caught up the priest's breeches, and having clapped them on her head, hied her forth, and locked the door behind her, saying:--"Where is this woman accursed of God?" And so, guided by the sisters, all so agog to catch Isabetta a sinning that they perceived not what manner of headgear the abbess wore, she made her way to the cell, and with their aid broke open the door; and entering they found the two lovers abed in one another's arms; who, as it were, thunderstruck to be thus surprised, lay there, witting not what to do.

The Abbess "began giving her the severest reprimand that ever woman got."  Decameron, Day 9, Tale 2.  Drawn by KW

The sisters took the young nun forthwith, and by command of the abbess brought her to the chapter-house. The gallant, left behind in the cell, put on his clothes and waited to see how the affair would end, being minded to make as many nuns as he might come at pay dearly for any despite that might be done his mistress, and to bring her off with him. The abbess, seated in the chapter-house with all her nuns about her, and all eyes bent upon the culprit, began giving her the severest reprimand that ever woman got, for that by her disgraceful and abominable conduct, should it get wind, she had sullied the fair fame of the convent; whereto she added menaces most dire.

Shamefast and timorous, the culprit essayed no defence, and her silence begat pity of her in the rest; but, while the abbess waxed more and more voluble, it chanced that the girl raised her head and espied the abbess's headgear, and the points that hung down on this side and that. The significance whereof being by no means lost upon her, she quite plucked up heart, and:--"Madam," quoth she, "so help you God, tie up your coif, and then you may say what you will to me." Whereto the abbess, not understanding her, replied:--"What coif, lewd woman? So thou hast the effrontery to jest! Think'st thou that what thou hast done is a matter meet for jests?" Whereupon:--"Madam," quoth the girl again, "I pray you, tie up your coif, and then you may say to me whatever you please." Which occasioned not a few of the nuns to look up at the abbess's head, and the abbess herself to raise her hands thereto, and so she and they at one and the same time apprehended Isabetta's meaning.

Wherefore the abbess, finding herself detected by all in the same sin, and that no disguise was possible, changed her tone, and held quite another sort of language than before, the upshot of which was that 'twas impossible to withstand the assaults of the flesh, and that, accordingly, observing due secrecy as theretofore, all might give themselves a good time, as they had opportunity. So, having dismissed Isabetta to rejoin her lover in her cell, she herself returned to lie with her priest. And many a time thereafter, in spite of the envious, Isabetta had her gallant to see her, the others, that lacked lovers, doing in secret the best they might to push their fortunes."

Now that's making merry! Of course, in science we really should play by the rules, but perhaps even then we can have as much joy as did these nuns and their partners. So enjoy this New Year's Eve, and may your next year be as joyful wherever you spend your time, as it was for them!

Monday, December 29, 2014

The Big " 'Scuse me!" on Mars

People rarely like to talk openly about flatulence--it's a topic that, one might say, just doesn't smell right.  But it's a normal digestive function, a release from what otherwise might do a person or animal in.  So when it makes the headlines, we as responsible bloggers can't just ignore it.

What we refer to, of course, is the Big Fart that NASA is now reporting to have discovered on Mars; the identification of methane bursts, as described by Webster et al. in the Dec 16 issue of Science, as well as all over the popular media, including here at the NYTimes.  NASA says methane can only be present for two reasons.  Kenneth Chang in the NYT writes "It could have been created by a geological process known as serpentinization, which requires both heat and liquid water. Or it could be a product of life in the form of ancient microbes known as methanogens, which release methane as a waste product." We can joke about some buried, covey of hibernating Green Giants (or Green Sheep) dormant underground, cowering in shelter from the harsh surface condition, but expelling from time to time.  This would overstate what is being suggested, sure, but it points to some important issues.  So, with our usual level of skepticism, we can try to outline what we see are some of them.

Public domain image of Mars, 1980.

The hint or even suggestion of course is that a burst of methane may signal not just that life used to be on Mars, but that life is still there, hidden under the surface (is it a hypothesis of last resort, since there isn't any on the surface?).  An important point is that methane doesn't last very long before degrading.

What is Life?
There are, however, some tiny details that are worth discussing.  Why do we assume that 'life' means organisms of complex nature using Earth-like metabolism and presumably based on DNA and RNA or its equivalent?  That is, why should  life on Mars be the same as life on Earth?  Are 'microbes' like ours inevitable if there is life?

There are many arguments over what life 'is'.  If one thinks of it in terms of self-reinforcing proliferation via chemical capture of solar energy by partially isolated subunits, like cells or other structures, then it is said that only carbon or silicon could be the basis of it.  That's different from, say, mere crystal growth.  Chemists have also shown the fundamental physical reasons why lipid layers form and how the ATP-based processing of energy works and is fundamental to life here on Earth.  Some believe that this is the only way such things might happen.  There are also diverse, and disputed, ideas about where the first life on Earth, the primordial 'soup', was--perhaps in undersea hot geothermal vents and so on. Again, we can't pass judgment on that.  Indeed, some seem to have argued that RNA and DNA will inevitably be the basis, and the means, of the evolution of life, no matter where it occurs.

But this is all derived from what we know about life on Earth.  Why does it have to be the same on Mars?  And, if methane is an indication of life, to complicate things further, there is also the issue that, since there appear to be no multi-celled organisms on Mars, methane there would have to be the result of bacteria digesting other complex material produced by life.  That is, the microbes have to be eating the result of some other DNA-based life.  If not some other, higher forms of life, are the methanogens digesting their dead ancestors or is, or was there other complex life's detritus to eat?  

Assuming that's what's going on, one can then construct scenarios by which natural selection would lead to cellular organisms on Mars (such as found around 3.5 billion years ago on earth), even if there aren't any subterranean gaseous little green Martian sheep.  If such microbes were truly there, it would be a genuinely interesting finding and something NASA could not be criticized for reporting, nor the NYTimes or Science criticized for highlighting. But these are large assumptions. 

And indeed, as we noted above, there are apparently other ways that methane can be made or found on Mars--NASA itself points this out--and a large part of the December story about methane has to do with controversial previous reports that were apparently artifacts, which authors of this new report are saying they hereby resolve. So, the major purpose of this new flurry of news!! about methane on Mars is to establish the fact that there really is methane on Mars.   

But NASA and news outlets aren't ignoring the possible implications about life, so let's suspend all skepticism and ask whether this scenario could have any other implications.

What if it's true?
The idea would be that the same sort of evolution that has occurred on Earth also occurred on Mars.  It would have originated some half-billion or so years after Mars basically formed, and  DNA/RNA-based microbial life would have evolved, with the same sort of biochemistry as life on Earth, involving the release of methane.  Or was it underground Martian men, or methane-releasing cows....or even green?  But why would they be what the word 'microbe' tends to suggest--earth-like biological species?  Is it because simply that without any better imagination, one reasons that if there's methane there must be life--and that means life as we know it?

Mars’ origin was probably at a time not that different from the Earth’s, given the estimated age of the Sun (5 billion years).  But the Martian surface has been essentially dead, too cold and with atmospheric pressure too low to allow liquid water at least for the last 3.8 billion years (data from Wikipedia).  That means that the methane belchers either evolved way long ago and died out, leaving methane pockets to leak out still to this day (though methane degrades quickly, according to the news reports, as in within a few hundred years), or they somehow evolved the ability to live beneath the unlivable surface, and to move around (so as to evolve), for millions of years.

Further, as we noted above it amounts to assuming that Earth-like life is what was evolving there, with Earth-like metabolism.  But this evolution was either over and done with long ago—and hence occurred very much faster there than here, or has to be imagined as going on subterraneanly for that long—needing of course some source of energy other than sunlight (and hence something other than photosynthesis or perhaps even ATP-based storage), and probably needing liquid water.  And if still flatulent, how did the methane persist?

Methane on earth is produced by micro-organisms using the complex multi-part protein methyl-coenzyme M reductase (MCR), but of course this also involves chromosomes, DNA coding, DNA regulation, and RNA directly and even indirectly (e.g., those many genes involved in energy capture, nutrient capture, replication, protein processing, and much besides). It's fair to ask how likely, or even plausible, it could be that the same sort of reaction system would have evolved twice, or what the basis is for giving priority explanatory credence to saying that it is far more likely to be due to something else. 

But since, to us anyway, the odds seem very heavily stacked against independent origins, this would then raise the possibility that life originated elsewhere than here on Earth, either on Mars or somewhere else, and was seeded here or there, or both, from space. Many (including Francis Crick) have tried to make that case, and we know that molecules that life uses exist in space and have rained down on Earth (and presumably also on Mars). But there is no evidence that this rain is of other than very primitive molecules and nothing at all organized like real life. That is, no space probe or meteor or its like (or landing on a comet) have found DNA, RNA, complex proteins, or microbes.

Based on a great consistency among methods of analysis, all present life on Earth seems to have descended from a common beginning.  Of course that could have been seeded, even from Mars via meteors or whatever, but only one such seed ‘took’, it happened well over 3.5 billion years ago, and when and where it dropped, the Earth had to have been geochemically and environmentally ready for it, which means that whatever led to its evolution on Mars or elsewhere, conditions were similar enough here compared to its source, for the drop-ins to survive and evolve.  Or, the transport could have been from Earth to Mars, except that by the time bacteria evolved here Mars seems to have already become uninhabitable, even for microbes. 

If ready-made microbes rained down here, their forms would all have had to have evolved somewhere else with essentially similar conditions as here, and the microbes would have had to withstand the rigors (including mutagenic cosmic radiation) of light-years of deep space travel to get here.  And must then have found essentially immediate ways to live and replicate, wherever and whenever it was they landed, before dying due to conditions inimical to their needs.  And if from some other planet, then why does all life seem by various criteria to have originated here at 3.5 billion years?  Yes, after the fact we can construct all sorts of contorted explanations, but it's a post hoc stretch.  Quite a stretch.

Or, is it plausible to assume methane as a sign of life if we have to say as well that with an entirely unrelated path, not protein or DNA dependent, say, evolved at the same time (and underground) on Mars ending up as methane producers?  Why such exquisite parallelism?

While of course, we cannot rule any of this out (and chemists and geologists are now earnestly considering other methanogenic mechanisms that would explain the Martian finding), there is a massive burden of implausibility that reporters should be requiring NASA et al. to overcome before repeating the kinds of excited tales we’re seeing blazoned across the headlines.  The reporters may not have thought of the above sorts of issues, but the scientists have, or should have.  And it's reporters' duty to know of these sorts of things or else not write about them!  

The implication that very similar RNA/DNA/protein based life evolved in remarkably similar or even eerily coincidental ways, on Mars, or anywhere else with transport potential to Earth is a leap of credence of dramatic proportions, unless of course it’s just wishful thinking and the usual publicity stunt aimed at funders.

If the carefully caveated suggestions of life on Mars were to turn out to be true, then everyone will agree that it will be remarkable in unprecedented ways, and as fascinating as the evolution of life itself, here or anywhere.  The best discovery since ice cream, for sure.  But from what is being reported, there are far too many reasons to doubt the claims that a little runabout, on a trivial part of Mars’ surface, detected a bathroom odor from subterranean beasts.  Because the  proclamations just don't smell right.

Thursday, December 25, 2014

Christmas, again!

Here are some thoughts for the day, by William Wordsworth in 1820, written fondly to his brother, who had moved from the evocative country, where the poet lived, to London.  If Christmas is a day you celebrate, we wish you a warm and peaceful holiday.  If today is just another day to you, except that the stores are closed, we wish you a good day off.  Best wishes to all.

Christmas Minstralsy
The minstrels played their Christmas tune
To-night beneath my cottage eaves;
While smitten by a lofty moon,
The encircling laurels thick with leaves,
Gave back a rich and dazzling sheen,
That overpowered their natural green.

Through hill and valley every breeze
Had sunk to rest with folded wings:
Keen was the air, but could not freeze
Nor check the music of the strings;
So stout and hardy were the band
That scraped the chords with strenuous hand.

And who but listened?--till was paid
Respect to every inmate's claim,
The greeting given, the music played
In honor of each household name,
Duly pronounced with lusty call,
And a merry Christmas wished to all.

The Lake District in Winter.  Source: uk

O Brother! I revere the choice
That took thee from thy native hills;
And it is given thee to rejoice:
Though public care full often tills
(Heaven only witness of the toil)
A barren and ungrateful soil.

Yet would that thou, with me and mine,
Hadst heard this never-failing rite;
And seen on other faces shine
A true revival of the light
Which nature, and these rustic powers,
In simple childhood, spread through ours!

For pleasure hath not ceased to wait
On these expected annual rounds,
Whether the rich man's sumptuous gate
Call forth the unelaborate sounds,
Or they are offered at the door
That guard the lowliest of the poor.

How touching, when at midnight sweep
Snow-muffled winds, and all is dark,
To hear--and sink again in sleep!
Or at an earlier call, to mark,
By blazing fire, the still suspense
Of self-complacent innocence;

The mutual nod--the grave disguise
Of hearts with gladness brimming o'er,
And some unhidden tears that rise
For names once heard, and heard no more;
Tears brightened by the serenade
For infant in the cradle laid!

Ah! not for emerald fields alone,
With ambient streams more pure and bright
Than fabled Cytherea's zone
Glittering before the Thunderer's sight,
Is to my heart of hearts endeared,
The ground where we were born and reared!

Hail, ancient manners! sure defence,
Where they survive, of wholesome laws:
Remnants of love whose modest sense
Thus into narrow room withdraws;
Hail, usages of pristine mould,
And ye that guard them, Mountains old!

Bear with me, Brother! quench the thought
That slights this passion or condemns;
If thee fond fancy ever brought
From the proud margin of the Thames,
And Lambeth's venerable towers,
To humble streams and greener bowers.

Yes, they can make, who fail to find
Short leisure even in busiest days,
Moments to cast a look behind,
And profit by those kindly rays
That through the clouds do sometimes steal,
And all the far-off past reveal.

Hence, while the imperial city's din
Beats frequent on thy satiate ear,
A pleased attention I may win
To agitations less severe,
That neither overwhelm nor cloy,
But fill the hollow vale with joy!

Wednesday, December 24, 2014

A carol for the lonely skeptic

Here's a seasonal carol, just to warm the hearts of those lonely skeptics, shivering in wintry isolation:

Rudolf the Red-haired Post-Doc

You know Next-Gen, and Affy for mapping exomic,
You know HapMap, Illumina, and 1000-Genomics,
But do you recall
That most famous doubter-of-all?

Rudolph the red-haired post-doc
Had a notion truly new!
And if t'were ever followed
You’d have seen how fast it grew.

All of the genome mappers
Used to laugh and say he rants
They never let poor Rudolph
Join in getting any grants!

Then one frenzied deadline eve,
'Santa' came to say,
Rudolph with ideas bright,
Won't you take my funds tonight?

Then all the mappers followed,
As they shouted out this plea:
Rudolph, Oh, thinking post-doc
Please collaborate with me!

Tuesday, December 23, 2014

A season's pleading

I'm dreaming of a funded Christmas
Just like my mentors used to know
Where the pipets bustle
and post-docs hustle
To make everybody’s CV grow.

I'm striving for a grant this Xmas
With all the versions that I write
May my pleas be cogent and keen
So perchance my Christmas might be green!

Beautiful white Christmas.  (

Oh, the funding prospects are frightful
But I hope my grant’s delightful
And without a job I can go to:

Make it through! Make it through! Make it through!

It doesn't show signs of stopping
Though all our careers are dropping
The lights are turned way down, too:
Make it through! Make it through! Make it through!

The fire of hope is dying
And my opportunities flying
So with odds below point 0-two:
Make it through! Make it through! Make it through!

Monday, December 22, 2014

Seasonality of cooperative behavior in a large population of juvenile primates

My Grandpa used to read this on Christmas Eve, and most years we keep up the tradition.  It's an important first-person ethnographic account of the adaptive cooperative behavior displayed seasonally by many juveniles of our species. Enjoy...

Jest 'Fore Christmas
by Eugene Field (1850-1895)

FATHER calls me William, sister calls me Will,
Mother calls me Willie but the fellers call me Bill!
Mighty glad I ain't a girl---ruther be a boy,
Without them sashes curls an' things that's worn by Fauntleroy!
Love to chawnk green apples an' go swimmin' in the lake--
Hate to take the castor-ile they give for belly-ache!
'Most all the time, the whole year round, there ain't no flies on me,
But jest'fore Christmas I'm as good as I kin be!

Got a yeller dog named Sport, sick him on the cat.
First thing she knows she doesn't know where she is at!
Got a clipper sled, an' when us kids goes out to slide,
'Long comes the grocery cart, an' we all hook a ride!
But sometimes when the grocery man is worrited an' cross,
He reaches at us with his whip, an' larrups up his hoss,
An' then I laff an' holler, "Oh, ye never teched me!"
But jest'fore Christmas I'm as good as I kin be!

Gran'ma says she hopes that when I git to be a man,
I'll be a missionarer like her oldest brother, Dan,
As was et up by the cannibals that live in Ceylon's Isle,
Where every prospeck pleases, an' only man is vile!
But gran'ma she has never been to see a Wild West show,
Nor read the life of Daniel Boone, or else I guess she'd know
That Buff'lo Bill an' cowboys is good enough for me!
Excep' jest 'fore Christmas, when I'm as good as I kin be!

And then old Sport he hangs around, so solemn-like an' still,
His eyes they seem a-sayin': "What's the matter, little Bill?"
The old cat sneaks down off her perch an' wonders what's become
Of them two enemies of hern that used to make things hum!
But I am so perlite an' tend so earnestly to biz,
That mother says to father: "How improved our Willie is!"
But father, havin' been a boy hisself, suspicions me
When, jest 'fore Christmas, I'm as good as I kin be!

For Christmas, with its lots an' lots of candies, cakes an' toys,
Was made, they say, for proper kids an' not for naughty boys;
So wash yer face an' bresh yer hair, an' mind yer p's and q's,
And don't bust out yer pantaloons, and don't wear out yer shoes;
Say "Yessum" to the ladies, and "Yessur" to the men,
An' when they's company, don'a pass yer plate for pie again;
But, thinkin' of the things yer'd like to see upon that tree,
Jest 'fore Christmas be as good as yer kin be!

Season's greetings 
to you and yours from me and mine...
As good as kin be.
(Photo by Juliet Dunsworth)

Friday, December 19, 2014

Survivorship bias and genetics

I was a mathematics major as an undergraduate.  However, not then or since have I been anything that one could call a mathematician.  At least, I hope I learned something about trying to think logically about life even if I never do equations.  But this interest led me to read a new book I was told of, called How Not to be Wrong: the Power of Mathematical Thinking, by Jordan Ellenberg (2014, NY, Penguin Press).

This is a popular rather than technical book, but it shows in interesting and serious ways how mathematical thinking can lead to improved understanding of the real world.  I think it has relevance to an important area in current evolutionary and biomedical or agricultural genetics.  So I thought I'd write a post about it.

Survivorship bias
Ellenberg begins his book with an illustration of how abstract logical thinking can solve important real-world problems in subtle ways.  In WWII a mathematics research group was asked by the Army to help them locate armor plating on fighter aircraft.  The planes were returning to base with scattered bullet holes from enemy fire and the idea was to put some protective plating where it would do the most good without adding cumbersome mileage-eating weight.  The mathematician suggested to put the plating where the bullet holes weren't.  This seemed strange until he explained that this was because the bullet holes that were observed hadn't done much damage: bullets hitting elsewhere had brought the plane down so it was never observed because the plane never returned to base.  The engine compartment was the case in point: a shot to the engine was fatal to the aircraft, but to the wings and body, much less so.

This is a case of survivorship bias.  It can apply widely, and evolution and genetic causation provide instances where it seems likely to be a useful principle.  As geneticists we ask, what the genes whose variation causes variation in adaptive or biomedically interesting outcomes.  This is what genome mapping in its various forms is intended to identify.

Ironically, it seems, when we do experiments involving development or testing of genetic mechanisms by, say, knocking out a gene, or when we observe the major gene-usage switches that occur when some part of an embryo's body are forming, we can identify specific genes that seem to be very important.

Several pieces of evidence can suggest they are important.  One is the finding that the same gene is used in similar roles in very distantly related species (often, even, between humans and flies or even more distant species). It's usage has been conserved.  Secondly, there is usually far less variation within or between species in such genes than in what we believe to be non-functional or marginally functional parts of genomes. This seems to suggest that variation hasn't been tolerated by natural selection.  Thirdly, many congenital diseases in plants and animals including humans have proven to be due to the effects of variants, often newly arisen mutations, in a specific gene.  Most cases of diseases like Cystic Fibrosis, Phenylketonuria, Muscular Dystrophy, or Tay Sachs Disease are of this sort.  Some congenital traits like, say, eye or skin color, are also due to inheriting specific variants in at least relatively few genes.

Such findings at least indirectly fueled the fervor for mapping every trait one can define, with grand promises of discovering the genes 'for' the trait.  Conscientious investigators justified expensive mapping efforts by showing that their trait of interest had substantial heritability, for example, because trait-values were to a substantial extent correlated among close relatives in predicted patterns.  However, for most traits like diabetes, cancer, heart disease, or behavioral characteristics, such findings are few and far between.

Despite a welter of PR spin to the contrary, instead of dramatic findings of the expected (and promised) sort, what was found was that the traits were affected by variation in tens, hundreds, or even thousands of different parts of the genome. Even taking all these together, they typically only accounted for a fraction--usually a small fraction--of the estimated heritability.

What is this 'missing heritability'?

Evolutionary survivorship bias
A central theoretical idea is that a fundamental genomic criterion for showing biological function is sequence conservation.  Most evolution is purifying: what has been put together over billions of years is risky to change.  So most mutations in clearly functional areas of DNA are either neutral or deleterious.  As a result, more variation accumulates in non- or weakly functional DNA than in important genes.

What that means is that the variation we see misses what existed heretofore and hence is not a representative sample of all the variation that arises.  The idea can be that most variation in genomes is of major importance.  As a result, the tendency is to assume that non-conserved areas of the genome are non-functional.  This may be true, but it may be that our belief that conservation equals function is a corollary of a belief in strong Darwinian natural selection in molding traits.  In fact, most genomic variation is not of the highly conserved sort, but our analysis and explanation of functional genomics is biased by our predilection for ignoring less-conserved variation.

This can be seen as a kind of survivorship bias in that we assume that variation in non-conserved genome areas just doesn't survive for very long--isn't conserved because it has no function.  That's a kind of circular reasoning and has been, for example, highly contentious in the interpretation of the ENCODE project's objective to identify all causal elements in the genome, and in questions about whether selectively neutral variation exists at all. The same conceptual bias leads to reconstructions of evolutionary adaptive history that centers on the conserved genes as if they were the genes that were involved.  Finally, important genes that were involved in a trait's evolution to its current state may no longer be involved, and hence not be considered because their role did not survive to be identified today.

Biomedical survivorship bias
The same sort of bias in ascertaining the spectrum of causal variation exists on the shorter life-time scale of biomedical genetics.   There is a big discrepancy between the clearly key role of genes identified in experimental and developmental genetics, and in the deeply conserved nature of those genes, and the general lack of 'hits' in those genes when genomewide mapping is done on traits those genes affect.

How can a gene be central to the development of the basis of a trait, and yet not be found in mapping to identify variation that causes failures of the trait?  Indeed, the basic finding of GWAS and most other mapping approaches is that the tens or hundreds or thousands of genome 'hits' have individually trivial effects.

The answer may lie in survivorship bias.  Like the lethality of bullets to the engine of a fighter, most variation in the main genes, those whose sequence is more highly conserved, is lethal to the embryo or manifest in pathology so clear that it never is the subject of case-control or other sorts of Big Data mapping.  In other words, genome mapping may systematically be inevitably constrained to find small effects!  That's exactly the opposite of what's been promised, and the reason is that the promises were, psychologically or strategically, based on extrapolation of the findings of strong, single-gene effects causing severe pediatric disease--a legacy of Mendel's carefully chosen two-state traits.

To the extent this is a correct understanding, then genomewide mapping as it's now being done is, from an evolutionary genomic perspective, necessarily rainbow-chasing.  Indeed, a possibility is that most adaptive evolution is itself also due to the effects of minor variants, not major ones.  Once the constraining interaction of the major genetic factors is in place, mostly what can nudge organisms in this direction or that, whether adaptively or in relation to complex, non-congenital disease, is based on assembled effects of individually very minor variants.  In turn, that could be why slow, gradualism was so obviously the way evolution worked to Darwin, and why it generally still seems that way today.

Survivorship bias is a kind of mis-understanding of statistics and sampling that careful reasoning can illuminate.  It is so easy to collect biased samples, and so hard to do otherwise, and consequently so easy to make convenient, but erroneous inferences.  Science is a complex business and it's an unending challenge to do it right--even to know when we are doing it right!

Thursday, December 18, 2014

(Other) lessons of the Broad Street pump: understanding causation isn't so easy

The iconic John Snow, often referred to as the "father of epidemiology," is commonly credited with discovering the cause of cholera after his careful, empirical examination of the 1854 outbreak of the devastating disease in the Soho neighborhood of London.  But I think it's only with hindsight that we can say this, and I think it's not quite right.

Snow was nothing if not a detail man.  A physician, he was very much an empiricist, experimenting and observing to test his ideas about health and disease like no one else of his time.  He had developed his waterborne theory of cholera some time before the 1854 epidemic, writing about it in detail in 1849.  The 1854 outbreak, very near his home, was an ideal circumstance for him to try to confirm his theory.

Modified from Snow's map in The Ghost Map; Johnson, 2006

Soon after the outbreak began, Snow began interviewing anyone with a family or household member who had died of the disease to determine the source of their drinking water.  Every case had drunk water from the Broad Street pump.  And, he confirmed that the worst symptoms were intestinal, not respiratory, which meant to him that the cause was something people had ingested rather than inhaled.  He found that there had been no cases among the 70 workers in the Broad Street brewery, because they were all given free beer, and never drank water at all.  From the information he collected, he drew his famous map of the neighborhood which showed that cases clustered around the Broad Street pump.  He concluded the pump was the source of the contaminated water that was making people ill.

He then enlisted the aid of a previously skeptical ally, and eventually convinced an even more skeptical local council to remove the handle from the pump -- to the disgust of many local residents who thought this was a cockamamie idea.  Not long after the removal of the handle, the epidemic was over.  But even Snow recognized that the epidemic had already begun to abate by the time the handle was removed.  That piece of the story is often lost, however; perhaps from the vantage point of 160 years on, when we know that Snow was right, the removal makes a nice tidy ending.

But did Snow identify the cause of cholera?  No, not in the way we would accept today.  We would say he had strong circumstantial evidence, but we'd require the causal organism.  There were multiple competing theories for the cause at the time. An excellent history of the epidemic, The Ghost Map: the Story of London's Most Terrifying Epidemic, and How it Changed Science, Cities and the Modern Worldby Steven Johnson (2006), tells the story in detail. Johnson writes that an editorial in the Times of London in 1849 considered the possible causes of cholera:
• “A … theory that supposes the poison to be an emanation from the earth”
• An “electric theory” based on atmospheric conditions
• The ozonic theory -- a deficiency of ozone in the air
• “Putrescent yeast, emanations of sewers, graveyards, etc.”
• Cholera was spread by microscopic animalcules or fungi, though
   this theory “failed to include all the observed phenomena.”
                                 Source: The Ghost Map, Steven Johnson, 2006,  Riverhead Books
Note that the idea that cholera was spread by "microscopic animalcules or fungi" was deemed empirically deficient by the editors of the Times, and it certainly was, as no organism associated with the disease had yet been identified.  In 1854 Snow himself looked at water from the Broad Street pump under his microscope, and had seen nothing of note.

And, Snow wasn't the only one with empirical, observed evidence for the cause of cholera.  Indeed, each of the alternatives put forth by the Times was entirely plausible, given the current state of knowledge.  Miasmatists were empiricists too: epidemics were localized in poor areas, where air smelled bad, water was filthy and smelled bad, there were more cases in cities, fewer cases in hills, no living organism had been found to suggest they were wrong.  What both Snow and the miasmatists had was circumstantial evidence, correlations, and belief in their preferred theory.  And, at the time, no definitive way to choose between them.

My point here is not to doubt Snow's theory, of course, but to suggest that although we now know that he was right, that was much less obvious at the time.  Indeed, it wasn't really until the organism that causes cholera, Vibrio cholerae, was discovered by Robert Koch in 1883 that Snow's story could be considered conclusive.  (Actually, the organism was first seen in 1854 by Italian anatomist Fillipo Pacini, but this was not well-known at the time.  If it had been, would Snow have had an easier time convincing people that he was right?  I think the germ theory of disease had to get going in earnest before that could have happened, so I think probably not.)

What killed the miasma theory?  One blow was the rise of the germ theory, and the discovery of organisms that caused disease, one after another.  (Though, is the miasma theory in fact dead?  Still today there is some thought that dirty air causes asthma!)

But determining the cause of infectious diseases has its own problems.  It wasn't, and isn't, as simple as seeing live organisms  under a microscope.  Robert Koch was a German physician and microbiologist who discovered a number of causal microbes.  He won the Nobel Prize in Physiology of Medicine in 1905 for his work on tuberculosis.  He proposed a set of postulates, first published in 1890, that were meant to be useful in confirming microbial causes of infectious disease.  

                                                           The Koch Postulates
1.The microorganism must be found in abundance in all organisms suffering from the disease, but should not be found in healthy organisms.

2. The microorganism must be isolated from a diseased organism and grown in pure culture.

3. The cultured microorganism should cause disease when introduced into a healthy organism.

4. The microorganism must be re-isolated from the inoculated, diseased experimental
host and identified as being identical to the original specific causative agent.

Unfortunately, and Koch knew this too, many microbes don't meet these criteria.  There can be asymptomatic carriers of cholera and other diseases; many microbes can't be grown in culture, and so on.  So, when a microbe behaves properly, following the postulates, all is good but when it doesn't, as with, say, HIV, controversy can ensue (see Duesberg).

Another blow to the miasma theory was the birth of a statistical basis for establishing causation.  The American philosopher, logician, and mathematician C.S. Peirce formulated the idea of randomized experiments in the late 1800’s, after which they began to be used in psychology and education.

Randomized experiments were popularized in other fields by R.A. Fisher in his 1925 book, Statistical Methods for Research Workers. This book also introduced additional elements of experimental design, and this was adopted by epidemiology.

Physician and epidemiologist Austin Bradford Hill in 1937, published Principles of Medical Statistics for use in epidemiology.  And, the development of population genetics, which Ken has been writing about this week, and the Modern Evolutionary Synthesis (which showed that Mendelian genetics is consistent with gradual evolution), and discoveries in genetics laid the foundation for approaches to looking for the genetic basis of traits and diseases.

Recognizing that attributing cause to disease needed a more formal approach, Bradford Hill suggested a set of criteria that he thought were at least useful to consider.  The "Hill Criteria," which he published in 1964, are still in use today.  
Strength: The larger the association, the more likely that it is causal
Consistency: Findings should be consistent between observers in different places.
Specificity: The more specific an association between a factor and an effect is, the bigger the probability of a causal relationship
Temporality: The effect has to occur after the cause
Biological gradient: Greater exposure should generally lead to greater incidence of the effect.
Plausibility: Must make sense
Coherence: Coherence between epidemiological and laboratory findings increases the likelihood of an effect
Experiment: "Occasionally it is possible to appeal to experimental evidence”
Analogy: The effect of similar factors may be considered.
         AB Hill, “The Environment and Disease: Association or Causation?,”
                          Proceedings of the Royal Society of Medicine, 58 (1965), 295-300.
Again, even the author knew that only one of these was actually a requirement for causation, as he discussed in the paper proposing the criteria; the cause has to precede the effect.  The others are either vague or just 'would be nice', or in many ways are highly or even purely subjective.  So, when they work, great and we attribute our conclusions to their application, but when they don't, it's not clear whether a possible factor isn't a cause, or just that the criteria aren't adequate for determining it, or our sample inadequate, or some other perhaps unknowable problem.

A set of "molecular Koch postulates" were devised in the 1980's, to determine the role of a gene in the virulence of a microbe, but they, too, have their failings for similar reasons.

And, statistical criteria have become the standard for determining causation, but we know that p-values are arbitrary (see Jim Wood's MT post, "Let's abandon significance tests", on this), that statistics are only as good as the studies that generate them, and studies are prone to biases and missing data and the like, and results can be difficult to replicate even if studies are state-of-the-art.  David Colquhoun has written a lot on this, including here and here.

Why go on about this?
We write frequently here on MT about how important it is to think about how we know what we know.  If we don't, we can get very close to religious territory, where knowledge is based on belief, not observation.  Indeed, even in science, to some of us, every trait is genetically determined, or we've got our favorite cause of obesity, or autism, or diabetes.  The ease with which we might choose to understand cause and effect without questioning how we know reflects two things -- one, belief is alive and well as a way to determine cause, and two, we often don't have demonstrably better ways to do it.

So, we don't know if sugar is the cause of the obesity epidemic or fat, or just overeating; we don't know whether breast feeding or bottle is the cause of the asthma epidemic; whether genes or environmental risk factors are the most important cause of type 2 diabetes, or which ones, and so on.  A lot of work in genetics is still based on the assumption that traits are simple, even though we know the kinds of traits that are likely to have simple explanations (the low-hanging fruit) and we know that they are rare.  We know the kinds of traits that are complex, and that aren't going to have easy explanations of the kind often suggested, and yet 'gene for' thinking is still prevalent in the popular press, and even among scientists.

Ludwik Fleck, a Polish physician and biologist, in 1935 published a book, Genesis and Development of a Scientific Fact, that is now properly recognized as the precursor to Thomas Kuhn's Structure of Scientific Revolutions.  Fleck wrote about "thought collectives" in science, his idea that facts in science are driven by context.  We follow the herd, until in fact the thought collective becomes a thought constraint.

Fleck writes of the development of the Wassermann test for syphilis, meant to determine who had the disease, but instead the thought collective at the time led the test result to define the disease.  It's an excellent short little book and well worth reading, but Ken wrote an even shorter column on Fleck, also worth reading if you're interested in Fleck and the sociology that is an important part of the way science actually operates.

A modern equivalent would be the common de facto practice of defining a genetic disease by genotype -- if a patient has one of the known genetic variants associated with the disease in other patients, he or she has the disease, but if not, he or she doesn't have the disease.  Even though we know that there can be many pathways to a given phenotype (our post last week on phenogenetic drift describes one reason for this).  Such definition, if everyone is aware of its nature, can guide therapy in useful ways -- that is, some genotype-defined subset of a broader disease category may respond to a particular kind of drug. But the changeable landscape of definition based on assumed causal process is an important part of the elusiveness of many conditions, like autism and many others. Too often the assumption that the outcome is 'genetic' defines, steers, or determines the concept of the trait itself. That can distract, and we think regularly does distract, from more realistic approaches to what is currently the very elusive nature of many traits, normal and otherwise, in animals and plants.

Understanding causation is a fundamental issue in science, but the difficulties are often overlooked in the rush to publish.  To the detriment of the science.

Wednesday, December 17, 2014

Are we still doing 'beanbag' eu(genetics)? Part III. Culpably ignored nuances?

Part I of this series was about the particulate view of genes and their role in evolution and the determination of traits that are here because they were screened by evolution.  Many view all traits as being in this category, and genetic determinism of those traits to be very strong and specific.  But the data are less clear by far than the commitment to that idea.

Ernst Mayr criticized the one-gene-at-a-time focus of much of population genetics as 'beanbag' genetics.  Mayr said that this was wrong for reasons we mentioned in Part I.  As we discussed there, JBS Haldane, one of the grand ol' men who developed population genetics, wrote in defense of the field, in response to Mayr's criticism.

Haldane was a highly educated, thoughtful, perceptive British biologist whose life was nuanced in many ways that make telling a clear-cut story difficult.  He was brilliant and exceedingly skilled.  But he was also a product of his times, as are we all.  In the early 20th century he became a Marxist, as did many other British aristocrats, accepting all that implies about what determines the structure of human society.  Marxism was materialist but it was about improvability of individuals--an egalitarian view that claimed that position in a class-based society was due to class, not inherent inferiority of the lower classes, and thus that social inequity could--indeed would be erased by the processes of history.  At the time, the Soviet Union seemed a Great Hope to many in heavily unfair empirical Britain.  That essential malleability was one reason that the Russian plant geneticist Lysenko rejected Mendelian/Darwinian models of genetics in favor of a more Lamarckian mode of inheritance by which plants could be conditioned to have desired properties, and those would then be inherited. That proved in many ways to be a disaster for the Soviet Union.

Nonetheless, Haldane, who was a leading popularizer of science in his day, published a collection of reprinted essays in 1932 entitled The Inequality of Man.  Ironic for a Marxist, but he was not simplistic.  He dealt with, and accepted, the idea of eugenics in those essays, and that was largely what the title referred to.  He acknowledged the major role of environment in making people what they turned out to be. But he stressed that genetics was part of human makeup, too. Rather than a more balanced treatment, at points he lapsed into the aristocratic view about intelligence and in that sense, inherent societal worth.  The upper classes were what they are because of their abilities, and were under-reproducing compared to the lower classes.  He even wrote of society not having the guts to kill its lesser citizens: despite warning about too much stress on inherency, in one article he wrote:
"The danger to democracy to-day lies not in the recognition of a plain biological fact [of inherent inequality] but in a lack of will in certain countries to kill persons who obstruct the declared wishes of the majority of the people."  Further, "The only clear task of eugenics is to prevent the inevitably inefficient one per cent of the population from being born, and to encourage the breeding of persons of exceptional ability where that ability is known to be hereditary."  There should not be a democracy except of a better minority.
There is a mix of views in Haldane's chapters, ranging from the autocratic extreme to something more humane and nuanced.  He discusses social class, race, and intelligence as related to achievement, and even within Europe he makes distinctions about intelligence between (guess who!) northern and southern Europeans.  But he also promotes improved opportunities and acknowledges that we don't know the nature or extent of hereditary control of traits like intelligence.  In these popularized articles on many sociocultural issues, he is a softened genetic determinist. Perhaps this could be a Marxist 'from each according to his abilities, to each according to his needs' view; but that was always paternalistic when pronounced from on high.  Haldane, like many scientists who are given a public forum, strays far and wide beyond what he knows best, and nearly a hundred years on we can see his only too human opinions.  Life is complicated!

In any case, though the rhetoric is generally changed, we see roughly the same spectrum of views today, but that is in many ways implicitly a bean-bag model of inheritance.  In his day, the idea of identifying the genes that cause the traits of interest was technically not possible.  Now, in the excitement of 'omic technologies, the beanbagger approach is more explicit, noting this or that genetic variant that causes some socially relevant behavioral trait.  This viewpoint is widespread, despite some occasional caveats about complexity and even if there are many labs working on more integrative approaches to that complexity.

The difficulties
These are not simple issues.  People are different in physical, metabolic, and behavioral ways and clearly genetic variation is involved.  Depending on one's social politics, that can be a central or an uncomfortable fact.  But let us assume, for the moment and for argument's sake, that all the genetic determinism that has been proposed were perfectly true.  Then what?

The idea in the writings of various authors, from the past and today, is essentially about what 'we' should do to mold society this way or that.  But who are that 'we'?  They're the professors, politicians, and so on, who in positions of influence make the judgments about what 'we' as a society 'need' to do. 'We' want more intelligence and less addiction and crime (as defined by 'us', of course; usually 'we' aren't talking about white-collar crime).  'We' decide what would be 'good' for society and what should be discouraged.  And there is always the temptation to attribute inherent causation to these differences.

So, for example, we decide what do to with (or to?) those of higher and lesser inborn intelligence. This is rather indisputably arrogant and presumptuous, isn't it?  Or, perhaps, one can ask whether it is any different from what has gone on heretofore.

If the minority of the privileged have the power to decide on societal action, it is rather moot whether the criteria used to justify that action are presumed genotypic ones or just the arbitrary wielding of power.  Does it matter whether Divine Right or 'good genes' is credited with the power of the elite, and the subservience of the rest?  The powers-that-be define the value judgments.

Genotypes may have more, or less, determinative roles than is widely being claimed these days. Eugenics was a particular kind of social control, that had regularly dreadful, indeed lethal, consequences for many people for various reasons. But whether that was any worse than religious or other political dominance is an open question.

Does it matter if it's an ISIS member who chops your head off because of your religion, or a Nazi who gasses you because of your ethnicity, or a physician who decides what genotypes need to be screened prenatally and eliminated, or who gets educational resources?

We have our own personal view, which is that the data generally do not support the making of such decisions based on genotypes and their presumed predictive value--and decisions related to those genetic variants that really do have such value should only be made privately, rather than by public policy.  But the public pays for the treatment of genetic disease, so at what point is coercion within the scope of such an idea?

It is not clear whether these issues really ever get 'solved', or whether rational, measured discussion is even possible.  But it does seem clear that questions about how genes control, or don't control, the traits in organisms are worth understanding, rather than action being taken on vague assumptions about inherent causality before the questions are even answered.