Friday, January 23, 2015

What is 'inappropriate' use of baby aspirin? The risk of estimating risk

Something like a third of the American population* takes a baby aspirin every day to prevent cardiovascular disease (CVD).  But a new study ("Frequency and Practice-Level Variation in Inappropriate Aspirin Use for the Primary Prevention of Cardiovascular Disease : Insights From the National Cardiovascular Disease Registry’s Practice Innovation and Clinical Excellence Registry", Hira et al., J Amer Coll Cardiol) suggests that more than 1 in 10 of these people are taking it 'inappropriately.'

Aspirin slows blood clotting, and blood coagulation plays a role in vascular disease, so the thinking is that some heart attacks and strokes can be prevented with regular use of aspirin, and indeed there is empirical support for this.  As with many drug therapies, it was the side effects of aspirin use for something else, in this case rheumatoid arthritis (RA), that first suggested it could play a role in CVD prevention -- a 1978 study reported that aspirin use lowered the risk of myocardial infarction, angina pectoris, sudden death, and cerebral infarction in RA patients (study cited in an editorial by Freek Verheugt accompanying the Hira paper), a result that kick-started its use for CVD prevention.

The new Hira et al. study included about 68,000 patients in 119 different practices taking aspirin for prevention of a first heart attack or stroke, not recurrence.  The authors looked at clinical records in a network of cardiology practices to assess the proportion of patients in each practice that was taking aspirin, and whether they met the 10-year risk criteria for 'appropriate use' as determined by the Framingham risk calculator.  The calculator uses an algorithm based on age, sex, total cholesterol, HDL cholesterol, smoking status, blood pressure and whether the patient is taking medication to control blood pressure.

Appropriate use, according to Hira et al., is a 10-year risk of greater than 6%.  According to the calculator itself, 6% risk means that 6 of 100 people with whichever set of factors yields this risk will have a heart attack within the next 10 years.  The reason this even has to be thought about is because there is some risk to taking aspirin because it's an anticoagulant and can cause major bleeding, so maximizing the cost/benefit ratio, preventing CVD as well as major bleeds, is what's at issue here.  If the benefit is a long-shot because an aspirin user isn't likely to have CVD anyway, the potential cost can outweigh the pluses.

As Verheugt explains:
Major coronary events (coronary heart disease mortality and nonfatal MI) are reduced by 18% with aspirin but at the cost of an increase of 54% in major extracranial bleeding. For every 2 major coronary events shown to be prevented by prophylactic aspirin, they occur at the cost of 1 major extracranial bleed. Primary prevention with aspirin is widely applied, however. This regimen is used not only because of its cardioprotection but also because there is increasing evidence of chemoprotection of aspirin against cancer.
Hira et al. found that 11.6% of the population of patients visiting a cardiology practice were taking aspirin inappropriately, having a risk less than 6% as calculated by the Framingham calculator.  That is, their risk of bleeding outweighs the potential preventive effect of aspirin.

But, about this 6% risk.  Does it sound high to you?  Would you change your behavior based on a 6% risk, or would you figure the risk is low enough that you can continue to eat those cheese steaks?  Or maybe you'd just start popping aspirin, figuring that made it really safe to continue to eat those cheese steaks?

And why the 6% threshold?  So precise.  Indeed, a 2011 study suggested different risk thresholds for different age categories, increasing with age.  And, different calculators (such as this one from the University of Edinburgh) return different risk estimates, varying by several percentage points given the same data, so so much for precision.

Risk is, of course, estimated from population data, based on the many studies that have found an association between cholesterol, blood pressure, smoking status, and heart attack, particularly in older men.  A distribution of risk factors and outcomes would thus show that for a given set of cholesterol and blood pressure values, on average x% will have a heart attack or stroke.  These are group averages, and using them to make predictions for individuals cannot be done with precision that we know to be true.  Indeed, one of the strongest risk factors known to epidemiology, smoking, causes lung cancer in 'only' 10% of smokers, and it's impossible to predict who. But that's why these CVD risk calculators never estimate 100% risk.  The highest risk I could force them to estimate was "greater than 30%".

Hard to know what that actually means for any individual.  At least, I have a hard time knowing what to make of these figures.  If 6 of 100 people in the threshold risk risk category will have an MI in the next 10 years, this means that 94 will not.  So, another way to think about this is that the risk for 94 people is in fact 0, while risk for the unlucky 6 is 100%.  For everyone over the 6% threshold, the cost -- possible major bleed -- is assumed to be outweighed by the benefit -- prevention of MI --  even when that's in fact only true for 6 out of 100 people in this particular risk category.  But, since it's impossible to predict which 6 are at 100% risk, the whole group is treated as though it's at 100% risk, and put on preventive baby aspirin, and perhaps statins as well, and counseled on lifestyle changes and so on, all of which can greatly affect the outcome, and alter our understanding of risk factors -- or the effectiveness of preventive aspirin.  And what if it's true that a drink a day lowers heart failure risk?  How do we factor that in?

Further, a lot of more or less well-established risk factors for CVD are not included in the calculation. After decades of cardiovascular disease research, it seems to be well-established that obesity is a risk factor, as well as diabetes, and certainly family history.  Why aren't these pieces of information included?  Tens if not hundreds of genes have been identified to have at least a weak effect on risk (and even this number only account for a fraction of the genetic risk as estimated from heritability studies), and these aren't included in the calculation either.  And, we all know people who seemed totally fit, who had a heart attack on the running trail, or the bike trail, so at least some people are in fact at risk even with none of the accepted risk factors.

So, 11.6% of baby aspirin takers shouldn't be taking aspirin.  But, when risk estimation is as imprecise as it is, and as hard to understand, this seems like a number that we should be taking with a grain of salt, if not a baby aspirin.  Well, except that salt is a risk factor for hypertension which is a risk factor for heart disease....or is it?

*Or something like that.  It turns out that the Hira paper cited a 2007 paper, which cited a 2006 paper, which cited the Behavioral Risk Factor Surveillance System 2003 estimate of 36% of the American population taking a baby aspirin a day.  But this is a 12 year old figure, and I couldn't find anything more recent.

Thursday, January 22, 2015

Your money at, waste: the million genomes project

Bulletin from the Boondoggle Department

In desperate need for a huge new mega-project to lock up even more NIH funds before the Republicans (or other research projects that are actually focused on a real problem) take them away, or before individual investigators who actually have some scientific ideas to test, we read that Francis Collins has apparently persuaded someone who's not paying attention to fund the genome sequencing of a million people!  Well, why not?  First we had the (one) human genome project.  Then after a couple of iterations, the 1000 genomes project, then the hundred thousand genomes 'project'.  So, what next?  Can't just go up by dribs and drabs, can we?  This is America, after all!  So let's open the bank for a cool million. Dr Collins has, apparently, never met a genome he didn't like or want to peer into.  It's not lascivious exactly, but the emotion that is felt must be somewhat similar.

We now know enough to know just what we're (not) getting from all of this sequencing, but what we are getting (or at least some people are getting) is a lot of funds sequestered for a few in-groups or, more dispassionately perhaps, for a belief system, the belief that constitutive genome sequence is the way to conquer every disease known to mankind.  Why, this is better than what you get by going to communion every week, because it'll make you immortal so you don't have to worry that perhaps there isn't any heaven to go to after all.

Anyway, why not, the genomes are there, their bearers will agree and they've got the blood to give for the cause.  Big cheers from the huge labs, equipment manufacturers and those eyeing the Europe and and Chinese to make sure we don't fall behind anyone (and knowing they're eyeing us for the very same reason).  And this is also good for the million author papers that are sure to come.  And that's good for the journals, because they can fill many pages with author lists, rather than substance.

Of course, we're just being snide (though, being retired, not jealous!).  But whether in fact this is good science or just ideology and momentum at work is debatable but won't be debated in our jealous me-too or me-first environment.

Is there any slowing down the largely pointless clamor for more......?

We've written enough over the past few years not to have to repeat it here, and we are by no means the only ones to have seen through the curtain and identified who the Wiz really is.  If this latest stunt doesn't look like a masterful, professionally skilled boondoggle to you, then you're seeing something very different from what we see.  One of us needs to get his glasses cleaned.  But for us it's moot, of course, since we don't control any of the funds.

Wednesday, January 21, 2015

Dragonfly the hunter

For vertebrates and invertebrates alike, hunting is a complex behavior.  Even if it seems to involve just a simple flick of the tongue, the hunter must first note the presence of its prey, and then successfully capture it, even when the prey makes unpredictable moves. Vertebrates hunt by predicting and planning, relying on what philosophers of mind call 'internal models' that allow them to anticipate the movement of their prey and respond accordingly, but whether invertebrates do the same has not been known.  The typical human-centered reflex is to dismiss insects as mere genetic robots, mechanically linking sensory input to automatic, hard-wired action.

But that may be far too egocentric, because a new paper in the January 15 issue of Nature ("Internal models direct dragonfly interception steering," Mischiati et al.) describes the hunting behavior of dragonflies, and suggests that dragonflies have internal models as well.
Prediction and planning, essential to the high-performance control of behaviour, require internal models. Decades of work in humans and non-human primates have provided evidence for three types of internal models that are fundamental to sensorimotor control: physical models to predict properties of the world; inverse models to generate the motor commands needed to attain desired sensory states; and forward models to predict the sensory consequences of self-movement
Dragonflies generally don't hunt indoors, so Mischiati et al. decked out a laboratory to look like familiar hunting grounds, brought some dragonfly fodder indoors, and videotaped and otherwise assessed the behavior of the dragonflies in pursuit of their next meals to determine what they were looking at, and to assess their body movements as they pursued their prey.  These measurements suggested to them that the heads of the dragonflies were moving in sync with their prey, meaning that they were anticipating rather than reacting to the flight of their prey.

Anisoptera (Dragonfly), Pachydiplax longipennis (Blue Dasher), female, photographed in the Town of Skaneateles, Onondaga County, New York. Creative Commons

And this in turn suggests that, like vertebrates, dragonflies have internal models that facilitate their hunting. Rather than dashing after insects after they've already moved, dragonflies are able to predict their movements, and successfully capture their prey 90-95% of the time.  Compared with, say, echolocating bats, this is a remarkable success rate -- e.g., estimates of the success rate of Eptesicus nilssonii, a Eurasian bat, range from 36% for moths to 100% for the slow-moving dung beetle (Rydell, 1992). And it's an even more remarkable success rate compared with Pennsylvania deer hunters -- for every 3 or 4 hunting licenses sold, 1 deer was killed in 2012-13, which means that if, like dragonflies or bats, people had to rely on venison for their survival, they'd be in deep trouble.

But, apparently humans, bats and dragonflies are using essentially the same kind of internal model to hunt, a model that allows them to anticipate the future and take action accordingly.  More specifically, the model is a 'forward model', and it has been thought to be the foundation for cognition in vertebrates, but is at least the basis of motor control (as described here and here).  You can dismissively call it just 'computing' or you can acknowledge it as 'intelligent', but it is clearly more than simple hard-wired reflex: it involves judgment.

This is interesting and relevant, because if all that's required is the ability to predict and plan accordingly, why is there so much variation in the success rate of the hunt, even within a given species?  Clearly other factors and abilities are required -- other aspects of the nervous system, for example, or speed relative to prey, and population density of predator and prey.  Indeed, insects would be expected to vary in their 'intelligence' the way people do, in a way that means that most are able to succeed.

It seems that the study of insect behavior is building a more and more complex model of how insects do what they do.  The view of the insect brain is broadening into one that allows for much more complexity than robotic hard-wired behavior, or motor responses to sensory input.  A few months ago, we blogged about bee intelligence, writing about a PNAS paper that described how bees find their way home, credibly by using a cognitive map.

The author of a recent paper in Trends in Neuroscience ("Cognition with few neurons: higher-order learning in insects," Martin Giurfa, 2013) speculated about unexpected insect cognitive abilities, welcoming an approach to understanding plastic insect behavior that allows for the possibility of complex, sophisticated learned rather than mere associative learning.  But Giurfa cautions that there are many reasons why we don't yet understand insect behavior, including our tendency to anthropomorphize, using words for insect behavior derived from what we know about human abilities that, when applied to insects, imply more complexity than warranted, or to interpret experimental results as though they represent all that insects can do, rather than all that they were asked to do in the study.

On the other hand, many of the genes insects use for their sensory and neural functions are evolutionarily related to the genes mammals, including humans, use. So we likely share many similar genetically based mechanisms.

From the outside of this field looking in, it seems as though it's early days in understanding invertebrate brains.  And it seems to me that this is largely because observational studies are difficult to do on insects, must be interpreted because insects can't talk, and our interpretations are necessarily built on our assumptions about insect behavior, which in turn seem to follow trends in what people are currently thinking about cognition.  Until recently, researchers have assumed that insects, with far fewer neurons than we have, are pretty dumb.  The dragonfly hunter's success rate alone should be humbling enough to challenge this assumption.

In this sense, it's wrong to think simply that size matters.  Maybe its organization that matters more.

Monday, January 19, 2015

We can see the beast....but it's been us!

The unfathomable horrors of what the 'Islamists' are doing these days can hardly be exaggerated.  It is completely legitimate, from the usual mainstream perspective at least, to denigrate the perpetrators in the clearest possible way, as simply absolute evil.  But a deeper understanding raises sobering questions.

It's 'us' pointing at 'them' at the moment, and some aspects of what's going on reflect religious beliefs: Islam vs Christianity, Judaism, or the secular western 'faith'.  If we could really believe that we were fundamentally better than they are we could feel justified in denigrating their wholly misguided beliefs, and try to persuade them to come over to our True beliefs about morally, or even theologically acceptable behavior.

Unfortunately, the truth is not so simple.  Nor is it about what 'God' wants.  The scientific atheists (Marxist) slaughtered their dissenters or sent them to freeze in labor camps by the multiple millions. It was the nominally Christian (and even Socialist) Nazis who gassed their targets by the millions. And guess who's bombing schools in Palestine these days?

Can we in the US feel superior?  Well, we have the highest per capita jailed population, and what about slavery and structural racism?  Well, what about the Asians?  Let's see, the rape of Nanking, Mao's Cultural Revolution, the rapine Huns.....

Charlie Hebdo is just a current example that draws sympathy, enrages, and makes one wonder about humans.  Haven't we learned?  I'd turn it around and ask: has anything even really changed?

Christians have made each other victims, of course.  Read John Fox's Book of Martyrs from England in the 1500's (or read about the more well-known Inquisition).  But humans are equal opportunity slaughterers. Think of the crusades and back-and-forth Islamic-Christian marauding episodes.  Or the Church's early systematic 'caretaking' of the Native Americans almost from the day Columbus first got his sneakers wet in the New World, not to mention its finding justification for slavery (an idea going back to those wonderful classic Greeks, and of course previously in history).  Well, you know the story.

Depiction of Spanish atrocities committed in the conquest of Cuba in Bartolomé de Las Casas's "Brevisima relación de la destrucción de las Indias", 1552.   The rendering was by the Flemish Protestantartist Theodor de Bry. Public Domain. 

But this post was triggered not just by the smoking headlines of the day, but because I was reading about that often idealized gentle, meditative Marcus Aurelius, the Roman Emperor in the second century AD.  In one instance, some--guess who?--Christians had been captured by the Romans and were being tortured: if they didn't renounce their faith, they were beheaded (sound familiar?) or fed to the animals in a colosseum.  And this was unrelated to the routine slavery of the time. Hmmm...I'd have to think about whether anyone could conceive of a reason that, say, lynching was better than beheading.

It is disheartening, even in our rightful outrage at the daily news from the black-flag front, to see that contemporary horrors are not just awful, they're not even new!  And, indeed, part of our own Western heritage.

Is there any science here?  If not, why not?
We try to run an interesting, variable blog, mainly about science and also its role in society.  So the horrors on the Daily Blat are not as irrelevant as they might seem:  If we give so much credence, and resources, to science, supposedly to make life better, less stressful, healthier and longer, why haven't we moved off the dime in so many of these fundamental areas that one could call simple decency--areas that don't even need much scientific investment to document?

Physics, chemistry and math are the queens of science.  Biology may be catching up, but that would seem today mainly to be to the extent we are applying molecular reductionism (everything in terms of DNA, etc). That may be physics worship or it may be good; time will tell, but of course applied biology can claim many major successes. The reductionism of these fields gives them a kind of objective, or formalistic, rigor.  Controlled samples or studies, with powerful or even precise instrumentation are possible to measure and evaluate data, and to form testable credible theory about the material world.

But a lot of important things in life seem so indirect, relative to molecules, that one would think there could also be, at least in principle,  comparably effective social and behavioral sciences that did more than lust after expensive, flashy reductionist equipment (DNA sequencing, fMRI imaging, super-computing, etc.) and the like.  Imaging and other technologies certainly have made much of the physical sciences possible by enabling us to 'see' things our organic powers, our eyes, nose, ears, etc.,  could not detect.  But the social sciences?  How effective or relevant is that lust to the problems being addressed?

The cycling and recycling of social science problems seems striking.  We have plentiful explanations for things behavioral and cultural, and many of them sound so plausible.  We have formal theories structured as if they were like physics and chemistry: Marxism and related purportedly materialist theories of economics, cultural evolution, and behavior, and 'theories' of education, which are legion yet the actual result has been sliding for decades.  We have libraries-full of less quantitively or testably rigorous, more word-waving 'theories' by psychologists, anthropologists, sociologists, economists and the like.  But the flow of history and, one might say, its repeated disasters, shows, to me, that we as yet don't in fact have nothing very rigorous, despite a legacy going back to Plato and the Greek philosophers.

We spend a lot of money on the behavioral and social sciences with 'success' ranging from very good for very focal types of traits, to none at all when it comes to what are the major sociocultural phenomena like war, equity, and many others.  We have journal after journal, shelves full of books of social 'theory', including some (going back at least to Herbert Spencer) that purport to tie physical theory to biology to society, and Marx and Darwin are often invoked, along with ideas like the second law of thermodynamics and so on.  Marx wanted a social theory as rigorous as physics, and materialist, too, but in which there would be an inevitable, equitable end to the process.  Spencer had an end in mind, too, but one with a stable inequality of elites and the rest.  Not exactly compatible!

And this doesn't include social theories derived from this or that world religion.  Likewise, of course, we go through psychological and economic theories as fast as our cats go through kibbles, and we've got rather little to show for it that could seriously claim respect as science in the sense of real understanding of the phenomena.  When everyone needs a therapist, and therapists are life-long commitments, something's missing.

Karl Marx and Herbert Spencer, condemned to face each other for eternity at Highgate Cemetery in London (photos: A Buchanan)

Either that, or these higher-levels of organized traits simply don't follow 'laws' the way the physical phenomena do.  But that seems implausible since we're made of physical stuff, and such a view would take us back to the age-old mind-matter duality, endless debate about free will, consciousness, soul, and all the rest back through the ages.  And while this itemization is limited to western culture, there isn't anything more clearly 'true' in the modern East, nor in the cultures elsewhere or before ours.

Those with vested interests in their fMRI machines, super-computer modeling, or therapy practices will likely howl 'Foul!' It's hard not to believe that in the past there were a far smaller percentage of people with various behavioral problems needing chemical suppression or endless 'therapy' than there is today.  But if there were, and things are indeed changing for the worse, this further makes the point.  Why aren't mental health problems declining, after so much research?

You can defend the social sciences if you want, but in my personal view their System is, like the biomedical one, a large vested interest that keeps students off the street for a few years, provides comfy lives for professors, fodder for the news media and lots of jobs in the therapy and self-help industries (including think-tanks for economics and politics).....but has not turned daily life, even in the more privileged societies, into Nirvana.

One can say that those interests just like things to stay the way they are, or argue that while their particular perspective can't predict every specific any more than a physicist can predict every molecule's position, generic, say, Darwinian competition-is-everything views are simply true. Such assertions--axioms, really--are then just accepted and treated as if they're 'explanations'. If you take such a view, then we actually do understand everything!  But even if these axioms--Darwinian competition, e.g.--were true, they have become such platitudes that they haven't proven themselves in any serious sense, because if they had we would not have multiple competing views on the same subjects.  Despite debates on the margins, there is, after all, only one real chemistry, or physics, even if there are unsolved aspects of those fields.

The more serious point is this:  we have institutionalized research in the 'soft' as well as 'hard' sciences.  But a cold look at much of what we spend funding on, year after year without demanding actual major results, would suggest that we should be addressing the lack of real results as perhaps the more real or at least more societally important problem these fields should be addressing--and with the threat of less or no future funding if something profoundly better doesn't result.  In a sense, engineering works in the physical sciences because we can build bridges without knowing all the factors involved in precise detail.  But social engineering doesn't work that way.

After all, if we are going to spend lots of money on minorities (like professors, for example), we would be better to take an engineering approach to problems like 'orphan' (rare) diseases, which are focused and in a sense molecular, and where actual results could be hoped for.  The point would be to shift funds from wasteful, stodgy areas that aren't going very far.  Even if working on topics like orphan diseases is costly, there are no other paths to the required knowledge other than research with documentable results.  Shifting funding in that direction would temporarily upset various interests, but would instead provide employment dollar to areas and people who could make a real difference, and hence would not undermine the economy overall.

At the same time, what would it take for there to be a better kind of social science, the product of which would make a difference to human society, so we no longer had to read about murders and beheadings?

Thursday, January 15, 2015

When the cat brings home a mouse

To our daughter's distress, she needs to find a new home for her beloved cats, so overnight we've gone from no cats to three cats, while we try to find them someplace new.  I haven't lived with cats since I was a kid really, because I was always allergic.  When I visited my daughter, I'd get hives if Max, her old black cat, sadly now gone, rubbed against my legs, and I always at least sneezed even when untouched by felines.  But now with three cats in the house, I'm allergy-free and Ken, never allergic to cats before, is starting to sneeze -- loudly.

Old Max


Oliver upside-down

But the mystery of the immune system is just one of the mysteries we're confronting -- or that's confronting us -- this week.  Here's another.  The other day my daughter brought over a large bag of dry cat food.  I put it in a closet, but the cats could smell it, and it drove them nuts, so I moved it into the garage.  A few days later I noticed that the cats were all making it clear that they really, really wanted to go into the garage, but we were discouraging that given the dangers of spending time in a location with vehicles that come and go unpredictably. I just assumed they could smell the kibbles, or were bored and wanted to explore new horizons.

But two nights ago I went out to the garage myself to get pellets for our pellet stove, and Mu managed to squeeze out ahead of me.  He made a mad dash for the kibbles.  Oliver was desperate to follow, but I squeezed out past him and quickly closed the door.  At which point, Mu came prancing back, squeaking.  Oh wait, he wasn't squeaking, it was the mouse he was carrying in his mouth that was squeaking!  He was now just as eager to get back in the house as he'd been to get out.  After a few minutes he realized that wasn't going to happen, so he dropped the now defunct mouse, and I let him back in.

Mu, the Hunter
So, that 'tear' in the kibbles bag that I'd noticed a few days before?  Clearly made by a gnawing mouse (mice?).  And the cats obviously had known about this long before I did.  But how did Mu know exactly where to make a beeline to to catch the mouse?  He'd never seen where I put the bag, nor the mouse nibbling at it!  And I have to assume the other cats would have been equally able hunters had they been given the chance.

Amazing.  A whole undercurrent of sensory awareness and activity going on right at our feet, and we hadn't clued in on any of it.  I'd made unwarranted assumptions about holes in the bag, but the cats knew better.  Yes, I could have looked more closely at the kibble that had spilled out of the bag and noticed the mouse droppings.  But I didn't, because, well, because it didn't occur to me.

Though, now that I'm clued in, I believe we've got another mouse...

Mu and Ollie at the door to the garage yesterday afternoon

I might even have been able to detect the mouse without seeing any of the evidence, just like the cats, if I'd tuned in more attentively, but I'm pretty sure it would have required better hearing.  In any case, other bits of evidence more suited to my perceptive powers were available, but I didn't notice.  I take this as yet another cautionary tale about how we know what we know, and I will claim it applies as well to politics, economics, psychology, forensics, religion, science, and more.  We build our case on preconceived notions, beliefs, assumptions, what we think is true, rarely re-evaluating those beliefs -- unless we're forced to, when, say, Helicobacter pylori is found to cause stomach ulcers, or our college roommate challenges our belief in God, or economic austerity does more harm than good.

As Holly often says, scientists shouldn't fall in love with their hypothesis.  Hypotheses are made to be tested; stretched, pounded, dropped on the floor and kicked, and afterwards, and continually, examined from every possible angle, not defended to the death.  But we often get too attached, and don't notice when the cat brings home a mouse.

An illustrative blog post in The Guardian by Alberto Nardelli and George Arnett last October tells a similar tale (h/t Amos Zeeberg on Twitter).  "Today’s key fact: you are probably wrong about almost everything."  Based on a survey by Ipsos Mori, Nardelli and Arnett report disconnects between what people around the world believe is true about the demographics of their country, and what's actually true.

So, people in the US overestimate the percentage of Muslims in the country, thinking it's 15% when it's actually 1%.  Japanese think the percentage of Muslims is 4% when it's actually 0.4%, and the French think it's 31% while it's actually 8%.

In the US, we think immigrants make up 32% of the population, but in fact they are 13%.  And so on.  We think we know, but very often we're wrong.  We're uninformed, ill-informed, or under informed, even while we think we're perfectly well informed.

Source: The Guardian

The Guardian piece oozes political overtones, sure.  But I think it is still a good example of how we go about our days, thinking we're making informed decisions, based on facts, but it's not always so.  A minority of Americans accept evolution, despite the evidence; you made up your mind about whether Adnan is guilty or innocent if you listened to Serial, even though you weren't a witness to the murder, and the evidence is largely circumstantial.  And so on.  And this all has consequences.

In a sense, even if we are right about what we think, or its consequences, based on what we know, it's hard to know if we are missing relevant points because we simply don't have the data, or haven't thought to evaluate it correctly, as me in regard to Mu and the mouse.  We have little choice but to act on what we know, but we do have a choice about how much confidence, or hubris, we attribute to what we know, to consider that what we know may not be all there is to know.

This is sobering when it comes to science, because the evidence for a novel or alternative interpretation might be there to be seen in our data, but our brains aren't making the connections, because we're not primed to or because we're unaware of aspects of the data.  We think we know what we're seeing, and it's hard to draw different conclusions.

Fortunately, occasionally an Einstein or a Darwin or some other grand synthesizer comes along and looks at the evidence in a different way, and pushes us forward.  Until then, it's science as usual; incremental gains based on accepted wisdom.  Indeed, even when such a great synthesizer provides us with dramatically better explanations of things, there is a tendency to assume that now, finally, we know what's up, and to place too much stock in the new theory......repeating the same cycle again.

Tuesday, January 13, 2015

The Genome Institute and its role

The NIH-based Human Genome Research Institute (NHGRI) has for a long time been funding the Big Data kinds of science that is growing like mushrooms on the funding landscape.  Even if overall funding is constrained, and even if this also applies to the NHGRI (I don't happen to know), the sequestration of funds in too-big-to-stop projects is clear. Even Francis Collins and some NIH efforts to reinvigorate individual-investigator RO1 awards don't really seem to have stopped the grab for Big Data funds.

That's quite natural.  If your career, status, or lab depends on how much money you bring into your institution, or how many papers you publish, or how many post-docs you have in your stable, or your salary and space depend on that, you will have to respond in ways that generate those score-counting coups.  You'll naturally exaggerate the importance of your findings, run quickly to the public news media, and do whatever other manipulations you can to further your career.  If you have a big lab and the prestige and local or even broader influence that goes with that, you won't give that up easily so that others, your juniors or even competitors can have smaller projects instead.  In our culture, who could blame you?

But some bloggers, Tweeters, and Commenters have been asking if there is a solution to this kind of fund sequestration, largely reserved (even if informally) for the big usually private universities.  The arguments have ranged from asking if the NHGRI should be shut down (e.g., here) to just groping for suggestions.  Since many of these questions have been addressed to me, I thought I would chime in briefly.

First, a bit of history or perspective, as informally seen over the years from my own perspective (that is, not documented or intended to be precise, but a broad view as I saw things):
The NHGRI was located administratively where it was for reasons I don’t know.  Several federal institutes were supporting scientific research.  NIH was about health, and health 'sells', and understandably a lot of fund is committed to health research.  It was natural to think that genome sequences and sciences would have major health implications, if the theory that genes are the fundamental causal elements of life was in fact true.  Initially James Watson, discoverer of DNA's structure, and perhaps others advocated the effort.  He was succeeded by Francis Collins who is a physician and clever politician.
However, there was competition for the genome ‘territory’, at least with the Atomic Energy Commission.  I don’t know if NSF was ever in the ‘race’ to fund genomic research, but one driving force at the time was the fear of mutations that atomic radiation (therapeutic, from wars, diagnostic tests, and weapons fallout) generated.  There was also a race with the private sector, notably Celera as a commercial competitor that would privatize the genome sequence.  Dr Collins prominently, successfully, and fortunately defended the idea of open and free public access.  The effort was seen as important for many reasons, including commercial ones, and there were international claimants in Japan, the UK, and perhaps elsewhere, that wanted to be in on the act.  So the politics were rife as well as the science, understandably.
It is possible that only with the health-related promises was enough funding going to be available, although nuclear fears about mutations and the Cold War probably contributed, along with the usual less savory for self-interest, to AEC's interests.
Once a basic human genome sequence was available, there was no slowing the train. Technology, including public and private innovation promised much quicker sequencing in the future, that was quickly to become available even to ordinary labs (like mine, at the time!).  And once the Genome Institute (and other places such as the Sanger Centre in Britain and centers in Japan, China, and elsewhere) were established, they weren't going to close down!  So other sequences entered the picture--microbes, other species, and so on.  
It became a fad and an internecine competition within NIH.  I know from personal experiences at the time that program managers felt the need to do 'genomics' so they would be in on the act and keep their budgets.  They had to contribute funds, in some way I don't recall, to the NHGRI's projects or in other ways keep their portfolios by having genomics as part of this.  -Omics sprung up like weeds, and new fields such as nutrigenomics, cancer genomics, microbiomics and many more began to pull in funding, and institutes (and the investigators across the country) hopped aboard.  Imitation, especially when funds and current fashion are involved, is not at all a surprise, and efficiency or relative payoff in results took the inevitable back seat: promises rather than deliveries naturally triumphed.
In many ways this has led to the current of exhaustively enumerative Big Data: a return to 17th century induction.  This has to do not just with competition for resources, but a changed belief system also spurred by computing power: Just sample everything and pattern will emerge!
Over the decades the biomedical (and to some lesser extent biological) university establishment grew on the back of the external funding which was so generous for so long.  But it has led to a dependency.  Along with exponential growth in the number of competitors, hierarchies of elite research groups developed--another natural human tendency.  We all know the career limitations that are resulting from this.  And competition has meant that deans and chairs expect investigators always to be funded, in part because there aren't internal funds to keep labs running in the absence of grants. It's been a vicious self-reinforcing circle over the past 50 years.
As hierarchies built, private donors were convinced (conned?) into believing that their largesse would lead to the elimination of target diseases ('target' often meaning those in the rich donors' families). Big Data today is the grandchild of the major projects, like the Manhattan Project in WWII, that showed that some kinds of science could be done on a large scale.  Many, many projects during past decades showed something else: Fund a big project, and you can't pull the plug on it!  It becomes too entrenched politically.  
The precedents were not lost on investigators!  Plead for bigger, longer studies, with very large investments, and you have a safe bet for decades, perhaps your whole career. Once started, cost-benefit analysis has a hard time paring back, much less stopping such projects. There are many examples, and I won't single any of them out.  But after some early splash, by and large they have got to diminishing returns but not got to any real sense of termination: too big to kill.
This is to some extent the same story with the NHGRI.  The NIH has got too enamored of Big Data to keep the NHGRI as limited or focused as perhaps it should have been (or should be). In a sense it became an openly anti-focused-research sugar daddy (Dr Collins said, perhaps officially, that NHGRI didn’t fund ‘hypothesis-based research”) based on pure inductionism and reductionism, so it did not have to have well-posed questions.  It basically bragged about not being focused.
This could be a change in the nature of science, driven by technology, that is obsolescing the nature of science that was set in motion in the Enlightenment era, by the likes of Galileo, Newton, Bacon, Descartes and others.  We'll see.  But the socioeconomic, political sides of things are part of the process, and that may not be a good thing.
Will focused, hypothesis-based research make a comeback?  Not if Big Data yields great results, but decades of it, no matter how fancy, have not shown the major payoff that has been promised.  Indeed, historians of science often write that the rationale, that if you collect enough data its patterns (that is, a theory) will emerge, has rarely been realized.  Selective retrospective examples don't carry the weight often given them.

There is also our cultural love affair with science.  We know very clearly that many things we might do at very low cost would yield health benefits far exceeding even the rosy promises of the genomic lobby.  Most are lifestyle changes.  For example, even geneticists would (privately, at least) acknowledge that if every 'diabetes' gene variant were fixed, only a small fraction of diabetes cases would be eliminated. The recent claim that much of cancer is due just to bad mutational luck has raised lots of objections--in large part because Big Data researchers' business would be curtailed. Everyone knows these things.

What would it take to kill the Big Data era, given the huge array of commercial, technological, and professional commitments we have built, if it doesn't actually pay off on its promises?  Is focused science a nostalgic illusion? No matter what, we have a major vested interest on a huge scale in the NHGRI and other similar institutes elsewhere, and grantees in medical schools are a privileged, very well-heeled lot, regardless of whether their research is yielding what it promises.

Or, put another way, where are the areas in which Big Data of the genomic sort might actually pay, and where is this just funding-related institutional and cultural momentum?  How would we decide?

So what do to?  It won't happen, but in my view the NHGRI does not, and never did, belong properly in NIH. It should have been in NSF, where basic science is done.  Only when clearly relevant to disease should genomics be funded for that purpose (and by NIH, not NSF).  It should be focused on soluble problems in that context.
NIH funds the greedy maw of medical schools.  The faculty don't work for the university, but for NIH.  Their idea of 'teaching' often means giving 5-10 lectures a year that mainly consist of self-promoting reports about their labs, perhaps the talks they've just given at some meeting somewhere. Salaries are much higher than at non-medical universities--but in my view grants simply should not pay faculty salaries.  Universities should.  If research is part of your job's requirements, its their job to pay you.  Grants should cover research staff, supplies and so on.
Much of this could happen (in principle) if the NHGRI were transferred to NSF and had to fund on an NSF-level budget policy.  Smaller amounts, to more people, on focussed basic research.  The same total budget would go a lot farther, and if it were restricted to non-medical school investigators there would be the additional payoff that most of them actually teach, so that they disseminate the knowledge to large numbers of students who can then go out into the private sector and apply what they've learned.  That's an old-fashioned, perhaps nostalgic(?) view of what being a 'professor' should mean.  
Major pare-backs of grant size and duration could be quite salubrious for science, making it more focused and in that sense accountable.  The employment problem for scientists could also be ameliorated.  Of course, in a transition phase, universities would have to learn how to actually pay their employees.
Of course, it won't happen, even if it would work, because it's so against the current power structure of science.  And although Dr Collins has threatened to fund more small RO1 grants it isn't clear how or whether that will really happen.  That's because there doesn't seem to be any real will to change among enough people with the leverage to make it happen, and the newcomers who would benefit are, like all such grass-roots elements, not unified enough.
These are just some thoughts, or assertions, or day-dreams about the evolution of science in the developed world over the last 50 years or so.  Clearly there is widespread discontent, clearly there is large funding going on with proportionately little results.  Major results in biomedical areas can't be expected over night.   But we might expect that research had more accountability.

Thursday, January 8, 2015

Genomewide mapping and a correlation fallacy

When there isn't an adequate formal theory for determining cause and effect, we often must rely on searches for statistical associations between variables that we, for whatever reason, think might cause an outcome and the occurrence of the outcome itself.  One criterion is that the putative cause must arise before its effect, that is, the outcome of interest.  That time-order is sometimes not clear in the kinds of data we collect, but we would normally say we're lucky in genetics because a person is 'exposed' to his or her genotype from the moment of conception.  Everything that might cause an outcome, say a disease, comes after that.  So gene mapping searches for correlations between inherited genomic variation and variation in the outcome.  But the story is not as crystal clear as is typically presented.

In genomewide mapping studies, like case-control GWAS (or QTL mapping for quantitative traits), we divide the data into categories, based on say two variants (SNP alleles), A and B, at some genome position X. Then, if the outcome--say some disease under investigation--is more common among A-carriers than among B-carriers at some chosen statistical significance level, it is common to infer or even to assert that the A-allele is a causal factor for the disease (or, less often put this way, that B is causally protective).  The usual story is that the difference is far from categorical, that is, the A-bearing group is simply at a higher probabilistic risk of manifesting the trait.

However, the usually unstated inference is that the presence of SNP A has some direct, even if only probabilistic, effect in causing the outcome.  The gene may, for example, be involved in some signaling pathway related to the disease, so that variation in A affects the way the pathway, as a whole, protects or fails to protect the person.

Strategies to protect against statistical artifact
We know that in most cases many, or even typically most people with the disease do not carry the A allele, because the relative risks associated with the A allele are usually quite modest.  So the correlation might be false, because as we know and too often overlook, correlation does not in itself imply causation.  One way to test for potential confusion is to compare AA, AB, and BB genotypes to see if the 'dose' (number of copies) of A is correlated with some aspect of the disease.  Usually there isn't enough data to resolve such differences with much convincing statistical rigor.

Another approach to protect against false associations is to extend the study and see if the same association is retained.  But this is often very costly because of sample size demands or, if the study is in a small population, perhaps impossible.  Likewise, people exit and enter study groups, changing the mix of variation, risking obscuring signal.  One way to try to show systematic effects is to do a meta-analysis, by pooling studies.  If the overall correlation is still there, even if individual studies have come to different risk estimates, one may have more confidence.  This is, to my understanding, usually not done by regressing the allele frequency with risk, which seems like something that should be done, but there is heterogeneity in method, genotyping, accuracy, and size among studies so this is likely to be problematic.

An issue that seems often, if not usually to have been overlooked
An upshot of this typical kind of finding of very weak effect sizes is that the disorder is the result of any of a variety of genomic backgrounds (genomewide genotypes) as well as lifestyle exposures that aren't being measured.  The background differences may complement the A's effects in varying ways so that the net effect is real, but on average weak.  That's why non-A carriers have almost the same level of risk (again, that is, the net effect size of A is small).

But the problem arises when the excess risk in the A-carriers is assumed to be due to that allele.  In fact, and indeed very likely, is that even in many affected A-carriers the disease may have arisen because of risky variables in networks other than the one involving the 'A/B' gene.  That is why nearly as high a proportion of non-A-carriers are affected.  Because of independent assortment and recombination among genes and chromosomes, the same distribution of backgrounds will be found in the A-bearing cases (though none of the genome-types will be the same even between any two individuals).  In those A-individuals their outcome may be due entirely to variants other than the A allele itself, for example, because of variants in genes in other networks.  That is, some, many, or even most A-bearing cases may not, in fact, be affected because of the A allele.

This seems to me very likely to be a common phenomenon, given what we know about the complex genotypic variation at the thousands of potentially relevant sites typed in genomewide analysis like GWAS in any sample. One well-known issue that GWAS methods can and often do correct for, is that some factors related to population structure (origins and marriage patterns among the sampled individuals) can induce false correlations.  But even after that correction, given the true underlying causal complexity, it is likely that for some SNP sites it is only chance distributions of different complex genotypes between the A- and non-A SNP genotypes that suffice to generate the weak statistical effects, when so many sites in the genome are tested.

Suppose the A allele's estimated effects raise the risk of the disease to 5% in the A-carriers, and let's assume for the moment the convenient fiction that there is no error in this estimate.  It may be the case that 5% of the A-bearing cases are due to the presence of the very-strong A-allele, and are doomed to the disease, whereas the other 95% of A-bearing are risk-free.  Or it could be that every A-carrier's risk is elevated by some fraction so the average is 5%.  Given that almost as many cases are usually seen in non-A carriers, such uniformity seems unlikely to be true.  Almost certainly, at least in principle, is that the A-carriers have a distribution of risk, for whatever background genomic or environmental or stochastic reasons, but that whose average is 5%.  These alternative interpretations are very difficult to test, and when does anyone actually bother?

The problem relates to intervention strategies
For many years, we have know from all sorts of mapping studies that most identified sites have very weak effects.  We know that in many cases environmental factors are vastly more important (because, for example, the disease prevalence has changed dramatically in the last few decades).  But the justification (rationale, or excuse) for continuing the huge Big Data approach is that it will at least identify 'druggable target' genes so the culpable pathway can be intervened in.  Hoorah! for Big Pharma--even if the gene itself isn't that important, a network has many intervention points.

However, to the extent that the potential correlation fallacy discussed here is at play, targeting genetically based therapy at the A allele may fail not because the targeting doesn't work but because most A-carriers are affected mainly or exclusively for other reasons.  If the inferential fallacy is not addressed, think of how long and how costly it would be to end up doing better.

The correlation fallacy discussed here doesn't even assume that the A-allele is not a true risk factor, which as we noted above may often be the case if the results in Big Data studies are largely statistical artifacts.  The issue is that the A effect is unlikely to be very strong, because otherwise it would be easier to see or show up in family studies (as some rare alleles do, in fact), but simply because most individuals in both the A and non-A categories are affected for completely unrelated reasons.   Again what we know about recombination, somatic mutation, independent assortment, and the complexity of gene and gene-environmental interactions, suggests that this simply must often be true.   The correlation fallacy may pervasively lurk behind widely proclaimed discoveries from genome mapping.