False Positives in Animal Models–not Just Biased Statistics

Why haven’t They Done That Yet?

So it appears another paper evaluating bias in reporting animal model data has come out in PLoS Biology. This is a great met analysis because now they are calling us animal model researchers to task the same way that clinicians and pharmaceutical companies have been admonished.

I think this paper, however, does not go far enough. In my opinion the greatest issue with the research into animal models is not e biased reporting of significance values, but he overselling of data as being more directly comparable to human populations than they are in reality.

I am going to allude to the paper “Evaluation of excess significance bias in animal studies of neurological disorders” by Tsilidis and colleagues with John Ioannidis that appeared today in PLoS Biology.

What Should be Studied and Why?

So this is a bit of a rant on a topic that I have been studying, so by no means am I a dispassionate observer of a field. When I came into study the mouse model for e fragile X premutation, what we knew came from a single study that used two very mouse-centric tasks and tried to draw a comparison to the human literature. This paper suggested that the mouse was actually not a very good model.

My research, however, was designed after speaking with the doctors, psychiatrists, psychologists, and cognitive scientists that had been evaluating the human carriers of this premutation. These conversatios led me to believe wholeheartedly that the approach we had been using in the mouse model world was backwards-it makes no sense to run a series of standard tests in mice and then use some sort of Procrustean analyses to draw parallels with human disease. For example, in mice we commonly use the water maze for learning and memory and the rotatod for behavior may be fine in some cases, but if a population shows subtle cognitive decline and progressive ataxia/intention tremors, then how exactly does one draw that parallel? It seems to me that the only way this is valid is if we take humans and throw them in a cold swimming pool with a hidden platform that they have to swim to; and then put them on a rotating drum until they fall off.

What was needed was an approach that strives to make the tasks we use for testing mice as similar as possible to those used in humans. If it is possible so far as overall task ensign, great. If not, then the tasks need to at least focus in on, and test the same cognitive domains tested in humans. In the case of my own research, it involved developing spatiotemporal cognition tasks and motor tasks that required fine-scale visuomotor function. These tasks that were designed to test specific hypotheses from human research actually gave animal data that phenocopying results in human permutation carriers.

A Proposed Approach

So what does to have to do with the study I am supposed to be talking about, that part will now be made clear. First, as scientists studying animal models of disease, it is a responsibility to report all data collected in a study. Scuttling “nonsignificant” results is unacceptable (full stop). In fact, it is often much more illuminating to see what a model is capable of, not just where the model fails to perform on a given task. Secondly, as animal researchers we need to actually interact with clinicians and psychiatrists. If we are going to pretend to model their work in animals the least we can do is actually to model their work, not just pay lip service to translational neuroscience while doing what we would have done anyways.

So there was no way that Tsilidus and colleagues could have evaluated how well 4000 different studies for how well they actually tested the hypotheses they claimed based on behavioral outcomes, but from experience my guess is that there would be an even higher bias toward easy tasks that would guarantee an effect over more difficult tasks that would inform research into the disease-significant p value or not.

This post has harangued on behavior mostly, as at is my expertise, but the same argument can be made for pathological studies. Often we in the animal world use, at best a credulous violet stain of a few sections, and make grand assumptions about the ramifications for treatment in a human population off our limited, and highly biased, sampling. The movement toward stereo logo is nice, but it does not solve the problem because the researcher still tends to choose either a favorite region or else an area they know already is “fixed” by the treatment. Only full virtual histology with every histological slide digitized and stored in an independent repository will we actually be able to answer any questions for real.

The TL;DR of this post is that as animal researcher it is good to be called out for laziness. It is time to up our game. Talk to clinicians. Make valid disease models. Do not oversell results. And finally, for Pete’s sake make the data OPEN, and that includes behavioral videos and raw, unphotoshopped histology.


2 thoughts on “False Positives in Animal Models–not Just Biased Statistics

  1. Loved the post and I agree with you. My pet peeve are interesting studies that "make grand assumptions about the ramifications for treatment in a human population." I understand why authors do it, and why editors use it to sell magazines, but it is not helpful. To be clear: I am 175% in favor of using model organisms to do important biomedical research!! These studies should focus on fundamental mechanisms that are true in the system being examined, e.g. mice. We have created a fuzzy space, IMO between fundamental and clinical science where we think it’s OK to take observations in mice and advertise them as being one step away from the clinic. Um…no. If researchers want to write about being one step away from the clinic, they need to get a little closer.We have an effort here at Wash U to bring lab scientists together with clinical researchers to grass roots organize scientifically grounded clinical trials. Very excited by this…follow @sr_amend if you are interested.


    1. Thanks for the comment. I agree entirely that we often take too much stock in the goodness of fit of a mosue mdoel and the human condition. Oftentimes, IMO it is because we are trying to "solve’ or "cure" some condition rather than trying to understand it first. What is needed for scientists to start solving problems or curing disease is for us to take a step back and try to learn everything there is to know about the disorder first-and we can use an animal model as a first step toward gaining an understanding. If we take the time to understand why a mutation affects behavior, or results in cancer, then, and only then, can we move forward and try to cure the disease. When we take the results of a preclinical mouse study and try to move forward to the clinic before we understant the findings in the mouse, we end up in a situation where we are really good at curing cancer in mice, but we are no closer to curing cancer in humans.


I would love to hear your thoughts on this!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s