The Ticking Time Bomb Hidden in Orthopedic Research
What if an entire generation of orthopedic data collected for countless studies is so horribly biased as to render it problematic at best and useless at worst? That could be the problem we find ourselves in as lower-quality orthopedic study after study is blown out of the proverbial water by higher-quality studies contradicting the prior, glowingly positive result. In fact, in the last decade, the research base that serves as the foundation for an entire medical specialty has been crumbling like a skyscraper built on sand. I think I know one reason. Let me explain.
Orthopedic Outcome Measures Are Decidedly Different
An outcome measure is a metric that helps physicians determine if a treatment is working. This usually takes the form of a questionnaire given to a patient or some objective measure. In most fields outside of orthopedics, the patient is asked about his or her perception of the outcome, and then a blood test or other objective test adds additional data. As an example, in a cardiac study, there’s a questionnaire given to the patient about his or her overall health (usually the SF-12 or 36 questionnaire) and an objective measure that doesn’t depend on the input of the treating physician. This could be something like the distance the patient can walk or the blood levels of certain enzymes.
In orthopedic care there’s a big issue. While pain and physical function are easy enough to measure, objective tests that don’t rely on the input of the physician are pretty rare. You would think that you could use MRI, and in some studies that might work, but most of the research studies performed to date show a poor track record for MRI being able to measure or correlate with pain. Meaning many patients with normal or slightly abnormal MRIs have significant pain, and many others with highly abnormal MRIs have little pain. In some studies, things like range of motion may work, but in many treatments the range of motion isn’t expected to change. In addition, range of motion is often measured by the treating doctor, which as we’ll see below has its issues. In fact, there really are few truly objective measures that work until someone comes up with a magic brain scan that can measure pain.
Why Is Physician Input a Problem?
Physicians desperately want to see their patients improve. As a physician who treats patients, a good day is always when everyone I see is improving or already better. The problem is that research demonstrates that we physicians tend to overestimate improvement. Just look at a recent study of meniscus tear surgery for an example. The research demonstrated that the surgeons performing the operation thought that most patients were quite satisfied with their result, and the patients went the other direction, with most saying the procedure didn’t work well.
This is really where the outcome measures in orthopedics depart from the rest of the medical field. Orthopedic outcome measures usually rely heavily on physician input for the final score determining whether the patient improved. I found this out the hard way.
Many years back, when we were putting together the structure of our registry to track patient outcomes, we ran into a problem. Most validated orthopedic functional questionnaires required the physician to give his or her assessment of the outcome. In addition, many required things like the physician’s “eyeball” of the range of motion. This was an issue as most of our patients were from out of state or another country, so bringing them back to Colorado was almost impossible. Thankfully we finally found a set of questionnaires that were research-based and only relied on what patients told us about their result. This also felt ethically better as this is how outcomes are measured in other fields of medicine. In fact, the very idea that the physician could somehow sway the outcome measure one way or the other is a serious “no no” is the rest of medicine.
What Has the Orthopedic-Surgeon-Input Problem Created?
The last decade of orthopedic research has exposed a serious gaping crack in the proverbial foundation of orthopedic research. To use a structural engineering example, the inspector has condemned the orthopedic bridge due to serious flaws in its construction. As an example, we had many case series (the results of consecutively treated patients) that seemed to show beautiful results from meniscus surgery with high rates of patient satisfaction. However, when higher-level studies are performed with metrics that focused on what the patient reported as changes in pain and function, turns out the procedure is no better than a placebo surgery.
I think this is just the tip of the iceberg. We will likely see similar results for hip labrum surgery in the not too distant future. How many other orthopedic procedures will crater because of this physician-input problem?
A Recent Example
A new study was just published on about a thousand knee arthritis patients treated with adipose stromal vascular fraction (SVF). Regrettably this technique isn’t legal in the United States as, for some unknown reason, our FDA has classified it as a drug requiring a decade of clinical trials. However, many doctors are floating the risk, so it is available here.
The study, which was mostly performed in the Czech Republic, reported a remarkable rate of improvement in knee and hip arthritis patients. It showed that 91% of patients were classified as having greater than 50% improvement at one year after the injection of SVF. Wow! I have seen knee and hip stem cell data presented for years at more conferences than I can count. Same-day and cultured bone marrow stem cell procedures, stem cell drugs in clinical trials, SVF from fat, cultured fat stem cells—none of these researchers (including us) have reported anything approaching these results. In addition, knee replacement outcomes are also nowhere near these outcomes, and that involves amputating the entire painful joint!
So there are really two possibilities. One is that there is something truly magical about the method of processing the adipose SVF used in the study, or the other is that there’s something different about the way the outcomes are being measured.
Let’s explore the first possibility. The physicians used a kit to process the adipose tissue. One of the interesting things that jumped out was that they used two different versions of the system: one that used enzyme-based digestion to get to SVF and the other that used no enzymes. Remarkably, there wasn’t a difference in how the two procedures worked on arthritis! Let’s explore that further.
First, you need to know that to get to adipose SVF, the process involves liberating the cells in fat from the collagen matrix in which they’re normally encased and trapped. This is done with an enzyme, usually collagenase. This is one of the reasons the US FDA has classified this procedure as the manufacture of an unapproved drug (a position I don’t personally agree with). There are other procedures that can be used to try and isolate cells from fat without an enzyme, but they usually produce one-tenth to one-fifth the number of total cells, and much fewer of these are actually stem cells. Hence, the physicians reported that the results of the procedure had little to do with the number of stem cells that were available to treat the joint. For fresh stem-cell-isolation techniques, this isn’t consistent with the other techniques that have been reported for bone marrow, where dose of stem cells seems to make a difference (see study 1 and study 2).
So while the possibility remains that the processing of the cells is superior to other methods, it would seem unlikely. There are many review articles that have gone over many different ways to process SVF both with and without enzymes. Many practitioners use systems that have been researched to maximize stem cell numbers in the SVF. In addition, given that even if the researchers reported the same total number of cells in the sample, the cells in the non-enzyme sample likely didn’t contain nearly as many freely available stem cells, which means that the result was dose independent.
Now let’s explore the other possibility. The doctors who wrote the paper list five metrics that they used to determine outcome. Regrettably, three of five depended on what they physician thought or observed about the outcome. As seen in the meniscus study above, when we ask treating physicians to quantify how well patients did with a procedure, they usually guess way high. This isn’t an indication of anything other than all doctors want to believe that what they do matters, and human psychology being what it is causes many of us to want the best for our patients.
In summary, had the new fat stem cell study used only metrics that were 100% dependent on what the patients thought of their results, my educated guess is that their results would be very similar to what everyone else has reported. Why is the aberrant result likely created by optimistic doctors a problem for stem cell therapy as a whole? Let me explain by way of an innocuous blue dye and low-back pain.
The Methylene Blue Debacle
A few years back, a study out of China reported amazing results in patients with low-back pain due to a painful disc. The study involved using a common surgical dye called methylene blue to inject the discs. It seemed so contrary to what we thought we knew that it required an editorial in a major journal from an older and highly respected researcher to try and explain the results.
The study was too good to ignore for many clinicians seeing these patients. A large spike occurred in national methylene blue sales as many doctors (including those in our practice) began injecting discs with blue dye. We all held our breath to see what would happen, wondering if we were all missing a valuable tool in helping these patients. Regrettably, we and many of our colleagues observed no changes in the patients other than they now had blue discs! Methylene blue was soon relegated to the dustbin of medical history.
What did we learn? The data in this study may have been, as the editorial suggested, related to cultural differences in how Chinese patients communicate outcome to their physicians. More importantly, when the data looks remarkably better than anything else out there, it’s right to question it; in fact it’s part of the scientific process.
The orthopedic research has a big, ticking time bomb at its core, and the bedrock of its foundation has a gaping crack. The questionnaires and metrics used to determine outcome are broken in that physician input makes up a big part of the score. This is unique in medicine, and it’s likely why we see high-quality study after study opposing years of poorer-quality studies that seem to show that surgery works. As with any addiction, the first part of the solution is recognizing that you have a problem. In that regard, it’s time to wean off the physician-input sauce and go cold turkey. Our kids’ kids will thank us.
Update 8/30/17: The Michalek study above was withdrawn from the journal.