Since moving to Nashville, I’ve tried to get into country music, relying largely on suggestions from the music threads of Lawyers, Guns, & Money.
Drive-by Truckers have a song, “Puttin’ People on the Moon,”with the lyrics:
Mary Alice got cancer just like everybody here
Seems everyone I know is gettin’ cancer every year
I have some connection to Appalachia, and everyone I know of who has gotten cancer and is under the age of 40 lives somewhere in Appalachia (and Appalachia folk are a distinct minority of people I know). So what does the data say?
The CDC keeps track of cancer rates, with the most recent data being from 2011. And sure enough, Kentucky is right there at the top for death rates, followed by West Virginia. As for actual cancer rates, D.C. is at the top, oddly. Kentucky is second, Pennsylvania 3rd, and West Virginia is thirteenth. So, don’t get cancer in West Virginia, because you’re more likely to die.
There is, of course, a huge and obvious confounding factor: smoking, which the CDC also tracks. And sure enough, Kentucky is first, followed by West Virginia.
I’ve included a sortable document at the end of the post, with the data cleaned up slightly.
Let’s look at some graphs!
First, smoking and cancer rates:
There doesn’t seem to be much correlation. With some spreadsheet magic, we find the correlation is .248, which is quite weak (who cares about p-values). Not what I expected.
I would guess cancer rates correlate better with population age, but we’d have to look at the data. Another time perhaps.
Now, let’s look at smoking and death rates.
The correlation is quite clear. It’s .891 in fact, which is very strong. Once again, p-values can take a hike.
States with higher rates of smoking have much higher rates from cancer. Does West Virginia have a younger population than the rest of the state? If so, that could explain their low overall cancer rates and higher death rates.
And there are a hundred other possibilities. Socioeconomic status is a big one. Pollution is another, and was in fact my initial hypothesis. The data I used is for a single year, and it might fluctuate significantly. But the effect of smoking seem like the obvious and overwhelming answer.
So the moral of the story is… don’t smoke. Not very exciting.
This table has data for all states and the District of Columbia, with one exception. The CDC does not have cancer rate data for Nevada in 2011. They do have cancer death rates for the state, however. Strange.