Give the man what he wants

Caplan is back with more on my book with Kurzban. In this post, I’m going to try to give the man what he wants — regression models predicting issues with both key demographics and with people’s liberal-conservative labels. (FYI, some of this gets really technical; I’ll mark the worst paragraphs with the phrase “Technical points,” and those of you who don’t run data should feel free to skip those paragraphs.)

Taking stock of some things we’re agreeing and disagreeing about

Agreement: Political views are not 95% self-interest. Disagreement: Caplan thinks that self-interest occasionally plays a role but overall poorly predicts public opinion. I think it’s often (but not always) a substantial (but not exclusive) predictor.

Agreement: Self-placement on liberal-conservative labels correlates substantially with a range of issue opinions.

Technical points: We do appear to disagree on whether a simple multiple regression can solve disputes over the extent to which lib-con labels are causes or effects of issue opinions. Again, I’d point to pages 227 to 235 of the appendixes to my book for a straightforward demonstration that multiple regressions can do more harm than good in understanding things when an investigator has made mistaken causal assumptions about big correlates.

Disagreement: Whether issue opinions are one-dimensional, or, to use Caplan’s phrase, “boil down roughly to one big opinion, plus random noise.” I think that the high correlations among some issues and the near-zero correlations among others are precisely the evidence in question to show that there’s more than one statistical clump.

Disagreement: Whether entering liberal-conservative labels into a multiple regression would importantly undermine the demographic predictors of issue opinions. This point is the subject of this post. I’m going to see whether the major demographic patterns we found are importantly affected by entering lib-con labels in the models. Even if they aren’t importantly affected, however, this probably won’t help resolve my overall differences with Caplan, because he will surely stick to the next disagreement.

Disagreement: Whether or not the existing demographic predictors of public opinion count as “self-interest” (or some closely related concept).

Fine, here are some regressions with liberal-conservative labels as predictors

So, here, I’ll limit the investigation to something simple: Do the kinds of demographic correlates we talk about in the book largely go away when I control for overall liberal-conservative self-placement? I’ll leave to the side, for now, the further dispute over whether these demographic items do or don’t signal self-interest.

Technical points: For the book, we used categorical predictors and re-coded all issue variables as 100-point percentile scales. This helped make all the analyses comparable, given that, e.g., if a predictor had a coefficient of “5” in a regression model, in every case it meant “having this feature in the sample predicted a 5-percentile increase in liberalness on the issue in question, given everything else in the model.”


One of our empirical points in the book is about how people who combine what we call Freewheeler lifestyle features with higher education tend to be pro-choice while people who combine what we call Ring-Bearer lifestyle features with less education tend to be pro-life (my prior post explains some of this).

And, sure enough, in a regression predicting abortion views in the General Social Survey, here are some results. Model 1 contains the coefficients without the lib-con “control” (i.e., the GSS’s POLVIEWS variable) and Model 2 contains the coefficients with lib-con in the model.

1 2
No sex partners since age 18 -5.7 -5.5
1 sex partner since age 18 -5.6 -4.3
5 or more sex partners since age 18 8.6 7.3
Not been to a bar in the past year -6.8 -6.1
3 or more children -5.1 -3.9
High-school diploma or more 9.2 9.9
Graduate degree 11.4 9.2

Technical points: My regressions often include overlapping predictors. E.g., if you want to know the typical responses for someone with a graduate degree, you’d include the coefficient for “High-school diploma or more” added to the coefficient for “Graduate degree.” And, e.g., in the income model later, if you want to know about someone in the top 10% of family income, you’d add the coefficients for both “Family income in the top 20%” and “Family income in the top 10%.”

So, in this case, people who have had five or more sex partners, go to bars, have fewer than three children (all Freewheeler features), and have lots of education are really likely to be liberal on abortion. People who have had one (or no) partner, don’t go to bars, have lots of kids (all Ring-Bearer features), and have less education are really likely to be conservative on abortion. And the results change very little by adding people’s overall lib-con self-placement into the model.

Technical points: The combined demographic features by themselves have an r-squared of 10.6. Lib-con labels by themselves have an r-squared of 9.5. Putting it all in the same model has an r-squared of 17.5. What this means is that the unique variance of the demographic features is 8.0 (i.e., 17.5 minus 9.5); the unique variance of lib-con labels is 6.9 (i.e., 17.5 minus 10.6); and then there’s an extra 2.6 that is overlapping variance that can be accounted for by either these demographics or by lib-con labels.

The bottom line is that both demographics and ideology are big predictors of abortion attitudes, and neither dominates the other in a regression model — indeed, there’s relatively little overlap between these predictors.

School prayer and immigration

Another empirical point in our book is that discrimination issues are often predicted by a combination of the group-based category under consideration with education and test performance.

And, sure enough, in a regression predicting views on school prayer, here are some results (again, Model 1 is without the lib-con predictor and Model 2 is with).

1 2
Not Christian 16.5 14.7
Bachelor’s degree or more 8.1 8.1
Test performance in top 20% 10.0 9.8

In short, for school prayer, we often find liberals among non-Christians and people who test well and have more education; the conservatives are especially likely to come from Christians with less brains. And it doesn’t change much by adding the lib-con label variable to the model.

Technical points: In line with the calculations from the prior technical note, the unique variance accounted for by demographics here is 12.0; the unique variance accounted for by lib-con is 1.5; the shared variance accounted for by either is 2.4.

The bottom line with school prayer is that demographics are a much bigger deal than lib-con labels.

Here are further results, this time from a regression predicting views on immigration:

1 2
Born foreign 9.2 9.4
Parents born foreign 6.7 6.2
White -7.7 -6.7
Latino 10.4 10.1
Bachelor’s degree or more 10.0 9.8
Test performance in top 20% 7.5 6.9

This time, the liberals tend to come from Latinos, immigrants, and those who test well and have more education; the conservatives are especially likely to come from native-born whites with less brains. And, again, the coefficients don’t move much when “controlling” for lib-con labels.

Technical points: The unique variance accounted for by demographics here is 11.7; the unique variance accounted for by lib-con is 1.5; the shared variance accounted for by either is 1.2.

The bottom line with immigration, as with school prayer, is that the demographics are a much bigger deal than lib-con labels.

Income redistribution

Another empirical point in our book is that income and race predict people’s views on whether government should redistribute income (here, the GSS’s EQWLTH variable). Here are regression models (again, without and with the lib-con variable):

1 2
Family income in bottom 20% 3.6 3.6
Family income in bottom 40% 3.6 2.8
Family income in top 20% -3.6 -3.3
Family income in top 10% -6.9 -7.0
Personal income in top 20% -3.9 -4.1
White -10.5 -8.6

Race is a big deal, but so is the cumulative effect of these income categories (i.e., people in the bottom 20% of family income and people in the top 10% of family income are 17.7 percentile points away from each other in Model 1 (adding together 3.6 + 3.6 + 3.6 + 6.9)). And, again, adding lib-con labels to the model doesn’t change things much.

Technical points: The unique variance accounted for by demographics here is 5.0; the unique variance accounted for by lib-con is 7.5; the shared variance accounted for by either is 1.3.

Unlike the prior models, here lib-con accounts for more variance as a stand-alone (8.8) than the demographics do as stand-alones (6.3). But if Caplan wants to argue that 6.3 is really small, he needs to simultaneously tell us why the stand-alone variance predicted by lib-con labels when it comes to school prayer (3.9) or immigration (2.7) are somehow not small.

My point all along has been that these demographic relationships are substantial by the usual standards, even if things like lib-con labels are also substantial correlates. Now I can report that it doesn’t matter much if we add lib-con labels to the demographic models — the demographics still get a substantial share of the variance (sometimes a bit less than lib-con, sometimes a bit more, sometimes a ton more).

And the point remains that, despite the lib-con correlations, general public opinion is really, really not “one-dimensional.” For example, while lib-con by itself predicts 9.5 percent of the variance in abortion and 8.8 percent of the variance in income redistribution, abortion by itself predicts only 0.2 percent of the variance in income redistribution — which is about as close to zero as these things get.

But, as I said earlier, none of this will settle my disputes with Caplan. I think he might have really thought that we weren’t putting lib-con in our models because we were hiding something. We weren’t hiding anything. I’ve explained at length — both in the book (e.g., pp. 15-21) and in replies to Caplan — why we didn’t “control” for the usual suspects like lib-con labels. The analyses in this post have added something else: even if we had put lib-con in all our models, it really wouldn’t have mattered much.

Having seen what will surely be disappointing results from these analyses, Caplan will probably retreat back to “but these demographics don’t say anything about self-interest.” He might also try a different range of “controls,” even though his comments so far have focused almost exclusively on liberal-conservative labels. Which is fine. I’m becoming resigned to the fact that I’m unlikely to convince Caplan of much.


Caplan appears to still be worried that I’m hiding the ball, so here are a new set of regressions. In these, I’ve replaced the categorical predictors with more typical continuous measures for education and income. I’ve also combined lifestyle features into a single Freewheeler scale (if Caplan wants me to admit that the various components here (sexual history, drinking at bars, number of children, etc.) are small, I hereby admit it; the explanatory power is in combining these kinds of features into an overall lifestyle profile).

This time I’m reporting standardized coefficients for all predictors in the model. (All issue variables are coded such that higher values indicate more liberal positions.)

Abortion School Prayer Immigration Income redistribution
.199 Freewheeler

.170 Education

-.258 Lib-Con

-.242 Christian

.197 Education

-.119 Lib-Con

.189 Immigrant family

.097 Latino

-.092 White

.156 Education

-.125 Lib-Con

-.171 Family income

-.122 White

-.286 Lib-Con