Tuesday, February 6, 2018

On the Uses (and Abuses) of Economath: The Malthusian Models

Many American undergraduates in Economics interested in doing a Ph.D. are surprised to learn that the first year of an Econ Ph.D. feels much more like entering a Ph.D. in solving mathematical models by hand than it does with learning economics. Typically, there is very little reading or writing involved, but loads and loads of fast algebra is required. Why is it like this?

The first reason is that mathematical models are useful! Take the Malthusian Model. All you need is four simple assumptions: (1) that the birth rate is increasing in income, (2) that the death rate is decreasing in income, (3) that income per person is negatively related to population, and (4) the rate of technological growth is slow relative to population growth, and you can explain a lot of world history, and it leads you to the surprising conclusion that income in a Malthusian economy is determined solely by birth and death rate schedules, and is uncorrelated with technology. Using this model, you can explain, for example, why incomes before 1800 were roughly stagnant for centuries despite improving technology (technological advance just resulted in more people; see the graph of income proxied by skeletal heights below). It also predicts why the Neo-Europes -- the US/Australasia/Southern Cone countries are rich -- they were depopulated by disease, and then Europeans moved in with lots of land per person. It is a very simple, and yet powerful, model. And it makes (correct) predictions that many historians (e.g., Kenneth Pomeranz), scientists (e.g., Jared Diamond), and John Bates Clark-caliber economists (see below) get wrong.

A second beneficial reason is signalling. This reason is not to be discounted given the paramount importance of signalling in all walks of life (still not sufficiently appreciated by all labor economists). Smart people do math. Even smarter people do even more complicated-looking math. I gratuitously put a version of the Melitz model in my job market paper, and when I interviewed, someone remarked that I was "really teched up!" Simple models are not something that serious grown-ups partake in. Other social science disciplines have their own versions of peacock feathers. In philosophy, people write in increasingly obtuse terms, using obscure language and jargon, going through enormous effort to use words requiring as many people as possible to consult dictionaries. Unfortunately, the Malthusian model above, while effective in terms of predictive power, is far too simple to play a beneficial signalling role, and as a result would likely have trouble getting published if introduced today. 

A third reason to use math is that it is easy to use math to trick people. Often, if you make your assumptions in plain English, they will sound ridiculous. But if you couch them in terms of equations, integrals, and matrices, they will appear more sophisticated, and the unrealism of the assumptions may not be obvious, even to people with Ph.D.'s from places like Harvard and Stanford, or to editors at top theory journals such as Econometrica. A particularly informative example is the Malthusian model proposed by Acemoglu, Johnson, and Robinson in the 2001 version of their "Reversal of Fortune" paper (model starts on the bottom of page 9). Note that Daron Acemoglu is widely regarded as one of the most brilliant economic theorists of his generation, is a tenured professor of Economics at MIT, was recently the editor of Econometrica (the top theory journal in all of economics), and was also awarded a John Bates Clark medal (the 2nd most prestigious medal in the profession) in large part for his work on this paper (and a closely related paper). Also keep in mind this paper was eventually published in the QJE, the top journal in the field. Very few living economists have a better CV than Daron Acemoglu. Thus, if we want to learn about how economath is used, we'll do best to start by learning from the master himself.

What's interesting about the Acemoglu et al. Malthusian model is that they take the same basic assumptions, assign a particular functional form to how population growth is influenced by income, and arrive at the conclusion that population density (which is proportional to technology) will be proportional to income! They use the model:

p(t+1) = rho*p(t) + lambda*(y-ybar) + epsilon(t),

where p(t+1) is population density at time t+1, p(t) is population at time t, rho is a parameter (perhaps just less than 1), lambda is a parameter, y is income, ybar is the level of Malthusian subsistence income, and epsilon is an error term. If you impose a steady state (p* and y*) and solve for p*, you get:

p* = 1/(1-rho)*lambda(y*-ybar)

I.e., you get that population density is increasing in income, and thus that income per person should have been increasing throughout history. Thus, these guys from MIT were able to use mathematics and overturn one of the central predictions of the Malthusian model. It is no wonder, then, that Acemoglu was then awarded a Clark medal for this work.

Except. This version doesn't necessarily fit the skeletal evidence above, although that evidence may be incomplete and imperfect (selection issues?). What exactly was the source of the difference in the classical Malthusian model and the "MIT" malthusian model? The crucial assumption, unstated in words but there in greek letters for anyone to see, was that income affects the level of population, but not the growth rate in population. Stated differently, this assumption means that a handful of individuals could and would out-reproduce the whole of China and India combined if they had the same level of income. (With rho less than one, say, .98, the first term will imply a contraction of millions of people in China/India. With income over subsistence, we then need to parameterize lambda to be large enough so that overall population can grow in China. But once we do this, we'll have the implication that even a very small population would have much larger absolute growth than China given the same income.) Obviously, this is quite a ridiculous assumption when stated in plain language. A population can grow by, at most, a few percent per year. 100 people can't have 3 million offspring. What this model does successfully is reveal how cloaking an unrealistic assumption in terms of mathematics can make said assumption very hard to detect, even by tenured economics professors at places like MIT. Math in this case is used as little more than a literary device designed to fool the feebleminded. Fortunately, someone caught the flaw, and this model didn't make the published version in the QJE. Unfortunately, the published version still included the view that population density is a reasonable proxy for income in a Malthusian economy, which of course it is not. And the insight that Malthusian forces led to high incomes in the Neo-Europes was also lost. 

Given that this paper then formed part of the basis of Acemoglu's Clark medal, I think we can safely conclude that people are very susceptible to bullshit when written in equations. More evidence will come later in the comments section, as, conditioned on getting hits, I suspect several people will be taken in by the AJR model, and will defend it vigorously. 

This episodes shows some truth to Bryan Caplan's view that "The main intellectual benefit of studying economath ... is that it allows you to detect the abuse of economath." 

Given the importance of signaling in all walks of life, and given the power of math, not just to illuminate and to signal, but also to trick, confuse, and bewilder, it thus makes perfect sense that roughly 99% of the core training in an economics Ph.D. is in fact in math rather than economics.

Update: Sure enough, as I predicted above, we have a defender of the AJR model in the comments. He argues the AJR model shows why math clarifies, even while his posts unwittingly convey the opposite.

Above, I took issue with the steady state relationship in the model and the fallacious assumption which yields it. The commenter points out correctly, that, outside of the steady state, the AJR model actually implies that there are two conflicting forces. But, so what? My argument was about the steady state. If one fixes the wrong assumption, steady-state income in the Malthusian model will be equal to subsistence income, and thus the main argument for correlation between population density and income outside of the steady state will also be shut down.

Second the commenter unfairly smears Acemoglu & Co., writing that the real problem is not with their model, but that they didn't interpret their model correctly: "goes ahead in the empirical work to largely, in contrast to what their model says, take population density as a proxy for income!".

Thus I'd like to defend Acemoglu against this unfair smear. In preindustrial societies, there were vast differences in population densities between hunter-gatherer groups, and agricultural societies, even though there were not vast income differences between the two. In fact, quite surprisingly, hunter-gatherer societies often look to have been richer despite working less (read Greg Clark), and despite far more primitive technology. Thus it is quite reasonable to assume, as AJR did, that it is likely that differences in technology would swamp the differences in other population shocks (A more important than epsilon). The Black Death might have doubled or tripled incomes, but settled agrarian societies might have population densities 1000 times as large as primitive hunter-gatherer tribes. This isn't an airtight argument, but, given their model, I believe AJR's empirical extension is reasonable, particularly given that they provide a caveat. The problem is that their model is not reasonable.

The commenter goes on to argue that I've gotten AJR's conclusion backward: "You claim that the point they are making is "population density will be a decent proxy for income in a Malthusian model." The point they are making is explicitly the exact opposite: that "caution is required in interpreting population density as a proxy for income per capita." 

Huh? The first two lines of the abstract of the AJR paper read: "Among countries colonized by European powers during the past 500 years, those that were relatively rich in 1500 are now relatively poor. We document this reversal using data on urbanization patterns and population density, which, we argue, proxy for economic prosperity."

Seems clear here they are arguing for using it as a proxy.


  1. In my experience, physicists are much less fussy about their math than economists. The physicists could afford to be a bit sloppy because experiments tended to catch their errors, whereas economists have so many fudge factors at hand that they better get the models formally correct. The world isn't going to provide a pony. To be sure, my experience is out-of-date. There are far more big data economists around these days who tend to use simpler models that make sense when described in English. So maybe things are different now, though you can fetishize data sets as well as system of differential equations.

  2. I heard Richard Freeman (Harvard economist) say that if he can’t explain something in words, he feels he hasn’t understood it. Seems a good rule. Freeman also says he loves math.

  3. .
    Is there anything stated in math that cannot be clearly restated in English?

  4. The model in the working paper version of the paper is there for a simple reason: to consider the implications of a *simultaneous* relationship between population and income. It is *not* true in this model that "population density (which is proportional to technology) will be proportional to income!" That claim is analogous to taking the demand equation out of a supply and demand model and claiming the model implies that we should see lower quantities traded in times and places where prices are higher.

    The equation in the post above is meant to capture the Malthusian notion that higher income causes higher population growth. The other equation in the model---which is important if we want to solve the model--- captures causality running in the other direction. That equation doesn't just say "income which is proportional to technology," it says

    y(t) = A p^{-\theta},

    which, for \theta>0, captures a negative causal effect of population density on income, a mechanism operating through diminishing returns to land, in this story.

    The observed relationship between income and population density can't be read off either of these equations alone. A more stylized version of their argument might run:

    population = f ( income, shocks )

    income = g( population, technology, shocks ),

    which means in part that either technology shocks or population shocks play out over time as changes in both income and population.

    It is true that the way expressed the first equation, their model makes absurd predictions about population growth rates. But it doesn't really matter, as they're just using the model to make the point above, not directly taking it to data, and the model could be easily rewritten in such a way as to address this problem without changing the point.

    Finally, note that the "economath" did exactly as it's supposed to do in this case: forced us to specify precisely the assumptions we need to make, and highlights where those assumptions may lead us astray. If these same arguments were expressed verbally, all of this would be much more opaque.

    I don't know, but I would guess that the model was removed from the publication version not because of this "mistake," but rather because it takes two pages of space elaborating on the idea that the relationship between population density is not as simple as Malthus predicted, but then goes ahead in the empirical work to largely, in contrast to what their model says, take population density as a proxy for income! Which is an actual problem with the analysis, but one dutifully highlighted in the paper.


  5. Hey Chris -- thanks for your post. I encourage you to take a second look at the model above! If you impose a steady state, population is indeed increasing in income. And that is, in fact, a (wrong!) prediction about the data. Economath has struck again!

    All the best,

    1. On twitter, Chris replies: "That's not right and not what they say. Again, the model says income and population are sim. determined, but you are taking the implications of equation (2) in isolation. Eq (1) says higher pop -> lower income. In data, sign of corr. depends on what is varying across countries."

      First, Chris is quite correct that I did not present the full model above, only the offending bit of logic. However, you can plug in for p* in the equation above using the production function, set y-bar to zero for simplicity, and then, I hope, you'd find truth to AJR's claim that "As long as ρ < 1, there will be a steady-state level of income per capita, y∗, and steady state
      population density, p∗, both strictly increasing in productivity A." It does seem to me that Chris is speaking away from the steady-state, in which his argument sounds quite correct. However, in the steady state, I actually buy AJR's argument.

      Let's suppose Chris is correct here that AJR are wrong, and that an editor of Econometrica and tenured professors at MIT themselves did not even understand the implications of a very simple model. Does this make the use of math look better? Or, instead, if Chris, a tenured economics professor, is wrong and AJR are right, doesn't this anecdote also provide a cautionary tale about how tricky even a simple mathematical model can be?

    2. On twitter, Chris replies again: "Yes, if only A varies, that's so. And if only epsilon varies, there will be a negative association! The point, as they say, is that there's an identification problem. You have yet again just pulled one equation out of a sim. system and drawn erroneous conclusions."

      My response: OK, but in the blog post, it's clear I'm talking about the steady state. I don't see how pointing out that, outside of the steady state, things are different refutes the steady state positive relationship between p* and y*. It's this positive SS relationship which I take issue with. You seem to be saying that this positive steady state relationship between p* and y* is valid. However, I'm saying it's counterfactual, and comes from a specific, invalid assumption in AJR.

    3. Chris replies again: "Nope, you misunderstood the key piece of logic, which is that even assuming there's a positive causal effect of income on pop. density, we could observe any association between those outcomes. It isn't those three MIT profs (and me) who are confused by this model...."

      Hate to sound like a broken record here Chris, but, once again, we're talking about the steady state. In the steady state, there is, in fact, necessarily a positive relationship between p* and y* in their model (just not in real life). Yes, I realize that out of the SS a different relationship could materialize. However, I actually would like to defend AJR here -- differences in technology were enormous between hunter-gatherer groups and agricultural groups in the Americas, so that, even away from the SS, differences in A were likely to be dominant. Thus, given their model, applying it to data, while not perfect, is not necessarily without reason. The problem is about the model, and particularly with the SS relationship.

    4. Chris replies again on twitter: "Again, Doug, you're entirely missing the point of the model, which is that there's an "identification problem," as the authors put it---they're not limiting attention to SS. Their point is precisely the opposite of the claim you appear to think they're making, see last par. p10."

      Response: Dear Lord Chris. For the last time, I'm taking issue with the positive steady state relationship between population and income, and the fallacious assumption which yields it.

      The point I think they are making with the model is that there is good reason to think that population and income will be positively correlated in a Malthusian economy. I'm saying, this comes from a bad assumption. Correct that assumption, and this positive SS relationship doesn't hold, and thus, there is no reason to think income and population will be correlated in a Malthusian world.

    5. Chris -- if you really want to show me up, the way you can do it is by showing with algebra that the positive steady state relationship between income and population doesn't actually hold in this model. This would prove I'm wrong once and for all.

    6. This comment has been removed by the author.

    7. Here's another tweet from Chris: "Again, the basic point Acemoglu et al are making, which you're misconstruing, is robust to that assumption. This is not "bullshit" math being used to "trick people," particularly the "feebleminded." These are remarkable, uncharitable, and wrong, claims. Dear Lord, indeed."

      My response: Chris -- have another look at the model. If you set rho = 1, you get back to the classic Malthusian model, where there is no correlation between income and population density.

    8. Hi Doug,

      You claim that the point they are making is "population density will be a decent proxy for income in a Malthusian model." The point they are making is explicitly the exact opposite: that "caution is required in interpreting population density as a proxy for income per capita." This, as their little, pedagogical model shows, is at least in part because of simultaneous causality between income and population. They are not simply focusing on the equilibrium steady-state relationship, you are taking the quote about variance in technology and the induced positive association between income and population density out of context.

      Further, as I keep saying, the point they're making doesn't hinge on the goofy assumption you highlight. For example, consider what happens if we rewrite equation (2) as an AR(1) process in the log rather than the level of population density. We still get a positive association between income and density across steady-states if we assume that all that's driving variation across countries is variation in technology, but this slightly different specification does not have the problem you highlight.

      The point, as I say above, is that it doesn't really matter. They're not taking this model directly to data, they're just using it to make a simple point about identification in this context, and it's not the point you claim they're making. It would also, as I noted, be a point which would be extremely difficult to make if they couldn't use mathematics to express the argument.


    9. Hey Chris,

      If their main point is that population density will not be a decent proxy for income in a Malthusian world, why do the first two lines of the abstract read "Among countries colonized by European powers during the past 500 years, those that were relatively rich in 1500 are now relatively poor. We document this reversal using data on urbanization patterns and population density, which, we argue, proxy for economic prosperity."

      Seems pretty clear.

      Sure, they acknowledge a caveat that, in their Malthusian model, there are two opposing forces. But without the unrealistic assumption highlighted in my post, there won't be two opposing forces. Population density won't proxy prosperity period. This is why hunter-gatherers are often recorded as having higher incomes than settled agrarian societies.


  6. So... economics grad students do lots of math so they can learn how to lie better?

    I thought so, and that's why I'm transferring to a useful social science... Geography.

    1. Well, if I had to assign importance, I would of course have to say that signalling is dominant, as it is in most spheres, and second acknowledge the usefulness of math, and put subterfuge third. Note that, in fact, most economists actually are pure-hearted. They are generally innocent of any knowledge that they do so much math simply to signal, and they aren't lying on purpose. What also happens is an elaborate self-deception, which takes place with purely empirical researchers as well.

  7. "Math in this case is used as little more than a literary device designed to fool the feebleminded."

    That's a pretty striking assumption of motive on the part of Acemoglu et al. What's your reason for thinking that they were out to fool people?

    1. I actually do think that the intentions of these researchers were pure. I suspect they actually do believe that this algebra allowed them to overturn a basic result of the classic Malthusian model. But, math was still a tool in this case for an elaborate self-deception.

  8. Economath could be useful but one have to be very wary of the model limitations which are often dismissed.

    Finally, Economath need to be supported by Econostat. Your model need to support the observed data and be estimated via ecnometric analysis. So many models, including many DESG models, fail this tests and should only be taken as incomplete theory and interesting exercise.

  9. Grad student in economics here.

    I have never understood why mathematics was way too important in economics. If you do not take additional math classes from math department during your undergrad years, you are screwed. You will not be able to get into a decent phd program. Suppose that you have entered into a phd program and you are a possible candidate for becoming a brilliant social scientist, with a great intuition and grasp, but unable to demonstrate some proofs in your prelims, you will be get kicked out from the program. But I think, there is no penalty for any academic who is manipulating or overselling mathematical models.

    I need to quote George Box: "All models are wrong, some are useful". In contemporary economics, all models are wrong and nothing is useful.

    By the way, I think we need a blog post about your precious advices for prospective students (for phd applicant and JMCs)