Home > Analysis > Did Millions Die in the Great Leap Forward: A Quick Note on non-Contemporaneous Data

Did Millions Die in the Great Leap Forward: A Quick Note on non-Contemporaneous Data

This is a followup on a previous post titled Did Millions Die in the Great Leap Forward: A Quick Note on the Underlying Statistics.  In that post, I pointed out that the only systematic data available from the time (the census of 1953 and 1964) were such that they could neither support nor refute the hypothesis that millions upon millions died during the Great Leap Forward.  The claim that 15 or 30 or even 45 million people died – true or false – simply is not testable against the margin of errors inherent with the 1953 and 1964 census figures.

In a comment, long-time commentator jxie referenced some of the so-called “newer” research involving non-contemporaneous data that I want to quickly address in this post.  One thing I failed to address in my prior post is that since the mid 80’s – with the release of  data such as the Cancer Epidemiology Survey in 1976, the fertility survey of 1982 giving fertility rates dating back to 1940, and the re-release of the 1953 and 1964 census in 1982 where the population figures are broken down by age and gender groups (“cohorts”) – many researchers have claimed that they are able to prove how many millions actually died during the Great Leap Forward.  Various reputable scholars 1 estimated the death count to be anywhere between 20 to more than 45 million.  I want to address such studies, focusing in particular on Banister’s 1987 study that jxie cited.

Banister’s 30 Million Dead Hypothesis

Judith Banister is one of the most respected and prominent demographer in the West on China.  In what has become a classic book published in 1987, Banister estimated that some 30 million died during the Great Leap Forward (p. 118, Banister).

For Banister, the first critical new piece of data is China’s release of a detailed Cancer Epidemiology Survey in 1976.  The 1976 cancer survey gave detailed data of death rates for various causes of death from 1973-75 and allowed researchers to calculate a more accurate annual death rate for China from 1973-75 (.0073).  By comparing that rate with standard official data the government published for 1973-1975, Banister concluded that the standard government data systematically under-reported death during all three years. For this period, Banister noted that the death rates for people of working age appeared to be relatively accurate; it is the death rates for those outside the working age that appeared to be dramatically under reported – by 15% or more depending on the year (she thought this observation made sense, as work units would report deaths for those in the labor force but not for those outside it). Based on this, the author would hypothesize that death rates from other periods – including those one or two decades earlier, from 1958-1961, say – would be similarly under-reported and could be similarly corrected.

The second critical piece of new data came in the early 1980’s with the government’s re-release of the 1953 and 1964 censuses and the release in 1982 of a comprehensive fertility survey that traced China’s annual fertility rates from 1940 – 1981. The re-leased 1953 and 1964 censuses revealed for the first time to the public how the 1953 and 1964 population counts were  broken down by age and gender cohorts.  The fertility study surveyed some 1 million women sampled across China and asked them when they had children, how many, and at what age, providing an estimate of China’s annual fertility rate for women by age group from 1940 to 1981 (p. 229, Banister).

The fertility rates obtained from the fertility survey are as follows:

Total and age specific fertility rates in China from 1940-1970

Total and age specific fertility rates in China from 1940-1970

From the fertility survey and the 1953 and 1964 census data, Banister then “calculated” an annual birth rate and annual death rate by gender and age cohort for China for each of the year between 1953 and 1964. The original aggregated birth rates (aggregated across cohorts for each year)  government published and those derived by Banister derived from the fertility rates are summarized below:

Comparison of published (official) birth rates and birth rates derived from fertility survey

Comparison of published (official) birth rates and birth rates derived from fertility survey

I placed quotes around the word calculate above to emphasize that these figures were actually not derived (in a deterministic fashion) from the fertility and census data per se. The fertility and census data actually did not dictate any one set of birth and death rates.  Many such birth and death rates are allowed.  To point to one set to represent reality, Banister had to first presume a death rate distribution by age and gender (selectively using certain government data, tossing others), and then back-calculate a birth rate from the death rates, and do so iteratively until she found a set of  birth and death rates that in her eyes most plausibly bridged the 1953 and 1964 census.

Banister herself recognized that such a methodology was necessarily an “arbitrary estimation process” (p. 115, Banister).  However, she argues her results are grounded.  Ultimately, she argued that government data – or at least the trends revealed by such data – cannot be “far wrong.” (p.120, Banister).  A little “tweaking” – however ad hoc – was all that was needed to make the data trustworthy and pristine again.

There are two serious problems with Banister’s methodology.  First, Banister never quite justified why the systematically under-reporting of death rates during the mid 1970’s can be applied to the late 1950’s. The 1950’s (or even 1960’s for that matter) and the 1970’s represent night and day epochs as far as scientific population samplins is concerned in China.

Banister herself noted in her book:

In the late 1960s and most prior years, the permanent population registration and reporting system may have been so incomplete and uneven that national or provincial statistical personnel had to estimate all or part of their totals. In particular, in the 1950s the permanent population registration and reporting system was only beginning to be set up, and at first it did not cover the entire population. All the national population totals for the 1950s except the census total, were probably based on incomplete local reports supplemented by estimates.

In my prior post, I thus referenced this quote from Wild Swans and Mao’s Agrarian Strategy, Australia-China Review, by Wim F. Werthheim, Emeritus Professor, the Univ. of Amsterdam, regarding the 1953 census:

Often it is argued that at the censuses of the 1960s “between 17 and 29 millions of Chinese” appeared to be missing, in comparison with the official census figures from the 1950s. But these calculations are lacking any semblance of reliability.

At my first visit to China, in August 1957, I had asked to get the opportunity to meet two outstanding Chinese social scientists: Fei Xiao-tung, the sociologist, and Chen Ta, the demographer. I could not meet either of them, because they were both seriously criticized at that time as rightists’; but I was allowed a visit by Pang Zenian, a Marxist philosopher who knew about the problems of both scholars. Chen Ta was criticised because he had attacked the pretended 1953 census. In the past he had organised censuses, and he could not believe that suddenly, within a rather short period, the total population of China had risen from 450 to 600 million (by the way: with inclusion of 17 million from Taiwan), as had been officially claimed by the Chinese authorities after the 1953 ‘census’. He would have like to organise a scientifically well-founded census himself, instead of an assessment largely based on regional random samples as had happened in 1953. According to him, the method followed in that year was unscientific. For that matter, a Chinese expert of demography, Dr. Ping-ti Ho,  Professor of History at the University of Chicago, in a book titled Studies on the Population of China, 1368-1953, Harvard East Asian Studies No.4, 1959, also mentioned numerous ‘flaws’ in the 1953 census: “All in all, therefore, the nationwide enumeration of 1953 was not a census in the technical definition of the term”; the separate provincial figures show indeed an unbelievable increase of some 30% in the period 1947-1953, a period of heavy revolutionary struggle (PP.93/94)!

My conclusion is that the claim that in the 1960s a number between 17 and 29 million people was ‘missing’ is worthless if there was never any certainty about the 600 millions of Chinese. Most probably these ‘mission people’ did not starve in the calamity years 1960-61, but in fact have never existed.

In the 1950’s, China was still in revolutionary and nation-building mode.  By the 1970’s, a nation state with all the modern infrastructure of a modern state has been established.  In the 1970’s, the methodology the Chinese government employed was much better and complete than the two decades before, so I am not sure how one can extraopolate from one period to the other.

A second problem is with the Banister’s reliance on the fertility survey as hard, independent data.  One problem with the survey is that as one goes further back in time, the margin of error would necessarily increase. Consider the fact that the farther back in time one goes, the older the people have become, the more under sampled those people would be in the survey, the more error there would be for fertility rates for earlier years.

This becomes a real problem for the Great Leap Forward (or earlier) years.  According to the fertility survey, for example, the average age of a person having child in 1960 (right in the middle of the Great Leap Forward) is 29.9 (p. 230, Banister).  Based on Banister’s own estimate (derived from the 1973-75 cancer study mentioned earlier), such a person would  have a life expectancy of a mere 31.8 years (p. 91, Banister).  In 1981, the year the survey is conducted, such a person would be 51 years of age, 20 years above their life expectancy.

The fertility data thus becomes inherently under-sampled as one goes back in time.  For the GLF years, the survey involves sampling a group people who are on average 20 years above their life expectancy, resulting in unspecified large margins of errors!

To be fair, Banister did note of the fertility study: 2

For recent years, the age-specific fertility rates are based on the responses of large numbers of women covering all the age groups.  For earlier years, fertility for some age groups had to be estimated or extrapolated from other data. [In fact, even though the survey is purported to go back to 1940, t]he fertility of … those … intervening years is estimated rather than reported.

However, she did not carry to conclusion what this observation means for her analysis.

The government data itself does not specify the margin of error in the data – which should have given a huge warning light.  Yet Banister would use these estimates as hard data upon which to extract statistical conclusions!

A third problem is the selective use and tossing out and adjustment of government data.  Throughout the study, Banister would rely on any of several fragmentary data to argue how much the government data needs to be adjusted this way or that way for each of the intercensal years.  Banister would then conclude that her methodology must be correct on ground that she found a “plausible mortality schedule” or “plausible” “survival ratios by cohort” for each of the intervening years between the relevant year and 1953 (p.114, p. 231, Banister) that that plausibly bridged the 1953 and 1964 census figures.

In any case, Banister’s death rates – aggregated across all cohorts for each year – are summarized below:

Reconstructed death rates by Banister

Reconstructed death rates by Banister

For reference, the government’s originally published birth and death rates are provided in Table 1 below. 3

year population birth rate death rate growth rate*
1954 602.664 0.03797 0.01318 0.02479
1955 614.65 0.0326 0.01228 0.02032
1956 628.283 0.0319 0.0114 0.0205
1957 646.533 0.03403 0.0108 0.02323
1958 659.943 0.02922 0.01198 0.01724
1959 672.069 0.02478 0.01459 0.01019
1960 662.07 0.02086 0.02543 -0.00457
1961 658.591 0.01813 0.01433 0.0038
1962 672.955 0.03722 0.01008 0.02714
1963 691.72 0.0436 0.0101 0.0335
1964 704.991 0.03934 0.01156 0.02778

*growth rate is calculated via birth rate – death rate.

Table 1: Government population statics of China, 1953 – 1964

A chart of the originally published birth and death rates from the government and the corresponding adjusted rates provided by Banister is shown below, with the solid lines represent the originally published data and dashed lines Banister’s adjusted values

Birth and death rates during the intercensal years between 1953 and 1964

Birth and death rates during the intercensal years between 1953 and 1964

Graph 1: a plot of the published birth and published death rates and corresponding adjusted birth rates and adjusted death rates from 1953-1964

A plot of the published growth rates (published birth rate – published death rate) and adjusted growth rates (adjusted birth rates – adjusted death rates) from 1953-1964 is shown below:

growth rate comparison

Graph 2: a plot of the published growth rates and corresponding adjusted growth death rates from 1953-1964

From the adjusted death rates, Banister calculates the number of dead attributed to the Great Leap Forward to be 30 million (p. 118, Banister). This calculation is summarized below: 4

Table 2: My reconstruction of Banister’s calculations estimating that 30 million died as a result of the Great Leap Forward
Table 2: My reconstruction of Banister’s calculations estimating that 30 million died as a result of the Great Leap Forward

The 1964 population estimate in Table 2 is formally calculated from the 1953 census by the following:

estimated_population[1964] = census_population[1953]*(1+b[1954]-d[1954])*(1+b[1955]-d[1955])*p[1954]*….*(1+b[1963]-d[1963])       equation 1

Thus the 1954 figure of 613.25 is calculated as 601.938*(1+.042-.0258).  The 1955 figure of 625.95 is calculated as 613.25*(1+.043-.0242). The process is carried out via equation 2 in an iterative fashion for each year until the estimate for the 1964 population is obtained.

Since I do not have Banister’s figures (those figures would have been broken down by age groups for each year between 1953 and 1964 given the nature of the data from the 1982 fertility survey), I obtained an aggregate figure for each year here by simply multiplying Banister’s adjusted annual growth rate (adjusted birth rate – adjusted death rate) by the population iteratively from the year before, starting with the 1953 census population.

From an aggregate level, it appears that Banister did not use the 1957 figure as the baseline rate (as the government did with its estimate), but used an average of the rate for the years immediately before and after the Great Leap Forward instead.  To get a GLF death of 30 million, I have to use a baseline rate around .0165, average of the adjusted death rates for 1957, 1958, 1962, and 1963.  For a baseline rate of .0181 (1957 figure), I get a death figure of 25 million.  It’s possible that Banister did use the 1957 figure but used slightly different population estimates to get the 30 million.  Whatever the case, an important thing I noted is just how sensitive a 30 million figure is to slight variations in the various estimated parameters.

Refuting the Null Hypothesis

My core problem with Banister’s approach is the problem I cited in my prior post.  As I observed there, whenever one deals with complex data and claims a hypothesis to be supported by the data, a necessary first step is to ensure that the hypothesis is actually testable against the data.  A necessary first step is to show that the underlying data can actually refute the null hypothesis.  Otherwise, the (non-null, alternative) hypothesis cannot be pronounced with any degree of statistical confidence.  For the purpose here, the hypothesis I am interested in is Banister’s claim that 30 million died during the Great Leap Forward.  The null hypothesis is that no abnormal number of people died during the Great Leap Forward.

One way to define the null hypothesis is to simply modify the death rates Banister produced for China between 1953 and 1964, replacing the death rates she came up with for the Great Leap Forward with a baseline rate.  If such a rate can still bridge the 1953 and 1964 census data, then the underlying data is not capable of refuting the null hypothesis, and Banister’s claim of 30 million death is not testable against such data.  The implication of the null hypothesis is shown in Table 3.

Table 3: Calculations Used to Test Banister’s Null Hypothesis

Table 3: Calculations Used to Test Banister’s Null Hypothesis

As can be seen, the 1964 (estimated) population thus calculated goes from 719.88 million (-.44% compared to the 1964 census figure) for Banister’s hypothesis of 30 million dead to 752.53 million (+4.07%) for my null hypothesis.  Being .44% off would not be better than 4.07% if both fall within an inherent margin of error. Both would be considered valid as far as underlying statistics is concerned.

The critical question to ask is: what’s the inherent margin of error?

Since we are comparing the 1964 census figure to an estimated 1964 population figure, the margin of error is a combination of the margin of error of the 1964 census itself and the inherent margin of error of in estimating the 1964 population. An as discussed in my prior post, based on equation 1, the dominant source of error for estimating the 1964 estimate comes from the errors in the 1953 census.  That result comes from conducting an error propagation of the main measurements in the data, taking derivatives against all these measurements, and expanding the first order terms to get a first order approximation.

One way to see intuitively that the main error in estimating the 1964 population via equation 1 arises from the 1953 population figure rather than from the intercensal birth and death rates is by noting that a 1% change in the 1953 population would result in a proportional 1% change in the 1964 estimated population. However, a 1% change in any of the annual birth and growth rates here (in the range of .01 – .05; see Graph 1) would result in only a .01% – .05% change in the 1964 estimated population ((1 + .01*1.01) vs. (1 + .01) to (1 + .05*1.01) vs. (1 + .05) ).

So the inherent margin of error here ultimately boils down to the margin of error for the 1953 and 1964 census figures.  Unfortunately, the government never released any figures on the margin of error.  The problem is that methodology of the time was such that no robust measures could be given (either contemporaneously then or retroactively today).

Given fundamental questions surrounding the reliability of the 1953 and 1964 census, assigning say a 5% error to the 1953 census and 2.5% error would actually be quite generous.  Yet even with such modest figures,  the population estimates calculated by the null hypothesis would already come clearly within the inherent margin of error.  From Table 3, given a 1953 census figure of 601.94 million, the null hypothesis predicts a 1964 population of 752.53 million.  A 5% margin of error in the 1953 census figure would result in the estimated 1964 population to range between 714.9 and 790.16 million.   A 2.5 margin of error in the 1964 census would mean that the 1964 census is a figure between 704.99 and 741.15 million.  Since the two ranges overlap, the null hypothesis is not refutable, and Banister’s thesis of 30 million dead is not testable – not supportable – by the underlying data.

Null Hypothesis – Redux

An astute reader might counter that I have not used all the data available to refute the null hypothesis.  The available data is not just the aggregate 1953 and 1964 census population, but the distribution of population by cohort for the 1953 as well as the 1964 censuses, as well as the cohort fertility rates obtained from the fertility survey for each year during the intercensal years between 1953 and 1964.

This is a good observation.  The null hypothesis I describe above is for data aggregated across cohorts.  As such, it is a simplified version.  Still it is sobering to see that if a null hypothesis is to be refuted, it cannot be done on the aggregate annual data. The full-blown version would have involved characterizing not just the aggregate death rates for each of the years within between 1953 and 1964, but creating aggregate death rates across cohorts for each of the intercensal year.  The bridging of the 1953 and 1964 census would involve the bridging of not just two numbers from 1953 and 1964 with a series of birth and death rates for each of the intercensal years, but two numbers from 1953 and 1964 for each population cohort with a series of birth and death rate distributions (by cohort) for each of the intercensal years.

On a theoretical level, relying on such cohort data instead of aggregate data is problematic to the extent that the margin of error on the cohort level is usually even worse than that across the aggregate.  The methodology has to be much more sophisticated and refined to get data at such granular levels right. If the null hypothesis cannot be refuted on the aggregate level, I doubt it can be refuted on the cohort level. Any extra information contained in such data would probably be swamped by the extra margins of error encountered in extracting data with methodologies that failed to properly obtain information at the aggregate level.

On a practical level, to fully test such a hypothesis, I would need not only to build a computer model, but also to make a large number of assumptions (as Banister did also) to populate the model with actual working numbers. Such a calculation would obviously be more complicated.  And even if I did undertake such a task, whatever result I conclude may not be of much worth to counter Banister’s conclusion since Banister did not share most of her numbers and I would also need to make many assumptions to come up with hard numbers that a computer can crunch.

For both these reasons, I do not think it to be productive here to try to show how the null hypothesis is not refutable in a quantitative fashion. I will do so qualitatively instead. Below my goal is to show how much unproven assumption underlies  Banister’s thesis, how much freedom Banister had in making assumptions and making up numbers, and ultimately how much of the “consistency” and “plausibility” Banister’s rants about can be easily manufactured.

Garbage In Garbage Out

Throughout the book, Banister seemed genuinely impressed by how well the government’s newly released, the government’s originally published data and her hypothesis appeared to complement each other. Banister never attempted to analyze how testable her hypothesis is against the underlying data.  Instead, her confidence in her theory all seemed to arose from  how “plausible” and “believable,” “complete,” “apparently complete” and  “consistent” the data became when she applied her interpretation of government data, her many assumptions, her selection of what fragmentary to incorporate and what to reject.

Unfortunately, while consistency of data may sometimes be used to affirm – i.e. to confirm, to double check –  the quality of an otherwise well-established data set, consistency per se can never be used to establish the quality of an otherwise questionable data set.  If one depends only on that, one will be judging theories based on what appears most “plausible” and “believable,” by one’s own “worldviews,” not by what is actually statistically supported. Such would not be empirical driven research.  Such would not be science.

Consider Banister’s approach to using as the basis of her study the government’s birth and death rates, measurements that all reputable demographers – including herself – had previously considered unreliable.  This is critical.  Without good hard statistical foundations, Banister’s entire study is built on a deck of card.

The problem with government’s original data for the 1950’s and 1960’s was not just that there was under-reporting of births and deaths (that might be corrected for systematically), but that the methodology was of such that the data produced was deemed not reliable, not reproducible.  The data was so imprecise that any conclusion one can glean from the data (with fancy methodology, etc.) would have been drowned out by the margins of error inherent in any calculations.

John Baird – a fellow well-regarded demographer – had concluded that “the official vital rates [birth and death rates] of the crisis years [of the Great Leap Forward] must be estimates, but their basis is not known.” 5 Banister herself had noted that while China did try to start vital registration in 1954, its implementation was very spotty and uneven throughout the country. Thus she had argued, “If the system of death registration was used as a basis for any of the estimated death rates for 1955 through 1957, the rates were derived from only those localities that had set up the system, which would tend to be more advanced or more urbanized locations.” “In all years prior to 1973-75 the PRC’s data on crude death rates, infant mortality rates, expectation of life at birth, and causes of death were nonexistent, useless….” Banister concluded.

The notion that a set of data can be fixed by an “adjustment factor” is thus a hypothesis in and of itself. But Banister used it without question.  But the problem is even worse.  The adjustment factor Banister ultimately published for the intercensal period ranged from 37-84% higher than the government’s published values.  That’s a huge range! The systemic error Banister hypothesized was actually not so systemic after all … yet Banister never bat an eye. If I am able to pick and choose adjustment factors at will, I too can manufacture adjustment factors to fit other conclusions that fit best my worldview.

The problem is even worse. Even after allowing for the notion that the data is adjustable (even if not systemically), there appears to be no objective way to adjust the factor. In deriving each factor, Banister had to select, reject, and weigh various types of conflicting fragmentary data based on various ad hoc, subjective criteria. The process by which Banister obtained each “adjustment factor” is more art than science.

For at least these reasons, much of the conclusions Banister derived are at best speculative estimates, at worst garbage in, garbage out.

The Mirage of Consistency and Plausibility

Independent vs. Extrapolated Data

Banister’s much bantered issue of consistency and plausibility may also just be a mirage.  Much of the strength of Banister’s conclusion is based on the presumption that the fertility survey provided a good, reliable independent source of data that formed an indispensable tool in triangulating the truth.  When combined with her presumed mortality rate, the fertility rates bridged the 1953 and 1964 census figures on a cohort by cohort basis.

But as noted above, the fertility data was provided without any margins or error. Since a large part of the government’s independent fertility data data was necessarily derived and extrapolated, rather than independently measured, such data would be a far cry from providing useful third eye in triangulating truth.

If the women who bore children during the Great Leap Forward time is severely under-sampled in the survey, how good is the data?  How much margin of error exists in the data?  Given that the government does not provide any indication of margin of error, how much of the data is “estimated rather than reported” – i.e. derived and extrapolated from data that is already known rather than constitutes new independently measured data?  To the extent the data is “estimated rather than reported,” much of the consistency and plausibility Banister observes in her numbers becomes but a mirage – more an artifact of Banister’s assumption than an independent verification of truth???

A Greatly Under-Constrained Problem

The biggest problem with Banister’s claim about “consistency” and “plausibility” of results may be my contention that they are so easily manufactured.  To see this, note that instead of the aggregate equation 1 for bridging the census of 1953 and 1964, we now have the cohort relations of equations 2A and 2B below.

estimated_population[n][i] = census_population[1953][i]*(1-d[1954][i])*(1-d[1955][i])*….*(1-d[1963][i])       equation 2A

where n is the year of the estimated population, i is an age index for cohort, where the cohort is not born between 1953 and n, d is the annual cohort death rate for cohort i, and the number in the first bracket (for all variables) indicates generally the year of that variable (e.g., estimated population in year n for cohort i, 1954 death rate for cohort i).

For (younger) cohorts born after 1953, the fertility rates come into play as follows:

estimated_population[M][i][N] = sum(k){f[N][k]*estimated_population[N][k]}*(1-d[N][i])*(1-d[N+1][i])*…*(1-d[M-2][i])*(1-d[M-1][i])       equation 2B

where M is the year of the estimated population, i is an index of the cohort, N indicates the year in which the cohort is born (with N > 1953, < 1964),  sum(k){} means to sum all the variables within {} over index k (with f[N][k] being the fertility rate for year N and cohort k, and estimated_population[N][i] being the estimated population of year N for cohort i), defined iteratively via equation 2A and 2B.

While the equations are much more complicated, note the general structure is the same as equation 1.  Whereas equation 1 bridges population of 1953 to 1964, equation 2A does the same for cohorts that exist in both 1953 and 1964 (for this reason, equation 2A when n=1964 looks almost identical to equation 1, except for the addition of indice i).  Each cohort in 1964 that exists in 1953 is obtained by the 1953 cohort size multiplied by a cohort survival rate for each of the intercensal years.  For cohorts born during the intercensal years, one requires 2B.  The fertility rate enters the calculation via equation 2B.  The size of the cohort born each intercensal year is calculated as the fertility rate for that year for each cohort multiplied by the population size of each cohort and summing each of that up.    Each cohort in 1964 born during the intercensal years is calculated as the size of the cohort as it was born, multiplied by a cohort survival rate for each of the intercensal years after the birth of the cohort.

For equations 2A and 2B, note the amount of freedom Banister has in setting the death rates.  She has some 11 x N degree of freedom (11 intercensal years, N cohorts) in setting the death rates but would only need to produce N consistencies (to bridge the cohort population between 1953 and 1964) to pronounce her results are “plausible,” “believable,” and “consistent.”  She has 10 x N degrees of freedom to make that work.  The problem is under-constrained by a large degree. With so many variables to tweak, consistency is all but guaranteed.

One might argue that Banister actually has less freedom than that.  Since Banister chose to work with government death rates, she is constrained the government annual aggregate death rates for each intercensal year (Table 8.3).  The assertion can be countered by noting that the government data does not act like a real constraint.  On an annual aggregate basis, Banister’s adjusted death rates are anywhere between 37-84% higher than the government rates.  That’s a huge range. If there is a constraint, it is a very soft, informal constraint. Even if these are real constraints, that would add at most 10 constraints, making the problem under-constrained by at least 10 x (N – 1) degrees of freedom.

One might argue that Banister’s criterion of a “plausible mortality schedule” or “plausible” “survival ratios by cohort” also provide constraints of sorts as well.  Such rates could be determined in part by the shapes of the cohort distributions of 1953 and 1964.  Unfortunately it is difficult to see what sort of constraint this really means as Banister does not really qualify her “plausibility” criterion in any rigorous sense.  The distribution of population cohorts (e.g. shape of “Figure 6” below) may dictate a general proportion of death rates among cohorts. Even assuming these change every year for the intercensal – that means 11 more  (for a total of most 21 + N) constraints vs.  11 x N parameters to tweak. The problem is still at least under-constained by 9 x (N-1) – 1.

However unconstrained the problem, two additional observations serves to ensure that a plausible solution can be always found.

First, note how the estimated population calculated for 1964 via equations 2A and 2B is not very sensitive to the changes in intercensal birth and death rates.  As noted earlier in regard to equation 1, a 1% change in the death and birth rate shows up as a .01% change in the final population estimate for 1964.  Something similar also exists for equation 2A and 2B.  The change is not always 1/100.  But with the typical parameters shown in Table 8.2 and 8.3, the 1964 estimate is some two orders of magnitude more sensitive to the 1953 population than to any one fertility or death rate.  This insight has one major consequence.  Once one has found a solution that bridges the censuses by cohort, one has tremendous freedom in “tweaking” the death rates to be whatever one wants – for “plausibility” reason or other reasons – without breaking the bridge.  The N constraints may be just a soft constraint – or not much of a constraint, depending on the actual death rates used.

Second, note how the degrees of freedom can be more than doubled if the fertility rates turn out to be but a rough estimate (as is discussed to be the case above).  In such a case, each of the 11 x N fertility rates – being rough estimates with indefinite margins of errors – can be adjusted whichever way to make the data even more “plausible” and “consistent.”  In addition, given that the government published birth rates are adjusted anywhere between 1% to 35% from year to year by Banister between 1953 and 1964, it would appear that the so-called birth rates – or the related fertility rates – really do not offer much constraints at all.

In conclusion, the bridging of the 1953 and 1964 censuses represents a greatly under-constrained problem.  There are many ways to bridge the census populations of 1953 and 1964.  Which path are plausible depends on how one cherry-picks facts – which depends on one’s worldview (one’s unproven set of assumptions and biases).  Further, once the census populations are bridged, there also appear to be many tweaks one can make whatever path picked to appear as “plausible” one would like. If one is concerned with bridging the censuses in a plausible manner, that can always be manufactured.

Triangulating Truth – the Demographic Consequence of the Great Leap Forward

As can be seen, I do not find studies like Banister’s convincing.  Some of her techniques and methodologies and even assumptions and presumptions may indeed be interesting from a academic perspective (research on methodologies, etc.).  However, without addressing the margin of error issues squarely, and relying so much on “plausibility” and “believability” as a metric – Banister’s conclusions are at best an exercise in self-fulfilling hypothesizing.  It is truly unfortunate that what are speculations at the mathematical and statistical level have now come widely to be presumed to be facts by a wide swath of people discussing the Great Leap Forward.

Beyond Banister’s type study, one type of non-contemporaneous data I have indeed found to be of interest is data relating to the demographic consequences of the Great Leap Forward.  Here is a graph of China’s census cohort in 1964, 1982, and 2000.

China's demographics from census of 1964, 82, and 90

Graph copied from Population Reference Bureau publication.

The 1964 census shows a so-called “bite” in the demographics being taken out in the 15-19 age group and the 20-24 group, but nothing in the earlier group.  This would suggest that children between 10-20 were disproportionately impacted – that a large group of children did die.  The fact that there is a ” bite” – but not a reverse pyrimid with a point near the bottom – would also suggest that there was no rapid drop in birth rate, as the government published.

But the 1964 data is not considered very good. By the time we get to 1982, however, the 1982 census did not show as pronounced a “bite” for the original “bitten” groups, with a slight “bite” in the 35-39 group (which would include half of each of the original 15-19 and 20-24 groups; perhaps an average of those group mellowed things out?).  It did however show a big “bite” for the 20-24 age group – which would be consistent with the more likey event of a great dip in birth rate during the Great Leap Forward but no mass starvation and death.  The 2000 census bears the 1982 data out and does not show any persisting effects of the “bite” originally seen in the 1964 census but shows the continued persistence of the “bite” corresponding to a suppression of births during the Great Leap Forward as first seen in the 1982 census.

Obviously many theories can be drawn from this set of inconsistent data – depending which part one focuses on.  If one insists on the 1964 data as most accurate, one would have to go with the conclusion that children between 10-20 suffer (die) on an unprecedented scale during the Great Leap Forward, but that the GLP had relatively any impact on children younger than 10 and adults older than 20.  Or one might focus on the 1982 and 2000 data and conclude that perhaps GLP greatly distressed the populace but did not cause mass starvation and death, only a huge drop in birth rates – as might be expected of a population in distress.

Population data similar to that shown in “Figure 6” could also be derived from studies such as the Cancer Epidemiology survey of 1976 – where “bites” in death counts from diseases at particular age groups can be attributed to “bites” in the demographic distribution for that age group.  Data however all include inconclusive inherent margins of errors as well as inconsistencies of data from year to year.  In general these research raise more questions rather than settle them.

Questionable 1964 Census Figures and Concluding Remarks

In conclusion, I would like to emphasize again the inconsistency between the the 1964 vs. 1982 and 2000 histograms discussed above.  According to the former (1964 figure), there is no noticeable decrease in the size of the infant cohort right after the Great Leap forward (thus realistically no noticeable decrease in birth rates during the Great Leap Forward); according to the latter (1982 and 2000 figures), there is a large decrease in the infant cohort right after the Great Leap Forward (there must have been a noticeable decrease in birth rates or a noticeable rise in  infant mortality rates that was not reflected in the 1964 figures).  All of the so-called statistical studies on the deaths of the Great Leap Forward necessarily includes the 1964 census as a key (indispensable) data set, including the analysis of Banister.  However if the 1964 data is inherently reliable (experts already agree that methodically, they are unreliable; and the inconsistencies suggest that 1964 figure is truly unreliable), then regardless of how many “plausible” assumptions are made, results from such studies are really for naught.

Now I hope one sees how tentative and speculative Banister’s and others’ numbers are.  It’s not just the assumptions made.  It’s not just the unspecified errors of margins that threaten to take away any meaning in any hard numbers derived.  It’s that when the the goal posts of the game (both 1953 and 1964 census figures) are a moving target, whatever scores one keep means little. The whole carefully-built house is erected on a pile of sand.

While a detailed analysis of non-contemporaneous census data may in theory allow one to indirectly reconstruct estimate the death count for the Great Leap Forward, it simply has not been done – and may be impossible to do.  The error propagation across such a long time frame would ensure that today’s data (even if contemporaneously accurate for today) is not of much good to estimate the deaths that occur some three or four decades back.  The poor quality of contemporaneous historical data ensures no meaningful figures of deaths can be derived.  The lack of good contemporaneous historical data, the fact that any non-contemporaneous data today must incorporate the accumulated effects of many factors over the years, and the fact that error propagation of margins of errors must be propagated means that any study that claims to pinpoint any death figures is highly suspect.

However many died during the Great Leap Forward, it is incontrovertible that the Great Leap Forward was a disaster, with lasting consequences on China’s people, society, and development.  That’s not disputed. The problem I have lies in people’s taking the assertion that millions upon millions died as established fact.  I hope I have clearly dispelled that myth – at least as far as testable hypotheses go given the nature of the data that is available.

A final problem I see in discussions surrounding the Great Leap Forward in the West is the reflexive need to assign blame to Chinese leaders – even to the extent of assigning moral and ethic labels.  The actual situation is actually much more complex. And while statistics can’t answer many of the hard questions raised by the Great Leap Forward, you can go to Ray’s recent post and readers’ comments here to gain some perspectives.


  1. For a discussion how it is a mistake to defer the study of politically charged subjects to “scholars,” see Joseph Ball’s article titled “Did Mao Really Kill Millions in the Great Leap Forward?” which I have linked in the previous post
  2. p.229-231
  3. See see http://www.stats.gov.cn/english/
  4. Since Banister did not provide all the details of her calculations, these are the best I could do to reproduce her calculations following her discussions.
  5. J. Aird ‘Population Studies and Population Policies in China.’ In Population and Development Review, Volume 8, No.2, 1982.
  1. February 15th, 2013 at 02:18 | #1

    Allen, I know this topic has been on your chest for some time now. Glad to see your analysis. I think this article will be of valuable reference to those interested in the GLF.

    At a personal level, my parents told me just few days ago when I asked them about the GLF, they witnessed no deaths where they lived while in Fujian Province. They did say it was difficult for some, especially in the cities. In the rural areas of Fujian, people consumed what they were able to grow, and there was enough.

    On my wife’s side of the family, they witnessed 1 man stumbling in the streets (I believed at that time they were in Nanning).

  2. February 15th, 2013 at 05:21 | #2

    Luckily Fujian wasn’t as heavily affected by the GLF as many of the other provinces.

  3. February 15th, 2013 at 08:52 | #3


    Btw, how’s the trip to Shenzhen?

  4. February 15th, 2013 at 09:16 | #4

    Allen, are you familiar with the famine of 1942? I am not as even less information is available. There is a good movie by Feng Xioagang out recently called “Back to 1942”, you might want to check it out. The biggest issue I have with Banister’s estimate is actually those from the war years, mainly 1940-1945. In my view those figures are pure guest work. Nobody, be it the two central Nationalist government, the Communist controlled areas, the various regional Nationalist government or the Japanese controlled areas did any comprehensive demographic study. To even get a reliable set of figures from any of those entities is practically next to impossible. So how can one extrapolate a national figure using estimate?

    In my view and limited studies people living in war zone and adverse starvation condition will have less children then they normally would. The fertility figure would increase greatly after WWII ended in 1945 and again around 1953-1958 which is the biggest boom years since the end of the Qing dynasty. However, the fertility rate is almost flat on the figures provided. A special Chinese fertility rate characteristic in recent years (which affect even overseas Chinese community) is the Dragon year baby boom which is around 20% more and less during the year of the hog. Chinese schools in Malaysia and Singapore have to increase the class enrolment for children born on the year of the dragon.

    I believe millions died during the GLF years. However, from my study those years were not worse than the famine period since 1900s onward and especially during the Sino-Japanese war years of 1937-1945. The Japanese occupation forces and some Nationalist government routinely levied excessive grain tax on the farmers when there was serious short fall. And how many people are familiar with the Japanese “The Burn to Ash Strategy” (燼滅作戦 Jinmetsu Sakusen)? If we compare the numbers from that war (15-25 millions spread over eight long years) which is also an estimate, how can any serious researcher arrive at the 30 millions figures unless he or she already have a figure in mind before embarking on the research. The more recent figures of 40-70 million death are all done with no serious study at all.

  5. February 15th, 2013 at 09:32 | #5

    @YinYang and melektaus
    Fujian and other coastal regions are lucky in a sense that people can take to the sea to fish too. The GLF affect the traditionally poorer regions badly but has less effect on the richer regions.

    Although progress has been made, income and standard of living equality is still great. Simply compare income of different regions. Guizhou has made progress in eliminating extreme poverty but Guangdong now has an economy the size of South Korea. The income inequality of China is not more people become poorer, in fact the opposite is true. The perceived inequality is because people in certain region become quicker than anybody else. http://en.wikipedia.org/wiki/List_of_Chinese_administrative_divisions_by_GDP_per_capita

  6. February 16th, 2013 at 00:45 | #6


    Loved it. Better than HK. HK is too materialistic and too many rich douches from other countries there.

  7. conan
    February 19th, 2013 at 09:00 | #7

    People don’t mention this very much, but the U.S regime and its proxies did not allow food to be imported into China during the glf. The west, and their proxies like to complain about the glf, however, they do not take responsibility for murdering the Chinese who starved to death by placing sanctions on them. The U.S and their proxies have been using this trick for a long time. Some of their victims include Iran, and North Korea. Yes, you hear about North Koreans starving, but you don’t hear about the sanctions that cause it. People may not have been pigging out on KFC, and mcdonalds back then, but they had a lot of things that Chinese people do not have today, which is honesty, trust, a sense of safety, comradorie, unity, spirity, and a whole lot of things that died when capitalism came back. It’s truly sadening to see Chinese people waste a months salary on stupid imported “luxury” items. Chinese people may not be smoking opium anymore, but they might as well be when they blow their money on iphones. You look at Chinese people from the 1960s and they looked alive, and empowered. There was something so pure, and natural about that period. I wouldn’t have traded it for all the WMD’s in america. When you look around Zhejiang these days, you see people playing on their iphones like a bunch of weaklings. Look at how some of the women are dressed in Guang Dong, Hong Kong, and Taiwan, and it’s quite shameful to see what has happened to the Chinese race. China may not have had much in 1960, but at least women werent running around looking like sluts. The Indians may be poorer, and economically vulnerable, but it seems like at least they have their dignity. Today, youve got all kinds of scavengers running around China. There’s Africans, Arabs, jews, you name it. Koreans are opening up salons here, and getting your hair done there costs as much as an average Chinese months salary. The worst part is, people are lining up to get robbed. Ipads in apple stores are sold out. I don’t know what kind of a trade surplus people are talking about, but it seems like most Chinese people are throwing their money away on foreign crap. You name is, Chinese lose it. Cars- theres more foreign cars than Chinese cars on any road. Phones- foreign phones dominate the market. HTC, Samsung, Nokia, Iphone rules. Ktouch, ZTE, Coolpad, Meizu, sorry, better luck tomorow. I walk into a bathroom, and what do i see? toto, and american standard toilets? wtf does China need to import toilets for? As if it really makes a difference if you sit on a domestic or import toilet? I can understand if the import is cheaper, but in most cases, imports will cost twice as much as the domestic one, and people are still buying it. It really looks like the world is on life support, and China is keeping them all alive. I hate to say it, but China is getting a raw deal. China is buying all the expensive crap that no one else wants to buy, while local manufacturers are being put out of business. You look at the Beijing airport, and who designed it? You have british architects designing the Beijing airport, and the roof on it collapsed by the way. But why on earth do you need Brits to design the Beijing airport, especially when they charge twice as much as Chinese architects. Something really bad is hapening in China, and its hapening at the expense of the Chinese people. The worst part of it is, why the heck are Chinese buying stuff, and hiring people from hostile regimes? The brits are behind all kinds of terorism in China, and then China goes around and hires their architects. Something just doesnt sound right about that. Chinese people have always been frugal people, like the jews, and what has happened in the past 30yrs has turned China upside down. If you paid $50 to get your hair dyed back in 1960, people would have said, you;re a god dam idiot. Today, that is seen as “cool”, although who actually cares if you dye your hair anyway? @melektaus, if you want to see the worst of the worst, go to lan gwai fong in hong kong. There you will see the worst that Chinese civilization has to offer. Westerners always like to expand outward, and they do. They dominate most markets, and when you try to do your own thing, they attack you. Take skype for example, they pretty much dominate every country. No one dares challenge them. Theres a voip called uu. how many of you have ever heard of it? Google is another one. You look around this world and only 3 countries have their own search engines. China is one of them, and what happens when China makes their own search engine? The west cries “censorship”! even though they have so much western propaganda on baidu already, but no, they want more, to the point where all you do is host content that criticises yourself. Western websites censor stuff out all the time, using a variety of different tactics, ie blocking Chinese ip addresses, but no one ever dares to say that the west censors information out.

  8. conan
    February 19th, 2013 at 09:09 | #8

    tons of people starved to death during the kmt period, but no one ever blames chang kai shek for it. i wonder why? even the communists dont blame him. but if you read some of edgar snow’s books, then you can see that lots of people starved to death, and the kmt was responsible for it, no doubt. thats why the people ended up revolting against the kmt, because the conditions were so horrible. however, if you look at Chinese history from 1950-1980, there was no large scale revolutions or protests. this is in stark contrast to the kinds of revolutions and protests going on during the kmt period, which ultimately led to the toppling of the kmt regime. in order for Chinese people to revolt, things have to get really bad, and the kmt provided those conditions, both through concentration of wealth, and allowing the japanese to do whatever they want (ie kill everyone). on the one hand, you have chang kai shek actually “killing his own people”, then on the other hand, you had the u.s regime killing Chinese through sanctions, yet few people ever mention chang kai shek;s crimes against China, and the u.s regime’s genocide on the world through sanctions, instead, what you hear is Mao killing millions. thats the power of propaganda. it looks like humans havent really advanced that far since they believed the world was flat, because today, they are still believing all kinds of bullshet that are not true.

  9. February 20th, 2013 at 00:04 | #9


  10. February 23rd, 2013 at 09:50 | #10

    Allen, judged by how your pieces on this topic have been misread and misconstrued, you may want to put out a disclaimer:

    Warning: Grade-10 mathematical knowledge, basic logical reasoning, and critical thinking skills are highly recommended. For those with IQ less than 120, read at your own risk.

  11. wwww1234
    August 25th, 2013 at 06:03 | #11

    new research on the famine

Time limit is exhausted. Please reload the CAPTCHA.