Three Applications of Regression Discontinuity on Urban and Labor Economics by Zhichao Wei B.A., Shanghai Jiao Tong University, China, 2002 M.A., Peking (Beijing) University, China, 2006 M.A., Brown University, 2007 Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Economics at Brown University PROVIDENCE, RHODE ISLAND May 2012       @ Copyright 2012 by Zhichao Wei This dissertation by Zhichao Wei is accepted in its present form by the Department of Economics as satisfying the dissertation requirement for the degree of Doctor of Philosophy. Date __________ ___________________________ Vernon Henderson, Advisor Recommended to the Graduate Council Date __________ ___________________________ Nathaniel Baum-Snow, Reader Date __________ ___________________________ Louis Putterman, Reader Approved by the Graduate Council Date __________ ___________________________ Peter M. Weber, Dean of the Graduate School iii      Vitae   The author was born on May 9th, 1980 in China. He received his Bachelor of Arts in Economics from Shanghai Jiao Tong University in 2002. He also got his Master in Economics from China Center of Economic Research, Peking (Beijing) University in 2006. He entered Brown University in 2006 to study Economics. He received his M.A. in 2007 and finished his doctorate in May, 2012. iv    Acknowledgement I would never have been able to finish my dissertation without the guidance of my committee members, help from friends, and support from my family. I would like to express my deepest gratitude to my advisor, Professor Vernon Henderson, for his excellent guidance, caring, patience. I would also like to thank Professor Louis Putterman and Professor Nathaniel Baum-Snow for guiding my research for the past several years. I thank Kenneth Chay, Pedro Dal-Bo, Sriniketh Nagavarapu and Nancy Qian for sharing their numerous insights with me. I also thank my fellow colleagues Mongoljin Batsaikhan, Tianran Dai, Tai-sen He, Kenju Kamei, Daeho Kim, Tomislav Ladika, Weiye Li, Manabu Nose, Ying Pan, Adam Storeygard, Isabel Tecu, Zhi Wang, Joshua Wilde, Zhaoguo Zhan for making learning fun. I am particularly grateful to my wife, Ang Sun, for her intellectual input, continuing support and endless patience. She was always there cheering me up and stood by me through the good times and bad. Finally, I would like to thank my parents, my brother and also my host-parents, Paul and Elaine Meyrial. They were always supporting me and encouraging me with their best wishes. v    Table of Contents Vitae iv Acknowledgement v Chapter 1 Blessing or Curse: A Study of China’s Place-based Pro-agriculture Poverty 1 Alleviation Program 1 Introduction 2 2 Background 6 2.1 Time Line of the Program 6 2.2 Measures of the program and achievement 7 2.3 Program Assignment 10 2.4 China’s structural change and grain policy 10 3 Model 12 3.1 Parameterization of production functions 13 3.2 Comparative Statics 15 4 Data 17 5 Empirical Strategy 18 5.1 RD Validity 21 5.2 Specification 22 6 Results 23 7 Conclusion 27 Chapter 2 Rates of Return to University Education: The Regression 57 Discontinuity Design (Co-authored with Elliott Fan, Xin Meng, Guochang Zhao) 1 Introduction 58 2 Background 59 3 Methodology 64 4 Data 66 5 Fuzzy RD Results – LATE 68 5.1 Validity of the RD Design 68 5.2 Estimation Results 70 5.3 Robustness Tests 71 6 More Discussion on LATE 72 7 Conclusions 73 Chapter 3 Once A Loser, Always A Loser? Evidence from the Football League in 88 England 1 Introduction 89 2 Background 92 3 Data 95 4 Empirical Strategy 95 4.1 The Pre-test 95 4.2 Treatment Effects 96 4.3 Heterogeneity Across Players 96 5 Results and Interpretation 96 6 Concluding Remarks 104   vi    Abstract of "Three Applications of Regression Discontinuity on Urban and Labor Economics" by Zhichao Wei, Ph.D., Brown University, May 2012 My dissertation is to use the Regression Discontinuity method to study the topics in urban and labor economics. The first chapter evaluates the effects of China's second-wave, place-based poverty- alleviation program. The results using regression discontinuity design show that the supported sector grew faster in the counties receiving aid compared to those counties not receiving aid. However, this aid also drove the local labor out of other production---especially non-farm activities---that played an increasingly important role in rural earnings. The paper suggests that the traditional place-based policy in China caused significant distortions and that a more efficient way of helping the lagging regions is needed. The second Chapter estimates the rate of return to a university degree using a regression discontinuity design allowed by a special feature of the University Admission System in China. The National College Entrance Examination has clear cutoffs for university entry. Our results show that the rates of return to 4- year university education relative to 3-year college education are 40 and 60 per cent for the compliers in the male and female samples, respectively. The third Chapter sheds light on the literature about how the short-term adverse shock to a firm could affect an employee’s long term well being by exploring the effect of team relegation on team players in the England Premier League. In each football season, the three lowest ranked teams are relegated to a lower level league the following season. The fourth lowest ranked team is arguably similar to the team ranked the third last. Using a longitudinal team-player dataset, with a Differences-In-Differences approach, I find that team relegation has a negative effect on players in the short run but had a positive effect on players in the long run. I also find that players from the winning teams have fewer appearances than the players in the losing teams in the first few years after the relegation season, which indicates a channel of human capital accumulation. Chapter One Blessing or Curse: A Study of China’s Place-based Pro-agriculture Poverty Alleviation Program 1    1 Introduction The inequality in regional development is one of the most pressing socioeconomic problems that governments face in both developed and developing countries. Equity considerations have led to policies and programs for disadvantaged regions and/or the populations in those regions. Some well-known examples are America’s Appalachian Regional Develop- ment Program, Brazil’s SUDENE program in its earlier years, and, more recently, the European Union’s Structural Funds and Cohesion Funds, and Mexico’s Oportunidades. Even now, people continue to hotly debate the proper approach to help the lagging re- gions. Should help go to distressed places or to distressed people? The standard argument among economists, as Glaeser and Gottlieb (2008) clarify, is that people-based policies of supporting job training and facilitating household mobility are far superior to po- tentially wasteful place-based policies. However, place-based programs are still widely implemented, and existing ones continue to expand (Greenbaum and Landers, 2009). Well-known examples include America’s Empowerment Zone Program and the European Union’s Regional Policy Program. Although there are many reasons, such as political concern, for expansion, as Green- baum and Landers (2009) argue, a key factor is that solid evaluation of the programs is rare (Greenstone and Looney, 2010). People may reach very di¤erent conclusions simply be- cause they use di¤erent estimation strategies in their analyses1 . Simple OLS regression— especially the …xed-e¤ect model— is widely used, but the estimate is very likely to be biased since the lagging regions are, in general, di¤erent from other regions. Matching could be an option, but it relies on the strong assumption of conditional independence, 1 For example, there is disagreement over the e¤ect of the Appalachian Regional Development Program (ARDP). Isserman and Rephann (1995) constructed a "twin" county to serve as the counterfactual and found a seminal positive e¤ect of the program. However, Glaser and Gottlieb (2008) adopted a more standard multivariate regression model in their analysis and found little evidence of ARDP’s e¤ectiveness. Ziliak (2010) used a di¤erences-in-di¤erences-in-di¤erences approach and again found a positive e¤ect. Another example is that Greenbaum and Landers (2009) found that multiple evaluations of California’s Empowerment Zone Program came to varying conclusions. 2 which is di¢ cult to hold. Valid IV estimates are still very rare in the literature2 . Gov- ernments typically spend large and ever-growing amounts on these programs, and not having a rigorous estimate to justify these expenditures is a serious problem3 .In addition, although there is extensive theoretical and empirical literature on the e¤ects of these pro- grams, very few papers have tried to pin down the mechanism of the programs and have focused, instead, on an overall evaluation4 . In general, the programs target some speci…c industries or …rms. It is interesting to know the programs’e¤ects on those industries or …rms, and it is equally interesting to know their e¤ects on other, unsupported industries or …rms. In addition, it is more important to know how these e¤ects occur and through what channels. However, such discussion is very rare. Partly because of that, the dis- cussion of why the programs are e¤ective or ine¤ective lacks solid evidence. Thus, policy makers can easily disregard the little focused research that exists since it fails to provide clear recommendations for program improvements. China is no exception to this. Regional inequality has continued to increase since the mid-1980s and remains a worry for China’s government (Fan et al., 2010). Between 1986 and 2010, the government launched three waves of large-scale, place-based poverty- alleviation programs to help poor regions. These programs focus mainly on helping rural households with agricultural production5 , with the objective of increasing their income6 . 2 Criscuolo et al. (2009) exploited multiple changes in the area-speci…c eligibility criteria and con- structed an IV to estimate the e¤ect of the Regional Selective Assistant Program in the U.K. 3 The Appalachian Regional Committee disbursed $13 billion in the thirty years after 1965. Brazil’s government provided $10 billion in subsidized loans for the SUDENE program between 1989 and 2002. The latest Structural Funds and Cohesion Funds of the European Union for the programming period 2007-2013 have already reached 347 billion euros. 4 Bondonio and Greenbaum (2007) is one of the few exceptions that discuss the program e¤ect more than the overall evaluation. They found that the state Empowerment Zone Program had positive e¤ects on the new and existing establishments, but had negative e¤ects on …rms that close or leave the area. They argue that this is the reason they get the null overall mean impact in their estimation. 5 At …rst, the program funds were used mainly for household agricultural production. In the late 1980s, the emphasis switched to supporting TVE and other county enterprises. In the mid-1990s, agricultural production became the focus again. Wang (2004) and Rozelle et al. (1998) have more details on that. 6 Before 1978, China followed the Soviet industrialization strategy, which involves taxing agriculture to fund industrial investment. After the 1978 reform, the elimination of the exploitative taxation of agriculture helped agriculture develop very quickly between 1978 and 1984. In the 1980s, agriculture played a far more important role than other sources in reducing poverty and improving income in rural 3 More importantly, food security is always a top concern of China’s government, and it is harder for the poor regions to be self-su¢ cient. Therefore, grain production is always the main focus of government programs. In particular, in the mid-1990s, grain production was regarded as the top priority due to concerns about the high price of grain.7 . China’s governments took lots of measures to guarantee the goal of maintaining grain su¢ ciency. Few observers deny that China made remarkable progress in its war on poverty after the reform. We know that the lagging regions in China have grown very fast and that a lot of people have gotten out of poverty. However, we are not clear about the role of the poverty-alleviation programs in this process. The questions remain: Have these programs been e¤ective? Why or why not? Good answers to these questions can help guide policy makers in their future decisions regarding poverty alleviation. So far, a few papers have evaluated the programs, and they su¤er from the limitations listed above. Rozelle et al. (1998, 2003) used …xed-e¤ect regression and found that the program had a modest positive e¤ect on agricultural production in Shaanxi Province from 1986 to 1991 and in Sichuan Province from 1985 to 1995, respectively. Fan et al. (2002) used provincial-level panel data from 1970 to 1995 and found that poverty investments (measured as poverty loans) matter somewhat for growth and poverty alleviation. Park et al. (2002, 2010) used matching in their analysis and found the programs’e¤ect decreasing over time, even though the poverty-alleviation funds kept growing8 . Li and Meng (2010) used a regression-discontinuity approach and found a positive e¤ect between 1994 and 2004. However, all of these papers focused only on the estimation of the overall e¤ect of the program on income or production. They said little about the program’s e¤ect on other characteristics. Thus, we are still unclear about why the programs were e¤ective or China (Ravallion and Chen, 2007). That is the reason why China’s poverty-alleviation program focused on agricultural production. 7 The grain price increased dramatically in late 1993, and the high price continued until 1997. 8 They found that in the included counties or communities, per capita income increased by 2.28 percent per year between 1985 and 1992 and by 0.91 percent between 1992 and 1995, but by almost zero between 2001 and 2004, even though program funds kept increasing. 4 ine¤ective. This paper aims to …ll the research gap with a more rigorous and comprehensive analysis based on the general-equilibrium framework. It evaluates China’s second-wave poverty-alleviation program from 1994 to 2000,9 which focused on agricultural production, especially grain production. In 1994, the poverty-alleviation program was expanded, and newly-targeted counties were included based on their per capita rural income. If the county’s per capita rural income was below 400, then it was, in principly included. We make use of the discontinuity of program assignment at the eligibility threshold to construct an instrumental variable for the actual treatment status, as Li and Meng (2010) do in their paper. Utilizing 11-year, county-level panel data, the IV estimation results show that the poverty-alleviation program did have a positive impact on grain production and agricultural income. However, the program attracted more rural labor out from township and village enterprises(TVE), and thus, had a crowding-out e¤ect on rural industrial production. More importantly, the treated counties bene…ted from the support to agriculture and the unusually high prices of agricultural goods from 1994 to 1996. However, they lagged behind the untreated counties in industrial production, which proved to be more important in the future. As the prices of agricultural products dropped after 1997, the bene…ts of more agricultural production got smaller, and the gaps in non-farm activities might have gotten larger, possibly making the supported counties worse o¤ in the long run. This paper also has implications for the regional-development literature. First, it is one of the few papers that provide rigorous evidence by dealing with the endogeneity problem. The results show that the place-based regional policy did not work well in China from 1994 to 2000. More importantly, other than getting the estimate of the overall e¤ect on the income variable— as most of the other papers do— this paper tries to 9 The poverty-alleviation program had three waves: 1986-1993, 1994-2000, and 2001-2010. I will talk more about that in Section 2. 5 pin down the mechanism through analyzing the general equilibrium e¤ect of the poverty- alleviation program. It reveals that the pro-grain program caused distortion in favor of grain production and that it harmed other production, especially the non-farm activity that would play a more important role in rural income later on. In addition, the paper shows that factor (labor and capital) mobility is the channel of the distortion ignored in the literature. Furthermore, it reveals that place-based strategy lacks ‡exibility since the support was still agriculture-biased even when the grain price fell after 1997. This paper sheds light on the prospect of helping lagging regions in China which expe- rienced a big transition from agricultural to industrial economy. Although pro-agriculture or pro-grain policy worked well to …ght poverty in the 1980s (Ravallion and Chen, 2007), it might not have been a good option in the mid-1990s— or beyond. In a transitional economy like China, it is hard to always choose the right industry to support. A better option is people-based strategy which has more ‡exibility. People can respond to changes more quickly and make adjustment to adapt to the new environments faster than the gov- ernment. Furthermore, when a large-scale program is designed, the general equilibrium e¤ect and factor mobility should be taken into consideration. The paper is organized as follows: Section 2 describes the policy background. Based on that, in Section 3, I construct a simple model to clarify the mechanism and make some predictions. Then, I discuss the data I use in the paper in Section 4, and in Section 5, I present the empirical strategy and the results. Finally, I o¤er concluding remarks. 2 Background 2.1 Time line of the program After the mid 1980’s, inequality grew as the economy developed in China. Fan et al. (2010) show that the Gini and Theil index of regional inequality kept increasing after mid 1980’s (Figure 1). China’s government was concerned about this and wanted to help the 6 poor. So, in 1986, it launched the …rst-wave large-scale poverty alleviation program tar- geting more than 200 counties, called the National Designated Poverty (NDP hereafter) Counties10 . The government put a lot of resources into those NDP counties for improving rural infrastructure and agricultural production conditions, hoping that these improve- ments would help the poor in the long run. In 1993, the central government decided to expand the program, called the 8-7 plan, to include 592 counties, which accounted for 28 percent of the total counties in China. These 592 counties continued to receive support from 1994 to 2000. In 2001, the government made some changes to the program11 , and the program went on until 2010. This paper will focus on the second-wave program, which took place from 1994 to 2000, since many counties were added to the program during that period. Figure 2 shows the geographical distribution of the NDP counties in this wave— most of them located in the central or the west region of China, both of which are regarded as relatively poor. 2.2 Measures of the program and achievement The poverty alleviation program was a preferential policy focused on developing agri- cultural production and improving living and production conditions. Former President Jiang Zemin gave a keynote speech at the Poverty Reduction Conference in 1996 stating the emphasis of the poverty alleviation program: Grain production was the top priority, and other crops and animal husbandry should be developed only if grain self-su¢ ciency was not a problem. Beyond that, the rural processing industry could develop to some extent.12 . 10 Every year after 1986 and until 1990, some new counties were added to the list. In 1990, the total number was 331. 11 There are two important changes. First, the government removed some counties and added others to the list based on some criteria. Second, the government targeted more on villages or communities instead of counties. For the details and the evaluation, please see Park and Wang (2010). 12 Speci…cally, as Wang et al. (2004) point out, the goals of the 8-7 plan were to: (1) assist poor households with land improvement, increased cash crop, tree crop and livestock production, and improved access to o¤-farm employment opportunities; (2) provide most townships with road access and electricity, and improve access to drinking water for most poor villages; and (3) accomplish universal primary education and basic preventive and curative health care. 7 Speci…cally, the support package contained three di¤erent instruments: subsidized loans, food-for-work (FFW), and Ministry of Finance development grants. The interest rate, which was set by the Central Bank of China, was well below the market rate, as Table 1 shows. In addition, the CPI was quite high in the early 1990s, so the real interest rate was, in fact, negative before 1997. The farmers mainly used the loans to by or rent the agricultural machines13 , buy fertilizers, buy various agricultural seeds or young 14 livestock. The second most important program was the food-for-work (FFW) projects. . Building roads, constructing terraced …elds, improving soil, building small-scale hydraulic engineering projects (e.g., drains and ditches), and improving the drinking-water and irrigation systems were supposed to improve the infrastructure and agricultural production conditions. The third instrument, Ministry of Finance development grants, was intended to boost rural development in general.15 . The amount of poverty alleviation funds kept increasing in nominal values and went up dramatically after 1997 (Table 2). In addition, from table 2 we can see that the subsi- dized loan program, accounting for over one half of the total poverty funds while the other two programs account for 20-30% respectively. According to Wang et al. (2004), in 1996, for example, the program provided 11.6 billion yuan (or $1.4 billion), an amount equal to …ve percent of central government expenditures and more than …ve percent of the rural household income in poor counties. Table 3 shows that poverty alleviation funds were equal to 30 percent of the total …scal revenue for NDP counties, on average. The gov- ernment had strict control over the allocation of these funds. For FFW and development grants, the process is like the following: …rst, the village o¢ cials collected information 13 The agricultural machinery includes various sizes of tractors, tractor-towing farm machinery, loader, harrow, harvester, milling machine, sprayer, etc. 14 FFW is widely regarded as a good method to help the poor. It not only builds up material foundations for regional economic growth, but also provides short-term job opportunities and income for the poor. For the details of FFW in China, please refer to Zhu and Jiang (1994). 15 In reality, its focus was the improvement of the rural infrastructure, the extension of new agricul- tural technology or better seeds, the training of farmers, and the improvement of education and health conditions. 8 from the farmers, wrote proposals and submitted them to the township government. The township o¢ cials chose the proposals to send to the next level, the county government. The corresponding professional departments then examined the proposals and forwarded their selections to the county o¢ cials for approval. If approved, the proposals were then forwarded to the prefectural- and provincial-level governments. The provincial govern- ments made the …nal decision and then transmitted the funds downward to the county o¢ cials and through the successive levels. The subsidized loan is a little di¤erent. It is mainly manged by the Agricultural Bank of China (ABC thereafter), one of the state-own big banks in China. County-level ABCs worked closely with county-level governments and managed the …nal allocation and utilization of the subsidized loans. Speci…cally, to qual- ify for subsidized loan, a project needed …rst to be approved by county-level governments, and then con…rmed by county- level ABC, which could reject the project based on its risk.16 The proposals had to survive several rounds of screening, and the government used this procedure to guarantee that the funds would be used to improve agricultural produc- tion and living conditions, as the government expected. The central government thought that agricultural production, especially grain production, played a central role in increas- ing rural farmers’ income, so they tended to approve proposals that had promised for improving agricultural production and grain production in particular. Especially after 1996, in order to prevent the misuse of money, the central government put more e¤ort into monitoring how funds were actually used. The government pronounced the program a great success, especially in agriculture and infrastructure.17 . In its o¢ cial summary report issued in 2001, the government described the 8-7 plan’s major achievements: 601 billion mu (100 billion acres) of basic farmland 16 Wang et al. (2004) describe the programs in more detail. 17 There is no detailed information about the allocation of the poverty alleviation funds before 1997. Wang et al. (2004) …nd that between 1998 and 2001 in 519 counties, 46 percent of poverty funds were allocated to agriculture, 20 percent to infrastructure, 14 percent to industry, six percent to transportation and three percent to education and health care. 9 were formed; 535 billion people’s and 484 billion livestock’s drinking-water problems were resolved. 2.3 Program Assignment The choice of counties to participate in the program was based on a set of characteristics, especially in the …rst-wave program, with the rural net income per capita18 being the most important one19 . In 1993, the central government decided to expand the coverage to include 592 counties. The announced standard was that the new NDP counties’rural net income per capita had to have been below 400 in 1992. One political concern was that the counties included in the …rst-wave program were di¢ cult to remove from program. The cuto¤ for them was a net per capita income of 70020 . The expansion was approved in 1993, and the 1992 rural net income per capita had been released before the meeting. So, although the county o¢ cials had an incentive to manipulate the index data, they were not able to. 2.4 China’s structural change and grain policy To understand the e¤ect of the program, it helps to know the background of China’s development in this period. China’s development after the reform can be divided into several stages: 1978-1984, 1985-1992, 1993-2002, and 2003-present. In the …rst stage, the contribution of agriculture was very large (Lin, 1992; Fan, 1991; and others). China’s central government initiated further reform in 1992, and China began to experience big structural change in the early 1990s. The panel A of Figure 3 describe the changes 18 The formula of calculating the rural net income per capita is rural net income per capita= (Total income- Operation fee-Tax-Fee-Depreciation-Survey subsidy-Gifts to relatives and friends)/ number of household residency. Each year, the rural household survey team of NBS randomly chose 100 households in each county and calculated the rural net income per capita based on the formula. 19 The NDP’s rural net income per capita should have been below 150 yuan in 1985. The cuto¤ was raised to 200 for revolutionary counties and counties with large minority populations and to 300 for the minority counties in Inner Mongolia, Qinghai and Xinjiang. As Park et al. (2002) point out, the standard was not strictly enforced. 20 It turned out that only 30 counties were removed from the list. 10 of the of various components in rural net income per capita. We can see that the wage income went up dramatically after 1994 and kept increasing much faster than other income sources. It even surpassed the plantation farming as the major source of earnings in 2003. As shown in panel B of Figure 3, the share of wage income increased from around 20 percent in the early 1990s to 40 percent in 2009, while the share of plantation decreased from around 50 percent to 30 percent over the same period21 . These …gures indicate that China experienced a big change towards non-farm activities after the early 1990s. Not only the urban, but also the rural residents were a¤ected by and bene…ted from the transformation. Based on the above …gures, it is natural to assume that we should reconsider the tradition of supporting agricultural production in the poor areas. China has a long history of concern of grain production. Grain insu¢ ciency was always the cause of farmers’uprisings in feudal China. In 20th century, the big famine in 1959- 1961 haunts the Chinese leaders. Food security is always a social hot topic in China. So the grain production was always one of the top priorities of China’s government. In 1993, China initiated further reform in the circulation system of agricultural products22 . However, as Figure 4 shows, in late 1993 and early 1994, the grain price unexpectedly went up dramatically23 . At the same time, the in‡ation rate was also very high, and the government was concerned because of the big problems caused by the in‡ation of 1988. They thought that the in‡ation might be due to the shortage in supply, so they raised the procurement price by a great amount, hoping to stimulate grain production. However, 21 As Cai and Wang (2009) pointed out, the contribution of wage income to rural income is underesti- mated because some rural migrants are not counted as rural residents if they stay longer than half a year in the urban areas. 22 For more details, please see the Appendix. 23 There are a few reasons for that. The most important reason is related to the in‡ation expectation. The in‡ation rate in 1993 was very high (14.7% for 1993, 24.1 for 1994, 17.1% for 1995). The farmers expected the increase and preferred to hold up the stock so the grain supply was less in the market. At the same time, the buyers expected the price increase and tried to buy more grain so the grain demand was more in the market. Thus the price increased. In addition, grain self-su¢ ciency was loosed a bit in that period so the coastal regions gave up grain production and bought the grain from inland regions which caused the grain price in inland regions to go up. The inland regions tried to block the market trade but failed. Furthermore, international grain price increased at the same time. 11 The price increase continued24 , and Lester Brown’s 1995 book, "Who will feed China?", only intensi…ed the worry of China’s leaders. The provinces were required to attain self- e¢ ciency, and the government promised to buy any amount of grain at the procurement price, which was was very close to the market price, especially before 1997. However, the high price of grain did not last long and the grain price dropped sharply after 199725 . Because of that, income from agriculture increased signi…cantly from 1993 to 1997 and leveled o¤ after that.26 3 The Model This section develops a simple general-equilibrium framework for understanding the ef- fects of the poverty alleviation program. A key feature of the model is that the e¤ects of agricultural development are mediated by labor ‡ows, which might not be expected. Suppose that a rural household can choose to either work in agricultural sector or in- dustrial sector. The agricultural sector was supported by the government. When the program kicked in, the food-for-work and the public infrastructure construction improved the production conditions of the agricultural sector. In addition, it was easier for the agricultural sector to get subsidized loans. I want to use this model to show how that will a¤ect the ‡ows of factors (capital and labor) and, thus, the production in each sector and overall income. Speci…cally, I will show that labor and capital will ‡ow to the agricultural sector, and, thus, the output in this sector increases. In addition, labor ‡ows out of the industrial sector. Below are the key assumptions of the model: 24 It intensi…ed farmers’ expectation of price increase which drove the price to be higher. Regarding the reasons for the price increase please see Johnson and Song, 1999; Lu, 1999; and Lu and Peng, 2002. 25 There are two major reasons for the price drop. First, the in‡ation rate went down a lot in 1996-1997 so the farmers do not hold up the stock any more. Second, the government bought a lot of grain with hight prices in 1994-1996 and accumulated a huge loss which was not sustainable so the government changed the policy and furthered the reform. 26 For the detailed information of the grain price in this period in China, please refer to Huang et al. (2006). 12 Two sectors: The household is self-employed and engages in agricultural and indus- trial production. In the agricultural sector, the rural household in China does not hire laborers for its agricultural production. In the industrial sector, most laborers work in TVEs before the mid-1990s. Most of TVEs were collectively owned so it can be assumed that they are self-employed. Some of the farmers were wage earners and their numbers kept increasing in the late 1990s when the privatization began. For simplicity, I assume that rural people were self-employed27 . The capital price is …xed. In the 1990’s, the capital price was determined mainly by the central government. From another point of view, the "poor counties" can be regarded as a "small economy" whose scale will not have an impact on the capital price at the national level. In addition, since the capital markets were separate for the agricultural and industrial sectors in China I assume that the capital is not interchangeable across sectors. Closed labor market: The household could freely allocate their labor in those two sectors. The labor market was closed locally. This assumption, together with the …rst one, simpli…es the analysis in the sense that the "treatment e¤ect" is constrained within the "poor counties" by cutting o¤ the ‡ow of labor and capital cross the treated and untreated counties. Therefore, before continuing, it is worth clarifying that the "general-equilibrium" refers to the interaction between sectors within the "poor counties." 3.1 Parameterization of production functions Assume that the production function is Cobb-Douglas in both sectors. To ensure an interior solution of the optimization problem, assume that the productions have decreasing returns to scale. Speci…cally, assume that the production function of the agricultural 27 I can assume they are wage earners as well but the change of assumption will not a¤ect the conclusion. 13 sector is F = f (K1 ; L1 ) = A1 K1 L1 (1) where A1 is the TFP in the agricultral sector. Decreasing returns to scale requires that + < 1. Similarly, for the industrial sector, I assume that the production function is G = g(K2 ; L2 ) = A2 K2 L2 (2) where A2 is the TFP in the industrial sector and is assumed to be exogenous at this point. + < 1: Note that in this model, K1 and K2 are not interchangable. Assumption 2 implies that the capital prices r1 and r2 are exogenous. Assumption 3 implies that labor can ‡ow across sectors freely within the county’s border. Therefore, in equilibrium, we must have w1 = w2 L1 + L2 = L where w1 and w2 are the implicit wages for the two sectors respectively. Assume that the price of the …nal product of the agricultural sector is 1, and that of the industrial sector is p. The maximization problem can be set up as follows: M ax : I = A1 K1 L1 + pA2 K2 L2 r 1 K1 r 2 K2 s.t. L1 + L2 = L Based on the background above, we know that the program gave rural people subsi- 14 dized loans in the treated counties. We can characterize that as the decrease of r1 . In addition, the government also helped rural people improve agricultural production con- ditions. We can characterize this as the increase of A1 . So, we want to see how the decrease of r1 and the increase of A1 a¤ect other variables such as the capital use and labor allocation in the two sectors. More importantly we want to know the e¤ect on income. 3.2 Comparative Statics Set up Langrangian as L = A1 K1 L1 + pA2 K2 L2 r 1 K1 r2 K2 + (L L1 L2 ) The FOCs are @L 1 = A 1 K1 L1 r1 = 0 (3) @K1 @L 1 = pA2 K2 L2 r2 = 0 (4) @K2 @L 1 = A 1 K1 L 1 =0 (5) @L1 @L 1 = pA2 K2 L2 =0 (6) @L2 Solving it, I get 1 1 1 r1 = B(L L1 ) 1 L1 1 1 1 1 1 1 1 while B = 1 A1 A2 1 r21 15 @r1 1 1 1 + +2 1 1 1 1 1 = B( (L L1 ) (1 ) L1 (L L1 ) 1 L1 )<0 @L1 1 (7) @L1 @L2 @K1 so we know that @r1 < 0 as well. And we know immediately that @r1 > 0, @r1 < 0, @K2 @F @G @I @r1 > 0, @r1 < 0, @r1 > 0 and @r1 < 0. @L1 @L2 @K1 @F @G @I Similarly, I can get @r1 > 0, @A1 < 0; @A1 > 0, @A1 > 0, @A1 < 0 and @A1 > 0. 1 1 1 At the same time, w = = pA2 K2 L2 = p 1 A21 r2 1 L2 1 . We know that @w 1 1 2 2 = p 1 A21 r2 1 L2 1 <0 (8) @L2 1 @w @w @L2 @w @w @L2 Then we know that @r1 = @L2 @r1 < 0, @A1 = @L2 @A1 > 0. Based on the above deduction, I get the following propositions: Proposition 1 The subsidy, which lowers the capital price in the supported agricultural sector, will increase the labor allocation, capital input and the production in the supported agricultural sector. At the same time, it will decrease the labor allocation, capital input and the production in the unsupported industrial sector. Overall, it will increase the rural household’s income. Proposition 2 In addition, the increase of A1 (the total factor productivity) in the sup- ported agricultural sector, will increase the labor allocation, capital input and the pro- duction in the supported agricultural sector. At the same time, it will decrease the labor allocation, capital input and the production in the unsupported sector. Overall, it will increase the rural household’s income. The economic intuition behind the model is straightforward. Take the decrease of r1 for example. If r1 decreases, the capital cost decreases, and people will invest more capital in the agricultural sector. Since there is more capital in the agricultural sector, 16 the M P L1 increases. Thus people will devote more labor to the agricultural sector. Since both capital and labor increase in the agricultural sector, agricultural output and income will increase. At the same time, since the labor supply is …xed, people will devote less labor to the industrial sector, which means that M P K2 decreases. Then, people will invest less capital in the industrial sector. Since both capital and labor decrease in the industrial sector, industrial output and income will decrease. Overall, the increase in agricultural income is mediated by the decrease in industrial income but the mean e¤ect on rural income per capita is positive. The analysis for the increase of A1 is similar and will lead to the same conclusion. Although the above model is a static setup, it is, in reality, a dynamic process. The rural households wanted to maximize their income when the program kicked in, and they did get more income from their agricultural production. The high prices of agricultural products keep them engaging in agriculture, especially in grain production. However, they did not expect that non-farm income would take o¤ after 1994. So they engaged less in the rural, non-farm sector, such as rural industry. The greater increase in agricultural income in the supported counties was mediated by a smaller increase in the non-farm sectors. It became less pro…table to grow crops when the prices went down signi…cantly after 1997. The rural households that had engaged mainly in agriculture were locked into the agricultural production, and it was harder for them to switch to non-farm activities. So, we can assume that they were at least a little worse o¤ in income after 1997. The rapid transformation after 1994 was highly unexpected, so I did not integrate that into the model. 4 Data In this paper, I use mainly county-level panel data from the Ministry of Agriculture for the period 1990-2000. These data contain information on the agricultural variables, including: the total population the rural net income per capita; the total agricultural 17 output in values; the total sown area for various agricultural products; the total output in tons for various agricultural products such as grain, cotton and oil crops; the total use of agricultural inputs like the total agricultural mechanical power,28 fertilizer, membrane; the total number of livestock; the total employees of TVEs and the rural industrial output29 . The MOA data, are based on reports made by village, township, and county o¢ cials. It is perhaps the only estimate available for all counties in 1990’s30 . Beyond that, I also use the data the government used for choosing the NDP counties in 1993. The data are from the National Bureau of Statistics (NBS thereafter) and contain variables such as rural net income per capita; the population; the size of labor force; the total output of grain; the local …scal revenue and expense; the amount of savings in banks; the terrain condition of the county; whether or not the county is a minority county; whether or not the county is a border county; and whether or not the county is a revolutionary county. According to Park and Wang (2001), the rural income per capita statistics from NBS are generated through a reporting system supervised by the Division of Regional Economy under NBS31 . Since the price levels changed dramatically in the 1990s and prices di¤ered over time, I also use a set of provincial-level spatial price de‡ators constructed by Brandt and Holz (2006) to calculate the real value of the variables of interest based on 1992 price. 5 Empirical Strategy Consider an equation characterizing the causal relationship between being counted as an NDP county, described by the dummy variable indicator N DPi and outcome Yit : 28 It is hard to standardize the use of the agricultural machinery. In general we use the total agricultural mechanical power as the indicator for that. 29 Some variables are not available for some years. For example, rural industrial output is available only until 1997. 30 Park and Wang (2001) discussed the shortcoming of the self-reported data in more detail. For example, they claimed that their interviews showed the reporting was subject to revisions by upper-level governments. There are still concerns about the determination of the o¢ cial statistics and independent source of veri…cation is needed. 31 If the county belongs to the national rural household survey sample county (35% of all counties), the county-level NBS may or may not use the household data for county average. For other counties, they might use the provincial household survey data or the MOA data. For more details, please refer to Park and Wang (2001). 18 P 2000 P 2000 ln Yit = + ci +Y eart + N DP i 1(Y ear = ) + Income400i;92 1(Y ear = ) 1 (9) =1991 =1991 P 2000 P 2000 + Income4002i;92 1(Y ear = ) 2 + Income4003i;92 1(Y ear = ) 3 =1991 =1991 where N DPi is a dummy indicating whether or not the county is an NDP county, income400i;92 is the standardized income that is the rural net income in 1992 deducted by 400, Xi;90 is a vector of region-speci…c variables that include total population and acreage in year 1990, is a constant, ci is a county …xed e¤ect, Y eart is a year …xed e¤ect and Zj is a provincial dummy that interacts with year to control for the provincial time trends. The dependent variable is the log value of the variables we are interested in. Since it is panel data, I make use of them and do a panel data regression. The reason I choose 1990 as the base year is because we can detect if there were pre-trends before the program since it ran from 1994 to 2000. Direct application of OLS to equation (9) may lead to biased estimates of for the usual reasons: There are some unobservables in the regression that might cause the error term to be correlated with the variable of NDP status. The selection of NDP counties was not random. In general, because the NDP counties tended to be backward, poorer counties, we might expect divergence. However, since the convergence story also seems plausible, the poor might have grown faster than the rich and caught up in the end. To tackle the problem, in addition to controlling for the initial log of income, I truncate the sample to only include those counties with income around the cuto¤. Even after that, there are still concerns. As I mentioned above, although the selection was based mainly on some objective characteristics, especially the rural net income per capita, there are other factors that might have a¤ected the choice of counties. Those selecting the counties had worked on poverty alleviation programs for years, so they had a good sense of poor counties’ real situations and might have known which counties might bene…t more from the program. Those unobserved factors could contaminate the estimates. In addition, there is some anecdotal evidence showing 19 that lobbying may have played a role in the decision process32 . If that is true, it can complicate the estimates. It is possible that the counties that were better at lobbying were more likely to have more resources or to make better use of their resources. So the OLS estimate will be biased upward. However, it is also possible that o¢ cials in the counties that successfully lobbied for NDP status were more corrupt. They may have been more interested in getting funds from the upper-level governments than in developing the local economy. So the OLS estimate will be biased downward. In China, whether or not to lobby depended largely on the top government o¢ cials, who may have had di¤ering perceptions of the value of being an NDP county. Some may have thought that the title would bring the county more support, while others might have thought that the NDP title would scare potential investors away. And the top leaders’preferences could have a¤ected the local economy, thus biasing the estimates. I construct an IV using the discontinuity of the program assignment-the county with rural net income per capita below 400 was in principle included into the program- to handle the selection issue. As described above, the program assignment was determined mainly by the rural net income per capita in 1992. Figure 5 shows the relationship between the rural net income per capita and the program assignment for the new NDP counties33 . We can see that the lower the rural net income per capita in the county, the more likely the county was to be included in the program. In particular, there was a sharp jump at around 400. The probability of being an NDP county was around 80 percent if the county’s rural net income per capita was just below 400. The probability dropped dramatically, to around 30 percent, if the county’s rural net income per capita was just above 400. From …gure 5, we know that the rural net income per capita was a major determinant of being an NDP county, but other characteristics could also have played a role. Also from Figure 5, 32 “Caijing,”an in‡uential magazine in China, reports the lobbying for NDP county status with a case study in 2008. 33 Since the 400 cuto¤ applied only to the new NDP counties, I removed the old NDP counties (the …rst-wave NDP counties) from the program assignment analysis. 20 we know the probability of program assignment did not jump from one to zero; this is a “fuzzy”regression discontinuity (Lee and Lemieux, 2010), so the causal e¤ect of being an NDP county is: limx"400 E(Y jincome = x) limx#400 E(Y jincome = x) F RD = (10) limx"400 E(N DP jincome = x) limx#400 E(N DP jincome = x) The estimation in equation (10) is equivalent to an IV regression, where the reduced form regression34 is: P 2000 P 2000 ln Yit = + ci +Y eart + Below400i 1(Y ear = ) + Income400i;92 1(Y ear = ) 1 =1991 =1991 P 2000 P 2000 + Income4002i;92 1(Y ear = ) 2 + Income4003i;92 1(Y ear = ) 3 (11) =1991 =1991 P 2000 P 31 + Xi;90 1(Y ear = ) + Zj Y ear j +"it =1991 j=2 where Below400i is a dummy indicating if the county’s rural net income per capita was below 400 in 1992. is our main parameter of interest; it measures the local average treatment e¤ect at the point in the income distribution where the cuto¤ falls. This is an estimation based on compliers, the counties that would not have been selected as NDP counties and given generous support from 1994 to 2000 if their rural net income per capita had been above 400 in 1992, but would have been chosen as NDP counties and enjoyed the bene…ts if their income had been below 400 in 1992. 5.1 RD validity First, we might worry about sorting, which might cause the continuity assumption to be violated. As I mentioned above, the administrator used the 1992 data to make the decision, so sorting is technically not possible. Figure 6, which shows the histogram based 34 The …rst-stage regression is hard to write since there are multiple endogenous variables in the regres- sion function. Joint test is needed for this case and F-value will be shown in the regression results. 21 on the rural net income per capita in 1992, does not show any evidence of sorting around the cuto¤. Second, I carry out validity tests of the smoothness assumption using observ- ables, eight of which are depicted graphically in Figures 7 and 8. Rural population, rural labor, …scal income, …scal expenditure, total agricultural mechanical power and fertilizer use vary smoothly at the boundary, with di¤erences that are neither large enough to be important nor statistically signi…cant. Third, Table 4 shows the comparison of various characteristics between the two groups around the 400 cuto¤. I divided the counties into two groups according to their rural net income level in 1992: One group is composed of those counties with income between 350 and 400, and the other group is composed of counties with income between 400 and 450. Table 4 shows that they are statistically the same. Furthermore, Figure 2 shows that the NDP counties are geographically clustered, which is true for both old and new NDP counties. However, when I trimmed the samples to those counties with income between 350 and 450, Figure 9 shows that the counties are geographically dispersed. 5.2 Speci…cation Misspeci…cation of the functional form typically generates a bias in the treatment e¤ect; the estimation of Rd designs has generally been viewed as a nonparametric estimation problem since Hahn et al. (2001). However, the RD setting poses a particular problem because we need to estimate regressions at the cuto¤ point. Nonparametric regression gen- erally does not work very well for these boundary problems. Partly because of the reason above, along with the simplicity, applied papers more often use parametric regression. In order to correct for the bias of using simple linear terms, people include polynomial functions of forcing the variable in the regression model. Lee and Lemieux (2009) suggest that it was better to try to report a number of speci…cations to see to what extent the results are sensitive to the order of the polynomial. In practice, I follow their advice and try di¤erent orders of polynomial terms. The results are not sensitive to that, so I show 22 only the results with the linear terms. The selection of bandwidth, the tradeo¤ between validity and e¢ ciency, is always a concern for the regression discontinuity approach. Since the sample size is not very big, I choose the bandwidth to be 100 for each side to avoid the small sample problem. As a robustness check, I also narrow the bandwidth to 50, and the coe¢ cients ‡uctuate slightly although the variances are larger due to the size of the sample. 6 Results According to the model’s prediction, we expect that after the program the rural farmers in the NDP counties used more capital on and devoted more labor to the agricultural production than did the non-NDP counties. As a result, agricultural output and agricul- tural income would have increased more in the NDP counties. At the same time, rural farmers used less capital and devoted less labor to industrial production than did their counterparts. Thus the industrial output and income should increase less in the NDP counties. And the rural farmers in the treated counties would be expected to have had a greater increase in rural net income. Ideally I can use the data to test the hypothesis for all the variables. Unfortunately I have only a portion of the variables. But the results shown below are consistent with the model’s predictions, which support my hypothesis. Since the regressions using the full sample might be problematic, as I explained above, I will discuss the results based on the regressions of the trimmed samples. Before the discussion, let me describe the dependent variables I will use for the regres- sions. First, I will use total agricultural mechanical power and fertilizer use as the proxies for agricultural inputs. In terms of agricultural output, I will use mainly the total grain output (in tons) for the discussion. In addition, I will also look at the total number of livestock that could be agricultural input or output. In any case I expect to see positive e¤ects. Furthermore, I will look at the e¤ect on agricultural income. For the industrial side, the data are less available. I do have the data for the TVE employees. According to 23 Naughton (2007), most of the rural laborers worked in TVEs if they engaged in industrial production before the late 1990s. So, I will use the TVE employees as the proxy for the labor force in rural industrial sector. Unfortunately, I do not have the proxy variable for capital use in the rural industrial sector and industrial income but I can look at the e¤ect on rural industrial output. In the end, I will discuss the overall e¤ect on the rural net income per capita. I begin with the analysis of the impact of the poverty alleviation program on the agricultural capital input. Table 5 and table 6 show that the treated counties did have more increase in the total agricultural mechanical power and the fertilizer use respectively. The e¤ects kept increasing after 1994 and peaked at around 1997. As I mentioned in the background section, the government emphasized agricultural production, especially the grain production. It is interesting to see if that emphasis actually helped. Table 7 presents the impact of the program on grain production, and the results con…rm that, indeed, the poverty alleviation program did increase grain production. The coe¢ cients are signi…cantly positive and stable after 1994. Another focus of the poverty alleviation program was to increase access to drinking water. One of the main achievements claimed by the government is that the program gave 53.5 million persons and 48.4 millions animals in the NDP counties access to the drinking water. In poor counties, a lot of farmers used working animals for agricultural production35 . In addition, the program encouraged husbandry, so we can expect that the farmers raised more live stocks36 when the conditions improved and that they also have demand due to the need for grain production. Table 8 con…rms our expectation, showing that more live stocks were raised in NDP counties than in the non-NDP counties during the program. However, as I discussed in the model section, although the program might have in- creased production of the supported sector, it might also have had crowding-out e¤ects. 35 According to China’s Statistical Yearbook 1996 by NBS, the total number of Large Animals in 1995 was 158.716 million, and 88.12 million of them were used as working animals 36 Live stocks include cows, horses, donkeys, camel and mules. 24 Since the farmers had time constraints, they would be expected to put less e¤ort into production in the unsupported sectors. According to the model, households also had an incentive to reduce their labor input in the industrial sector. Table 9 shows that the em- ployment of TVE also increased less in the supported counties than in the unsupported counties. Table 10 further con…rms our supposition that rural industrial output increased less in the supported counties than in the unsupported counties37 . Since we do have the variable of capital input in the rural industry, we cannot test the program e¤ect on that. Since the program had positive e¤ects on grain production and livestocks but negative e¤ects on rural industrial production, the e¤ect on rural income is ambiguous. It quite depends on the prices or pro…tability of the product. Figure 6 shows the change in grain prices over time in China. The price increased dramatically in 1994, reached its peak in 1995 and remained high until 1996. From 1997, it began to fall sharply and stayed low until 2003. We can expect that rural income from agriculture increased for the period 1994-1996 and remained stable after 1997. Table 11 presents the results when we use agricultural income as the dependent variable. It con…rms that the supported counties bene…ted from the increase in grain production and the increase in prices of agricultural products for the period 1994-1997. However, after the prices of agricultural products fell, the treated counties did not experience an increase in agricultural income, although their grain production still increased more than the unsupported counties’. The overall program e¤ect on the rural net income per capita is shown by table 12. We know that the NDP counties grew more slowly than the non-NDP counties if we use the whole sample. The coe¢ cient of NDP is negative and signi…cant at all times. However, the results using the trimmed samples are very di¤erent. The NDP coe¢ cients are insigni…cant in all years. This comparison shows that the endogeneity problem is serious. Simple OLS or matching methods will be biased. In addition, the estimates are almost negative, except for the 37 Because there was a reform in term of ownership for TVEs in late 1990s, it is hard to do the statistics for TVEs, so the data are unavailable for the period after 1997. 25 period 1995-1997. The IV estimates are negative before 1994 and change to positive in 1995-1997 and change back to negative after 1998. It implies that the program might have slightly positive e¤ect in 1994-1997 when the grain prices were very high. It is consistent with our story that the more increase of agricultural income might be mitigated by less increase in other sources such as wage income. The main objective of the program was to increase the income of rural households by helping them develop their agricultural production. The program was successful, especially in grain production; however, the development of grain production crowded out other production through labor mobility, which policy might not have expected. As I show in panel A of Figure 3, wage income doubled from 1994 to 2000, while income from agriculture increased less than 25 percent in this period. Although the rural people in the supported counties enjoyed the bene…ts of the high prices of agricultural products in 1994-1997, they might have been worse o¤ when the prices dropped since they were less likely to engage in non-farm activities, which were more promising after 1997. From the results, we can see that the supported counties did bene…t from the program by producing more grain and, thus, having more agricultural income. However, this bene…t seems to have been almost mediated by less growth in other income sources, which played a more important role in the long run. Through the analysis of the results, we know that people will respond to regional policies through labor ‡ows and capital ‡ows. While it is always interesting to know the program’s e¤ect on the target industry, the e¤ect on other industries should not be neglected. The results based on an overall evaluation can be deceiving since the e¤ect might be positive for this and negative for that and the e¤ects cancel each other out. Policy makers need to take this response into consideration. Place-based policies are intended to support some industries in the lagging regions, but it is hard to know which industries should be supported. In China’s case, agricultural support was one of the major reasons for the agricultural success of early 1980s in China. However, things often change and it 26 might no longer have been e¢ cient to support agriculture in 1990s. Place-based policies are, in general, launched by the government, so they are very in‡exible. It is hard for the government to adjust to a new environment. In the case of China, the government’s place-based program caused a big distortion. The government did not respond to the distortion when the grain price dropped because the policy was essentially "locked in" until 2000. In this respect, people-based program is better since individuals respond to change much more easily than do governments. 7 Conclusion This paper made use of the regression-discontinuity approach to estimate the e¤ect of China’s second-wave place-based poverty-alleviation program. I found that the program successfully increased agricultural production, especially grain production— in line with the government’s plan and expectations. However, the support for grain production crowded out other agricultural production and non-farm activities by driving labor out of the unsupported sectors. Although the rural people in the supported counties enjoyed more agricultural income at the beginning of the program due to the unusually high prices of agricultural products, the increase in income was mediated by a smaller increase in other production, especially in non-farm activities. The dramatic drop in the prices of agricultural products made people worse o¤ since they were less likely to engage in the non-farm activities that played an increasingly important role in rural earnings. The policy implications of this paper are far-reaching. First, it reveals that people will vote with their feet, so policy makers need to take the general-equilibrium e¤ect into consideration. More importantly, it provides rigorous evidence that place-based poverty- alleviation programs did not work well and even caused distortion, leading to worse results. When non-farm activity becomes more important, rather than clinging to the traditional method of helping the poor with agricultural production, China’s government might need to try other, more e¢ cient ways, such as people-based policies that have been widely 27 regarded as successful in Mexico and followed by many other countries. Partly due to the availability of the data, this paper could not fully address the follow- ing issues. Capital mobility is also important for us to understand the transformation, but I did not have the necessary data to examine it. In addition, although the e¤ect on rural income was not that encouraging, the people in the supported counties might have gained more welfare since their living conditions were improved. And last but not the least, the contribution of agriculture to improving conditions for the poor varies enormously, not only at di¤erent stages of development for a given country, but also across and within countries, because of local contexts. The government needs to take that into account. References [1] Barro, R., 1997. Determinants of Economic Growth: A Cross-Country Empirical Study. MIT Press, Cambridge. [2] Bondonio, D. and R.T. Greenbaum (2007), “Do local tax incentives a¤ect economic growth? What mean impact miss in the analysis of enterprise zone policies,”Regional Science and Urban Economics, 37, 121-136. [3] Besley, T., Kanbur, R., 1993. “The principles of targeting, including the poor.” In: Proceedings of a Symposium Organized by the World Bank and the International Food Policy Research Institute. The World Bank, Washington, D.C, pp. 67–90. [4] Brandt, Loren & Holz, Carsten A, 2006. "Spatial Price Di¤erences in China: Esti- mates and Implications." Economic Development and Cultural Change, vol. 55(1), pages 43-86, October. [5] Cai, Fang, Dewen Wang, (2009) Migration and Poverty Alleviation in China, in Murphy, Rachel (eds) Labour Migration and Social Development in Contemporary China, London and New York: Routledge, Taylor & Francis Group, pp. 17-46. [6] Carvalho, Alexandre, Somik Lall, Christopher Timmins, 2005, "Regional Subsidies and Industrial Prospects of Lagging Regions," World Bank Policy Research Working 28 Paper No. 3843 [7] Caselli, F., Esquivel, G., Lefort, F., 1996. “Reopening the convergence debate: a new look at cross-country growth empirics.”Journal of Economic Growth 1 (3), 363–389. [8] Chen, S., Ravallion, M., 1996. “Data in transition: assessing rural living standards in southern China.”China Economic Review 7 (1), 23–56. [9] Chen, S., Ravallion, M., 2007. “China’s (uneven) Progress Against Poverty.”Journal of Development Economics 82 (1), 1–42. [10] Chenery, Hollis, Moises Syrquin, 1976. Patterns of Development, 1950-1970, Oxford University Press. [11] Cornia, G.A., Stewart, F., 1995. “Two errors of targeting.” In: Van de Walle, D., Nead, K. (Eds.), Public Spending and the Poor: Theory and Evidence. Johns Hopkins University Press, Baltimore. [12] Du, Yang & Park, Albert & Wang, Sangui, 2005. "Migration and rural poverty in China," Journal of Comparative Economics, Elsevier, vol. 33(4), pages 688-709, December [13] Ellis, F. and N. Harris. 2004. “New Thinking About Urban and Rural Development.” Keynote Paper for DFID Sustainable Development Retreat. London. [14] Fan, S., P. Hazell, and S. Thorat. 2000. “Government Spending, Agricultural Growth, and Poverty in Rural India.” American Journal of Agricultural Economics 82 (4): 1038-1051. [15] Fan, S., C. Chan-Kang, and A. Mukherjee. 2005. "Rural and Urban dynamics and poverty: Evidence from China and India." Draft Paper, IFPRI, Washington, D.C. [16] Fan, Shenggen, Ashok Gulati & Sukhadeo Thorat, 2008. "Investment, subsidies, and pro-poor growth in rural India," Agricultural Economics, International Association of Agricultural Economists, vol. 39(2), pages 163-170 [17] Fan, Shenggen & Kanbur, Ravi & Zhang, Xiaobo, 2010. "China’s Regional Dispari- ties: Experience and Policy," Working Papers 57041, Cornell University, Department 29 of Applied Economics and Management. [18] Ferreira, P. C. (2004) “Regional Policy in Brazil: A Review.”Mimeo [19] Foster, Andrew D & Rosenzweig, Mark R, 2004. "Agricultural Productivity Growth, Rural Economic Diversity, and Economic Reforms: India, 1970-2000," Economic Development and Cultural Change, University of Chicago Press, vol. 52(3), pages 509-42, April. [20] Foster, Andrew D. & Rosenzweig, Mark R., 2008. "Economic Development and the Decline of Agricultural Employment," Handbook of Development Economics, Else- vier. [21] Gallup, J., Sachs, L.J., 1999. Geography and Economic Growth, Annual World Bank Conference on Development Economics 1998. World Bank, Washington, D.C. [22] Glaeser, E.L. and Joshua D. Gottlieb. 2008. The Economics of Place-Making Policies. NBER Working Paper No. W14373 [23] Gollin, D., S. Parente, and R. Rogerson. 2002, "The Role of Agriculture in Develop- ment" American Economic Review 92: 160-164 [24] Greenbaum, Robert T. and Jim Landers, 2009, “Why are State Policy Makers Still Proponents of Enterprise Zones? What Explains Their Action in the Face of a Preponderance of the Research?” International Regional Science Review, 32(4): 466-479 [25] Grosh, M., 1995. “Toward quantifying the trade-o¤: administrative costs and inci- dence in targeted programs.”In: Van de Walle, D., Nead, K. (Eds.), Public Spending and the Poor: Theory and Evidence. Johns Hopkins University Press, Baltimore. [26] Hahn, J., Todd, P., van der Klaauw, W., 2001. “Identi…cation and Estimation of Treatment E¤ects with a Regression-Discontinuity Design.”Econometrica 69(1), 201- 209. [27] Huang, Jikun, Scott Rozelle and Honglin Huang, 2006, "Fostering or Stripping Rural China: Modernizing Agriculture and Rural to Urban Capital Flows.’The Developing 30 Economies, XLIV-1: 1-26 [28] Imbens, Guido & Lemieux, Thomas, 2008. " "Regression discontinuity designs: A guide to practice," Journal of Econometrics, Elsevier, vol. 142(2), pages 615-635 [29] Jalan, J., Ravallion, M., 1998. “Are there dynamic gains from a poor-area develop- ment program?”Journal of Public Economics 67, 65–85. [30] Jiang, Y., Gao, H. (Eds.), 1997. China Central Government Finance for Poverty Alleviation (zhongyang caizheng fupin). China Financial Economics Press, Beijing. [31] Johnston, B.G. and J. W. Mellor. 1961. "The role of Agriculture in Economic Devel- opment." American Economic Review 87 (2): 566-593 [32] Johnson, Gale, Guoqing Song, In‡ation and the Real Price of Grain, Food Security and Economic Reform, McMillan Press, 1999 [33] Kahn, Matthew. 2010. “Cities, Economic Development and the Role of Place Based Policies.”In Appalachia and the Legacy of the War on Poverty [34] Lee, D., Thomas Lemieux, "Regression Discontinuity Designs in Economics," Journal of Economic Literature, 48(2) 281-355. [35] Li, H. 1997. Introduction to the state compulsory education program in poor areas. People’s Daily (renmin ribao), March 27. O¢ ce of the Leading Group for Economic Development in Poor Areas, 1989. Outlines of Economic Development in China’s Poor Areas. Agricultural Publishing House, Beijing. [36] Li, Hongbin, Lingsheng Meng, 2010, “Evaluating China’s Poverty Alleviation Pro- gram: A Regression Discontinuity Approach.”Working Paper. [37] Lu, Feng, Kaixiang Peng, 2002, "The Interaction of the Grain Market and the Macro Economy," Peking University Working Paper. [38] Lu, Feng, "The Supply and Demand of Grain and the Price of Grain in China," Management World, 2008 (In Chinese) [39] Maxwell, S., I. Urey, and C. Ashley. 2001. “Emerging Issues in Rural Development: An Issues Paper.”Development Policy Review 19 (4): 395-426. 31 [40] McCrary, J., 2007. “Manipulation of the Running Variable in the Regression Discon- tinuity Design: A Density Test.”Journal of Econometrics 142(2), 698-714. [41] Michael Greenstone and Adam Looney, Oct 2010, "An Economic Strategy to Renew American Communities", Hamilton Project, Brookings Institution. [42] Naughton, Barry, 2007, "The Chinese Economy: Transitions and Growth". Cam- bridge, Massachusetts: MIT Press. [43] Olfert, Rose, Julio Berdegué, Javier Escobal, Benjamin Jara, Félix Modrego. Places for place-based policies, Rimisp Latin American Center for Rural Development work- ing paper [44] Park, Albert, Sangui Wang, Guobao Wu, “Regional Poverty Targeting in China.” Journal of Public Economics, 86, 2002, 123-153. [45] Park, Albert, Sangui Wang, “Community-based development and poverty allevia- tion: An evaluation of China’s poor village investment program.” Journal of Public Economics, 94, 2010, 790-799. [46] Piazza, A., Liang, E., 1997. “The state of poverty in China: its causes and remedies.” In: Paper presented at a conference on Unintended Social Consequences of Economic Reform in China, Fairbanks Center, Harvard University, May. [47] Puga, Diego, 2002. "European regional policies in light of recent location theories," Journal of Economic Geography, Oxford University Press, vol. 2(4), pages 373-406, October. [48] Ravallion, M., 1993. “Poverty alleviation through regional targeting: a case study for Indonesia.”In: Ho¤, K., Braverman, A., Stiglitz, J. (Eds.), The Economics of Rural Organization: Theory, Practice, and Policy. Oxford University Press for the World Bank, Oxford; New York; Toronto and Melbourne, pp. 453–467. [49] Ravallion, M., Lipton, M., 1995. “Poverty and policy,” in: Behrman, J., Srinivasan, T. (Eds.), Handbook of Development Economics, Volume III. Horth Holland, Ams- terdam. 32 [50] Ravallion, M., Datt, G., 1996. “India’s checkered history in the …ght against poverty: Are there lessons for the future?”Economic and Political Weekly, 2479–2485. [51] Ravallion, Martin, Datt, Gaurav, 1996. "How Important to India’s Poor Is the Sec- toral Composition of Economic Growth?," World Bank Economic Review, Oxford University Press, vol. 10(1), pages 1-25. [52] Ravallion, M., Jalan, J., 1996. “Growth divergence due to spatial externalities.” Economics Letters 53 (2), 227–232. [53] Ravallion, M., Jalan, J., 1999. “China’s lagging poor areas.” American Economic Review P&P 89 (2), 301–305. [54] Ravallion, M.,Wodon, Q., 1999. “Poor areas, or only poor people?” Journal of Re- gional Science 39 (4), 689–711. [55] Ravi Kanbur and Anthony J. Venables, "Spatial Inequality and Development: Overview of UNU-WIDER Project", September, 2005. Revised version published in David Held and Ayse Kaya (Eds.) Global Inequality. Polity Press, 2007. [56] Riskin, C., 1994. “Chinese rural poverty: marginalized or dispersed?”American Eco- nomic Review, Papers and Proceedings, 84 (2), 281–284. [57] Rodríguez-Pose, Andrés, Ugo Fratesi, 2003. "Between development and social poli- cies: the impact of European Structural Funds in Objective 1 Regions," European Economy Group Working Papers 28, European Economy Group. [58] Rozelle, S., Park, A., Bezinger,V., Ren, C., 1998. “Targeted Poverty Investments and Economic Growth in China.”World Development. [59] Rozelle, Scott, Linxiu Zhang, Jikun Huang, “China’s War on Poverty: Assessing Tar- geting and the Growth Impacts of Poverty Programs,”Journal of Chinese Economics and Business Studies, (2003): 301-317 [60] Squire, L., 1993. “Fighting poverty.”American Economic Review 83 (2), 377–382. [61] Syrquin, M. (1988). “Patterns of structural change, in Chenery”. In: Srinivasan, H.T.N. (Ed.), Handbook of Development Economics, vol. 1. Elsevier Science. 33 [62] Thorbecke, E. and H.S. Jung 1996. "A Multiplier Decomposition Method to Analyze Poverty Alleviation’. Journal of Development Economics,. 48: 225-252. [63] Timmer, C. P. 1988. “The Agricultural Transformation.” In Handbook of Develop- ment Economics, Vol. 1. ed. H. Chenery and T. N. Srinivasan. Amsterdam, North Holland. [64] Timmer, P. 2002. “Agriculture and Economic Development,” In Handbook of Agri- cultural Economics, Vol. 2. ed. B. Gardener and G. Rausser.. Elsevier Science B.V., Amsterdam, pp1487-1546. [65] Tong, Z., Rozelle, S., Stone, B., Dehua, J., Jiyuan, C., Zhikang, X., 1994. China’s experience with market reform for commercialization of agriculture in poor areas. In: von Braun, J., Kennedy, E. (Eds.), Agricultural Commercialization, Economic Development, and Nutrition. Johns Hopkins Press, Baltimore and London, pp. 119– 140. [66] Vogel, S. J. 1994. “Structural Changes in Agriculture: Production Linkages and Agricultural Demand-Led Industrialization.” Oxford Economic Papers. 46 (1): 136- 156. [67] World Bank, 2001. “China: Overcoming Rural Poverty,” World Bank, Washington, D.C. [68] World Bank, 2005a. “Agricultural Growth for the Poor: An Agenda for Develop- ment.”Directions in Development Paper, Washington, D.C. [69] World Bank, 2005b. “Agriculture, Rural Development and Pro-poor Growth: Coun- try Experiences in the Post-Reform Era.”Washington, D.C. [70] World Bank. 2008. Accelerating Growth and Development in the Lagging Regions of India Washington, DC: The World Bank. [71] World Bank. 2009. World Development Report Reshaping Economic Geography. Washington, DC: The World Bank. [72] World Bank, 2009. “From Poor Areas to Poor People: China’s Evolving Poverty 34 Reduction Agenda,”World Bank, Washington, D.C.. [73] Van de Walle, D., 1998. “Assessing the welfare impacts of public spending.” World Development 26 (3), 365–379. Van de Walle, D., Nead, K. (Eds.), 1995. Public Spend- ing and the Poor: Theory and Evidence. Johns Hopkins University Press, Baltimore. World Bank, 1992. China: Strategies for Reducing Poverty in the 1990s. World Bank Country Study. [74] Xie,Y. 1994. Description of the capacity for mobilizing social forces to alleviate poverty (shehui fuping gongneng de yige telie). Tribune of Economic Development (kaifa luntan) 6. [75] Zhu and Jiang (1996) Public Works and Poverty Alleviation in Rural China, New York: Nova Science Publishers. [76] Ziliak, James P. Forthcoming. “The Appalachian Regional Development Act and Economic Change.” In Appalachian Legacy: Economic Opportunity after the War on Poverty, J. P. Ziliak (ed.), Brookings Institution Press, 2011. 35 Figure 1 Regional Inequality in Per Capita Consumption Source: Fan et al. (2010). The regional inequality measures are the Gini Coefficient and Theil Index (with c=1), calculated by authors based on population weighted real per capita consumption at the provincial level in rural and urban areas. The data are from Comprehensive Statistical Data and Materials on 50 Years of New China (China National Bureau of Statistics, 2000) and various issues of China Statistical Yearbook (China National Bureau of Statistics, various issues). 36 Figure 2: the Geographical Distribution of NDP Counties Note: The chocolate-colored counties, which I call the Old NDP Counties, are included in both the first-wave and the second-wave programs. The light-orange counties, which I call the New NDP Counties, are newly included in the second-wave program. The yellow counties indicate other counties. 37 Figure 3 Panel A Composition of Rural Net Income Per Capita in Values 1000 900 Wage Income 800 700 Household Business‐ Unit: Yuan 600 Plantation 500 400 Household Business‐ Other  Agricultural Production 300 200 Household Business‐ Other  100 than Agriculture 0 Transfer and Property Income 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Note: 1) Assume the price in 1992 is 1, all the numbers above are calculated based on the 1992 price. 2) There are three sources of rural net income: (1) Household’s own business. The rural household can engage in agricultural production, which includes plantation, forestry, husbandry, fishery and side industries (The last four consist of the part of “other agricultural production”). Beyond that, the rural household can also engage in off-farm activities such as craft, manufacturing or other service industries (All of these are part of “other than agriculture.”) (2) Wage income. The farmers work in firms and earn wages. (3) Transfer and property income, which includes income transfer, bank interest and stock dividends. The two blue lines indicate year 1994 and year 2000. Source: China Statistical Yearbook 2011. Panel B Composition of Rural Income Per Capita in Percentage 60 Wage income 50 Share: Percentage 40 Household Business‐ Plantation 30 Household Business‐Other  20 Agricultural Production 10 Household Business‐Other  than Agriculture 0 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Transfer and Property  Income Note: There are three sources of rural net income: (1) Household’s own business. The rural household can engage in agricultural production, which includes plantation, forestry, husbandry, fishery and side industries (The last four comprise “other agricultural production”). Beyond that, the rural household can also engage in off-farm activities such as craft, manufacturing or other service industries (All of are part of “other than agriculture.”) (2) Wage income. The farmers work in firms and earn wages. (3) Transfer and property income, which includes income transfer, bank interest and stock dividends. The two blue lines indicate year 1994 and year 2000. Source: China Statistical Yearbook 2011. 38 Figure 4 Grain Price of Rural Free Market in China (Yuan/KG) 2.5 0.6 0.5 2 0.4 1.5 0.3 1 0.2 0.5 0.1 0 0 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 Nominal Price (Yuan/KG) Real Price (Yuan/RMB, 1978==1) Source: Lu (2007). This figure only shows the grain price in the rural free market. For more detailed information of the quote price and the negotiated price, please refer to Huang et al. (2006). The two solid lines indicate year 1994 and year 2000. 39 Figure 5 Program Assignment of the Second-wave Poverty Alleviation Program 1 .8 Program Assignment .6 .4 .2 0 200 400 600 800 Rural Net Income Per Capita in 1992 Note: X axis indicates the rural net income per capita in 1992, Y axis indicates the program assignment. The blue solid line plots nonparametric predictions of the program assignment from an unweighted uniform kernel smoother with the bandwidth of 0.05 for the counties with their rural net income per capita between 200 and 800 in 1992. The red line indicates that the rural net income per capita is 400. 40 Figure 6 Histogram of rural net income per capita in 1992 200 150 Frequency 100 50 0 0 500 1000 1500 2000 rural_income Note: the red line indicates that the rural net income per capita is 400. 41 Figure 7 RD validity: Changes of Variables around the Cutoff (I) Note: the blue line indicates that the rural net income per capita is 400. The blue dots are the scatter plots. The red solid line plots nonparametric predictions of the various variables from an unweighted uniform kernel smoother with the bandwidth of 0.1 for the counties with their rural net income per capita between 200 and 1000 in 1992. 42 Figure 8 RD validity: Changes of Variables around the Cutoff (II) Note: the blue line indicates that the rural net income per capita is 400. The blue dots are the scatter plots. The red solid line plots nonparametric predictions of the various variables from an unweighted uniform kernel smoother with the bandwidth of 0.1 for the counties with their rural net income per capita between 200 and 1000 in 1992. 43 Figure 9: RD validity: Geographic Distribution of the Trimmed Samples Note: All the counties included in the first-wave poverty alleviation program are excluded from the comparison here. 350-400 is a group of counties with rural net income per capita between 350 and 400 in 1992; 400-450 is a group of counties with rural net income per capita between 400 and 450 in 1992. 44 Table 1. The Comparison of Interest Rates between Commercial Ones and the Subsidized Ones Year Nominal interest rate Nominal interest rate Real interest rate Real interest rate In‡ation rate of subsidized loans (%) of commercial loans (%) of subsidized loans (%) of commercial loans (%) (CPI) 1991 2.88 8.64 -0.5 5.07 3.4 1992 2.88 8.64 -3.31 2.11 6.4 1993 2.88 9.36 -10.3 -4.66 14.7 1994 2.88 10.98 -17.1 -10.57 24.1 1995 2.88 12.06 -12.14 -4.3 17.1 1996 2.88 10.98 -5 2.47 8.3 1997 2.88 8.64 0.08 5.68 2.8 1998 2.88 7.92 3.71 8.79 -0.8 1999 2.88 5.85 4.34 7.35 -1.4 2000 2.88 5.85 2.47 5.43 0.4 Note: The nominal interest rates for subsidized loans and business loans are benchmark interest rates set by the Central Bank of China. The real interest rates are calculated after adjusting for in‡ation. All of the interest rates are one-year interest rates. Source: People’s Bank of China 45 Table 2. China’s Central Government Poverty Alleviation Funds 1986-2000 (Billion Yuan at current price) Year Subsidized Share Food for Share MOF develop- Share Total loans (billion) work(billion) ment funds(billion) 1986 2.3 54.76 0.9 21.43 1 23.81 4.2 1987 2.3 54.76 0.9 21.43 1 23.81 4.2 1988 3.1 75.61 0 0 1 24.39 4.1 1989 3.1 73.81 0.1 2.38 1 23.81 4.2 1990 3.1 64.58 0.6 12.5 1.1 22.92 4.8 1991 3.6 46.15 1.8 23.08 2.4 30.77 7.8 1992 4.1 49.4 1.6 19.28 2.6 31.33 8.3 1993 3.5 37.63 3 32.26 2.8 30.11 9.3 1994 4.6 40 4 34.78 2.9 25.22 11.5 1995 4.6 40.35 4 35.09 2.8 24.56 11.4 1996 5.5 49.55 4 36.04 1.6 14.41 11.1 1997 8.5 47.49 4 22.35 5.4 30.17 17.9 1998 10 49.5 5 24.75 5.2 25.74 20.2 1999 15 58.37 6 23.35 4.7 18.29 25.7 2000 15 56.6 6 22.64 5.5 20.75 26.5 Total 88.3 55.45 41.9 27.44 41 17.11 171.2 Source: Park et al. (2002) and Dongmei Liu (2003) 46 Table 3. The Comparison Between the Fiscal Revenue and the Poverty Alleviation Fund Year Average local Average total Average poverty Average poverty (4)/(2) (3)/(2) revenue of revenue of alleviation funds of alleviation funds (%) (%) NDP counties NDP counties NDP counties (FFWMOF grant) of NDP counties 1994 1746 4531 1639 878 19 36 1995 2369 5460 1655 895 16 30 1996 3166 7003 1824 895 13 26 1997 3686 8416 2584 1149 14 31 1998 4056 8989 3091 1402 16 34 1999 4156 10213 4358 1824 18 43 2000 4435 11766 4476 1959 17 38 Note: (1) the NDP counties mean the NDP counties that are included in the second-wave poverty alleviation program; (2) local revenue includes the locally collected tax and fees; (3) total revenue not only includes local revenue, but also includes the transfer from the upper-level governments. However, it does include extra-budget income, such as land-sale income. (4) column 4 includes only the funds of food-for-work and MOF development grant. 47 Table 4. Comparision of Charateristics Between Counties in Di¤erent Income Intervals 350-400 400-450 Di¤ P-value Population 35.58 (3.56) 36.39 (2.69) -0.81 0.86 Rural population 32.32 (3.31) 33.61 (2.53) -1.28 0.76 Rural labor 15.71 (2.05) 15.79 (1.30) -0.08 0.97 Rural labor in agriculture 12.5 (1.64) 13.28 (1.07) -0.78 0.68 Bank deposit 12679 (1252) 15098 (1876) -2419 0.37 Old revolutionary counties 0.06 (0.03) 0.09 (0.03) 0.03 0.40 Minority counties 0.37 (0.06) 0.30 (0.04) 0.07 0.32 Fiscal revenue 2682 (274) 2335 (137) 347 0.21 Balanced Fiscal income 4493 (368) 3392 (185) 500 0.18 Balanced …scal expediture 4541 (332) 4267 (175) 274 0.43 Agricultural mechanical power(kw) 58689 (10282) 78155 (8567) -9466 0.49 Fertilizer use (tons) 8085 (1322) 9358 (1105) -1273 0.47 Agricultural output 22470 (3310) 24568 (2227) -2098 0.59 Industrial output 19559 (3777) 18639 (3584) 920 0.84 Grain output(tons) 106840 (11913) 127128 (10749) -20288 0.22 Gevernment employee 7539 (532) 7979 (493) -440 0.56 Fiscal transfer 1313 (126) 1181 (74) 133 0.33 Observations 65 105 Note: All the counties that are included in the …rst-wave poverty alleviation program are excluded from the comparison here. 350-400 is a group of counties with rural net income per capita between 350 and 400 in 1992; 400-450 is a group of counties with rural net income per capita between 400 and 450 in 1992. 48 Table 5. The E¤ects of Poverty Alleviation Program on Total Agricultural Mechanical Power: Coe¢ cients of the Interactions Between Dummies Indicating Year and the Program Assignment Dependent Variable: Log of the Total Agricultural Mechanical Power Full Sample 300-500 OLS IV OLS IV Year (1) (2) (3) (4) (5) (6) (7) (8) 1991 -0.00 -0.01 0.02 0.02 0.00 0.01 0.09 0.08 (0.02) (0.02) (0.04) (0.04) (0.03) (0.03) (0.12) (0.11) 1992 0.01 0.00 0.02 0.02 0.00 0.01 0.07 0.07 (0.02) (0.02) (0.05) (0.05) (0.03) (0.03) (0.09) (0.12) 1993 0.02 0.02 0.06 0.06 0.02 0.02 0.11 0.12 (0.02) (0.02) (0.05) (0.05) (0.04) (0.03) (0.12) (0.12) 1994 0.02 0.01 0.12*** 0.10*** 0.02 0.03 0.17 0.17 (0.02) (0.02) (0.05) (0.05) (0.04) (0.04) (0.12) (0.12) 1995 -0.00 -0.00 0.06 0.06 0.01 0.01 0.24** 0.24** (0.02) (0.02) (0.05) (0.05) (0.05) (0.04) (0.12) (0.12) 1996 -0.01 -0.02 0.10** 0.10** 0.05 0.06 0.27** 0.26** (0.02) (0.02) (0.05) (0.05) (0.05) (0.04) (0.12) (0.12) 1997 0.01 0.01 0.14** 0.16** 0.04 0.05 0.37*** 0.39*** (0.02) (0.02) (0.06) (0.06) (0.05) (0.05) (0.12) (0.12) 1998 0.03 0.02 0.14** 0.16** 0.00 0.00 0.26* 0.26* (0.02) (0.02) (0.06) (0.06) (0.05) (0.05) (0.12) (0.12) 1999 0.03* 0.03* 0.14** 0.18*** -0.00 0.00 0.29** 0.30** (0.02) (0.02) (0.06) (0.06) (0.06) (0.05) (0.12) (0.12) 2000 0.03* 0.04* 0.16** 0.20*** -0.01 -0.01 0.25** 0.28** (0.02) (0.02) (0.06) (0.06) (0.06) (0.06) (0.12) (0.12) Main controls Yes Yes Yes Yes Yes Yes Yes Yes Other controls No Yes No Yes No Yes No Yes F-Value 39 40 Observations 16362 16362 16362 16362 3421 3421 3421 3421 Note: The regression function is equation (9) while the dependent variables is the log of the total agricultural mechanical power in the years from 1991 to 2000. 300-500 means that the sample used in the regression contains the counties with rural net income per capita between 300 and 500 in 1992 but excludes the counties which were included in the …rst-wave poverty alleviation program. Standard errors are clustered at county level. * indicates signi…cance at 10% level, ** indicates signi…cance at 5% level, *** indicates signi…cance at 1% level. 49 Table 6. The E¤ects of Poverty Alleviation Program on Fertilizer Use: Coe¢ cients of the Interactions Between Dummies Indicating Year and the Program Assignment Dependent Variable: Log of the Fertilizer Use Full Sample 300-500 OLS IV OLS IV Year (1) (2) (3) (4) (5) (6) (7) (8) 1991 -0.01 -0.01 0.04 0.04 -0.01 -0.02 -0.01 -0.02 (0.02) (0.02) (0.08) (0.08) (0.05) (0.05) (0.14) (0.14) 1992 0.03 0.01 0.02 0.02 0.01 0.01 -0.02 -0.02 (0.02) (0.02) (0.08) (0.08) (0.05) (0.05) (0.14) (0.14) 1993 0.02 0.00 0.00 -0.02 0.02 0.03 0.01 0.02 (0.03) (0.03) (0.08) (0.08) (0.05) (0.05) (0.14) (0.14) 1994 0.03 0.02 0.04 0.02 0.06 0.05 0.11 0.14 (0.03) (0.03) (0.08) (0.08) (0.05) (0.05) (0.14) (0.15) 1995 0.02 0.01 -0.06 -0.04 0.06 0.05 0.12 0.13 (0.03) (0.03) (0.08) (0.08) (0.05) (0.05) (0.14) (0.14) 1996 0.01 -0.00 0.10 0.12 0.03 0.04 0.16 0.19 (0.03) (0.03) (0.08) (0.08) (0.05) (0.05) (0.14) (0.14) 1997 0.03 0.02 0.14* 0.14* 0.05 0.05 0.22* 0.23* (0.03) (0.03) (0.08) (0.08) (0.05) (0.05) (0.14) (0.14) 1998 0.01 -0.01 0.12 0.12 0.04 0.03 0.23* 0.25* (0.03) (0.03) (0.08) (0.08) (0.05) (0.05) (0.14) (0.14) 1999 0.02 0.01 0.14* 0.12 0.10** 0.09** 0.18 0.19 (0.03) (0.03) (0.08) (0.08) (0.05) (0.05) (0.14) (0.14) 2000 0.02 0.02 0.04 0.06 0.07 0.08 0.12 0.11 (0.03) (0.03) (0.08) (0.08) (0.05) (0.05) (0.14) (0.14) Main controls Yes Yes Yes Yes Yes Yes Yes Yes Other controls No Yes No Yes No Yes No Yes F-Value 39 40 Observations 16362 16362 16362 16362 3421 3421 3421 3421 Note: The regression function is equation (9) while the dependent variables is the log of the total fertilizer use in the years from 1991 to 2000. 300-500 means that the sample used in the regression contains the counties with rural net income per capita between 300 and 500 in 1992 but excludes the counties which were included in the …rst-wave poverty alleviation program. Standard errors are clustered at county level. * indicates signi…cance at 10% level, ** indicates signi…cance at 5% level, *** indicates signi…cance at 1% level. 50 Table 7. The E¤ects of Poverty Alleviation Program on Total Grain Output: Coe¢ cients of the Interactions Between Dummies Indicating Year and the Program Assignment Dependent Variable: Log of the Total Grain Full Sample 300-500 OLS IV OLS IV Year (1) (2) (3) (4) (5) (6) (7) (8) 1991 -0.02 -0.02 0.04 0.02 -0.01 0.02 0.09 0.08 (0.02) (0.02) (0.06) (0.06) (0.04) (0.04) (0.11) (0.10) 1992 -0.01 -0.01 -0.06 -0.06 0.02 0.02 -0.02 -0.01 (0.02) (0.02) (0.04) (0.04) (0.04) (0.04) (0.11) (0.10) 1993 -0.03 -0.03 0.00 0.02 -0.07 -0.06 0.06 0.06 (0.02) (0.02) (0.06) (0.06) (0.04) (0.04) (0.11) (0.10) 1994 -0.01 -0.01 0.02 0.02 0.05 0.05 0.17 0.18* (0.02) (0.02) (0.06) (0.06) (0.04) (0.04) (0.11) (0.10) 1995 0.06** 0.04* 0.10* 0.14** 0.10*** 0.08** 0.20* 0.22** (0.02) (0.02) (0.06) (0.06) (0.04) (0.04) (0.11) (0.10) 1996 0.09*** 0.07*** 0.14** 0.16*** 0.14*** 0.13*** 0.26** 0.29** (0.02) (0.02) (0.06) (0.06) (0.04) (0.04) (0.11) (0.10) 1997 0.08*** 0.07*** 0.14** 0.18*** 0.12*** 0.11*** 0.30*** 0.32*** (0.02) (0.02) (0.06) (0.06) (0.04) (0.04) (0.11) (0.10) 1998 0.08*** 0.07*** 0.12** 0.16*** 0.14*** 0.13*** 0.20* 0.22** (0.02) (0.02) (0.06) (0.06) (0.04) (0.04) (0.11) (0.10) 1999 0.08** 0.06*** 0.14** 0.16*** 0.12*** 0.11*** 0.21* 0.20** (0.02) (0.02) (0.06) (0.04) (0.04) (0.04) (0.11) (0.11) 2000 0.04 0.03 0.10* 0.10* 0.03 0.03 0.15 0.17 (0.03) (0.03) (0.06) (0.06) (0.04) (0.04) (0.12) (0.11) Main controls Yes Yes Yes Yes Yes Yes Yes Yes Other controls No Yes No Yes No Yes No Yes F-value 40 39 Observations 16362 16362 16362 16362 3421 3421 3421 3421 Note: The regression function is equation (9) while the dependent variables is the log of the total grain output in tons in the years from 1991 to 2000. 300-500 means that the sample used in the regression contains the counties with rural net income per capita between 300 and 500 in 1992 but excludes the counties which were included in the …rst-wave poverty alleviation program. Standard errors are clustered at county level. * indicates signi…cance at 10% level, ** indicates signi…cance at 5% level, *** indicates signi…cance at 1% level. 51 Table 8. The E¤ects of Poverty Alleviation Program on Number of Livestock: Coe¢ cients of the Interactions Between Dummies Indicating Year and the Program Assignment Dependent Variable: Log of the Number of Livestock Full Sample 300-500 OLS IV OLS IV Year (1) (2) (3) (4) (5) (6) (7) (8) 1991 0.00 0.00 0.02 0.02 -0.01 -0.01 0.01 0.01 (0.01) (0.01) (0.06) (0.06) (0.02) (0.02) (0.06) (0.08) 1992 0.00 0.00 0.09 0.08 0.00 0.00 0.04 0.04 (0.01) (0.01) (0.06) (0.06) (0.03) (0.03) (0.08) (0.08) 1993 0.01 0.01 0.06 0.04 0.01 0.01 0.10 0.11 (0.02) (0.02) (0.06) (0.06) (0.03) (0.03) (0.08) (0.08) 1994 -0.02 -0.02 0.10* 0.11** 0.05* 0.04* 0.22*** 0.23** (0.02) (0.02) (0.06) (0.06) (0.03) (0.03) (0.07) (0.08) 1995 -0.01 -0.01 0.02 0.03 0.06** 0.06** 0.30*** 0.31** (0.03) (0.03) (0.07) (0.07) (0.03) (0.03) (0.08) (0.08) 1996 -0.01 -0.01 0.04 0.04 0.08** 0.07** 0.25*** 0.26*** (0.03) (0.03) (0.07) (0.07) (0.03) (0.03) (0.09) (0.09) 1997 0.03 0.03 0.10 0.13* 0.07** 0.07** 0.27*** 0.28*** (0.03) (0.03) (0.07) (0.07) (0.03) (0.03) (0.09) (0.10) 1998 -0.00 -0.00 0.06 0.08 0.04 0.04 0.24*** 0.24*** (0.03) (0.03) (0.07) (0.07) (0.03) (0.03) (0.09) (0.09) 1999 0.02 0.02 0.08 0.10 0.03 0.03 0.25*** 0.25*** (0.03) (0.03) (0.07) (0.07) (0.04) (0.04) (0.09) (0.09) 2000 0.01 0.01 -0.00 0.02 0.04 0.04 0.23*** 0.23*** (0.03) (0.03) (0.07) (0.07) (0.03) (0.03) (0.09) (0.09) Main controls Yes Yes Yes Yes Yes Yes Yes Yes Other controls No Yes No Yes No Yes No Yes F-Value 39 40 Observations 16362 16362 16362 16362 3421 3421 3421 3421 Note: The regression function is equation (9) while the dependent variables is the log of the total number of livestock in the years from 1991 to 2000. * indicates signi…cance at 10% level, ** indicates signi…cance at 5% level, *** indicates signi…cance at 1% level. 52 Table 9. The E¤ects of Poverty Alleviation Program on Labor Forces in TVEs: Coe¢ cients of the Interactions Between Dummies Indicating Year and the Program Assignmen Dependent Variable: Log of the Labor Forces in TVEs Full Sample 300-500 OLS IV OLS IV Year (1) (2) (3) (4) (5) (6) (7) (8) 1991 -0.00 0.03 -0.02 -0.02 0.00 0.01 -0.09 -0.05 (0.02) (0.02) (0.10) (0.10) (0.05) (0.05) (0.15) (0.14) 1992 -0.04 -0.02 -0.02 -0.02 -0.01 -0.02 0.05 0.02 (0.03) (0.02) (0.10) (0.10) (0.05) (0.05) (0.15) (0.14) 1993 -0.07** -0.02 0.00 -0.00 -0.03 -0.02 0.03 0.03 (0.03) (0.02) (0.10) (0.10) (0.06) (0.06) (0.16) (0.14) 1994 -0.10*** -0.08** -0.04 -0.04 -0.07 -0.06 -0.13 -0.15 (0.03) (0.03) (0.10) (0.10) (0.05) (0.05) (0.16) (0.14) 1995 -0.07** -0.05* -0.08 -0.07 -0.10* -0.09 -0.18 -0.24 (0.03) (0.03) (0.10) (0.10) (0.06) (0.05) (0.15) (0.14) 1996 -0.09*** -0.08* -0.19* -0.18* -0.11** -0.10** -0.29* -0.28* (0.03) (0.03) (0.10) (0.10) (0.06) (0.05) (0.17) (0.15) 1997 -0.15*** -0.09** -0.21** -0.22** -0.11** -0.10** -0.30* -0.30** (0.03) (0.03) (0.10) (0.10) (0.06) (0.05) (0.16) (0.16) Main controls Yes Yes Yes Yes Yes Yes Yes Yes Other controls No Yes No Yes No Yes No Yes F-Value 39 40 Observations 16362 16362 16362 16362 3421 3421 3421 3421 Note: The regression function is equation (9) while the dependent variables is the log of the total labor force in TVEs in the years from 1991 to 2000. 300-500 means that the sample used in the regression contains the counties with rural net income per capita between 300 and 500 in 1992 but excludes the counties which were included in the …rst-wave poverty alleviation program. Standard errors are clustered at county level. * indicates signi…cance at 10% level, ** indicates signi…cance at 5% level, *** indicates signi…cance at 1% level. 53 Table 10. The E¤ects of Poverty Alleviation Program on Total Rural Industrial Output: Coe¢ cients of the Interactions Between Dummies Indicating Year and the Program Assignmen Dependent Variable: Log of Total Rural Industrial Output Full Sample 300-500 OLS Reduced form OLS IV Year (1) (2) (3) (4) (5) (6) (7) (8) 1991 -0.12*** -0.13*** -0.14 -0.18 -0.06 -0.07 -0.04 -0.01* (0.04) (0.04) (0.12) (0.12) (0.06) (0.06) (0.22) (0.20) 1992 -0.06 -0.06 -0.04 -0.04 -0.01 -0.02 0.14 0.10 (0.04) (0.04) (0.12) (0.12) (0.06) (0.06) (0.22) (0.20) 1993 -0.10** -0.10*** -0.10 -0.10 -0.03 -0.04 0.04 0.06 (0.04) (0.04) (0.12) (0.12) (0.06) (0.06) (0.22) (0.20) 1994 -0.19*** -0.20*** -0.26** -0.24** -0.08 -0.10* -0.18 -0.19 (0.04) (0.04) (0.12) (0.10) (0.06) (0.06) (0.22) (0.20) 1995 -0.22*** -0.23*** -0.32** -0.35*** -0.10 -0.14** -0.24 -0.21 (0.04) (0.04) (0.12) (0.12) (0.06) (0.06) (0.22) (0.20) 1996 -0.22*** -0.25*** -0.37*** -0.38*** -0.12* -0.17*** -0.26 -0.28 (0.04) (0.05) (0.12) (0.12) (0.06) (0.06) (0.22) (0.20) 1997 -0.25*** -0.28*** -0.41*** -0.40*** -0.14** -0.21*** -0.29 -0.23 (0.04) (0.04) (0.12) (0.12) (0.06) (0.06) (0.22) (0.20) Main controls Yes Yes Yes Yes Yes Yes Yes Yes Other controls No Yes No Yes No Yes No Yes F-Value 39 40 Sample Size 16362 16362 16362 16362 3421 3421 3421 3421 Note: The regression function is equation (9) while the dependent variables is the log of the total labor force in TVEs in the years from 1991 to 2000. 300-500 means that the sample used in the regression contains the counties with rural net income per capita between 300 and 500 in 1992 but excludes the counties which were included in the …rst-wave poverty alleviation program. Standard errors are clustered at county level. * indicates signi…cance at 10% level, ** indicates signi…cance at 5% level, *** indicates signi…cance at 1% level. 54 Table 11. The E¤ects of Poverty Alleviation Program on Agricultural Income: Coe¢ cients of the Interactions Between Dummies Indicating Year and the Program Assignmen Dependent Variable: Log of the Agricultural Income Full Sample 300-500 OLS IV OLS IV Year (1) (2) (3) (4) (5) (6) (7) (8) 1991 0.00 -0.01 0.04 0.04 -0.01 -0.01 -0.00 -0.01 (0.03) (0.03) (0.08) (0.08) (0.04) (0.04) (0.12) (0.11) 1992 -0.02 -0.02 -0.04 -0.04 0.02 0.02 0.03 0.01 (0.03) (0.02) (0.08) (0.08) (0.04) (0.04) (0.12) (0.11) 1993 -0.04 -0.03 -0.08 -0.10 0.05 0.04 0.08 0.08 (0.03) (0.03) (0.08) (0.08) (0.04) (0.04) (0.12) (0.11) 1994 -0.03 -0.02 -0.10 -0.12 0.05 0.05 0.06 0.08 (0.03) (0.03) (0.08) (0.08) (0.04) (0.04) (0.12) (0.11) 1995 -0.02 -0.01 -0.04 -0.04 0.09** 0.08** 0.18 0.19* (0.03) (0.03) (0.08) (0.08) (0.04) (0.04) (0.12) (0.11) 1996 -0.02 -0.03 -0.06 -0.06 0.07* 0.07* 0.19 0.23** (0.03) (0.03) (0.08) (0.08) (0.04) (0.04) (0.13) (0.12) 1997 -0.00 -0.02 -0.10 -0.14* 0.06 0.05 0.18 0.16 (0.03) (0.03) (0.08) (0.08) (0.04) (0.04) (0.13) (0.11) 1998 -0.04 -0.05 -0.12 -0.12 0.08** 0.07* 0.15 0.14 (0.03) (0.04) (0.08) (0.08) (0.04) (0.04) (0.13) (0.12) 1999 -0.06** -0.08** -0.06 -0.12 0.03 0.05 0.11 0.15 (0.03) (0.04) (0.08) (0.08) (0.04) (0.04) (0.13) (0.12) 2000 -0.07** -0.09** -0.06 -0.10 0.04 0.03 0.10 0.11 (0.03) (0.04) (0.08) (0.08) (0.04) (0.04) (0.13) (0.12) Main controls Yes Yes Yes Yes Yes Yes Yes Yes Other controls No Yes No Yes No Yes No Yes F-Value 39 40 Observations 16362 16362 16362 16362 3421 3421 3421 3421 Note: The regression function is equation (9) while the dependent variables is the log of the total rural industrial output in the years from 1991 to 2000. 300-500 means that the sample used in the regression contains the counties with rural net income per capita between 300 and 500 in 1992 but excludes the counties which were included in the …rst-wave poverty alleviation program. Standard errors are clustered at county level. * indicates signi…cance at 10% level, ** indicates signi…cance at 5% level, *** indicates signi…cance at 1% level. 55 Table 12. The E¤ects of Poverty Alleviation Program on Rural Net Income Per Capita: Coe¢ cients of the Interactions Between Dummies Indicating Year and the Program Assignmen Dependent Variable: Log of the Rural Net Income Per Capita Full Sample 300-500 OLS Reduced form OLS IV Year (1) (2) (3) (4) (5) (6) (7) (8) 1991 -0.02 -0.00 0.00 -0.00 -0.01 -0.00 -0.01 -0.02 (0.02) (0.02) (0.04) (0.04) (0.04) (0.04) (0.13) (0.10) 1992 0.00 -0.03 -0.01 -0.00 0.01 0.01 0.01 0.01 (0.02) (0.02) (0.04) (0.04) (0.04) (0.04) (0.13) (0.10) 1993 -0.01 -0.03 0.03 0.04 -0.01 -0.01 0.04 0.05 (0.02) (0.02) (0.04) (0.04) (0.04) (0.04) (0.13) (0.10) 1994 -0.02 -0.00 -0.05 -0.06 0.02 0.01 0.08 0.09 (0.02) (0.02) (0.04) (0.04) (0.04) (0.04) (0.13) (0.10) 1995 -0.01 -0.03 -0.08** -0.08** 0.04 0.03 0.12 0.14 (0.02) (0.02) (0.04) (0.04) (0.04) (0.04) (0.13) (0.10) 1996 -0.03 -0.03 -0.08** -0.08** 0.03 0.02 0.10 0.12 (0.02) (0.02) (0.04) (0.04) (0.04) (0.04) (0.13) (0.10) 1997 -0.04** -0.05** -0.10** -0.09** 0.02 0.02 0.03 0.05 (0.02) (0.02) (0.04) (0.04) (0.04) (0.04) (0.13) (0.10) 1998 -0.06** -0.04** -0.11** -0.11** 0.01 0.02 0.07 0.09 (0.02) (0.02) (0.04) (0.04) (0.04) (0.04) (0.13) (0.11) 1999 -0.07*** -0.08*** -0.12*** -0.12*** 0.02 0.01 -0.03 -0.02 (0.02) (0.02) (0.04) (0.04) (0.04) (0.04) (0.13) (0.11) 2000 -0.03* -0.04** -0.15*** -0.14*** -0.01 -0.01 -0.08 -0.07 (0.02) (0.02) (0.04) (0.04) (0.04) (0.04) (0.13) (0.11) Main controls Yes Yes Yes Yes Yes Yes Yes Yes Other controls No Yes No Yes No Yes No Yes F-value 39 40 Observations 16362 16362 16362 16362 3421 3421 3421 3421 Note: The regression function is equation (9) while the dependent variables is the log of rural net income per capita from 1991 to 2000. 300-500 means that the sample used in the regression contains the counties with rural net income per capita between 300 and 500 in 1992 but excludes the counties which were included in the …rst-wave poverty alleviation program. Standard errors are clustered at county level. * indicates signi…cance at 10% level, ** indicates signi…cance at 5% level, *** indicates signi…cance at 1% level. 56 Chapter Two Rates of Return to University Education: The Regression Discontinuity Design 57    1 Introduction It is common knowledge that people who are more educated, on average, earn more than the less educated. A key question, however, is to what extent does higher levels of education cause higher earnings? Perhaps higher earnings are caused by the more-educated having higher ability levels or other unobserved advantages. Many studies have attempted to account for differences in various unobserved endowments by using within-twin comparisons (see, for example, Ashenfelter and Krueger, 1994; Berhman, Rosenzweig, and Taubman, 1994; Miller, Mulvey, and Martin, 1995; Isacsson, 1999; and Bingley, Christinsen, and Jensen, 2009). Other studies have used natural experiments (see, for example, Angrist and Krueger, 1991; Card, 1995; Harmon and Walker, 1995; Acemoglu and Angrist, 2001; Lochner and Moretti, 2004; and Oreopoulos, 2006). Critics, however, have been skeptical about the returns to education estimated using these techniques (critics of within-twin variations include Bound and Solon, 1999; Neumark, 1999; Leigh and Ryan, 2008; and Lee and Lemieux, 2009; criticisms of natural experiments include Bound, Jaeger and Baker, 1995 and Oreopoulos, 2006). The best way to estimate the causal effect of education on earnings is to use a randomized trial. However, education is a long run investment and measuring the resulting labor market outcomes demands a long window of observation, not to mention that such experiments on humans are not executable. This may be one of the reasons why we have not seen any studies on returns to education using a randomized trial. Recently, Lee and Lemieux (2009) have labeled Regression Discontinuity (RD) design as “a closer cousin to randomized experiments”. To date, however, Oreopoulos (2006) is the only study known to us which uses the Regression Discontinuity design to estimate the causality between one additional year of high school education and labor market earnings. The main contribution of this paper is to extend the application of RD design to assessing the returns to higher education. Utilizing an essentially unique feature of the Chinese College Admission System (CAS), which uses test scores from a centralized examination—the National College Entrance Examination (NCEE)—as the benchmark to select students, we are able to find well-defined cut-offs for university admission. This, together with a rich survey data set with information on individual NCEE scores, provides a rare opportunity for us to apply fuzzy RD design with IV to estimate the local average treatment effects (LATE) of the university education. To the best of our knowledge, our study is the first to use RD design to estimate the causal effect of university education on earnings. While our LATE estimate can be interpreted as the average treatment effect for the subpopulations whose treatment status is induced by the instrument, i.e., compliers, this might not provide information on the average treatment effect (ATE) for the population as a whole, unless the estimate can be extrapolated to all other subpopulations. Following a method proposed by Imbens and Wooldridge (2007), 58 we calculate and compare the differences in average earnings between eligible compliers and eligible always-takers, and between ineligible compliers and ineligible never-takers. The magnitudes of these differences can assist in assessing whether the LATE estimate is similar to the treatment effects for non-compliers. The LATE carries strong policy implications in the Chinese context. In China, where the economy has grown at an unprecedented speed for the past twenty or so years, the annual enrollment at universities skyrocketed from just over 284,000 in 1979 to 5.99 million in 2008, a twenty-fold increase.1 In particular, China expanded university enrollment by 47 per cent in one year in 1999 and since then the enrollment figure has increased by almost three-fold. As a result, the proportion of the urban labor force with tertiary education increased from just over 10 per cent in 1987 to 40 per cent in 2007 (Meng, Shen, and Xue, 2009). Although studies have shown that the return to education increased during the 1990s, the rate of increase has slowed down significantly since the late 1990s (Zhang, Zhao, Park, and Song, 2005 and Meng et al. 2009). It is unfortunate that the drastic expansion of the tertiary education in the late 1990s was not based on careful assessments of the returns to education, especially the rate of return for the group whose university attainment is more likely to be affected by the expansion policy. Our estimate of the LATE, which measures the returns to university education for the group whose cut-off scores are marginal, and hence, are more likely to be affected by the university expansion policy, provides an important insight into the effect of the education expansion. In addition, our LATE estimate applies to 45 to 48 per cent of our sample who participated in the NCEE (the compliers), and hence, should have a relatively general relevance. Using fuzzy RD with IV estimation, we find that the LATE of obtaining a 4-year university degree relative to a three-year college qualification on annual earnings is very large—an increase of around 40% and 60% for males and females, respectively. If compared to the unsuccessful NCEE examinees (all with high school education), the effects are enlarged to around 112% and 95% for males and females, respectively. The remainder of the paper is structured as follows. Section 2 describes the institutional background which affects our research design. Section 3 discusses the methodologies. Data are presented in Section 4, which is followed by sections which discuss the RD-LATE results. Conclusions are given in Section 7. 2 Background The Chinese schooling system is quite similar to that of the West. Figure 1 provides 1 China is not alone in expanding education investment. Despite the lack of consensus as to the size of the causal effect of education on earnings, governments in many parts of the world are investing heavily in education. The 1993 World Bank Report The East Asian Miracle identifies the rapid growth of human capital as one of thetwo principal engines of economic growth in East Asian countries (World Bank, 1993). 59 a sketch of the system. Students begin primary school at the age of 6-7.2 The primary school normally requires six years to complete, and this is followed by three years at junior high school. Upon completion of junior high school, students have the option to continue studying for three years in an academic senior high school or entering a vocational secondary school for 2-4 years. Normally, those who complete the academic senior high school program participate in the NCEE to gain their undergraduate admission. Chinese higher education at the undergraduate level is divided into three-year college and four-year university programs. There are two tiers of four-year universities and the first tier is of higher quality and hence attracts greater central government funding.3 Within these tiers, undergraduate education is divided into two streams: a humanities/social sciences stream and a sciences stream. The National College Entrance Examination and College Admission System were established in 1952. During the Cultural Revolution (1966-1976) the system stopped operating for 10 years and resumed after the Cultural Revolution in 1977. The system operates on the basis of universal examination papers and marking standards across all regions in China. 4 The subjects tested for the humanities/social sciences stream include: Politics, Chinese, Math, Foreign Language, and History, while those for the sciences stream include Chinese, Math, Foreign Language, Physics and Chemistry5. Although, in general, the total score for different provinces within the same year is the same, it varies over years because the number of subjects tested and the total score of each subject varies over years. For example, in 1977, the year the NCEE first resumed after the Cultural Revolution, only four subjects for each stream were tested (Foreign Language was excluded) and the full score for each subject was set at 100. Thus, the full score for that year was 400. In later years, however, the number of subjects tested increased to five while the full score for each subject was set at 150 instead of 100. Consequently the total score increased to 750. In addition, in a few provinces for several years6 the original individuals’ test scores were standardized based on their ranking in the distribution of the scores for all students in the province for that year. The full standardized scores are higher than the total original scores. These differences will have some implications later in the paper when we try to normalize the scores across different provinces and different years. We will discuss these implications in detail in the Data Section. 2 School starting age may differ across regions and over time. Currently in most of the urban areas it is 6 years. 3 The two tier system was first established by the central government in 1954. In that year six universities, including Peking University and Tsinghua University, were assigned to the first tier Afterwards more universities gained first tier entitlement. By 1963, three years before the Culture Revolution, there were 68 first tier universities. In 1978, two years after the Culture Revolution, the central government released a new list of 88 first tier universities. In the 1990s this number reached 100 (China Ministry of Education, 2006 and Harbin Institute of Technology, 2008). 4 For a detailed discussion of the NCEE and CAS see Meng et al. (1989). 5 The exact subjects tested for different streams in different years vary slightly. 6 For detailed information on which province implemented "standardized scores", in which years, and the years they did so, please see Data Appendix. 60 Admissions into colleges are based on the NCEE scores. In addition to taking NCEE, students are required to submit an application form specifying the list of preferred colleges to the admission offices in their province. There are 3-tier colleges on the list. They are major universities, general colleges or universities, junior colleges. Major universities are also called Zhongdian Daxue which are mainly supported by the nation; general universities are also called Yiban Benke which the local governments are responsible for. Students need to study four years in both of these universities for their bachelor degrees. The third tier universities are called the junior colleges. Their programs are generally three years long. The first three tier schools are regarded as colleges or called Daxue. The application forms may be submitted before or after the students take the NCEE. In some cases, submission may be after they know their final score. In any case, the application occurs before the publication of the cutoff scores for the different types of universities. In this paper, though, we do not consider the division between the first and second tier universities, nor do we examine the rate of return for the three-year colleges, as we only have limited cutoff score data for the first tier universities and the three-year colleges. Once all the NCEE results are known, each province will determine their own cut-offs for the different tier university and three-year colleges, based on the quota given to the province7 and the distribution of the current year NCEE results. This design ensures that before participating in the NCEE no student would have any knowledge of the cutoff scores. In addition, the cut-offs are normally set at a percentile which is 10 to 20 per cent higher than that implied by the quota. In other words, 10 to 20 per cent more students may have their NCEE scores above the cut-offs than the actual number of students who can be admitted. These cutoff scores will then be made publicly available through schools, local education bureaus, local newspapers, internet and television channels. The university/college admission process follows the rule of “better school, earlier admission”. In particular, after the NCEE scores are known, all the application forms submitted, and cut-offs published, the first tier universities will start their admission process and continue until all their quotas are filled, followed by the second tier universities, and then the three-year colleges. Based on the cut-offs published by each province and the number of admissions (quota) the university/college has allocated to the province, each school processes the admissions in priority order. That is, students who exceeded the cut-off score and listed a particular university as their first preference will be considered first, based on the rank of their NCEE score among all students who applied to that school. If the university’s quota for the province is less than the number of students in the province whose NCEE scores exceeded the cutoff, those whose NCEE scores ranked lower may not be admitted. If the quota is greater than the number of students with NCEE scores exceeding the cut-off, the school will process students who listed the school as their second preference and so on. In this case, the process will stop at 7 The quota is first given by the central government to each university. The universities then divide their quota to different provinces. 61 the point where all the quotas are filled. Inevitably, due to lack of demand, some schools may end up admitting students whose NCEE scores are below the cut-off score. Three features of this admission system are worth emphasizing. First, for any individual student, the cut-offs are exogenously determined. Second, the design ensures that before participating in the NCEE, a student will have no knowledge about the exact cut-off points, implying that it is impossible for any student to exercise complete control over his/her test score around the cutoff points. This feature satisfies the primary requirement for a valid RD design. The third feature is about non-compliance. The discussion above indicates that the cut-offs will be fuzzy by design. This is because: (i) Some universities may admit students with scores lower than their cut-off because of lack of interest in the university; (ii) Normally the cut-offs are set at the point where there are 10 to 20 per cent more students with scores exceeding the cut-offs than the quotas. Hence, students with NCEE score above the cut-offs may not necessarily be admitted. Because students submit their application forms before they know the cut-offs for different schools and sometimes even before they know their own NCEE scores, some students may mis-judge their own ability/performance. Hence, some with higher scores than the cutoff may miss out on admission because they listed lower schools as their first and second preferences in their application. (iii) Some students might have bonus scores due to various reasons.8 (iv) Finally, there may also exist corruption, which may allow individuals with a lower score than cutoff scores be admitted. These non-compliance cases will have significant implications on our research. In this paper, we try to look at the rates of return of going to a 4-year university relative to going to a third-tier third-year college. Since the cutoffs of major universities are higher than the cutoffs of general universities, the cutoff of a 4-year university is the same as the cutoff of a second-tier four-year university. Because the 4-year universities are generally better than the 3-year colleges, the return of going to a 4-year university relative to going to a third-tier third-year college consists of two parts: one part is from an extra year of college and the other part is from better education from a better college. Thus, otherwise specified, the cutoff we mention thereafter is the cutoff for 4-year universities. 8 Those cases include: 1) Minority students; 2) Students who earn provincial award of “excellent student cadre”; 3) Students who win the award for provincial-level or national-level academic competitions ; 4) Students who win the award for sports competition; 5) Children who are the certified athlete; 6) Students who specialize in playing instruments or very good at singing, dancing, handwriting, drawing, painting or playing chess; 6) Children of oversea Chinese or returned Chinese; 7) Children of Martyr (soldiers or policemen); 8) Children of demobilized soldiers who find their jobs by themselves ; 9) Students who earn the provincial-level award of “excellent student”; 10) Children of seriously disable soldiers or policemen; 11) Foreign students or students from Hong Kong, Macau or Taiwan. All the students above can have 10 or 20 bonus points for their scores. For (5), (6) and (11), some can have much more bonus points and the bonus points can be higher than 200. In reality, some colleges set a much lower cutoff for students whose parents work in those colleges although it is not allowed. In addition, since only certificates are needed for these bonus points, some parents might use fake certificate or bribe to get those certificates. Unfortunately, we cannot get the detailed information of the fractions each category accounts for. 62 Since RD design estimates we get in this paper are Local Average Treatment Effects (LATE), which are the average treatment effects among those who comply with treatment assignment, it will be helpful to explain the estimates in our context. According to Angrist and Imbens (1994), there are four kinds of individuals: Compliers, Always-takers, Never-takers and Defiers. Defiers are the people who do the opposite to the treatment assignment. In our context, defiers are the students who are not admitted into 4-year universities if their scores are above the cutoff while are admitted into 4-year universities if their scores are below the cutoff. Angrist and Imbens (1994) show that no-defiers or monotonicity assumption is one of the critical assumptions to get a valid Local Average Treatment Effect (LATE). It is impossible to test the assumption but in general people assume the condition holds since the behavior of defiers is irrational. So in this paper we assume the monotonicity assumption holds thus we assume there are no defiers. Compliers are the people who comply with treatment assignment. In our context, compliers are the students who are admitted into 4-year universities if their scores are above the cutoff while are not admitted into 4-year universities if their scores are below the cutoff. There are two types of compliers: eligible compliers and ineligible compliers. Eligible compliers are the compliers who are admitted into 4-year universities and their scores are above the cutoff. Ineligible compliers are the compliers who are not admitted into 4-year universities and their scores are below the cutoff. Always takers are the people who are always treated no matter if they are assigned or not. In this context, always-takers are the students who are admitted to 4-year universities no matter if their scores are above the cutoff or not. In addition, there are two types of always-takers: eligible always-takers and ineligible always-takers. Eligible always-takers are the always-takers whose scores are above the cutoff while ineligible always-takers are the always-takers whose score are below the cutoff. Ineligible always-takers are observed since they are admitted into 4-year universities even if their scores are below the cutoff. However, eligible always-takers are not observed since they mix with eligible compliers. Never-takers are the people who are always not treated no matter if they are assigned or not. In this context, never-takers are the students who are not admitted to 4-year universities no matter if their scores are above the cutoff or not. In addition, there are two types of never-takers: eligible never-takers and ineligible never-takers. Eligible never-takers are the never-takers whose scores are above the cutoff while ineligible never-takers are the never-takers whose score are below the cutoff. Eligible never-takers are observed since they are not admitted into 4-year universities even if their scores are above the cutoff. However, ineligible never-takers are not observed since they mix with ineligible compliers. The analysis of the non-compliance feature indicates an endogenous problem in the OLS estimation. For example, students who have bonus scores are likely to be always-takers especially when they are around the cutoff. The characteristics they have not only give them bigger chance to enter 4-year universities but also might 63 affect their future earnings. In that case the OLS estimate is biased. 3 Methodology In this paper, we examine the causal effect of having a four-year university degree on earnings relative to three-year college degree. Consider the following equation: = α+β+γ+ε(1) where refers to the logarithm of annual earnings for individual ; is a dummy variable indicating whether the individual possesses a four-year university degree; is a vector of control variables, andεis the error term. The OLS estimation of the Equation (1) may provide a biased estimate ofβbecauseεmay include components, such as ability and drive, which are correlated with and . To resolve this problem, we adopt the Regression Discontinuity (RD) design. The basic idea of the RD design is to utilize the fact that a treatment is given to a group of people for whom a measurable characteristic (forcing variable) is equal to, or greater than, an exogenously set threshold value. This generates a sharp discontinuity in the treatment, which is a function of the forcing variable. If individuals are unable to precisely manipulate the forcing variable it is reasonable to attribute the discontinuous jump in the outcome to the causal effect of the treatment (Lee and Lemieux, 2009). To avoid the possible omitted variable problem, Heckman and Robb (1985) propose estimating the effect of the treatment by adding a flexible function of the forcing variable into the estimating equation. Thus, in our case Equation (1) may be re-written as:  = α+β+γ+ ()+ε(1a) = 1{+ = }(1b) where is the NCEE test score for individual , () is a flexible function of , which can be a vector of high order polynomial terms, and is the cutoff score. Equation (1b) indicates the eligibility rule: if individual’s test score is equal to or greater than , they will gain admission to a four-year university. Otherwise, they will be placed in the control group. In the case where the forcing variable perfectly predicts treatment receipt (is a constant, a sharp RD design) and the treatment effect is heterogeneous, the estimate emanating from the RD design is a ‘weighted average treatment effect’. The weights are directly proportional to the ex ante probability of an individual’s realized value for the forcing variable being close to the cutoff point (Lee and Lemieux, 2009). In the case where the forcing variable does not relate to the treatment receipt in a deterministic way (is a variable), we have a “fuzzy” RD design. In this case the OLS estimation of Equation (1a) is biased. However, an IV estimate can provide an 64 unbiased estimate of a weighted local average treatment effect (LATE) for the compliers if the treatment effect is heterogeneous, and of the weighted average treatment effect (ATE) for the population if the treatment effect is homogeneous across subpopulations of various compliance types. The natural candidate for the instrument should be the eligibility rule (Hahn, 2001 and Lee and Lemieux, 2009). In this paper, we have a fuzzy RD design. Although whether or not a student passes the cutoff score is the most important criterion for university admission, there does exist noncompliance. As discussed in the background section, there are situations where individuals with scores lower than the cut-offs are admitted and those with results higher than the cutoff scores missed out on admission. In this case we may rewrite Equation (1b) as: Pr(= 1|= ) Pr(= 1|) (1b2) Using a dummy variable indicating whether an individual’s NCEE score () is equal to or greater than the cutoff () as the instrument (in other words, eligibility for admission), and providing that the assumptions of monotonicity and excludability are satisfied, 9 we are able to estimate an unbiased local average treatment effect (LATE). The LATE gives us the causal effect of attending a four-year university on earnings for a group of individuals whose university participation is induced by their eligibility status. The empirical importance of estimating the LATE in our case lies in its policy relevance. As discussed in the Introduction Section, over the past ten or more years, China has implemented a policy which significantly expanded university admission. To understand whether and to what extent the policy is beneficial, it is important to know the magnitude of the causal effect of university education on the group of individuals whose university attainment can be affected by the policy. In the case where the effect of university attainment on earnings is the same for the compliers and non-compliers (homogenous effect), the LATE can also be the average treatment effect (ATE) for the entire population. To gauge whether in our case the treatment effect is homogenous, we follow Imbens and Wooldridge (2007) to examine the unconditional mean payoffs for the never-takers and always-takers, which gives us some information on which to infer whether the LATE is close to the ATE. More specifically, we calculate the proportions of compliers, never-takers, and always-takers in the population, and then use these to calibrate separately the average earnings for (1) compliers if eligible, (2) compliers if ineligible, (3) always-takers if ineligible, and (4) never-takers if eligible. Imbens and Wooldridge (2007) argue that if a substantial difference in the levels of earnings is found between (1) and (3) and/or (2) and (4), it is then less plausible that the LATE is 9 As pointed out by Hahn et al. (2001), it requires two assumptions – ‘monotonicity’ and ‘excludability’ –for the LATE to be interpreted as a causal effect. The monotonicity assumption states that the forcing variable crossing the cutoff point cannot cause some individuals to accept and others to reject the treatment at the same time. The excludability assumption demands that the forcing variable crossing the cutoff point can only affect the outcome variable through its impact on the treatment. 65 indicative of the treatment effects for other compliance types. The purpose of this paper is to estimate the returns to a four-year university education relative to a three-year college education. Throughout the paper the treatment group is defined as individuals who possess a four-year university degree, whereas control group is defined as individuals who possess three-year college degrees. 4 Data The main data used in this paper are from the Urban Residents Education and Employment Survey (UREES) conducted in 2005 by the National Bureau of Statistics (NBS) of China. The survey covers 10,000 urban households from 12 provinces. 10 It uses the same NBS Urban Household Income and Expenditure Survey (UHIES) sampling frame, which is based on Probability Proportional to Size (PPS) sampling with stratifications at the provincial, city, county, town, and neighborhood community levels. Households are randomly selected within each chosen neighborhood community (see Han, Wailes, and Cramer, 1995; Fang, Zhang, and Fan, 2002; Gibson, Huang, and Rozelle, 2003; and Meng, Gregory and Wang, 2005 for detailed discussion of the sampling). In addition to individual demographic characteristics, income and wages in 2004, the UREES focuses mainly on the education and employment status of household members. There are several unique features of the survey. The one which is particularly useful for this study is that the survey asks a set of retrospective questions regarding the respondent’s participation in the National College Entrance Examination. The questions include whether the individual participated in the NCEE, if so, the year and province of the participation, the total test score, whether he/she was admitted, the type of the education they completed (three-year college or four-year university), the name of the university/college, and the subject major. In addition to the information on tertiary education, the survey also asks about the quality of the senior high school the individual attended and the household’s relative income/expenditure level at the time when the individual graduated from senior high school. The purpose of this study is to evaluate the rate of return to a four-year university education relative to a three-year. To this end, our sample includes everybody who has completed at least senior high school education, participated in the NCEE from 1977 onwards, was working and reported positive earnings for the year 2004. Thus, those who participated in the NCEE after the year 2000 are excluded as they were not due to graduate from universities until 2005 and hence did not report labor market outcome variables in the survey. Similarly, those who participated in the NCEE before 1977 are also excluded because of the lack of information on cutoff 10 The Provinces where the survey was conducted are: Beijing, Shanxi, Liaoning, Heilongjiang, Zhejiang, Anhui, Hubei, Guangdong, Sichuan, Guizhou, Shaanxi, and Gansu. 66 scores for this earlier period. Restricting the sample to our interest groups and excluding observations with missing values on the NCEE test score, education level and other important demographic variables our final sample includes 702 individuals with a four-year university degree (the treatment group), 693 with a three-year college degree (the 1st control group), and 919 who were not admitted to the university (the second control group). The dependent variable used is logarithm of the 2004 annual earnings.11 Table 1 reports the summary statistics of the variables for the three subsamples. On average, the treatment group earns 31 per cent higher wages than the control group. The difference in earnings is larger for females than for males. The average age of the treatment and the control groups is about the same. Males are more likely to have a higher level of education than their female counterparts, whereas very little difference is detected in terms of individual ethnicity across different education levels or gender groups. Father’s years of schooling are slightly higher for the treatment group than for the control group. This is especially true for women. Furthermore, significantly more individuals in the treatment group are from richer families than are their counterparts in the control group, especially for women. This is indicated by the proportion of individuals who reported that, at the time of their senior high school graduation, their family’s relative consumption level was very high for their city. In addition, almost half of the sample in the treatment group attended the best local senior high schools. The ratio for those with three-year college education and those with a senior high school education is 16 and 26 percentage points lower, respectively. Finally, as expected, the average NCEE test score is highest for the treatment group, followed by the three-year college degree holders, and then those who failed to be admitted to either of these two education levels. Another important data set we use in this paper is the cutoff scores for four-year universities over the period 1977 to 2000 across different provinces for the humanities-social sciences stream and the sciences stream. We collected these data ourselves from various sources, including published books (for example, Meng, Yi, Xue, Qi, Xu, Liu, and Xia, 1988), local newspapers, and some official internet sites. 12 Despite our widespread search effort, there are still 8 per cent of the year-province cells with missing cutoff data.13 To handle the problem of the missing cut-offs, we use existing data to impute missing values. The basic idea is to use variations within a province over time and within one year across different provinces to extrapolate the missing cutoff scores. The details of our imputation method are presented in the Data Appendix. While in our main estimation the imputed cut-offs are included, we do test the robustness of excluding them. Figure 2 presents the NCEE cut-offs for the 11 We also have a sub-sample of individuals with information on their hourly earnings and we test the sensitivity of our estimation with annual or hourly earnings in Section 5.3. 12 Detailed data sources are listed in the Data Appendix. 13 We have some information on the cutoff scores for the first tier universities and for the three-year colleges. But a much larger proportion of them are missing. 67 humanities-social sciences and the sciences streams by year. The hollow triangles show the original scores, while the solid dots are the imputed scores. The figure shows a significant increase in the value of the cutoff scores between 1977 and 1988, and since then they have not been changed much. As discussed in the background section, the early increase in cutoff scores was mainly due to the change in the NCEE settings (variations in the number of subjects examined and the full scores for each subject). Another important point revealed from Figure 2 is that since the late 1980s there are a few outlier provinces, where the cut-off scores are much higher than those for other provinces. These outliers are the provinces which adopted the standardized scores (see Background Section for detailed discussion). Finally, the figure also shows that including or excluding imputed missing cutoff scores does not change the ranges and the trend of the cutoff scores. As indicated earlier, the range and the distribution of the NCEE scores vary significantly across years, and in some years, even across provinces within the same year. In addition, the cutoff scores are also different for different provinces over different years. It is, therefore, important to standardize the NCEE scores so that our forcing variable can be a comparable variable across different years and different provinces. To do so, we take residuals from a linear regression of raw scores on a full set of the provincial and year dummy variables, plus a dummy variable indicating whether a province was using standardized scores in a particular year. 5 Fuzzy RD results—LATE 5.1 Validity of the RD design Before presenting our fuzzy RD results (LATE), it is important to conduct the validity tests for the RD design. The most important assumption underlying the validity of the RD design is that each individual cannot exercise precise control over the forcing variable around the cutoff point. Although this assumption cannot be directly tested (Lee and Lemieux, 2009), it is difficult to imagine that individuals have precise control over the test scores around the cutoff point, based on our description of the Chinese National College Entrance Examination and the Chinese College Admission system. This is mainly because the cut-offs are determined after the NCEE is finished each year. However, because our data on NCEE scores are collected retrospectively through individual self reporting rather than through administrative records, it is possible that individuals have forgotten what their original scores were and reported them based on their knowledge of the cutoff scores.14If this is the case, our estimation may suffer from a problem of violating this important assumption. Fortunately, there are two implicit features of the RD underlying assumption that 14 This issue is similar to the misreporting problem raised in Lemiuex and Milligan (2008). 14 The figure presenting the density distribution of the normalized difference taken between each individual’s raw NCEE score is available upon request from the authors. 68 may be testable. First, if individuals do not have precise control over the forcing variable around the cutoff point, the density of the forcing variable should not exhibit any discontinuity around the cutoff. Second, the means of the baseline covariates should be continuous at the cutoff. Below, we test these two implications. We adopt a test suggested by McCrary (2008) to examine whether the density of the forcing variable exhibits any discontinuity around the cutoff. A jump in the density at the cutoff is direct evidence of some degree of sorting around the threshold, and should cast serious doubt about the appropriateness of the RD design (Lee and Lemiuex, 2009). The result of the t test proposed by McCrary (2008) cannot reject the hypothesis that the density distribution is continuous around the cutoff at the 95% significant level for both males (= 1.44) and females (= 1.64), though for females the test result is marginal.15 To test whether the conditional means of the observable characteristics are continuous at the cutoff, we first present a group of graphs to show that the outcome and treatment variables are discontinuous at the cutoff (Figures 3 and 4) but all the other covariates are not (Figure 5).The plots in these figures are non-parametric predictions from local polynomial smother and the dotted lines are the 95% confidence interval. The figures are plotted for the positive and negative normalized test scores, separately. Figures 3 and 4 show a very clear discontinuity of the outcome (log annual earnings) and treatment (having a four-year university degree) variables at the cutoff point for both male and female samples. Figure 4 also reveals that we do not have a sharp discontinuity, but rather a fuzzy one for both males and females. Figure 5 tests whether the conditional means of baseline covariates included in our RD regression (age, father’s years of schooling, and whether the household consumption level was high relative to the local average at the time the individual graduated from the senior high school) jump in a discontinuous fashion at the cutoff point. The top and bottom panels present the figures for the male and female samples, respectively. The figure shows that none of the variables are statistically significantly different at the cutoff for either sample. The slight difference in age for the male and female samples and the difference in father’s years of schooling for the male sample are all within the 95% confidence interval. More formally, following Lee and Lemieux (2009) we also estimate the Seemingly Unrelated Regression for the three covariates included to test whether they are jointly significantly different at the two sides of the cutoff point. Two sets of results are reported in Table 2, one regressing the covariates on the dummy variable that indicates eligibility for university, and the other on the dummy variable for university and using eligibility as the IV. We observe no statistically significant difference at the two sides of the cutoff in any of the regressions separately; and the 69 2 tests also reject the non-hypothesis that they are jointly significant. 5.2 Estimation results As discussed earlier, due to the fuzziness of our treatment, we employ the IV approach to estimate the local average treatment effect (LATE). The instrument used is the ‘eligibility’ dummy variable indicating whether an individual passed the cutoff score in the year and province where he/she participated in the NCEE. Our estimations compare the four-year university group with the control group—the three-year college group. The control variables include age, father’s years of schooling, a dummy variable indicating household consumption level at the time of high school graduation, and a vector of regional dummy variables which is used to capture regional cost of living differences. The flexible function of the forcing variable () includes a 5-order polynomial function of the standardized NCEE scores.16 The results are presented in Table 3 for male (left panel) and female (right panel) samples and for using three-year colleges as the control group. Within each quadrant, we also present the results using the full sample and those using the sample with optimal bandwidth of the forcing variable. Before discussing the estimated results we first examine the results from the first stage and reduced form estimations. These results are presented in columns 1 and 2 of the right and left panels in Table 3. The results from the first stage estimation show that the instrument is very strong in all the cases, as indicated by the F-tests presented at the bottom row of each panel. All of them pass the rule-of-thumb test of F-statistics being greater than 10. The reduced form results all have the correct signs and are statistically significant. The IV results are revealed in the last column of each quadrant. All the results are positive and statistically significant at the 1 to 5 per cent significance level. Let us examine the results using three-year college as the control group first (Panel A). Relative to this control group, a four-year university degree provides 45 and 52 per cent additional earnings to male and female individuals, respectively. These estimates are based on the full sample. However, as the basic idea of RD design is to evaluate the effect at the cutoff, it is important to choose the bandwidth of estimation so that it optimizes the tradeoff between precision and bias. Following Lee and Lemieux (2009) we use the cross-validation (CV) method to estimate the optimal bandwidth for each subsample, the subsequent results using the three-year college as the control group are presented at the bottom of the Panel A. Compared to the full-sample results these results using optimal bandwidth change slightly. The RD-IV estimate for the male sample reduces to 0.40 and for the female sample increases to 0.60. These results focus more on the information closer to the cutoff points and hence are less biased. 16 We examine the robustness to various polynomial orders later in this section. 70 Considering that there is only a difference of one year education between the treatment (four-year university) and control group (three-year college), the estimates presented above seem to be very large. Previous estimates for the return to one year of education have been much lower. For example, using a simple OLS estimation, Zhang et al. (2005) report that the average rate of return to an additional year of schooling in urban China is around 10 per cent in 2001, which is less than one quarter of our estimation. To further illustrate the difference between our fuzzy-RD estimates and the OLS estimates, we estimate the OLS regression for the same treatment and control groups, using our data as well as the data from National Bureau of Statistics (NBS)Urban Household Income and Expenditure survey for the year 2004 (the same year as our data).We find that the returns to four-year university degree relative to three-year college degree for males and females are 26 and 34 per cent, respectively, using our own data; and 22 and 27 percent, respectively, using the NBS data. These estimates are around half of what we estimated using the fuzzy-RD design.17 5.3 Robustness tests We conduct several robustness tests. First we test whether imposing different functional forms on the forcing variable function (i.e. function () in Equation (1a)) makes a difference. We use from the 1st order to the 8th order of the polynomial terms of the forcing variable. The results are presented in the first panel of Table 4. We find that the estimated coefficients across different specifications only change slightly, especially those for the male sample, which seem to stabilize at the 5th or higher orders of polynomial specifications. For the female sample, the change is more obvious, but not significant enough to cause any concern. The second panel of Table 4 presents results excluding all the individuals whose year-province cutoff scores are missing but were previously included using predicted values extrapolated from the available cutoff data for other year-province cells. These results are very close to the IV results presented in Table 3. We also test the robustness of our results using a subsample whose information on hours worked is available. In the survey only the household heads and spouses are asked the questions on the number of days per week and number of hours per day they worked in 2004. The last panel of Table 4 presents the results using log hourly earnings as well as log annual earnings as dependent variables based on the consistent sub-sample of individuals who reported information on hours worked. We find that with the restricted sample, the estimated returns using log annual earnings are reduced somewhat, relative to the full sample as shown in Table 3. However, on average, the result using log hourly earnings seems to suggest a higher 17 The regression using our own data includes the same covariates, whereas for using the NBS data it includes age and its squared term, the provincial dummy variables and an indicator for having four-year university degree. The results using our own data are presented in Table 6. The results using the NBS data are available upon request from the authors. 71 return to four-year university degree for both males and females when compared to the three-year college group. 6 More discussion on LATE The preceding section presented the local average treatment effect (LATE) of four-year university education relative to three-year college education on earnings for the compliers. As indicated in Oreopoulos (2006), the average treatment effect (ATE) for the population is also important, as it offers a theoretically more stable parameter than the LATE when considering potential gains for anyone receiving university education. In this section, we try to see if LATE can carry over the whole population. Following Imbense and Wooldridge (2007) we calculate the following proportions: (i) those who went to university but were not eligible (observed always-takers) out of total ineligibles (π); (ii) those who did not go to university but were eligible (observed never-takers) out of total eligibles (π); (iii) those who went to university and were eligible (including compliers and unobserved always-takers) out of total eligibles (π+ π); and (iiii) those who did not go to university and were not eligible (including compliers and unobserved never-takers) out of total ineligibles (π+π). The fact that eligibility status is random implies that π=πandπ=π. Thus we can calculate the proportions of eligible and ineligible compliers (π andπ). Using these calculated proportions we then calculate the unconditional average earnings for the eligible compliers, ineligible compliers, observed always-takers and observed never-takers. 18 These calculated results for the full samples and the samples with the optimal bandwidths using the three-year college as the control group are reported in Table 5. The results in Table 5 show that in almost all the cases, and in all the samples, the difference in payoff between the compliers and never-takers and between the compliers and always-takers is quite large. For example, for the male full sample, the average log earnings for eligible compliers is 10.01, while for always-takers it is 9.76, which is 24 per cent lower. Similarly, for ineligible compliers the average log earnings is 9.45 while for never-takers it is 9.71. These differences indicate that the effects are more likely to be heterogeneous, and hence, the estimated LATE for the compliers is less likely to carry over to the non-compliers. Another important point to note is that the proportions of compliers in our male and female samples are 45 and 48 per cent, respectively. These are quite large proportions of the population, and hence the LATE estimates should have relatively general implications. In particular, our LATE estimates carry some policy implications for the potential effect of the post-1999 university expansion in China, which allows individuals who otherwise would have failed to acquire a 18 For the detailed method of these calculations, see Imbens and Wooldridge (2007). 72 university degree. Our estimates suggest that at the cut-off point the four-year university degree brings a 40 to 60 per cent increase in earnings relative to the three-year college group. These estimates are particularly accurate for individuals whose score is around the admission thresholds and who are most likely to have been affected by the university expansion policy. That being said, we must acknowledge that our results may not carry full weight in predicting the possible effect of the 1999 university expansion program as most students admitted after the university expansion had not yet entered the labor market in 2004. 7 Conclusions Exploiting the special feature of the Chinese University Admission system and a unique data set that provides individuals’ NCEE scores, we have estimated the local average treatment effect (LATE) of university education on earnings using fuzzy RD design. The empirical results suggest that the average return to obtaining a four-year university degree for the compliers is 40 and 60 per cent for the male and female samples, respectively, using the three-year college group as the control group. These estimates are much larger than the rate of return to university education revealed in the existing literature for urban China for a similar period. Further investigation in the paper indicates that in our sample a relatively large proportion of individuals are compliers (45 per cent for males and 48 per cent for females). Thus, the LATE estimated in this paper should have relatively general implications. We also find that the average earnings for the always-takers and never-takers are very different from those of the compliers, indicating heterogeneous treatment effects across different complier types. Given that the literature is very limited in applying RD design to estimating the returns to education, this paper makes an important contribution to the literature by applying RD design to evaluate the returns to higher education. Empirically, the LATE constitutes valuable implications for the effects of university expansion in China, which exhibits an increasing trend of annual enrollment at universities from the late 1970s to the 1990s, followed by a drastic three-fold jump since 1999. The LATE estimates offer solid evidence on the earnings effect on individuals whose scores are around the admission thresholds and who are most likely to have been affected by the university expansion. 73 References [1] Acemoglu, Daron and Joshua Angrist (2001), "How large are human capital externalities evidence? [2] Evidence from compulsory schooling laws," NBER Macrocosmic Annual, 15, 9-59. [3] Angrist, Joshua D. (1998), "Estimating the labor market impact of voluntary military service using social security data on military applicants", Econometrica, 66(2), 249-288. [4] Angrist, Joshua D. and Alan Kruger (1991), "Does compulsory school attendance affect schooling and earnings?" The Quarterly Journal of Economics, 106(4), 979-1014. [5] Ashenfelter, Orley and Alan Krueger (1994), "Estimates of the economic return to schooling from a new sample of twins", American Economic Review, 84, 1157-1174. [6] Behrman, Jere, Mark Rosenzweig and Paul Taubman (1994), "Endowments and the allocation of schooling in the family and in the marriage market: The twins experiment", Journal of Political Economy, 102, 1131-1174. [7] Bingley, Paul, Kaare Christensen and Vibeke Myrup Jensen (2009), "Parental schooling and child development: leaning from twin parents", The Danish National Centre for Social Research Working Paper 07: 2009. [8] Black, Dan A. and Jeffrey A. Smith (2004), "How robust is the evidence on the effects of college quality? evidence from matching", Journal of Econometrics, 121. 99-124. [9] Bound, J. and G. Solon (1999), ‘Double trouble: On the value of twins-based estimation of the return to schooling’, Economics of Education Review, 18(2), 169-182. [10] Bound, John, David A. Jaeger and Regina M. Baker, (1995), "Problems with instrumental variables estimation when the correlation between the instruments and the endogeneous explanatory variable is weak," Journal of the American Statistical Association, 90(430), 443-450. [11] Card, D. (1995). "Using geographic variation in college proximity to estimate the return to schooling," In L. N. Christodes, E. K. Grant, & R. Swidinsky (Eds.), Aspects of labour market behaviour: Essays in honour of John Vanderkamp (pp. 201–222). Toronto, Buffalo and London: University of Toronto Press. [12] China Ministry of Education, 2006, http://www.moe.edu.cn/edoas/website18/52/info25052.htm  [13] Dale, S. and A. Krueger (2002), "Estimating the payoff of attending a more selective college: an application of selection on observables and unobservables", Quarterly Journal of Economics, 107(4), 1491-1527. 74 [14] Fang, Cheng, Xiaobo Zhang and Shenggen Fan (2002), "Emergence of urban poverty and inequality in china: evidence from household survey", China Economic Review, 13(4), 430-443. [15] Gibson, John, Jikun Huang and Scott Rozelle (2003), "Improving estimates of inequality and poverty from urban china’s household income and expenditure survey", Review of Income and Wealth, 49(1), 53-68. [16] Hahn, Jinyong, Petra Todd and Wilbert Van der Klaauw (2001), "Identification and estimation of treatment effects with a regression-discontinuity design," Econometrica, 69(1), 201-209. [17] Han, Tong, Eric J. Wailes, and Gail L. Cramer 1995, "Rural and urban data collection in the people’s republic of China", In The China Market Data and Information Systems. Proceedings of WRCC-101 Symposium, Washington, D.C. [18] Harbin Institute of Technology, 2008, http://news.hit.edu.cn/articles/2008/06-21/06154802.htm [19] Heckman, James J. and Richard Jr. Robb (1985), "Alternative methods for evaluating the impact of interventions: an overview," Journal of Econometrics, 30. 239-267. [20] Heckman, James and Xuesong Li. 2004. “Selection bias, comparative advantage and heterogeneousreturns to education: evidence from China in 2000”, Pacific Economic Review 9(3), pp.155-171. [21] Isacsson, Gunnar (1999), "Estimates of the return to schooling in Sweden from a large sample of twins", Labour Economics, 6, 471-489. [22] Lee, D. and T. Lemieux (2009), "Regression discontinuity designs in economics", NBER Working Paper 14723. [23] Leigh, A. and C. Ryan (2008), "Estimating returns to education using different natural experiment techniques", Economics of Education Review, 27(2), 149-160. [24] Lochner, Lance and Enrico Moretti (2004), "The effect of education on crime: evidence from prison inmates, arrests, and self-reports," American Economic Review, 94(1), 155-189. [25] McCrary, Justin (2008), "Manipulation of the running variable in the regression discontinuity design: a density test", Journal of Econometrics, 142, 698-714. [26] Meng, Mingyi, Yi, Guohua, Xue, Yipeng, Qi, Lin, Xu, Zhong, Liu, Ziqiang, Xia, Guangjin,(1988), The Complete Collection of China’s College Entrance Examination (Zhong Guo GaoKao Da Quan), Jilin People’s Press, Changchun. [27] Meng Xin, Robert Gregory and Youjuan Wang (2005), "Poverty, inequality, and growth in urban China, 1986-2000", Journal of Comparative Economics, 33(4), 710-729. [28] Meng, Xin, Kailing Shen and Xue Sen (2009), "Economic reform, education expansion, 75 and earnings inequality for urban males in China, 1988-2007", Unpublished manuscript, Canberra: Australian National University. [29] Miller, Paul, Charles Mulvey and Nick Martin (1995), "What do twins studies reveal about the economic returns to education? a comparison of Australian and U.S. findings", American Economic Review, 85(3), 586-599. [30] Neumark, D. (1999), "Biases in twin estimates of the return to schooling", Economics of Education Review, 18(2), 143-148. [31] Oreopoulos, Philip (2006), "Estimating average and local average treatment effects of education when compulsory schooling laws really matter", American Economic Review, 96(1), 152-175. [32] Zhang, Junsen, Yaohui Zhao, Albert Park and Xiaoqing Song (2005), "Economic returns to schooling in urban China, 1988 to 2001", Journal of Comparative Economics, 33. 730-750. 76   77 78 79 Table 1- Statistical Descriptions Total Males Females 4-Year University (treatment group) Mean Std.Dev. Mean Std.Dev. Mean Std.Dev. Ln(annual earnings) 9.82 0.64 9.87 0.65 9.75 0.62 Age 34.74 7.42 36.20 7.51 32.36 6.63 Dummy for males 0.62 Dummy for Han ethnicity 0.95 0.95 0.94 Father’s years of schooling 9.03 4.65 8.30 4.68 10.21 3.99 a Dummy for high family cnsmpt. level 0.12 0.09 0.16 b Dummy for quality of the SHS : best 0.47 0.49 0.44 NCEEc test score 469.18 89.86 467.73 90.92 471.53 88.24 No. of observations 702 435 267 3-Year College (Control group) Mean Std.Dev. Mean Std.Dev. Mean Std.Dev. Ln(annual earnings) 9.51 0.65 9.59 0.63 9.40 0.66 Age 34.82 7.04 35.81 7.27 33.45 6.47 Dummy for males 0.58 Dummy for Han ethnicity 0.96 0.97 0.94 Father’s years of schooling 8.71 4.55 8.21 4.69 9.39 4.26 Dummy for high family cnsmpt. levela 0.09 0.10 0.08 Dummy for quality of the SHSb: best 0.31 0.31 0.31 c NCEE test score 423.24 86.99 419.74 89.20 428.07 83.76 No. of observations 693 402 291 Note: a: Dummy for family consumption level being high relative to the level in the city the respondent lived at the time graduation. b. Senior high school. c.National College Entrance Examination 80 Table 2- Validity tests for joint significance of baseline covariates Male sample Female Sample SUR: Age Father years High Age Father years High of HH of HH schooling cnsmpt. schooling cnsmpt. eligible 0.246 -0.066 -0.013 -0.492 -0.381 0.042 [0.053] [0.371] [0.022] [0.636] [0.491] [0.028] 5 order polynomial terms of Yes Yes Yes Yes Yes Yes forcing variables Provincial fixed effect Yes Yes Yes Yes Yes Yes Observations 1255 1255 1255 1059 1059 1059 R-squared 0.05 0.05 0.02 0.09 0.06 0.03 2 Chi -test on joint significance Chi2=0.53,prob>chi2=0.92 Chi2=4.78, prob>chi2=0.19 3SLS Age Father years High Age Father years High 4-year Univ (Eligibility as IV) of HH of HH schooling cnsmpt. schooling cnsmpt. 0.800 -0.214 -0.043 -1.307 -1.014 0.113 [1.807] [1.206] [0.072] [1.659] [1.136] [0.074] 5 order polynomial terms of Yes Yes Yes Yes Yes Yes forcing variables Provincial fixed effect Yes Yes Yes Yes Yes Yes Observations 1255 1255 1255 1059 1059 1059 R-squared 0.04 0.05 0.01 0.13 0.02 0.04 2 Chi -test on joint significance Chi2=0.53,prob>chi2=0.91 Chi2=4.80, prob>chi2=0.19 Note: Standard errors in brackets 81 Table 3- Results from the fuzzy regression Discontinuity Design 4-year University vs. 3-year College Male sample Female Sample st 1 stage Reduced RD-IV 1st stage Reduced RD-IV form form Full sample 4-year univ. degree 0.455** 0.490** [0.177] [0.198] Eligibility 0.295*** 0.134*** 0.350*** 0.171** [0.042] [0.052] [0.053] [0.070] Observations 837 837 837 557 557 557 R-squared 0.233 0.338 0.290 0.282 F-test for instrument 48.11 42.69 Trimmed sample (with optimal bandwidth) 4-year univ. degree 0.400** 0.604*** [0.184] [0.217] Eligibility 0.288*** 0.115** 0.333*** 0.201*** [0.045] [0.053] [0.056] [0.072] Observations 723 723 723 469 469 469 R-squared 0.201 0.337 0.286 0.249 F-test for 41.82 35.26 instrument Note: Other control variables included are: age and its squared term, father's years of schooling, dummy for high level household dummies, and 5 order of polynomial of standardized test scores. Standard errors in brackets, * significant at 10%; ** significant at 5%; *** significant at 1% 82 Table 4- Sensitivity tests for RD regressions 1. functional form test linear Quadratic Cubic 4th order 5th order 6th order 7th order 8th order Univ. vs. College 0.386** 0.401** 0.415** 0.407** 0.455** 0.460*** 0.449** 0.457** (males) [0.153] [0.156] [0.167] [0.167] [0.177] [0.177] [0.184] [0.183] Univ. vs. College 0.421*** 0.433** 0.421** 0.462** 0.490** 0.489** 0.528** 0.561** (females) [0.152] [0.170] [0.187] [0.195] [0.198] [0.198] [0.212] [0.219] 2. Excluding predicted cutoffs Univ v.s college (males) Univ v.s college (females) 4-year university degree 0.485*** 0.475** [0.182] [0.215] Observations 775 521 3.sample with hourly earnings Hourly earnings Annual earnings Males Females Males Females 4-year university degree 0.484** 0.360* 0.345* 0.301 [0.188] [0.196] [0.176] [0.185] Observations 691 432 752 674 Note: Standard errors in brackets, * significant at 10%; ** significant at 5%; *** significant at 1% 83 Table 5: Estimated proportion and average log earnings for different compliance types Male sample Female sample Full sample proportions Always takers (1) 0.28 0.23 Never takers (2) 0.26 0.29 Compliers (3) 0.45 0.48 Average age log earnings Treated always takers (4) 9.76 9.58 Untreated never takers (5) 9.71 9.46 Treated compliers (6) 10.01 9.96 Untreated compliers (7) 9.45 9.33 (6)-(4) 0.24 0.38 (7)-(5) -0.26 -0.13 Optimal bandwidth sample proportions Always takers (1’) 0.31 0.26 Never takers (2’) 0.28 0.28 Compliers (3’) 0.41 0.46 Average age log earnings Treated always takers (4’) 9.78 0.57 Untreated never takers (5’) 9.70 9.44 Treated compliers (6’) 9.94 9.91 Untreated compliers (7’) 9.47 9.42 (6’)-(4’) 0.15 0.34 (7’)-(5’) -0.23 -0.03 Note: (i) Under the monotonicity assumption, there is no defier. The sample, therefore, is comprised of three groups ‐ the compliers, the never‐takers, and the always‐takers. (ii) The randomness of the instrumental variable implies that the ratio of eligible individuals to ineligible individuals is constant across the three groups. Thus, the proportion of always‐takers, either (1) or (1'), is calculated as ineligible individuals who went to university (their non‐compliance is thus revealed) out of total ineligible individuals. Similarly, the proportion of never‐takers, either (2) or (2'), is calculated as eligible individuals who did not go to university (their non‐compliance is thus revealed) out of total eligible individuals. Finally, the proportion of compliers, either (3) or (3') is calculated as one minus the proportions of always‐takers and never‐takers. (iii) Using these proportions, and the formula provided by Imbens and Wooldridge (2007), we are able to calibrate the expected log earnings for the four groups listed in (4) to (7) (and (4') to (7')). 84 A Data Appendix: This appendix provides detailed information on how the cut-offs of the NCEE scores are collected. Essentially, there is no central source for a complete data set of the cut-offs. Thus, we collected these data from multiple, dismantled sources – a tedious and time-consuming task. Despite our relentless search, the resulting collection is less than comprehensive. Thus, we also provide details in this appendix about how we impute the predicted values for the missing cutoff data. A.1 Data sources The data were retrieved from three sources: (1) published books, monographs and theses; (2)archives of local newspapers and periodicals of the provinces and cities that participated in the 2004 Urban Residents Education and Employment survey; (3) websites, such as web-pages of China Education Online. The complete list of these materials, except the newspapers, is provided in the reference at the end of this Appendix.19 In a limited number of cases the recorded cut-offs from different sources are not consistent. Thus, we prepare two versions of the cutoff points – version A is based on Meng et. al. (1988), while version B is based on the data reported in newspapers. The following points present the details of missing values and some inconsistency in the cutoff data. (1) In each province-year cell, there are in general six different cutoffs. Firstly, there are three levels of cutoffs: 1) admissions into first-tier universities; 2) second-tier universities; 3) three-year colleges. Then, at each of these three levels, there are two different cut-offs – one for admissions into the humanity and social sciences stream and the other for the sciences stream. Approximately eight percent of the cut-offs for the second-tier universities are missing, while the missing values amount to 57% in the cases of three-year colleges, and 43% for the first-tier universities, respectively. (2) In Shandong province in 1991 and 1992, different major cities announced their own cutoff points. We use the average values of these cut-offs at the city level to form the provincial figure. (3) In 1991 and 1992 there were three examination papers for Sciences in Hunan, Hainan and Yunnan provinces, and the corresponding cutoff points are inconsistent. In these cases, again, we use average values. 19 The long list of references regarding issues of newspapers from which we obtained our information is available upon request from the authors. 85 (4) The NCEE scores in some provinces and in some years are standardized: Guangdong (1988-2006), Shaanxi (1994-2001), Fujian (1997-2001), Hainan (1993-2009), Shandong (1996-2000), Henan (1994-2000) and Guangxi (1996-2004). The basic idea of standardization is to re-scale the raw scores according to some presumed distribution. The cut-offs for standardized scores are usually much higher than for the raw scores, as indicated in Figure 2. A.2 Imputing the missing cut-offs We use existing data to impute for missing values for the second-tier university for the purpose of this paper. The basic idea of this imputation is to use within-province variations over time and within-year variations over provinces to extrapolate the missing cutoff scores. The strategy is detailed as follows: Let 2denotes the cut-offs for second-tier universities and the corresponding predicted values are represented by 2. The model used to predict the missing values is: CPt2L= α+ β+γ++ (A1) where the subscripts and represent province and year, and indicates whether scores are standardized. Thus, , and are three sets of dummy variables for province, year and standardized scores, and is the error term. We run this regression using non-missing data, and use the predicted values for the missing cutoff scores. There are many more missing values in the cutoff lines for first-tier universities and three year colleges. Therefore, the loss of accuracy may be substantial if we rely on Equation (A1) to impute the missing cut-offs. An alternative method that can enhance the accuracy is to use Equation (A1) with the dependent variable replaced by (1−2),or (−2),where 1 and indicate the cut-offs for the first tier universities and four-year colleges, respectively. In this case, the predicted cutoff points for first-tier universities/three-year colleges, are merely the sum of the predicted dependent variable and 2(if missing). This method should be an improvement on than Equation (A1) because it imposes useful information – the relationship that 1is greater or equal to 2, and is smaller or equal to 2. Due to the large number of missing values, the above model may generate unreasonable predicted cut-offs. In these cases we can correct the values by using linear interpolation and other methods. 86 A.3 References 1981, 1981 Nian Gao Kao Bu Fen Sheng Shi Zui Di Lu Qu Xian, Zhong Xue Jiao Yan, 6, 22. 1982, 22 Sheng Shi 1982 Nian Gao Kao Lu Qu Fen Shu Xian, Zhong Xue Jiao Yan, 5, 24. Admission Office of Harbin Engineering University 1988, Bao Kao Da Xue Zhi Nan, Harbin Engineering University Press, Harbin. Chen, Dahui and Zhang, Falu 1988, Fujian Sheng Gao Xiao Zhao Sheng Tian Bao Zhi Yuan Zhi Nan, Fujian Education Press, Fuzhou. China Education Online 2006, http://www.eol.cn/include/cer.net/gaokao/zhuanti/2006_fenshuxian.shtml. Gao, Yucun 1991, Gao Kao Da Xue Zhi Yuan Zhi Dao, Tianjin Academy of Social Sciences Publishing House, Tianjin. Hebei Jiao Yu Kao Shi Yuan 2007, Hebei Gao Kao 30 Nian, Social Sciences Academic Press (China), Beijing. Jiangsu Sheng Gao Xiao Zhao Sheng Ban Gong Shi 1992, Gao Xiao Bao Kao Zhi Dao, Phoenix Science Press, Nanjing. Li, Lifeng 2006, ‘The Researches into the Regional Equity of Entrance Examination in China (Wo Guo Gao Xiao Zhao Sheng Kao Shi Zhong De Qu Yu Gong Ping Yan Jiu)’, PhD thesis, Xiamen University, Xiamen. Meng, Mingyi; Yi, Guohua; Xue, Yipeng; Qi, Lin; Xu, Zhong; Liu, Ziqiang; and Xia, Guangjin, 1988, The Complete Collection of China’s College Entrance Examination (Zhong Guo Gao Kao Da Quan), Jilin People’s Press, Changchun. Wang, Xiyu 1991, Gao Kao Zhi Dao, Sichuan People’s Publishing House, Chengdu. Yan, Mingming 1998, Hubei Gao Xiao Zhao Sheng 20 Nian, China University of Geosciences Press, Wuhan. Zhong Guo Zhao Sheng Lian Meng 2006, http://www.zgedu.net/kspd/gk/gd.asp  Zhong, Shurong 1993, Gao Kao Zhi Nan, Jiangxi High Education Press, Nanchang. 87 Chapter Three Once A Loser, Always A Loser? Evidence from the Football League in England 88    Better to be the head of a dog than the tail of a lion --Old Saying 1. Introduction The increasing literature has found that adverse labor market conditions will affect the quality of job opportunities and therefore has both short term and long term effects on workers' well being. For example, there is vast literature documenting the negative long-term effect of adverse initial labor market conditions on the earnings of college graduates. Over (2006, 2008) finds such evidence among MBA college graduates and Ph.D. economists. Kahn (2010) finds similar evidence among college graduates in the 1982 recession, and Genday Kondo, and Souichi (2010) support the findings by a comparison of U.S. and Japanese college graduates. Ellwood (1982), Beaudry and DiNardo (1991), Baker, Gibbs, and Holmstrom (1994) and Devereux (2003) find persistent effects of cyclical fluctuations for non-college workers. Regarding workers who already acquired certain specific skills and/or experiences, another related branch of literature is the costs of labor market adjustment to external factors such as trade, immigration, innovations in labor demand and environmental policies. Henderson (1996) and Greenstone (2002) find that production is typically reallocated away from newly regulated industries, and this creates a broad set of private and social costs. Walker (2012) documents the effect of "job lost" caused by the 1990 Clean Air Act Amendments on the long-term wage of workers with industry specific skills and/or experiences.1 Oreopoulos, Wacher and Heisz (2012) summarize four potential possible mechanisms through which the adverse condition would possibly affect the long-term income of the workers subject to the adverse labor market condition. First, the workers could be "misplaced" to a firm of lower quality, which offers limited opportunities for promotion and training. Secondly, within a typical firm, there is a wage adjustment. Workers are paid less than their more productive years. Thirdly, the lower quality firm could send a negative signal and affect the potential employers' perception about the workers' productivity in the long term. Finally, the lower quality firm could have a negative effect on workers' human capital accumulation. The prior research about adverse labor market condition has the challenge to disentangle the different channels. In particular, it is hard to separate the effect of being “randomly” misplaced to a low quality firm on the workers welfare. Because the labor market conditions are likely to affect many firms and even many industries, it is very difficult to impossible to detect the effect on the workers’ long term well being when a bad shock hits one single firm. In this paper, I study the effect of a low quality firm/working environment on workers' long term well being in the soccer industry using a unique setting of relegation of the league in England. Annually, the teams which rank as the last three are relegated to a lower level league in the next                                                              1  Borjas and Ramey (1995), Menezes-Filho and Muendler (2007), Artuc, Chaudhuri, Mclaren (2010), Ebenstein, Harrison, McMillan, and Phillips (2011), Feenstra (2010), Topalova (2010), Autor, Dorn, and Hanson (2011), and Dix-Carneiro (2011) study the impact of international trade on labor markets.  89    season, which is generally perceived as a setback of the group of players in their career. The remaining teams stay in the league. I collect the longitudinal information of the players from the team which ranked the third last and relegated and the team which ranked as the fourth last in England Premier League from 1991 to 2002. Based on the idea of Regression Discontinuity (Greenstone, Hornbeck and Moretti; 2010), the teams ranked as the fourth last make a natural control group for the teams which ranked the third last and were relegated. The rank of the teams captures the heterogeneity of the players, but the pre test shows that the players’ characteristics are very balanced between the team ranked fourth last and that ranked third last. Given the fierce competition in the England League, it is plausible to believe that there is little space for manipulation on the final rank of teams. Thus, this design enables me to explore the causal effect of team relegation on the short term and long-term well being of the players in these two teams after the season of relegation. According to the prior literature on adverse labor market conditions, team relegation should unambiguously badly affect the workers both in the short run and in the long run. In the context of this paper, the relegation signals the player's productivity negatively, which lowers his opportunity to land a good place in the future. A lower ranked club also receives less funding and other resources that help the career development of players. The literature on peers at work (Rasu 2004, Moretti and Mas, 2009) predicts an ambiguous sign of the effect. First, if the peers in a lower ranking league are weaker, interacting and/or competing with peers of lower productivity prevent the externality from learning, etc. This channel predicts a negative effect of relegation. Second, the incentive of the relegated team players could be either reduced or increased by easier competition from the other teams in the same level of the league. The incentive is lower if the rival team is too weak to defeat, but it could be higher if the chance to win a game is quite slim when in the first level of the league, but is greater when in the second level of the league. I find that compared to those in non relegated teams, the players in the relegated teams played in lower ranked clubs in the first 3 years after relegation, but played in higher ranking clubs after that. In the long term, the relegated players also had a higher transfer fee and more appearances in the club where they played. The effect is more significant and of a larger magnitude for players under 25 at the time of relegation. All evidence is consistent with the hypothesis that the relegated players face weaker internal and external competition. Therefore, they are more likely to be top players and have more appearances. Those who stayed in the first level league might have similar productivity, but were more prone to be bench warmers and lose the opportunity to play and then accumulate human capital which had a negative impact for their long term career. More opportunities to play are critical to develop the human capital, which benefits players in the long term, and also makes it possible for a higher transfer fee and opportunities to land in a fair place. This channel should be especially effective for young players, who have greater demand in accumulating skills and experiences. The evidence and the hypothesis illustrate how the working environment affect workers’ long term well being through impact human capital accumulation in a high turn-over, high competition industry. 90    This hypothesis is driven by two key assumptions. First, most players should not transfer to other clubs shortly after the relegation season. Second, the relegated club is less likely to recruit higher quality players in the following seasons. The first assumption assures that the relegation impacts players through interacting with the team such as their appearance. The second assumption directly explains the reason why players in the relegated teams have more appearances afterwards. In the background information, I shall discuss in greater detail that over 70% of the players stay during the first year after relegation, and at least 50% of the players stayed during the first two years after relegation. Regarding the second assumption, the relegated teams recruit lower quality players either because they face weaker rivals in lower level league and there is no demand for introducing high quality players, or because they receive less funds after relegation and high quality players are not affordable. I find that the transfer fee for recruiting new players is significantly higher for the non relegated club during the first 3 years after the relegation season. This study documents the effects of an exogenous adverse shock on firm quality in one unique industry. I recognize that the performance and market value of soccer players may not be of immediate scholarly concern, but I believe this study has methodological advantages and results that make it of broader interest. Another advantage of this study compared with the other firm study is that the quality of the firm, the opportunity in the firm, the work performance and market value of the worker are easier to measure. (See the small table as follows.) General labor market The industry of soccer Quality of the firm/Peers at work Rank of the club Productivity Appearances/Goals Market Value/Income Fees paid by clubs Therefore identifying the effects of an exogenous shock on the team on the players' appearance, goals, the rank of the club working for, the fee paid by clubs is also useful because, as discussed above, these measures closely mirrors that in more immediately relevant domains. It is also the case that European soccer has significant economic impacts. The television contracts of European soccer games are worth several billion dollars. Similarly, a recent study by Kotchen and Potoski (2011) explores the question of conflicts of interest using the poll data collected from college football coaches participating in the USA Today Coaches Poll of the top 25 teams. They reviewed the growing tradition of research that exploits the wealth of data and well-defined incentives often found in sports to investigate more general economic phenomena. These include studies on globalization and technological progress in track and field (Munasinghe, O'Flaherty, and Danninger, 2001), corruption in sumo wrestling (Duggan and Levitt 2002), maximization behavior in football (Romer, 2006), racial discrimination in basketball (Price and Wolfers, 2010), game theory in chess (Levitt, List, and Sandoff, 2011), and many others. 91    2. Background The English football league system, is a series of interconnected leagues for football clubs in England. The system has a hierarchical format with promotion and relegation between leagues at different levels. Clubs that are at the top in their league can rise higher in the hierarchy, while those are at the bottom go further down. The top division is the Premier League, which contains 20 clubs. The Football League  has  three divisions: The Championship (Level 2), League One (Level 3) and League Two (Level 4)2. Promotion and relegation allow the leagues to maintain a hierarchy of leagues and/or divisions. Through promotion and relegation, teams are transferred between two divisions based on their performance (points or relative rankings) in each season. The top teams in the lower division are promoted to the upper division, and at the same time, the worst-ranked teams in the higher division are relegated to the lower division. For example, let’s look at the Premier League (level 1). Each year the lowest three teams are relegated to level 2 based on the ranking of points. The Premier League has 20 teams in total and the competition is fierce. It is hard to predict which team is the champion and it is much harder to predict which teams are relegated at the end of the season. According to the rule, the 4th lowest team stays in the league while the 3rd lowest team is relegated. However, these two teams are so close that usually the results are not known until the last round of the season. Figure 1 shows the points of four different groups of teams in each season over the period 1950- 2010. Although the gap of points between the champion and the lowest team is big, the gap between the 4th lowest team and the 3rd lowest team is very small. That indicates that the 3rd lowest team narrowly lost the competition, which means that whether 4th lowest team or 3rd lowest team is relegated is quite random. Figure 1‐Points of Various Teams Over Time in the Premier League 120 100 80 Champion Points 60 4th lowest team 40 3rd lowest team 1st lowest team 20 0 1950 1953 1956 1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007 2010                                                              2 Clubs outside this group are referred to as 'non-League' clubs. 92    Relegation has a big impact on the relegated teams since there is a big and increasingly disparity between the Premier League and the Champion League. Since the Premier League began as the FA Premier League at the start of the 1992–93 season, its member teams have received larger amounts of money in TV rights than their Football League colleagues. Prior to the formation of the Premier League, television revenues from top flight matches were shared between the 92 Football League clubs across 4 unified national professional divisions. The breakaway of 22 clubs to form the Premier League resulted in top flight revenues being shared exclusively between Premier League clubs. Thus the Premier League clubs invest much more money in ground improvements and the player transfer market than the Champion League counterparts. Once the teams are relegated, they can fight hard to be promoted in the next season. If they succeed, they can play in the higher rank leagues after their promotion. However, it is not that easy. Figure 2 shows that on average it takes the just relegated teams at least three years to get to the same level as the just-not-relegated teams. Figure 2‐Percentage of Playing in the Premier League of Two Groups 100 80 60 40 20 just relegated 0 just not relegated % ‐20 ‐18 ‐16 ‐14 ‐12 ‐10 ‐8 ‐6 ‐4 ‐2 0 2 4 6 8 10 12 14 16 18 20 ‐20 differences ‐40 ‐60 ‐80 ‐100 Note: X axis is the normalized year; Y axis is the percentage of playing in the Premier League. For now, I only focus on the teams around the cutoff in the Premier League (Level One in England Football League System) in England over the period 1950-2010. In each year I only include two teams: one team which was just above the cutoff and not relegated and the other team which was just below the cutoff and relegated. The teams which were just above the cutoff in this period formed a group indicated by the blue color in the figure. The teams which were just below the cutoff in this period formed a group indicated by the red color in the figure. I normalize the year when the just relegate/just not relegate happened to be time zero. Then I calculate the percentage of teams staying in the Premier League (tier one) in various standardized years for different groups. In addition, I calculate the difference as well. Football players in the relegated teams are also affected by the relegation. If they can transfer to another club similar to the one they worked for before relegation in a very short time, the costs 93    should be negligible. However, if it takes a longer time for them to find another job, the immediate transfers are unlikely, especially when the skills of soccer players are not quite substitutable between positions. The second reason why an immediate transfer is unlikely is that the relegation is generally regarded as a setback for players in the team as a group. It negatively signals the productivity of players and will impact the perception of the potential employers. Finally, they are constrained by the contracts signed with the teams. On average, a contract lasts for 4 years. A transfer fee should be paid by the new club if the transfer is before the end of the term. Figure 3 shows that 70% of the football players stay in the same team 12 months after the relegated team is relegated. Two years later the percentage drops to around 50%. Based on this, it means most of the football players don’t move shortly after the relegation. Figure 3‐Percentage of Players' Staying in the Same Team 120 100 80 60 just relegated % just not relegated 40 20 0 0 4 8 100 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 Note: X axis is the normalized month; Y axis is the percentage of players’ staying in the same team. The left red solid line indicates 12 months after the relegation, the right red solid line indicates 24 months after the relegation. At this time, I only focus on the teams around the cutoff in the Premier League (Level One in England Football League System) in England over the period 1991-2002. I only include the football players in two groups of teams: one group of teams which were just above the cutoff and not relegated and the other group of teams which were just below the cutoff and relegated. The teams which were just above the cutoff in this period formed a group indicated by the blue color in the figure. The teams which were just below the cutoff in this period formed a group indicated by the red color in the figure. I normalize the month when the just relegate/just not relegate happened to be time zero. Then I calculate the percentage of teams staying in the same team in the months after relegation for the two groups. Since most of the players do not leave the club shortly after relegation happens, they are sure to be affected by the relegation of their teams in the short term. In addition, the relegation of their teams might also affect them in the long run. 94    3. Data For the analysis of teams, I am using a club-level panel data from 1991 to 2002 that I collected from Wikipedia. I focus on the history of ranking for teams which were just relegated and those that were not relegated over this period. For the analysis of career development of players, I mainly use two parts of the data to explore my hypothesis in this paper. First, I collect all the information available at the on-line data set Soccer Base to construct an unbalanced panel of professional information of the players in teams interested. The advantage of this data set is the long span of history for each player. For each football player, it contains information about the date of birth, birth place, the teams and periods he played in his career (including the period when loaned or swapped to other clubs), the rankings of the teams he played, transfer fee, appearances and goals in each team he played. But the disadvantage is that the performance of players, especially the key variable we would explore in this paper (players' appearances), is aggregated to the level of the period when the player belongs to a certain club. For example, if a player worked for the club for 5 continuous years, Soccer Base only provides information about the total appearances and goals within the 5 years. Put differently, the variations of performance (appearances and goals) are only driven by transferring across clubs. Using this data set, I show the negative relationship between appearances and the ranking of the club, which is consistent with my hypothesis and an important piece of evidence. I also use this dataset for the main empirical analysis about the long term placement of players, including the ranking of the club where he played and the transfer fee at each point of change in contract. However, note that a critical part of the story is that the players in non relegated teams have fewer appearances during the stay of the same club right after the relegation season, which hinders their human capital accumulation, compared to the relegated teams. With the aggregated data from Soccer Base, I cannot show any direct evidence. Then, I resort to the data set of Player History. Player History aggregates players' performance to each season. It doesn't contain detailed information about a loan, swap or any transfer in the middle of a season, or the information of a transfer fee, but it provides the opportunity to study the change in appearances for those who stayed in the club before and after the timing of relegation. I use this dataset for the analysis about performance across seasons but within each club, which is a direct piece of evidence about the mechanism of the long term effects. 4. Empirical Strategy 4.1 The pre-test My empirical strategy is a Differences-in-Differences approach. The "treatment" is defined as "relegation", and the sample is confined to those ranked as 3rd lowest and 4th lowest teams. The validity of the Differences-in-Differences approach is that the outcome variable for players in relegated and non relegated teams should be on the same trend before and after relegation. This assumption is ensured to hold by a Regression Discontinuity Design. The team just relegated and the team just not relegated is arguably homogeneous (Greenstone, Hornbeck and Morreti, 2010), which is a stronger condition than a parallel trend. I run T-tests to check if all the available information of players is balanced between relegated and non-relegated teams. 95    The results are shown in Table 1. 4.2 Average Treatment Effect on the Treated 9 9 yit     R 1{Year  j}t   j   1{Year  j}t j  R  i   it j 4 j 4 i represents for the i th player, and t is the calendar year/month. The outcome variable y that I have explored using data from Soccer Base include ranking of the club he signed a contract with, the ranking of the club where he played (note that he can be loaned or swapped to clubs other than the contract club), and transfer fee in each transaction. In these regressions, t is the calendar month, because each transaction is recorded and precise to the month. However, if there is no transaction, or swap or loan or any change in the club where he played, the performance is simply the average of the period that he belongs to the club. The results are shown in table 2-4 The outcome variables that I use from Player history include appearances for players at all positions and goals for forward players. I confine the sample to those who stay in the same club to examine if there is any change in performance within a club. The results are to be added. 4.3 Heterogeneity across players There is another prediction of my story. For younger players, the human capital accumulation should have a greater effect, because their professional life is longer than the older players after the relegation season. This argument suggests a Triple Difference Strategy. 9 9 yit     Heteroi  R 1{ year  j}t     Heteroi 1{ year  j}t    j 4 j 4 9  1{ year  j}  R   Hetero  R  R     j 4 t i i it I use the median player age at the relegation year as a cutoff to generate a dummy indicator of whether a player has a longer professional life. The results are shown in table 5-7. 5. Results and Interpretations Table 1 shows that the players’ characteristics are not significantly different between these two groups, one group is formed by the teams just relegated and the other group is formed by the teams just not relegated. Before presenting regression results, I show the figures I draw using the raw data to present some warm-up evidence. 96    Figure 4 shows the average league level of teams the players are in over time. We can see that the players in the two groups are indifferent most of the time before the relegation year which is confirmed by figure 53. At the first year after the relegation year, most players stay with the team so we can see a big gap between the two groups in terms of league levels right after the relegation. However, from figure 3 we know that around 30% of players leave the club at the first year after the relegation. It is also confirmed by figure 4. The average league level gradually increases after the relegation year for the just-not-relegated group, which indicates that some players transfer to other clubs with lower league levels even if their teams still stay in the Premier League. In general, it happens when the players do not have much appearance in the team since they want to keep themselves active. At the end of the first season after the relegation season, the just-relegated group jumped down a bit and the just not-relegated group jump up a bit more. It is caused by the relegation and promotion of their teams. Since their teams were on the margin at the relegation year, they are likely to be on the margin one year later. Because most of the players stay with the team in the first two years, they are promoted or relegated along with their teams. In the second year after the relegation year, on average the players in the just not- relegated teams at the relegation year play in the teams with higher levels than the players in the just-relegated teams at the relegation year. However, the gap is pretty small at the second year after the relegation year. So in the short term, team relegation causes the players to play in lower level teams than their counterparts. Figure 4: Average League Level of Players 4.5 4 3.5 3 Level just relegated 2.5 just not relegated 2 1.5 1 ‐8 ‐7 ‐6 ‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5 6 7 8 Note: X axis is the normalized year with the year 0 as the time when relegation happens. Y axis is average league level of teams the players are in. For this moment, I only focus on the players in the teams which were around the cutoff in the Premier League (Level One in England Football League System) in England over the period 1991-2002. I only include the football players in two groups of teams when the relegation happens: one group of teams which were just above the cutoff and not relegated and the other group of teams which were just below the cutoff and relegated. The teams which were just above the cutoff in this period formed a group indicated by the blue color in the figure. The teams which were just below the cutoff in this period formed a group indicated by the red color in the figure. I normalize the month when the just relegate/just not relegate happened to be time zero. Then I calculate average league level of teams the players are in for the months after relegation for the two groups. The specification of league level is the following: level 1: Premier League; level 2: Football League Champion; level 3: Football League One; level 4: Football League Tow; level 5: Conference National; level 6: Conference North and Conference south; level                                                              3 They are significantly different for the season right before the season when the relegation happens. That is because I only have twelve years of observations so the sample size in terms of clubs is not big enough. 97    7: Northern Premier League Premier Division, Southern Football League Premier Division and Isthmian League Premier Division. For other levels, please see the appendix for details. Figure 5: Differences in League Levels the Players Are In Over Time 1.5 1 0.5 0 0 1 2 3 4 5 6 7 8 ‐8 ‐7 ‐6 ‐5 ‐4 ‐3 ‐2 ‐1 ‐0.5 ‐1 ‐1.5 Note: X axis is the normalized year with the year 0 as the time when relegation happens. Y axis measures the differences of levels of teams the players are in. The bar shows the confidence interval at 95% level. For this moment, I only focus on the players in the teams which were around the cutoff in the Premier League (Level One in England Football League System) in England over the period 1991-2002. I only include the football players in two groups of teams when the relegation happens: one group of teams which were just above the cutoff and not relegated and the other group of teams which were just below the cutoff and relegated. The teams which were just above the cutoff in this period formed a group indicated by the blue color in the figure. The teams which were just below the cutoff in this period formed a group indicated by the red color in the figure. I normalize the month when the just relegate/just not relegate happened to be time zero. Then I calculate average league level of teams the players are in for the months after relegation for the two groups. The specification of league level is the following: level 1: Premier League; level 2: Football League Champion; level 3: Football League One; level 4: Football League Tow; level 5: Conference National; level 6: Conference North and Conference south; level 7: Northern Premier League Premier Division, Southern Football League Premier Division and Isthmian League Premier Division. For other levels, please see the appendix for details. From the end of the second year after the relegation year, the two lines in figure 4 gradually increase over time indicating that the players gradually play in lower level teams as time goes by. The surprising evidence happened at the end of the 3rd year after relegation season. From the end of the third year after the relegation year, average league level that the players from the just-not- relegated group landed begins to be lower (the number of rank is bigger) than that of the players from the just-relegated group landed. Their gap keeps increasing over time after that. So in the long run, it seems that team relegation had a positive impact on team players. It is surprising because relegation, as an adverse shock for the firm/club, limits the players’ access to media exposure, better facilities, and good opportunities to play with and learn from the top teams. However, I find the adverse effect of staying in a higher-ranking team in the long run which contradicts the conventional wisdom. A good exercise to test if the pattern is driven by the hypothesis in this paper is to confine the sample to younger cohorts. The idea is that they have a longer professional life so that any effect through human capital accumulation should be greater for them. In Figure 6A-6C, it shows the even stronger opposite pattern to the conventional wisdom. For the players younger than 30, the 98    pattern is similar to the pattern using the overall sample. The long-term positive effect of relegation is more obvious for the younger people. For the players older than 30, the pattern is the opposite. Team relegation has an adverse effect on those players both in the short term and the long term. Figure 6A: Average League Level of Players Young Players: younger than 25 at the time of relegation year 4.5 4 3.5 3 Level Just Relegated 2.5 Just Not Relegated 2 1.5 1 ‐6 ‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5 6 7 8 Note: Same as Figure 4 Figure 6B: Average League Level of Players Players older than 25 but younger than 30 at the time of relegation year 4.5 4 3.5 3 Level Just Relegated 2.5 Just Not Relegated 2 1.5 1 ‐8 ‐7 ‐6 ‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5 6 7 8 Note: Same as Figure 4. 99    Figure 6C: Average League Level of Players Players older than 30 at the time of relegation year 5 4.5 4 3.5 Level 3 Just Relegate 2.5 Just Not Relegate 2 1.5 1 0 1 2 3 4 5 6 7 8 ‐8 ‐7 ‐6 ‐5 ‐4 ‐3 ‐2 ‐1 Note: Same as Figure 4. The only plausible channel through which relegation affects human capital accumulation in a positive way is that the relatively inexperienced young cohorts are more involved in the games after relegation than they were before. Figure 6A-6C show that this effect is big enough to offset the negative effect predicted by the adverse labor market condition literature. Figure 7A and figure 7B show the league appearances and differences for these two groups. We can see that the just-relegated group has more appearances a few years after the relegation season. Figure 7C and figure 7D confirm it and further show that the effect is bigger for young players. Figure 7A: Appearances of all players  27 25 yearly Appearances 23 21 relegate 19 not relegate 17 15 ‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5 6 normalized year Note: X axis is the normalized year with the year 0 as the time when relegation happens. Y axis is average league appearances of players. For this moment, I only focus on the players in the teams which were around the cutoff in the Premier League (Level One in England Football League System) in England over the period 1991-2002. I only include the football players in two groups 100    of teams when the relegation happens: one group of teams which were just above the cutoff and not relegated and the other group of teams which were just below the cutoff and relegated. The teams which were just above the cutoff in this period formed a group indicated by the blue color in the figure. The teams which were just below the cutoff in this period formed a group indicated by the red color in the figure. I normalize the month when the just relegate/just not relegate happened to be time zero. Figure 7B: Difference in appearances for all players 8 6 4 2 0 ‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5 6 ‐2 ‐4 ‐6 ‐8 ‐10 Note: same as figure 7A. The bar shows the confidence interval at 95% level. Figure 7C: Difference in appearances for players under 25 15 10 5 0 ‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5 6 ‐5 ‐10 ‐15 Note: same as figure 7A. 101    Figure 7D: Difference in appearances for players under 25 15 10 5 0 ‐5 ‐4 ‐3 ‐2 ‐1 0 1 2 3 4 5 6 ‐5 ‐10 ‐15 Note: same as figure 7B. Figure 8‐Transfer Fee 1400000 1200000 1000000 pound 800000 600000 not relegated 400000 relegated 200000 0 ‐16 ‐14 ‐12 ‐10 ‐8 ‐6 ‐4 ‐2 0 2 4 6 8 10 12 14 16 half‐year period Note: the x-axis is half year. The pattern of the difference in ranking of clubs signed up appears in the analysis of the transfer fee, as shown in Figure 8. At around the end of the 3rd year, the transfer fee for players from the relegated teams exceed the non-relegated teams. Figure 9 shows the significant level of this difference. The large standard deviation of the transfer fee generate a large confidence interval in Figure 9, only the difference in 2 and half years is statistically significant. 102    Figure 9‐Difference in Transfer Fee 1500000 1000000 500000 0 ‐16 ‐14 ‐12 ‐10 ‐8 ‐6 ‐4 ‐2 0 2 4 6 8 10 12 14 16 ‐500000 ‐1000000 ‐1500000 We then group the data in each 2-year period in Figure 10 and Figure 11, and confine the players to those under age 25, which is the median of the player age in the sample. The magnitude of the difference seems greater. But the significance level doesn’t improve much. Figure 10: Transfer fee for under 25 1400000 1200000 1000000 pound 800000 600000 not relegated 400000 relegated 200000 0 ‐3 ‐2 ‐1 0 1 2 3 4 5 2‐year period 103    Figure 11: Difference in Transfer Fee for under age 25 1500000 1000000 500000 0 ‐3 ‐2 ‐1 0 1 2 3 4 5 ‐500000 ‐1000000 6. Concluding Remarks The promotion and relegation design in England Football League enables us to use the teams which are just not relegated to serve as the valid control group for the teams which are just relegated. I find that team relegation, which is well regarded as a setback for team players, has a negative impact on players in the short run but has a positive impact on them in the long run. That is because players have more chances to appear in the fields thus accumulate more human capital which is essentially important for their long term career. This better-be-the-head-of-a-dog than-the-tail-of-a-lion effect is more critical for young players. 104    Table 1- Player characteristics between the year when the team was on the edge of relegation or relegated. Player characteristics Teams just Teams just difference relegated not relegated Height 1.808 1.803 -.0047 [.0036] [.0038] [.0052] Weight 76.73 76.43 -0.296 [.507] [.421] [.657] Foreigner .377 .346 -.019 [.0266] [.0265] [.038] Age when the team is (not) relegated 26.05 26.05 .005 [.27] [.29] [.395] Years of experience before 7.357576 7.377778 -.020202 [.3688691] [.4017467] [.5462223] Ever been loaned to other club before .230303 .2666667 -.0363636 [.0328767] [.0382017] [.0501499] Player’s fee before 244059.7 194463.5 49596.27 [31947.88] [28311.36] [43469.08] obs 340 330 105    Table 2  Difference-In-Difference estimation on the effect of relegation on the ranking of the club signed up for players in the relegated club at the season (1) (2) (3) (4) (5) (6) VARIABLES c_rank1 c_rank1 c_rank1 c_rank3 c_rank3 c_rank3 Relegate*Season -4 -0.094 -0.121 -0.067 -2.150 -2.724 -1.584 (0.112) (0.109) (0.102) (2.566) (2.505) (2.306) Relegate*Season -3 -0.128 -0.164 -0.129 -3.595 -4.400* -3.692 (0.118) (0.116) (0.112) (2.688) (2.625) (2.535) Relegate*Season -2 -0.100 -0.155 -0.080 -0.158 -1.411 0.107 (0.116) (0.113) (0.114) (2.639) (2.559) (2.537) Relegate*Season -1 0.175 0.104 0.171 2.025 0.460 1.853 (0.111) (0.110) (0.112) (2.522) (2.483) (2.495) Relegate*Season 0 0.057 -0.026 0.040 0.790 -0.999 0.421 (0.106) (0.105) (0.112) (2.409) (2.385) (2.505) Relegate*Season 1 0.487*** 0.413*** 0.510*** 5.078** 3.516 5.634** (0.111) (0.112) (0.118) (2.561) (2.552) (2.657) Relegate*Season 2 0.180 0.123 0.235* 1.288 0.106 2.404 (0.120) (0.120) (0.125) (2.803) (2.781) (2.854) Relegate*Season 3 -0.065 -0.107 0.044 -2.464 -3.360 -0.268 (0.132) (0.131) (0.135) (3.034) (3.002) (3.078) Relegate*Season 4 -0.179 -0.211 -0.067 -4.376 -5.087 -1.918 (0.148) (0.146) (0.151) (3.371) (3.337) (3.427) Relegate*Season 5 -0.235 -0.267* -0.146 -5.570 -6.266* -3.610 (0.156) (0.156) (0.160) (3.514) (3.494) (3.590) Relegate*Season 6 -0.333** -0.359** -0.202 -7.296* -7.865** -4.254 (0.166) (0.166) (0.172) (3.732) (3.723) (3.832) Relegate*Season 7 -0.378** -0.384** -0.214 -8.498** -8.559** -4.624 (0.172) (0.172) (0.173) (3.848) (3.831) (3.860) Relegate*Season 8 -0.437** -0.452** -0.244 -11.514*** -11.793*** -7.368* (0.181) (0.180) (0.182) (4.060) (4.032) (4.071) Relegate*Season 9 -0.406** -0.437** -0.186 -10.321** -10.924*** -5.383 (0.194) (0.191) (0.190) (4.287) (4.226) (4.189) Relegate*Season 10 -0.210 -0.230 0.014 -5.261 -5.575 -0.240 (0.212) (0.208) (0.199) (4.761) (4.651) (4.503) Season -4 -0.298*** -0.268*** -0.335*** -6.339*** -5.647*** -7.142*** (0.092) (0.090) (0.080) (2.067) (2.000) (1.776) Season -3 -0.328*** -0.289*** -0.356*** -7.160*** -6.248*** -7.775*** (0.094) (0.091) (0.085) (2.141) (2.064) (1.933) Season -2 -0.462*** -0.408*** -0.548*** -11.219*** -9.964*** -13.035*** (0.094) (0.091) (0.089) (2.161) (2.068) (2.008) Season -1 -0.714*** -0.645*** -0.799*** -15.204*** -13.666*** -17.067*** (0.090) (0.089) (0.087) (2.046) (2.006) (1.947) Season 0 -0.927*** -0.847*** -1.015*** -14.973*** -13.208*** -16.981*** (0.086) (0.086) (0.088) (1.960) (1.940) (1.966) Season 1 -0.751*** -0.680*** -0.883*** -10.908*** -9.388*** -13.966*** (0.091) (0.092) (0.093) (2.052) (2.049) (2.059) 106    Season 2 -0.293*** -0.240** -0.457*** -4.288* -3.162 -8.000*** (0.096) (0.096) (0.096) (2.204) (2.188) (2.181) Season 3 0.015 0.053 -0.219** 1.908 2.718 -3.286 (0.105) (0.104) (0.102) (2.409) (2.376) (2.337) Season 4 0.290** 0.318*** 0.029 7.706*** 8.301*** 1.855 (0.117) (0.115) (0.116) (2.661) (2.625) (2.629) Season 5 0.417*** 0.444*** 0.172 10.035*** 10.609*** 4.446 (0.121) (0.121) (0.122) (2.721) (2.699) (2.727) Season 6 0.576*** 0.595*** 0.298** 13.583*** 13.945*** 7.172** (0.129) (0.129) (0.130) (2.885) (2.881) (2.913) Season 7 0.775*** 0.780*** 0.462*** 18.173*** 18.194*** 10.959*** (0.130) (0.130) (0.125) (2.909) (2.904) (2.795) Season 8 0.884*** 0.894*** 0.548*** 21.530*** 21.660*** 13.803*** (0.137) (0.136) (0.130) (3.055) (3.039) (2.888) Season 9 0.918*** 0.942*** 0.570*** 21.888*** 22.218*** 13.640*** (0.149) (0.148) (0.137) (3.249) (3.225) (2.984) Season 10 1.031*** 1.033*** 0.689*** 23.529*** 23.294*** 15.311*** (0.157) (0.155) (0.141) (3.437) (3.380) (3.051) relegate 0.067 1.956 (0.102) (2.364) Constant 2.056*** 2.079*** 2.195*** 33.370*** 34.138*** 36.814*** (0.085) (0.048) (0.040) (1.939) (1.113) (0.892) F.E. No Club f.e. Player f.e. No Club f.e. Player f.e. Se clustered at player Yes Yes Yes Yes Yes Yes level Observations 113,022 113,022 113,022 113,080 113,080 113,080 R-squared 0.124 0.145 0.415 0.110 0.136 0.411 Note: c_rank1 is the league level of the team the player is contracted with, c_rank3 is the overall ranking in the whole England Football League of the team the player is contracted with. Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 107    Table 3- Difference-In-Difference estimation on the effect of relegation on the ranking of club playing (Including the cases of loan and swap) for players in the relegated club at the season (1) (2) (3) (4) (5) (6) VARIABLES p_rank1 p_rank1 p_rank1 p_rank3 p_rank3 p_rank3 ryr23 -0.093 -0.118 -0.056 -2.096 -2.631 -1.352 (0.108) (0.106) (0.100) (2.481) (2.431) (2.271) ryr24 -0.139 -0.172 -0.133 -3.909 -4.643* -3.847 (0.114) (0.113) (0.110) (2.606) (2.559) (2.497) ryr25 -0.121 -0.173 -0.081 -0.717 -1.890 -0.015 (0.112) (0.110) (0.111) (2.543) (2.482) (2.475) ryr26 0.116 0.048 0.122 0.615 -0.883 0.642 (0.108) (0.107) (0.109) (2.438) (2.416) (2.430) ryr27 0.007 -0.073 0.000 -0.383 -2.117 -0.492 (0.103) (0.103) (0.109) (2.354) (2.335) (2.434) ryr28 0.402*** 0.332*** 0.431*** 3.346 1.888 4.047 (0.108) (0.109) (0.114) (2.493) (2.491) (2.563) ryr29 0.165 0.111 0.225* 0.963 -0.123 2.229 (0.117) (0.117) (0.121) (2.728) (2.716) (2.775) ryr30 -0.080 -0.119 0.033 -2.908 -3.708 -0.618 (0.128) (0.128) (0.131) (2.956) (2.938) (2.987) ryr31 -0.233 -0.262* -0.114 -5.637* -6.254* -2.982 (0.144) (0.143) (0.147) (3.285) (3.265) (3.337) ryr32 -0.292* -0.322** -0.198 -6.826** -7.474** -4.758 (0.153) (0.153) (0.156) (3.444) (3.437) (3.505) ryr33 -0.374** -0.399** -0.246 -8.280** -8.801** -5.308 (0.164) (0.164) (0.168) (3.671) (3.670) (3.754) ryr34 -0.398** -0.403** -0.236 -9.250** -9.289** -5.468 (0.171) (0.171) (0.171) (3.813) (3.802) (3.796) ryr35 -0.467*** -0.480*** -0.272 -12.218*** -12.446*** -8.065** (0.179) (0.179) (0.179) (4.014) (3.989) (3.997) ryr36 -0.449** -0.479** -0.233 -11.293*** -11.867*** -6.492 (0.191) (0.189) (0.185) (4.224) (4.161) (4.092) ryr37 -0.247 -0.268 -0.026 -6.202 -6.515 -1.236 (0.213) (0.209) (0.197) (4.775) (4.663) (4.458) yr23 -0.289*** -0.261*** -0.336*** -6.185*** -5.517*** -7.177*** (0.088) (0.086) (0.078) (1.971) (1.917) (1.729) yr24 -0.318*** -0.282*** -0.356*** -6.903*** -6.047*** -7.739*** (0.090) (0.088) (0.083) (2.048) (1.992) (1.889) yr25 -0.448*** -0.397*** -0.552*** -10.884*** -9.687*** -13.092*** (0.090) (0.087) (0.086) (2.058) (1.989) (1.947) yr26 -0.663*** -0.596*** -0.761*** -13.983*** -12.497*** -16.145*** (0.086) (0.086) (0.084) (1.955) (1.935) (1.875) yr27 -0.857*** -0.779*** -0.958*** -13.584*** -11.863*** -15.954*** (0.083) (0.083) (0.084) (1.889) (1.877) (1.888) yr28 -0.681*** -0.615*** -0.830*** -9.479*** -8.066*** -12.954*** (0.087) (0.088) (0.088) (1.966) (1.973) (1.949) yr29 -0.278*** -0.229** -0.459*** -3.949* -2.906 -8.066*** (0.092) (0.092) (0.092) (2.115) (2.114) (2.095) yr30 0.038 0.073 -0.212** 2.429 3.156 -3.143 108    (0.101) (0.101) (0.098) (2.324) (2.308) (2.239) yr31 0.316*** 0.340*** 0.039 8.286*** 8.805*** 2.053 (0.112) (0.112) (0.111) (2.571) (2.552) (2.529) yr32 0.449*** 0.475*** 0.188 10.662*** 11.204*** 4.727* (0.117) (0.117) (0.117) (2.636) (2.630) (2.623) yr33 0.597*** 0.614*** 0.308** 14.092*** 14.421*** 7.440*** (0.125) (0.126) (0.126) (2.801) (2.809) (2.807) yr34 0.786*** 0.790*** 0.459*** 18.557*** 18.565*** 11.059*** (0.128) (0.129) (0.122) (2.869) (2.873) (2.717) yr35 0.897*** 0.906*** 0.546*** 21.822*** 21.922*** 13.778*** (0.135) (0.135) (0.126) (3.009) (2.999) (2.805) yr36 0.946*** 0.970*** 0.586*** 22.494*** 22.823*** 14.026*** (0.146) (0.145) (0.133) (3.186) (3.159) (2.874) yr37 1.050*** 1.052*** 0.696*** 23.987*** 23.760*** 15.491*** (0.157) (0.154) (0.137) (3.410) (3.351) (2.970) relegate 0.093 0.000 0.000 2.605 0.000 0.000 (0.098) (0.000) (0.000) (2.275) (0.000) (0.000) Constant 2.062*** 2.099*** 2.223*** 33.535*** 34.660*** 37.509*** (0.080) (0.047) (0.039) (1.840) (1.080) (0.868) F.E. No Club f.e. Player f.e. No Club f.e. Player f.e. Se clustered at player Yes Yes Yes Yes Yes Yes level Observations 112,907 112,907 112,907 112,987 112,987 112,987 R-squared 0.118 0.139 0.411 0.106 0.130 0.407 Note: p_rank1 is the league level of the team the player plays, p_rank3 is the overall ranking in the whole England Football League of the team the player plays. Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 109    Table 4  Difference-In-Difference estimation on the effect of relegation on the transfer fee for players in the relegated club at the season. (1) (2) (3) VARIABLES fee_pound fee_pound fee_pound Relegate*Season -4 -48,407.177 -54,363.095 -295,124.153 (140,078.780) (140,172.362) (213,075.683) Relegate*Season -3 -82,677.053 -76,472.486 -265,909.875 (131,331.597) (127,038.489) (191,687.705) Relegate*Season -2 14,541.054 16,912.936 -46,487.679 (113,227.529) (111,555.667) (131,505.001) Relegate*Season -1 -37,341.910 -48,576.166 -138,731.819 (174,583.523) (172,264.787) (247,725.277) Relegate*Season 0 -59,308.761 -211.932 -147,310.248 (112,229.513) (110,831.593) (155,562.698) Relegate*Season 1 310,060.860* 260,744.033 83,967.831 (184,266.562) (177,518.608) (181,085.527) Relegate*Season 2 -180,997.077 -233,821.726* -354,786.711** (123,433.656) (127,034.759) (161,678.502) Relegate*Season 3 -59,203.310 -70,007.515 -63,641.260 (111,532.839) (113,629.212) (122,012.637) Relegate*Season 4 -111,566.301 -120,516.895 -323,435.679 (217,302.434) (215,477.001) (236,702.864) Relegate*Season 5 257,807.959 239,246.635 180,373.068 (242,884.018) (237,414.380) (228,568.433) Relegate*Season 6 146,866.165 143,371.901 -12,849.853 (283,689.014) (272,994.684) (238,768.606) Relegate*Season 7 260,721.252 246,985.656 131,414.907 (242,084.795) (242,659.982) (213,617.193) Relegate*Season 8 172,471.458 190,129.559 -24,178.822 (185,333.221) (177,373.157) (137,994.897) Relegate*Season 9 248,341.556 291,105.844 25,735.687 (242,833.961) (239,398.718) (186,480.490) Relegate*Season 10 -384,006.926 -316,833.964 -112,553.754 (250,832.246) (255,371.252) (194,334.784) Season -4 302,655.196*** 270,612.027*** 347,926.050*** (98,961.569) (100,581.024) (114,411.748) Season -3 325,425.072*** 298,192.202*** 290,759.724** (93,983.294) (92,869.998) (113,995.467) Season -2 263,014.488*** 264,778.369*** 375,451.400*** (85,983.306) (84,243.746) (86,865.806) Season -1 485,938.243*** 475,687.515*** 475,220.552*** (102,222.820) (100,506.659) (135,187.658) Season 0 362,143.056*** 317,312.394*** 461,161.849*** (77,622.852) (75,065.235) (89,392.113) Season 1 290,878.566** 293,597.123*** 416,695.715*** (114,082.040) (113,015.486) (119,658.973) Season 2 229,715.684*** 248,350.394*** 397,530.896*** 110    (86,280.958) (87,071.799) (100,185.899) Season 3 128,565.272 105,137.822 193,870.432** (82,108.363) (80,757.726) (88,883.205) Season 4 257,421.041* 245,275.623* 454,144.911** (146,532.159) (147,596.780) (177,724.815) Season 5 72,820.794 72,404.838 80,851.418 (83,088.794) (81,429.447) (63,317.855) Season 6 150,164.462 139,077.677 316,785.052** (119,988.963) (123,484.879) (135,552.850) Season 7 13,710.675 -21,364.189 73,025.992 (73,964.436) (73,681.385) (95,649.491) Season 8 -29,898.764 -68,490.367 20,770.957 (52,428.141) (63,178.213) (80,734.551) Season 9 3,521.848 -51,739.273 86,492.395 (82,939.034) (88,557.423) (81,975.384) Season 10 215,296.041 178,381.757 57,926.470 (249,180.556) (251,569.021) (182,956.974) relegate 58,048.022* (34,761.060) Constant 124,703.959*** 167,916.117*** 146,520.902*** (21,591.902) (18,408.953) (26,641.436) F.E. No Club f.e. Player f.e. Se clustered at player Yes Yes Yes level Observations 4,165 4,165 4,165 R-squared 0.023 0.053 0.424 Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 111    Table 5 Triple Difference estimation of the relegation effect on ranking of club signed up among cohorts under 25 in relegation season. (1) (2) (3) (4) (5) (6) VARIABLES c_rank1 c_rank1 c_rank1 c_rank3 c_rank3 c_rank3 age<25*Relegate*Season -4 0.593** 0.540** 0.467** 14.075** 12.827** 11.107** (0.255) (0.247) (0.226) (5.668) (5.507) (5.003) age<25*Relegate*Season -3 0.635** 0.566** 0.472* 13.338** 11.618** 9.425* (0.263) (0.256) (0.241) (5.841) (5.715) (5.415) age<25*Relegate*Season -2 0.711*** 0.626** 0.600*** 17.766*** 15.612*** 14.728*** (0.251) (0.245) (0.231) (5.570) (5.410) (5.114) age<25*Relegate*Season -1 1.196*** 1.095*** 1.047*** 25.574*** 23.173*** 21.812*** (0.239) (0.235) (0.223) (5.219) (5.136) (4.871) age<25*Relegate*Season 0 1.246*** 1.137*** 1.107*** 22.519*** 20.053*** 19.216*** (0.227) (0.223) (0.211) (5.013) (4.923) (4.667) age<25*Relegate*Season 1 1.083*** 0.980*** 1.004*** 18.249*** 15.942*** 16.475*** (0.231) (0.227) (0.214) (5.157) (5.065) (4.765) age<25*Relegate*Season 2 0.678*** 0.588** 0.637*** 13.714** 11.648** 12.649*** (0.237) (0.232) (0.216) (5.342) (5.242) (4.882) age<25*Relegate*Season 3 0.388 0.317 0.452** 5.977 4.343 7.382 (0.250) (0.245) (0.228) (5.649) (5.540) (5.201) age<25*Relegate*Season 4 0.069 0.010 0.136 0.497 -0.835 2.213 (0.263) (0.256) (0.242) (5.920) (5.793) (5.526) age<25*Relegate*Season 5 -0.069 -0.124 -0.041 -4.079 -5.295 -3.290 (0.276) (0.269) (0.255) (6.196) (6.048) (5.824) age<25*Relegate*Season 6 -0.233 -0.274 -0.270 -8.222 -9.092 -8.738 (0.290) (0.282) (0.272) (6.441) (6.275) (6.091) age<25*Relegate*Season 7 -0.984*** -1.029*** -0.981*** -24.372*** -25.369*** -24.026*** (0.308) (0.301) (0.291) (6.812) (6.670) (6.456) age<25*Relegate*Season 8 -1.235*** -1.284*** -1.174*** -27.687*** -28.741*** -26.299*** (0.349) (0.342) (0.334) (7.978) (7.811) (7.697) age<25*Relegate*Season 9 -1.187*** -1.256*** -1.190*** -27.002*** -28.371*** -26.772*** (0.365) (0.357) (0.342) (8.151) (7.946) (7.626) age<25*Relegate*Season 10 -1.385*** -1.432*** -1.362*** -28.659*** -29.369*** -27.756*** (0.408) (0.398) (0.361) (9.539) (9.254) (8.584) F.E. No Club f.e. Player f.e. No Club f.e. Player f.e. Se clustered at player level Yes Yes Yes Yes Yes Yes Observations 113,022 113,022 113,022 113,080 113,080 113,080 R-squared 0.115 0.139 0.414 0.105 0.132 0.412 Note: c_rank1 is the league level of the team the player is contracted with, c_rank3 is the overall ranking in the whole England Football League of the team the player is contracted with. Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 112    Table 6 Triple Difference estimation of the relegation effect on ranking of club playing at among cohorts under 25 in relegation season (1) (2) (3) (4) (5) (6) VARIABLES p_rank1 p_rank1 p_rank1 p_rank3 p_rank3 p_rank3 age<25*Relegate*Season -4 0.472** 0.422* 0.374* 11.605** 10.460** 9.167* (0.236) (0.231) (0.221) (5.290) (5.189) (4.900) age<25*Relegate*Season -3 0.612** 0.548** 0.472** 13.177** 11.594** 9.738* (0.245) (0.242) (0.236) (5.509) (5.454) (5.348) age<25*Relegate*Season -2 0.643*** 0.564** 0.584*** 16.401*** 14.406*** 14.455*** (0.233) (0.230) (0.224) (5.182) (5.104) (4.977) age<25*Relegate*Season -1 1.079*** 0.984*** 0.953*** 23.176*** 20.950*** 19.884*** (0.220) (0.220) (0.215) (4.853) (4.850) (4.738) age<25*Relegate*Season 0 1.157*** 1.052*** 1.033*** 20.916*** 18.557*** 18.010*** (0.210) (0.208) (0.201) (4.666) (4.636) (4.485) age<25*Relegate*Season 1 0.968*** 0.869*** 0.906*** 16.021*** 13.858*** 14.682*** (0.213) (0.212) (0.203) (4.800) (4.770) (4.566) age<25*Relegate*Season 2 0.669*** 0.585*** 0.651*** 13.754*** 11.867** 13.238*** (0.219) (0.217) (0.207) (4.966) (4.936) (4.705) age<25*Relegate*Season 3 0.353 0.289 0.440** 5.347 3.883 7.248 (0.231) (0.229) (0.219) (5.259) (5.222) (5.005) age<25*Relegate*Season 4 0.004 -0.049 0.089 -1.088 -2.254 1.034 (0.245) (0.242) (0.234) (5.567) (5.513) (5.356) age<25*Relegate*Season 5 -0.167 -0.219 -0.124 -6.013 -7.120 -4.870 (0.260) (0.256) (0.248) (5.884) (5.803) (5.685) age<25*Relegate*Season 6 -0.320 -0.358 -0.346 -9.889 -10.667* -10.152* (0.275) (0.270) (0.266) (6.143) (6.042) (5.959) age<25*Relegate*Season 7 -1.065*** -1.107*** -1.045*** -26.179*** -27.087*** -25.541*** (0.295) (0.290) (0.285) (6.557) (6.477) (6.327) age<25*Relegate*Season 8 -1.285*** -1.330*** -1.206*** -28.661*** -29.613*** -26.943*** (0.338) (0.333) (0.330) (7.750) (7.637) (7.599) age<25*Relegate*Season 9 -1.246*** -1.313*** -1.239*** -28.242*** -29.535*** -27.843*** (0.354) (0.348) (0.337) (7.921) (7.768) (7.523) age<25*Relegate*Season 10 -1.447*** -1.493*** -1.398*** -29.966*** -30.630*** -28.526*** (0.401) (0.393) (0.357) (9.403) (9.162) (8.492) F.E. No Club f.e. Player f.e. No Club f.e. Player f.e. Se clustered at player level Yes Yes Yes Yes Yes Yes Observations 112,907 112,907 112,907 112,987 112,987 112,987 R-squared 0.108 0.131 0.409 0.099 0.126 0.407 Note: p_rank1 is the league level of the team the player plays, p_rank3 is the overall ranking in the whole England Football League of the team the player plays. Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 113    Table 7 Triple Difference estimation of the relegation effect on transfer fee among cohorts under 25 in relegation season (1) (2) (3) VARIABLES fee_pound fee_pound fee_pound age<25*Relegate*Season -4 -537,865.761*** -532,392.290*** -1301477.203*** (197,263.728) (200,150.205) (385,572.617) age<25*Relegate*Season -3 -73,410.739 -213,095.742 -380,337.653 (184,952.077) (173,408.090) (328,982.604) age<25*Relegate*Season -2 -529,809.606*** -550,160.374*** -584,822.014** (191,917.954) (188,855.190) (241,245.266) age<25*Relegate*Season -1 -535,185.690* -586,156.495* -687,670.938 (317,082.159) (305,535.816) (449,316.247) age<25*Relegate*Season 0 -238,845.084 -328,343.463* -430,939.267 (183,574.969) (189,405.161) (287,684.460) age<25*Relegate*Season 1 646,300.470* 513,492.815 318,807.613 (351,402.802) (344,487.115) (351,196.143) age<25*Relegate*Season 2 213,043.944 9,663.478 119,986.228 (243,047.043) (251,322.137) (334,220.145) age<25*Relegate*Season 3 234,745.604 217,581.985 134,775.665 (192,300.767) (202,481.431) (229,120.358) age<25*Relegate*Season 4 45,727.574 1,812.783 -164,714.130 (374,508.744) (365,227.761) (428,284.936) age<25*Relegate*Season 5 763,362.032* 706,905.520 853,736.897* (449,040.441) (442,076.030) (445,074.299) age<25*Relegate*Season 6 544,188.871 486,899.282 419,704.043 (452,492.142) (427,342.827) (383,881.710) age<25*Relegate*Season 7 693,400.538* 684,316.125** 595,684.160* (366,163.101) (348,447.375) (337,404.914) age<25*Relegate*Season 8 473,438.955* 491,634.439* 304,601.883 (251,527.356) (258,878.016) (224,598.848) age<25*Relegate*Season 9 602,480.152* 645,287.124** 584,889.985** (319,952.230) (311,265.179) (268,126.546) age<25*Relegate*Season 10 -241,042.561 -269,678.753 8,628.498 (315,789.745) (322,827.013) (245,167.788) F.E. No Club f.e. Player f.e. Se clustered at player level Yes Yes Yes Observations 4,165 4,165 4,165 R-squared 0.043 0.072 0.443 Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 114    References: Baker, George, Michael Gibbs, and Bengt Holmstrom. 1994. ‘The Wage Policy of a Firm.’ Quarterly Journal of Economics, 109: 881-91 Bandiera, Oriana, Iwan Barankay and Imran, Rasul, “Social Connections and Incentives: Evidence From Personnel Data,” Econometrica, July 2009, 77: 1047-94. Bandiera, Oriana, Iwan Barankay and Imran, Rasul, “Social Incentives in the Workplace,” Review of Economic Studies, April 2010, Vol. 77, Issue. 2: 417-59. Beaudry, Paul and John DiNardo, 1991, “The Effect of Implicit Contracts on the Movements of Wages over the Business Cycle”, Journal of Political Economy: 99(4): 665-688. Devereux, Paul. 2003. ‘The Importance of Obtaining a High-Paying Job.’ Mimeo, University of California, Los Angeles. Ellwood, David. 1982. "Teenage Unemployment: Permanent Scars or Temporary Blemishes?" In: Richard B. Freeman and David A. Wise (Eds). The Youth Labor Market Problem: Its Nature, Causes and Consequences, pp. 349-390. Chicago: University of Chicago Press. Genda, Yuji & Ayako Kondo & Souichi Ohta, 2010, "Long-Term Effects of a Recession at Labor Market Entry in Japan and the United States," Journal of Human Resources, University of Wisconsin Press, vol. 45(1). Greenstone, Michael, Rick Hornbeck and Enrico Moretti “Identifying Agglomeration Spillovers: Evidence from Winners and Losers of Large Plant Openings,” Journal of Political Economy, 2010, 118 (3): 536-598 115    Greenstone, Michael, 2002, “The Impacts of Environmental Regulations on Industrial Activity: Evidence from the 1970 and 1977 Clean Air Act Amendments and the Census of Manufacturers.” Journal of Political Economy, 110(6). Henderson, Vernon, 1996, "Effects of Air Quality Regulation," American Economic Review, 86, 789-813 Kahn, Lisa. 2006. ‘The Long-Term Labor Market Consequences of Graduating College in a Bad Economy.’ Labour Economics, (Forthcoming). Kotchen Matthew, Matthew Potoski, 2011, “How Private Incentives Distort Public Evaluations: Evidence from the NCAA Football Coaches’ Top 25 Ballots,” NBER Working Paper. Levitt, Steven D., John A. List, and Sally E. Sadoff. 2011. "Checkmate: Exploring Backward Induction among Chess Players." American Economic Review, 101(2): 975–90. Moretti, Enrico, Mas, Alexandre, 2009, “Peers at Work”, American Economic Review, 99(1). Munasinghe, Lalith, Brendan O'Flaherty, Stephan Danninger, 2001. "Globalization and the Rate of Technological Progress: What Track and Field Records Show," Journal of Political Economy, University of Chicago Press, vol. 109(5), pages 1132-1149, October. Neal, Derek, 1995, “Industry-Specific Human Capital: Evidence from Displaced Workers,’ Journal of Labor Economics, October, vol. 13: 653-7 Oreopoulos, Philip, Till von Wachter, and Andrew Heisz, 2012, “The Short- and Long-Term Career Effects of Graduating in a Recession: Hysteresis and Heterogeneity in the Market for College Graduates.” American Economics Journal: Applied Economics, pp 1-29. Oyer, Paul. 2006. ‘The Macro-Foundations of Microeconomics: Initial Labor Market Conditions and Long Term Outcomes for Economists.’ NBER Working Paper No. 12157. 116    Oyer, Paul. 2008. ‘The Making of an Investment Banker: Macroeconomic Shocks, Career Choice, and Lifetime Income’, The Journal of Finance, 63 (December): 2601-2628. Price, Joseph, Justin Wolfers, 2010, "Racial Discrimination Among NBA Referees," The Quarterly Journal of Economics, MIT Press, vol. 125(4) von Wachter, Till and Stefan Bender. 2006. “At the Right Place at the Wrong Time: The Role of Firms and Luck in Young Workers’ Careers.” American Economic Review 96(5): 1679-1705 117