Zipf Law Analysis of Urban Scale in China

In this paper, by using China's urban population data from 1990 to 2010 and double logarithmic regression model to test China’s urban scale and urban rank through Zipf law, we found that China’s urban scale distribution is relatively balanced, virtually conforming to Zipf's law. Researching and analyzing urban scale distribution of China’s every province’s, we also found that the urban scale distributions of China’s all provinces are relatively complicated and they are not even close to 1. The increase of Zipf index means that urban scale distribution is transiting from primacy degree distribution to ideal Zipf distribution. Further analyzing and categorizing every province’s Zipf index, we can conclude that China’s urban scale distribution can be categorized four mainly: achieved ideal Zipf law; approaching to ideal Zipf distribution; The transition state from the primacy degree distribution to the Zipf law distribution; typical primacy degree distribution.


Introduction
In the research of city size distribution, the urban scale and the urban level have good characteristics.The earliest research on urban scale and order is from Auerbach (1913) and Singer (1936).They believe that the scale of the city and the sequence can be characterized by Pareto distribution, and there is a mathematical relationship of: . Zipf (1949) believes that the scale of the city can not only be expressed by the Pareto distribution, and when 1   ,it found i i R S A  , Zipf index, the relationship between the size of the city and the order of the city is called the Zipf rule.Since 1950s, a large number of empirical studies have been carried out on the Zipf law.Berry (1961) by the 38 countries on the urban population data for empirical testing, the city size distribution is divided into three categories： Firstly, including 13 countries, in full compliance with the size of the city and the sequence distribution, that is, the city size distribution of the Zipf rule.Secondly, including 15 countries, including the first city to control the scale of urban structure characteristics; thirdly, there are 10 countries balance between the above characteristics.Madden (Madden) using the United States city population data from 1790 to 1950, the use of a logarithmic model to test the stability of the Zipf rule.The conclusion shows that the urban structure system changes very quickly in the scale, and the city's rank is also found changing, but the city size distribution itself is stable in a long time.Soo (2004)  the third part of the empirical results were analyzed; the fourth part is a brief conclusion of the research.

Model Settings
We mainly uses the following model to test the size distribution of cities in China： ln ln ln Where R is the city level,， S is the size of the city， is the Zipf index，the Pareto value，but also need to estimate the data equation.The main purpose of the study is to study the size and change trend of the Zipf index of urban size distribution in China.
In the city size distribution, the Zipf index ( value) shows the aggregation of the city size distribution.The natural logarithm of city scale and registration is calculated, and the regression of these data is obtained.The significant log linear model is the order -scale rule.If the slope of the regression is-1 ( 1   )，then the city size of the Zipf rule can be verified.If 1   ，shows that the city's population is relatively scattered, small and medium-sized city scale development is good, the high level of urban size is not prominent; if 1 that the scale of the distribution is concentrated, small and medium-sized urban population size is poor, the size of large urban population growth, urban size distribution of the first degree is higher.In Zipf's index to study the trend of changes, in becomes large，show that the city size distribution tends to be concentrated strength is greater than that of the dispersion of power ;if  is smaller, indicating that makes the city scale distribution tends to disperse the strength is greater than the concentration of power.
Gabaix and Ioannides (1975) proposed using Hill method to replace the least squares method to calculate the distribution of urban size Zipf index.The Hill method is essentially a maximum likelihood estimation method in the case of the model.If the sample size is n, the expression of the size of the city is： 1 The city size distribution of Zipf index can be calculated by the following equation： Through the empirical test of the Zipf rule of China's city size distribution, the ordinary least square method is adopted.The least square error of the least square method is relatively small, and it can fit the relationship between the size and the position of the city.The premise of the Hill method is that if the city size obeys Pareto distribution, the Zipf index of urban scale is tested, which is not representative and general.

Data Description
An urban scale is the population, economy and science and technology in a certain area of the size of the aggregation.The general scale of the city includes population scale, land use, building and facilities, and the scale of productivity and consumption.Narrow city scale only refers to the population size in a certain city.
Compared to other measures of city size, the relevant data of urban population size is more easy to collect, but also the most commonly used indicators.The non-agricultural population in China's urban residence is the main measure of the urban scale.Take non agriculture population is: first, non-agricultural population in number and the built area of the resident population is close to; second, for the urban system, each city non-agricultural population and built district resident population in general showed a linear relationship; third, non-agriculture in some extent to maintain statistical consistency and comparability.
Secondly, because of Chongqing, Beijing, Shanghai, Tianjin and other four municipalities have their own special characteristics of the scale of the city, the return of the four cities are not considered.In addition, Macau and Hong Kong in the late 1990s to review the motherland, so Macao and Hong Kong's urban population data exist flaw phenomenon is serious, so we in the establishment of measurement model, without considering the special situation of Macau and Hong Kong.Similarly, for the city of Taiwan scale, also temporarily do not consider.Finally, due to the vast territory of China, the statistical data of the urban population of Tibet autonomous region has a serious deficiency in the statistical yearbook of China in recent years.And Tibet have a certain number of the scale of the city is relatively small, in order to prevent the statistical results and the model brought large errors, we selected China statistical yearbook data of city size distribution test will Tibet Autonomous Region exclude outside.
According to the original data of the 2010 -1990 China Urban Statistical Yearbook, a descriptive statistical table of the urban population was obtained, which was shown in Table 1.According to the results from Table 2, P value are all 0 when using double logarithmic model to do regression analysis.They can all pass the hypothesis test with confidence level of 1%.Thus, we can conclude that the regression analysis model is significant.The model well explains the good linear relationship between the urban scale and the order of the city.To better describe the trend of Zipf index of urban scale distribution from 1990 to 2010, we made Figure 1 based on Table 2.

Testing Analysis of China's Urban Scale Provinces from 1990 to 2010
From 1990 to 2010, China's urban size distribution equilibrium, close to the ideal state of the Zipf index distribution.At the same time, in recent years, China's economy has also been a good development, the city's size distribution has also changed significantly.To this end, in order to explore the status of the distribution of urban size and change in various provinces and cities, by the size of cities in various provinces of China Zipf rule test.
Analysis table 3 can found that, in the 1990 -2010s, the Zipf distribution of the provinces in China is more complex, and not significant.Between the provinces, and of different periods, the distribution of urban size differences are great.Further analysis found that in 1990 China's urban size distribution in the Zipf ideal state of the province a total of 13( 0.8 1.2    ),which shows that the size distribution of cities in various provinces of China has not reached the ideal state distribution.And in the 1995,2000,2005,2010, Avenue equilibrium distribution of provinces respectively, 19, 19, 26, 26, from a nationwide perspective, Chinese city scale in provinces in recent ten years of the Zipf distribution index in general has reached the ideal state of the Zipf distribution.
In order to further study the China in 1990 -2010 years all the provinces of the urban scale distribution, we're on the regression analysis of Zipf's index evolution of similarity analysis, to classify the various provinces of the country of urban scale.The results shows that Chinese provinces, city size distribution is mainly divided into the following categories： The first category is Zipf's index is relatively high, indicating that the provinces and cities development is good, and the first city size on the influence of the size of the population of the province's is not obvious, the city size and rank are more balanced distribution.Including Liaoning, Hunan, Sichuan, Xinjiang, Henan, Anhui.
The second category are the provinces city size distribution is basically consistent with the Zipf's law of an ideal state of distribution, provinces with the central city, and small and medium-sized cities scale development of good, high order cities complete is generally China's economy is relatively developed, or scale economy relatively large provinces.Including Guangdong, Zhejiang, Jiangsu, Jilin, Hubei, Shandong, Fujian.
The third category is the Zipf's index is relatively small, belonging to the general economy is relatively less Trend of Zipf index developed provinces, area economy is relatively backward, center city development is not perfect, resulting in center city scale, thus highlighting has obvious first features.At the same time, the rapid development of economy in recent years promoted the rapid development, two provinces city, province is in the first degree distribution to Zipf ideal equilibrium distribution of the transition state, these provinces including Guangxi, Jiangxi, Shanxi, Hebei, Shaanxi, Guizhou, Gansu.
The fourth category is the Zipf's index is very small, provinces in general in addition to the center of the capital city, other cities are not perfect development, urban population highly concentrated in the capital city, is typical of the first feature, the city size distribution belongs to the first distribution.Including Ningxia, Qinghai, Inner Mongolia, Yunnan, Heilongjiang.

Conclusion
Through the collection of urban population data from 1990 to 2010, the Zipf rules of urban size and urban order were tested and found that the size distribution of Chinese cities was more balanced, which was basically consistent with the Zipf rule.Zipf index values from 1990 to 2005 continued to grow, showing an increasing trend in 2010 Zipf index fell more close to the Zipf distribution of the ideal state.
Through the research and analysis of the city scale data of each province in China from 1990 to 2010, the city size distribution of each province in China is more complex, and which is not 1.Zipf index gradually increased, indicating that the city size distribution gradually changed from the first degree distribution to the Zipf rule.At the same time, for all provinces of city size distribution of Zipf's index were further analysis classified, found across Chinese provinces city size distribution state can be divided into four types: already reached the ideal state of Zipf's law distribution; close to the ideal of Zipf's law distribution; by the first degree distribution to the Zipf's distribution of the transition state; typical first degree distribution.
using 75 national urban population data, using the method of least squares fitting and maximum likelihood estimation to compare the Zipf index of several countries in the world.The results show that the distribution of the countries in the world is basically in accordance with the Zipf law, and the law is stable.The domesticscholar Xu Xueqiang (1993)  tracked the urban scale distribution of the top 1953 in China in the 1960 -1990, and predicted the situation in 2000.Through the Zipf rule test of the city size distribution in China, the characteristics and the changing trend of the city size distribution in China are analyzed.The study is divided into four parts.First part is a literature review; the second part is the part of empirical research, mainly on the model set and research data do a brief description;

Figure 1 .
Figure 1.The trend of Zipf index of urban scale distribution from 1990 to 2010

Table 1 .
Descriptive statistics of urban population in China from 1990 to 2010

Table 2 .
Zipf index measurement regression results of urban scale distribution in provinces of 1990-2010