Running Head: INTERPRETING SOCIAL DATA

Name:

University:

Course:

Date:

Abstract:

The theory of education screening by Stiglitz (1975) shows the relationship between education and income, the theory states that education attainment acts as a screening device in the labor market. Other related theories include the human capital theory that shows the relationship between development and education. These theories are discussed in relations to findings from this paper that show that there is a positive relationship between education and personal income.

Introduction:

The paper describes the characteristics of three variables namely education attainment, income and gender, the paper also defined the various measurement levels and also determines the distribution of each variable. The last section focuses on the relationship between income and

1/22

Interpreting Social Data

education attainment and relates these findings to recent theories that depict this relationship.

Data selected:

There are four measurement levels and they include ordinal, interval, nominal and ratio, nominal data according to Bryman (1988) represents categories that are not ranked, an example of a nominal variable is the number of a football player, this number does not represent any rank, another example is gender whereby a number one may be used to represent male and the number 2 may be used to represent female. Therefore sex variable from the data set is nominal data given that the number represents a certain category but does not represent any rank. The mode can be calculated for this type of data whereas the mean and median have no meaning.

According to Bryman (1988) the other type of measure is ordinal data whereby numbers represents categories and can be ranked in a certain way, for example numbers that represent the income group of individuals, in our case the income levels are grouped in twelve categories whereby the number 12 represents a high income group and the number 1 represents a low income group, the numbers represents categories and can also be ranked and therefore individual income variable from the data set is ordinal. The median and mode can be calculated in this type of data but the mean value has no meaning.

Bryman (1988) also defines interval and ratio data, interval data refers to units of data whose difference can be calculated, this differences have meaning but the ratio between the units of data cannot be determined, however the ratio of differences can be determined, the mean, median, standard deviation and mode can be calculated for this type of data. For ratio data the ratio between units of data can be calculated including all the statistical operations that can be undertaken on interval data. Therefore the number of years in school will be classified as interval data and included in this study.

2/22

Interpreting Social Data

From the above discussion the selected variables will include sex data which represents gender and therefore is a nominal scale, individual income which is ordinal data whereby one can rank the values of income in terms of high income groups and low income groups, finally number of years in school which is interval data whereby these numbers can be ranked, differences determined and other calculations such as mean, median, mode and standard deviation determined.

Distribution:

The distribution of the three variables selected will be determined by determining the frequency, the following is a summary of results determined using SPSS.

Personal income:

The income variable mode value is 12 and this means that majority of the respondents income was 25,000 and above, the number of included cases n =959, the following table shows the results:

Statistics

RESPONDENTS INCOME

3/22

Interpreting Social Data

N

Valid

959

Missing

541

Mode

12

Std. Deviation

2.716

Variance

7.376

4/22

Interpreting Social Data

Skewness

-2.126

Std. Error of Skewness

.079

Kurtosis

3.870

Std. Error of Kurtosis

.158

Minimum

1

5/22

Interpreting Social Data

Maximum

13

The skewness value is -2.126 and this means that from the frequency table the distribution will be skewed to the left, the following chart shows the frequency distribution chart:

From the chart there are more respondents whose level of income was in the higher level of income, this also means that only a few respondents level of income was below the mean income level, the standard deviation level is 2.716 which mean that income level deviates 2.716 from the mean value. Therefore it is evident that income level does not assume a normal distribution whereby the distribution is skewed to the left whereby majority of observations are on the high scores of the scale and only a few on the low score of the scale.

Sex:

Sex represents whether the respondent is male or female, the total number of respondents amounted to n = 1500, 705 were male and 795 were female, this means that 53% of the respondents were female and 47% were male, therefore male were the majority, the table below

6/22

Interpreting Social Data

summarizes the results:

RESPONDENTS SEX

Frequency

Percent

Valid  Percent

Cumulative  Percent

Valid

MALE

705

47.0

7/22

Interpreting Social Data

47.0

47.0

FEMALE

795

53.0

53.0

100.0

Total

1500

100.0

100.0

8/22

Interpreting Social Data

The chart below represents the ratio of male and female in the data set:

The pie chart is more appropriate in that the variable contains only two categories which include male and female, from the chart it is evident that there are more male than female in the data set. Given that this variable is nominal the mean and standard deviation cannot be calculated and only the mode which represents the category with high frequency can be calculated.

Education:

Education variable represents the number of years in school, this is interval level data that shows the number of years respondents have spent in school, the number of valid cases are 1499 and that the mean number of years spent in school is 13.69, the highest number of years spent in school is 20 years while the lowest number of years spent in school is zero. The table below shows the results:

Statistics

HIGHEST YEAR OF SCHOOL COMPLETED

9/22

Interpreting Social Data

N

Valid

1499

Missing

1

Mean

13.69

Median

14.00

Mode

12

10/22

Interpreting Social Data

Std. Deviation

2.888

Variance

8.342

Skewness

-.301

Std. Error of Skewness

.063

Kurtosis

1.314

11/22

Interpreting Social Data

Std. Error of Kurtosis

.126

Minimum

0

Maximum

20

The skewness value from the table is -0.301 and this shows that the distribution is skewed to the left, the normal distribution skewness value is usually zero and comparing this value with the value from the income distribution above it is evident that the shape of the distribution is relatively close to a normal distribution, the mode value for this variable is 12 meaning that majority of the individuals have the number of years spent in school equal to 12 years, the median value is 14 and given that this value is greater than the mean value then this supports the conclusion that the variable is negatively skewed. The chart below show the frequency

12/22

Interpreting Social Data

distribution:

From the chart there are more respondents whose level of education was in the higher number of years spent in school, this also means that only a few respondents level of education was below the mean number of years spent in school, the standard deviation level is 2.888 which means that number of years variable deviates 2.888 years from the mean value.

Independent variable:

According to Blanden Gregg (2004) the level of education will determine the level of income achieved, individuals who have achieved high education are more likely to earn more. He also states that the family income will influence the education attainment of children in a family. Children whose parents have high income levels are likely to achieve higher education levels than those whose parents earn less.

Michael Coelli (2007) also discusses the relationship between income and education, in his paper he states that parent income will determine the level of education achieved by the youth and that the level of education achieved by the youth will also determine their level of income in future, he states that those who spend more time in school have a high possibility of receiving higher income levels than those who spent less years schooling.

Using this concept that income earned will be influenced by the level of education then the level of income will be the dependent variable while education will be the independent variable, independent variable refers to the variable which will influenced another variable while the dependent is the variable that will be influenced by another variable. Therefore in this case the number of years spent in school will be the independent variable while the income level will be the dependent variable.

13/22

Interpreting Social Data

The hypothesis in this case will therefore state that the level of education will influence the level of income achieved by a respondent. The chi square test is appropriate in testing this hypothesis and the following table shows the results of the tests:

Chi-Square Tests

Value

df

Asymp.  Sig. (2-sided)

Pearson Chi-Square

312.899(a)

216

.000

14/22

Interpreting Social Data

Likelihood Ratio

263.496

216

.015

Linear-by-Linear Association

37.286

1

.000

N of Valid Cases

959

15/22

Interpreting Social Data

a 214 cells (86.6%) have expected count less than 5. The minimum expected count is .01.

From the tablet he chi square calculated value is 312.899 and given that the p value is 0.25 then from the chi table the critical value is less than the calculated value, when the calculated value is greater than the critical value then the null hypothesis is rejected, the null hypothesis in this case was that the dependent variable does not depend on the independent variable. Therefore it is evident that there is a relationship between the two variables where the income level depends on the level of education.

The table below shows the correlation coefficient of the two variables:

Symmetric Measures

Value

Asymp.  Std. Error(a)

16/22

Interpreting Social Data

Approx.  T(b)

Approx.  Sig.

Interval by Interval

Pearson’s R

.197

.031

6.225

.000(c)

Ordinal by Ordinal

Spearman Correlation

.282

17/22

Interpreting Social Data

.030

9.102

.000(c)

N of Valid Cases

959

a Not assuming the null hypothesis

b Using the asymptotic standard error assuming the null hypothesis

c Based on normal approximation

18/22

Interpreting Social Data

From the table the Pearson correlation coefficient value is 0.197, this positive value shows that there is a weak but positive relationship between the two variables, this positive relationship depicted states that as one variable increases then the other variable is also increasing, therefore as the level of education increases income will also increase and that as the level of education declines the income level also declines. The value is relatively close to zero and this is explained by the fact that there exist a weak relationship between the variables and this can be explained by the fact that the level of income will be influenced by other factors other than education attainment such as experience.

Existing theories:

The above results relate to other theories that have been developed over the years, one of this theory include the theory of Screening by Joseph Stiglitz in 1975, according to him he discusses results from a study by Spence and Arrow ( 1973) whereby they stated that education is a screening device in the job market, this means that the job market will selectively choose employees in the labor market with reference to their education attainment, individuals with higher education levels are more likely to acquire job positions that are highly paid.

This shows that education attainment will determine an individuals employment position in future, given that different positions will have different compensation packages then those who have attained high in terms of education are likely to be employed in positions with high compensation, therefore this means that the more the years spent in school then there is a high possibility that an individual will have higher earnings than those who spend less years in school.

The human capital theory also supports the above findings. According to this theory the level of education will determine the level of human capital, higher education levels will mean that an individual has higher levels of human capital, this relates to development theories that state that

19/22

Interpreting Social Data

one way to relies economic development is through increasing human capital by investing in education, for this reason individuals with high levels of human capital are more likely to earn more than those with low levels of human capital.

Peter Wiles (1974) also discusses the correlation between income and education attainment, he relates the human capital theory to college degree whereby he states that a degree will confer social status whereby individuals who have attend high education levels will have high status in the society, as a result of this status the individuals are more likely to earn higher incomes than other individuals in the society.

Conclusion:

From the above analysis there is a positive relationship between education and income, the higher the education level attained then the higher the level of income earned, and this concept is supported by the chi square test that shows that there is a relationship between the independent and dependent variable. The correlation coefficient shows that as the level of education attained increases then the income level also increases, finally the theory of screening and the theory of human capital support these findings whereby they depict a positive relationship between education and income.

References:

Bernard, H. Russell (2002) Research Methods in Anthropology Qualitative and Quantitative

Approaches.                                                             , California: AltaMira Press.

20/22

Interpreting Social Data

Black, T. (1993) Evaluating Social Science Research. An Introduction, London: Sage.

Bryman, A. (1988) Quantity and Quality in Social Research, London: Unwin Hyman,

Blanden, Gregg, 2004. Family Income and Educational Attainment: a Review of Approaches And Evidence for Britain. Oxford Review of Economic Policy 20 (2), 245-263.

Gilbert, N. (2008). Researching Social Life, London: Sage .

Joseph Stiglitz (1975). The Theory of Screening: Education and the Distribution of Income, American Economic journal, 4, (2) Pp 34-37

Locke, L. and Spirduso, W. (1998) Reading and Understanding Research, London: Sage.

May, T. (2001) Social Research: Issues, Methods and Process, Buckingham: Open University Press.

January, from https://editorialexpress.com/cgi-bin/conference/download.cgi?db_name=ESAM0 7&paper_id=148

Miles, B. and Huberman, M. (1994) Qualitative Data Analysis: An Expanded Sourcebook,

London: Sage.

Peter Wiles (1974). The correlation between education and earnings: The External Test. Higher Education Journal, 3, (1)Pages 43-58

21/22

Interpreting Social Data

Rose, D. and Sullivan, O. (1996, 2nd ed.) Introducing Data Analysis for Social Scientists, Buckingham: Open University Press.

Shipman, M. (1981) The Limitations of Social Research, London: Longman.

22/22