Analyzing Cross-sectional Data and Correlation and Regression

Part one

Analyzing cross sectional data

This report describes the results of a research done on our members, both full members and weekend members; it was aimed at exposing the level of our customer satisfaction regarding our services and equipment use. There were 100 participants and this number included 55 full members and 45 weekend members, 47 of them were male and 53 were female, this is indicated by the graphs below:

Therefore the mode and mean were 1 and 1.45 respectively, the mode was one because this number represent the full members who are the majority.

Regarding the rating of our services provided, data was collected regarding the quality of services rendered by our instructors, out of 100 hundred participants 12 rated the services as bad, 35 of them rated the services as neither bad or good, 39 rated the services as good and only 14 of them rated our services as very good. Therefore the mode or the majority of the participants rated our services as good. These results are graphically shown below:

Our services were also rated through the quality of equipments used, out of 100 participants 6 of them rated the quality of equipments as very bad, 25 of them rated the equipments as bad, 33 of them rated the equipments as neither bad nor good, 27 of them rated the equipments as

1/27

Cross-sectional Data and Correlation and Regression  2

good and only 9 rated the equipments as very good. Therefore the mode or the majority of the participants did not rate our services as bad nor good regarding the quality of equipments, this graphically shown below:

According to the participants our services were also rated according to the range of facilities available, out of 100 participants only one rated the range of facilities available as very bad, 6 rated the facilities as bad, 20 as neither good or bad and 38 rated the facilities as good, the rest rated them as very good. Therefore according to the rating of the range of facilities we offer, 73 participants rated them as good or very good. This is graphically shown below:

The mean was 4 and this shows that the average rated the range of equipments available as good; the mode was 4 which indicate that the majority of the participants rated the equipments as good. In the case where the mode, the median and mean are equal, the distribution assumes an asymmetric or bell shape where both deviations from the mean are identical, the negative value of skew ness indicates that the distribution is skewed to the left.

The ratings on the cost of membership indicate that most people rate the costing as bad and only a few people rate the cost of membership as very good, only 3 out of 100 rate the cost of membership as very good, 16 as good, 21 of them rate the cost of membership as neither good or bad, 37 as bad and 23 of them rated the cost of membership as very bad. However the mode rating was the bad rating, therefore the majority rated the cost of membership as bad, the mean rating or the average rating was 2.39 meaning that the mean was the bad rating. Positive skew ness in this distribution shows that the distribution is skewed to the right.

This is graphically shown below;

Part two

2/27

Cross-sectional Data and Correlation and Regression  2

Correlation and regression

We will investigate the increase of unemployment over time using the data available; we will use Denmark to analyze the increasing unemployment rate over time in years, therefore time in years is the independent variable and the unemployment rate is the dependent variable.

rate of unemployment in Denmark  from the year 1970 to 1994

year

Denmark

1970

0.6

1971

3/27

Cross-sectional Data and Correlation and Regression  2

0.9

1972

0.8

1973

0.7

1974

2.8

1975

3.9

1976

4/27

Cross-sectional Data and Correlation and Regression  2

5.1

1977

5.9

1978

6.7

1979

4.8

1980

5.2

1981

8.3

5/27

Cross-sectional Data and Correlation and Regression  2

1982

8.9

1983

9

1984

8.5

1985

7.1

1986

5.4

6/27

Cross-sectional Data and Correlation and Regression  2

1987

5.4

1988

6.1

1989

7.4

1990

7.7

1991

8.4

7/27

Cross-sectional Data and Correlation and Regression  2

1992

9.2

1993

10.1

1994

8.2

Scatter diagram

Correlation coefficient(r)

It is the measure of the degree of the relationship between two or more variables, in our case we will determine our correlation coefficient using the absolute deviation method where [1]

8/27

Cross-sectional Data and Correlation and Regression  2

n ∑ X Y – ∑ X ∑ Y

r =                          ______________________________

(n∑X2 – ( ∑ X)2) ½ ( n∑Y2 – (∑Y)2)1/2

X

Y

XY

X

Y

2

2

9/27

Cross-sectional Data and Correlation and Regression  2

year

Denmark

1

1970

0.6

0.6

1

0.36

10/27

Cross-sectional Data and Correlation and Regression  2

2

1971

0.9

1.8

4

0.81

3

1972

0.8

2.4

9

0.64

11/27

Cross-sectional Data and Correlation and Regression  2

4

1973

0.7

2.8

16

0.49

5

1974

2.8

14

25

12/27

Cross-sectional Data and Correlation and Regression  2

7.84

6

1975

3.9

23.4

36

15.21

7

1976

5.1

35.7

13/27

Cross-sectional Data and Correlation and Regression  2

49

26.01

8

1977

5.9

47.2

64

34.81

9

1978

6.7

14/27

Cross-sectional Data and Correlation and Regression  2

60.3

81

44.89

10

1979

4.8

48

100

23.04

11

1980

15/27

Cross-sectional Data and Correlation and Regression  2

5.2

57.2

121

27.04

12

1981

8.3

99.6

144

68.89

13

16/27

Cross-sectional Data and Correlation and Regression  2

1982

8.9

115.7

169

79.21

14

1983

9

126

196

81

17/27

Cross-sectional Data and Correlation and Regression  2

15

1984

8.5

127.5

225

72.25

16

1985

7.1

113.6

256

18/27

Cross-sectional Data and Correlation and Regression  2

50.41

17

1986

5.4

91.8

289

29.16

18

1987

5.4

97.2

19/27

Cross-sectional Data and Correlation and Regression  2

324

29.16

19

1988

6.1

115.9

361

37.21

20

1989

7.4

20/27

Cross-sectional Data and Correlation and Regression  2

148

400

54.76

21

1990

7.7

161.7

441

59.29

22

1991

21/27

Cross-sectional Data and Correlation and Regression  2

8.4

184.8

484

70.56

23

1992

9.2

211.6

529

84.64

24

22/27

Cross-sectional Data and Correlation and Regression  2

1993

10.1

242.4

576

102.01

25

1994

8.2

205

625

67.24

23/27

Cross-sectional Data and Correlation and Regression  2

325

147.1

2334.2

5525

1066.93

Therefore our correlation coefficient (r) is 0.01162

Regression line

We use the classical estimation model which states that when Y= α + β x, then we estimate the model as

24/27

Cross-sectional Data and Correlation and Regression  2

α = Y- β x, and [2]

β =  n ∑x y – ∑ x∑ y

______________

n ∑ x2 – (∑ x) 2

[3] Therefore in our case

β = 0.3245

And

α = -0.498

Our model therefore will be stated as

Y = – 0.498 + 0.3245 X

The autonomous level of unemployment is – 0.498 and the model still states that an increase in one unit of time (year) will increase the level of unemployment by 0.3245 units.

25/27

Cross-sectional Data and Correlation and Regression  2

Over time there has been a rise in the level of unemployment despite the high economic growth in developed countries, the data on Denmark’s unemployment rate trend shows that there has been an increase the rate of unemployment over the years, the rising unemployment rate is matter of concern to all economies in the world and that’s why there has been an increase in efforts to reduce unemployment rates by the use of policies to bring unemployment down and also the level of inflation.

The model we have specified as Y = – 0.498 + 0.3245 X where Y is the level of unemployment and X is time in years, therefore the autonomous level of unemployment is – 0.498 and the model still states that an increase in one unit of time (year) will increase the level of unemployment by 0.3245 units, the model also shows that there is an increase in the level of unemployment overtime, however the autonomous level of unemployment is less than zero and this would show that there has been efforts to reduce the unemployment levels.

The correlation coefficient for the two variables is 0.01162; the value shows a positive relationship between the two variables, however regarding the strength of the relationship we could say that there do not exist a strong relationship between the two variables, this could be because we have omitted other important variables that will determine the level of unemployment example price levels or inflation, the level of national income and government policies.

26/27

Cross-sectional Data and Correlation and Regression  2

## References

P. Schmidt (1976) Econometrics, Marcel Dekker publishers, USA

Sergio J. Rey (1956) Advances in Spatial Econometrics: methodology, tools and applications, Springer publishers, USA

Wooldridge J. (2002) Econometric Analysis of Cross –section and Panel Data, MIT Press, US

[1] P. Schmidt (1976)

[2] P. Schmidt (1976)

[3] P. Schmidt (1976)

27/27