Questions 3

Question 1:

I. 5% are defective, probability that at least one is defective out of 4

The assumption is that the outcomes are independent and therefore this is a Binomial distribution, binomial probability is calculated as:

P(x) = nCk Pk (1-P) 1-k

P – Probability of success

K – Successes

N – Trials

Using the binomial function in excel the table below show the results

1/39

Questions 3

success

n

k

binomial

0

4

0

0.81450625

2/39

Questions 3

1

4

1

0.171475

0.171475

2

4

2

0.0135375

0.0135375

3/39

Questions 3

3

4

3

0.000475

0.000475

4

4

4

0.00000625

4/39

Questions 3

0.00000625

1

0.18549375

The probability of at least one is defective is the sum of P (1) + P (2) + P (3) + P (4) = 0.1855

II.

5/39

Questions 3

Battery last 40 hours, normal distribution with mean = 50 and variance 16.

Therefore standard deviation = square root variance = 4

Z score for the value 40

Z score = x – mean / standard deviation

Z score = 40 – 50/4

Z score = -2.5

From Z table probability for 2.5 = 0.4938

The following diagram demonstrates the probabilities

Probability those hours is greater than 40 = 0.4938 + 0.5 = 0.9938

Probability less than 40 = 1-0.9938 = 0.0022

6/39

Questions 3

III. 60,000 students

In order to compare the above the appropriate method would be stratified sampling, this will involve subdividing the population into categories that are required, the next step will be to determine the sample size to be selected from each category and when the equal samples are determined a random sample should selected from each category.

Question 2:

I. The one sample Z test is applied; Z calculated value is calculated as follows:

Z calculated = [(x – mean)/ (standard deviation/ (square root n)]

3101

2984

7/39

Questions 3

3333

2416

3102

1897

2851

2134

3415

8/39

Questions 3

3050

3002

2910

3222

3872

2999

2806

mean

9/39

Questions 3

2943.375

n

16

standard deviation

478.9157

1980

3011

10/39

Questions 3

Z calculated

0.564818

Z calculated = [(3011 – 2943.375)/ (478.9157/ (square root 16)]

Z calculated = 0.564818

Z critical value at 95% level = 0.0195

Z calculated > Z critical

Therefore null hypothesis that the two values are equal is rejected; this means that 1987 and 1980 tax deductions were different.

Confidence interval

P [(mean – (Stdev * Z critical)) < mean < (mean + (Stdev * Z critical))] = 95%

11/39

Questions 3

P [(2943.375– (478.92 * 0.0195)) < mean < (2943.375 + (478.92 * 0.0195))] = 95%

P [(2934.036144) < mean < (2952.713856)] = 95%

Answer: 1980 and 1987 tax deductions were different at 95% level of test

Confidence interval-This means that we are 95% confident that the mean is between2934.036 and 2952.71

II. Unemployment = 30%

Sample of 400, 90 are unemployed

Proportion = 90/400 = 0.225

Proportion hypothesis test:

Hypothesis:

H0: P ≥ 0.30

12/39

Questions 3

H1: P < 0.30

Z calculated = P1 – P0 / (square root (P0 (1-P0))/n)

Where P0 is hypothesized proportion

Z calculated = 0.225-0.3/ (square root (0.3 (1-0.3))/400)

Z calculated = -0.00818

Z critical at 5%=0.0195

Z critical > Z calculated, therefore null hypothesis is accepted, this means that the proportion of unemployed is greater or equal to 0.3

III. Weight gain in five different groups:

The ANOVA analysis is appropriate given that it compares more than three mean values,

13/39

Questions 3

A

B

C

D

E

1

10

9

12

6

11

14/39

Questions 3

2

11

6

11

8

9

3

7

8

8

9

15/39

Questions 3

10

4

8

7

10

8

7

Assumptions are that the distribution of values is normally distributed, there is a possibility of difference in mean values and standard deviations are equal.

Hypothesis:

H0: M1=M2=M3=M4=M5

16/39

Questions 3

H1: M1≠M2≠M3≠M4≠M5

Using excel data analysis tool the following table summarizes the results:

ANOVA: Single Factor

17/39

Questions 3

SUMMARY

Groups

18/39

Questions 3

Count

Sum

Average

Variance

Row 1

5

48

9.6

5.3

19/39

Questions 3

Row 2

5

45

9

4.5

Row 3

5

20/39

Questions 3

42

8.4

1.3

Row 4

5

40

8

1.5

21/39

Questions 3

22/39

Questions 3

ANOVA

Source  of Variation

23/39

Questions 3

SS

df

MS

F

P-value

F crit

Between  Groups

7.35

3

2.45

0.777777778

24/39

Questions 3

0.52339371

5.292214

Within  Groups

50.4

16

3.15

25/39

Questions 3

Total

57.75

19

26/39

Questions 3

From the table the F critical value is 5.292214, F calculated is 0.777777778, F critical > F calculated, therefore null hypothesis H0: M1=M2=M3=M4=M5 is accepted.

Answer: H0: M1=M2=M3=M4=M5 is accepted, meaning that there is no significant difference in mean weights of the groups.

Question 3:

Relationship between electricity usage and temperature:

elect usage

average temp

1000

18

420

50

27/39

Questions 3

400

55

705

30

550

45

850

25

1020

17

28/39

Questions 3

670

35

610

38

560

42

I. the following is a scatter diagram that shows the relationship between electricity usage and temperature:

From the chart there is an inverse relationship between the two variables, as temperature increase electricity usage declines.

II. Regression

The following table shows the regression output using excel:

29/39

Questions 3

SUMMARY OUTPUT

30/39

Questions 3

Regression Statistics

Multiple R

0.987489

31/39

Questions 3

R Square

0.975135

0.972027

32/39

Questions 3

Standard Error

36.54977

Observations

10

33/39

Questions 3

ANOVA

df

SS

34/39

Questions 3

MS

F

Significance F

Regression

1

419115.4

419115.4

313.7359736

1.0558E-07

Residual

8

35/39

Questions 3

10687.09

1335.886

Total

9

429802.5

36/39

Questions 3

Coefficients

Standard Error

t Stat P-value Lower 95%

Intercept

1268.277

35.24601

35.98356

3.89722E-10

1186.99947

37/39

Questions 3

average temp

-16.6134

0.937945

-17.7126

1.05581E-07

-18.776339

The estimated model is as follows:

Electricity usage = 1268.28 – 16.6134 temperature

This model mean that as temperature increases the electricity usage declines, by holding all factors constant and increasing temperature by one unit then electricity usage declines by 16.6134 units, the constant electricity usage is 1268.

The following is the fitted regression model:

38/39

Questions 3

III. Hypothesis test on significance of temperature coefficient:

From the excel output the T statistics value for the slope = -17.7126, T critical at 0.01 (two tail) level of significance is positive or negative 3.24, t calculated > T critical, therefore the null hypothesis that the slope is equal to zero is rejected, therefore slope is significance at the 0.01 significance level.

IV. Predict usage when temperature is 40 degrees

Electricity usage = 1268.28 – 16.6134 temperature

Electricity usage = 1268.28 – 16.6134 (40)

Electricity usage = 603.7395