Methodology of Econometric Research:

There are four main stages in the methodology of econometric research and they include:

1. Model specification

2. Estimation of the model

3. Evaluation of the estimates

4. Evaluation of the forecasting validity of the model

Model specification

Model specification is based on the available literature and the theory, the literature helps us to specify the relationship between the independent variable and the dependent variable in the form Y=F(X) or Y=F(X 1, X2). We choose the mathematical form of the model and in this case we determine whether the relationship between the two variables is linear or quadratic. To determine this we use a scatter diagram where we plot the graph using the data and determine the relationship that exist or we can experiment with different specifications. Model specification is the most difficult stage yet the most important because of specification errors that occur including errors of omission where certain variables are omitted and mistaken mathematical form of the model.

Estimation of the model:


Methodology of Econometric Research:

This involves obtaining the numerical estimates of the coefficients and this stage involves the following steps:

Gathering the data

We gather the data and determining the type of the data, the data may be time series data which is the data collected from one period to another for a long period of time, cross sectional data which is data collected at a certain point in time and observed from different countries or regions.

Aggregation of units

This entails adding up the units which are collected from different levels and different units

Choosing appropriate econometric technique

We choose from single equation technique or simultaneous equation technique and this will depend on the nature of relationship between variable.

Evaluation of the estimates:


Methodology of Econometric Research:

This stage involves determining whether the estimates of the model are theoretically or statistically meaningful, we do this using three criteria the economic criteria where we check for unexpected signs and sizes of parameters and this could be as a result of deficiencies in the data or poor model choice.

Under the statistical criteria we test for statistical reliability of the model by calculating the correlation coefficient (r) which gives us the goodness of fit and dispersion between the variables. The econometric criteria involves checking whether assumptions of the econometric models are observed, if the assumptions are not observed we must re specify the model, here we check for autocorrelation, standard errors and whether the equation is the best linear unbiased equation (BLUE).

Evaluation of the fore casting power of the model

Here we test the predictive power of the model before we use it for forecasting, we also investigate the stability of the model through manipulation of the sample size or using different samples.

In our case we are to investigate the effects of war on economic development, the occurrence of war is qualitative in nature and therefore we apply a dummy variable, the dummy variable only assumes a value of zero or one, if there is war we apply the dummy variable as one and if there is no war the dummy variable assumes a value of zero. We have to first specify the function of economic development and then apply the dummy variable to the model, in the data therefore we have to add another column say W and in the years in which there is war we give the value of one to the dummy variable, in the years in which there is no war we assume the value of the dummy variable as zero.

We may assume that economic development E is equal to a constant (C) plus W times D, therefore our equation will be: E= C + WD. when we specify the equation as shown we estimate our equation using the classical linear regression model.

1. The classical linear regression model has the following assumptions

2. The error term is assumed to be a real random variable


Methodology of Econometric Research:

3. The mean value of the error term is equal to zero

4. The variance of the error term at each period is constant, in other words it is homoscedastic and not heteroscedastic

5. The error term assumes a normal distribution at every point

6. The error terms of different variables are independent and therefore not correlated

7. The error term is independent of the explanatory variable and therefore no correlation between the error term and the independent

According to Gauss Markov he says that given the assumptions of the classical linear regression model, the OLS estimators have the minimum possible variance in the class of unbiased linear estimators that is to say they are BLUE.

Given a model of the form Y=α + β x then

_          _

α = Y – β x

_                                                     _

Where Y is the mean of Y and x is the mean of x.

β = n ∑x y- ∑x ∑y


Methodology of Econometric Research:


n ∑x2 – ( ∑x ) 2


Heterosckedasticity is most common in cross sectional data; it is a situation whereby the variance of the error term is not constant across observations.

Consequences of Heterosckedasticity:

The estimators are still linear functions but they are biased, the estimators no longer have minimum variance and also due to biasness in variance the confidence level, T and F test and the hypothesis test is not reliable.

`Detection of Heterosckedasticity:


Methodology of Econometric Research:

The first easiest method is to plot the value of the residues against the independent variable; the graph these residues give us will help us detect Heterosckedasticity, if the residues increase as the value of the independent variable increases then there is Heterosckedasticity.

Another method used to detect this is using the park test, this method involves the regressing of the variance of the random variable on one of the explanatory variable and if the estimates are statistically significant then there is Heterosckedasticity. Other methods include the Gleiser test and the white’s general test.


A remedy for Heterosckedasticity is to re specify the model; the other remedy is applied where the true variance is known and therefore we simply use the weighted least square method where we simply divide the whole model by the variance. In the case where the variance is not known we standardise the model by dividing the whole model by the square root of the independent variable.

Hypothesis testing:

this is the statistical process of testing the estimated parameters using a T table, first you have to state the null hypothesis and the alternative hypothesis, next you get the Z calculated and this is obtained by dividing the estimated parameter by its standard error, you hen specify the level of testing example at 1% or 5%, you then find the Z critical through observing the readings from the T table taking into consideration whether it’s a two tail or one tail test and the degrees of freedom.


Methodology of Econometric Research:

If Z calculated is greater than the Z calculated then you reject the null hypothesis and accept the alternative hypothesis, this can be shown below:

For the slope β

The null hypothesis

H: β=0

Alternative hypothesis

H: β≠ 0

For the constant α you apply the following

Null hypothesis

H: α = 0

Alternative hypothesis

H: α ≠ 0


Methodology of Econometric Research:

Calculation of the Z calculated

For β

Z β calculated =           β

________     Where S β is the standard error of β

S β

For α

Z α calculated =           α

________     Where S α is the standard error of α

S α

You specify the level of testing and if you choose a 5% level and it’s a two tail test you will have to divide this percentage by two and therefore read the value of 0.025 on the T table. For a one tail test you will read 0.05 on the T table.

If the Z calculated is greater than the Z critical then you reject the null hypothesis, this can be illustrated below

Z calculated > Z critical then you reject the null hypothesis


Methodology of Econometric Research:

Z calculated < Z critical then you accept the null hypothesis

When you reject the null hypothesis it means that the parameter you are testing is statistically significant and therefore it is not equal to zero. If you accept the null hypothesis this means that the parameter being tested is not statistically significant and therefore is equal to zero. This is diagrammatically shown below; it is the normal distribution graph.

Therefore if the Z calculated lies on the rejection zone you reject the null hypothesis.

Confidence interval:

It is the idea of getting an interval estimate as a level of confidence; it is calculated for the two estimated parameters in the model, to build the confidence interval we have to standardise as follows

For α


µ =  x – Z σ2 this is the lower limit of µ


µ =  x + Z σ2 this is the lower limit of µ


Methodology of Econometric Research:


µ is the population mean while x is the sample mean, Z is the reading from a normal distribution table depending on the level of confidence one wants to read, example a confidence level of 99%, 98% or even 95%, if it’s a 95% confidence level we check for the readings for 5% on the table.

We therefore represent the confidence interval as follows:

P ( (α – Z S α) ≤ α ≤ (α + Z S α)) = Y% where Y is the percentage of the confidence interval, Z is

the reading from the normal distribution and S α is the standard error of the parameter in question.

For β the same case apply and we represent this as follows

P ((β – Z S β) ≤ β ≤ (β + Z S β)) = Y%

the above discussion apply to simple models where we have only one dependent variable and one independent variable, however there are cases where we apply more than one independent variable, this is known as the multiple regression models, this model is in the form of

Y = F(X1, X2) that is to say Y depends on the variations of X1 and X2, this models are complex and their estimation is different from the estimation of the above.


Methodology of Econometric Research:

Correlation (r)

It is the degree of relationship that exist between two or more variables, the higher the correlation coefficient the stronger is the relationship between the variables. In a function that takes the form of Y=F(X) the correlation between Y and X is referred to as the Pearson correlation coefficient and is calculated as follows

r =                   n ∑x y- ∑x ∑y


(n ∑x2 – (∑x) 2 ) ½ (n ∑y2 – (∑y) 2 ) ½

The closer the correlation coefficient is closer to one then the strong is the relationship between the two variables.


Given that the cost of running a school depends on the number of students and the type of school, if the general cost function of the cost of running a school is given by

Cost = B1 + B2N


Methodology of Econometric Research:

If we want to determine the additional costs associated with the type of school, assume that we have two types of schools occupational and regular school.


the above discussion involves all the econometric techniques used in the estimation of a model, the nature of data helps us to choose which model to apply and estimate, statistical testing must be applied to every model estimated and results accurately determined, the hypothesis test help us to determine the importance of an estimated parameter, if the null hypothesis is accepted then the parameter is not statistically significant. Cross sectional data mostly is faced by the problem of Heterosckedasticity and this violates the classical assumption where the error term should have a constant variance across observations, if the error term does not have a constant variance then this shows that there is an omitted independent variable that causes this therefore this should be observed.


Jeffrey M. Wooldridge (         )Introductory Econometrics