Data Anlaysis

Data Analysis 2

Task1  (30 marks)

Task 1 is based on the dataset to be prepared as follows:
A.    Go to  http://data.worldbank.org/ .
Click on the option By Country on the top left-hand side of the page. Select a country from the list. For this country,  collect the data on (i) Life expectancy at birth, total (years), (ii) GNI per capita, Atlas method (current US$) and (iii) Improved water source, rural (% of rural population with access)
B.    Acquire the data for a total of n randomly selected  countries where n is an integer of your choice satisfying 15 ≤ n  ≤  30

C.    Questions:
Consider the  Classical Linear Regression Model (CLRM) Y = α +βX + where X denotes the independent GNI per capita, Atlas method (current US$), Y is the dependent variable Life expectancy at birth, total (years), α and β are unknown constants and  is a random variable. Use a calculator and your sample to calculate ∑X, ∑Y, ∑XY and ∑X2.  Use these values to write down the pair of ‘normal equations’ the solutions of which give the constant term (a) and the slope coefficient (b) of the fitted Ordinary Least Squares line Y = a + bX.
i.    Write down the equivalent matrix representation of the normal equations you have written in part (i) above. (15 marks)
ii.    Explain how matrix algebra can be used to solve for the terms a and b. (15 marks)
[Note: You are not required to give the solutions; simply explain the method.]

Task 2 (70 marks)
Consider the Classical Linear Regression Model (CLRM) Y = 1 + 2X2 +3X3 + 4X4 + ɛ, where
Y ≡ IN ≡ Internet users; X2 ≡  Urban population; X3≡  Literacy Rate  and X4 ≡  GDP.
Stata performed an OLS estimation of the model and produced the regression output below:

READ ALSO :   Substance Abuse at Rocky University

Use the regression output to answer the following questions
i.      Identify the 95% confidence interval for the coefficient 2 and show the calculations that explain its derivation. (20 marks)
ii. Test (at 5% level of significance) the hypothesis that GDP has a positive effect on IN. (20 marks)
iii. Explain why and how the F-statistic was calculated by the software and how it can be used to test for the overall relevance of the regression model. (30 marks)

General guidelines:
(i)    Explain well using a Text Editor such as Word.
(ii)    Do not write anything that is unrelated to the numeric analysis you have
performed.
(iii)    Remember that a proper explanation is as important as correct numerical
calculations.
(iv)    Reference your work accurately, acknowledging any source of information
you have used.
(v)    Stay within the specified word-limit (± 10%)

PLACE THIS ORDER OR A SIMILAR ORDER WITH US TODAY AND GET AN AMAZING DISCOUNT 🙂