Using Excel to Work on Correlation and Regression Examples - Statistics HW Help

Personal Income x

Personal outlay, y

5.6

4.6

5.8

4.9

6.2

5.2

6.5

5.5

6.9

5.8

7.4

6.1

7.8

6.5

8.4

7

8.7

7.3

8.9

7.7

9.2

8




  • Our company offers quality and prompt Statistical Homework Help Online.


  • We offer personalized help for any kind of Statistics subjects including Elementary Statistics, Stats 101, Business Statistics, Biostatistics, Probabilities, Advanced Statistics, etc.


  • Our service is convenient, efficient and confidential. We can solve stats problems for you.


  • We can help you with your EXCEL, SPSS, SAS, STATA, JMP and MINITAB assignments and projects.


  • Our rate starts $35/hour. We provide a Free Quote in hours. Quick turnaround!





Question 2: Calculate the correlation coefficient \(r\). What can you conclude?

Personal Income x

Personal outlay, y

Personal Income x

1

Personal outlay, y

0.995917753

1

\[r=0.9959\]

Solution: We need to test the hypotheses

\[\begin{array}{cc}

& {{H}_{0}}:\rho =0 \\ & {{H}_{A}}:\rho \ne 0 \\ \end{array}\]

We use a t-test, and the t-statistics is given by:

\[t=r\sqrt{\frac{n-2}{1-{{r}^{2}}}}=0.9959\sqrt{\frac{9}{1-{{0.9959}^{2}}}}=33.0996\]

The critical t-value for 9 degrees of freedom and \(\alpha =0.05\) is \({{t}_{c}}=2.262\), which means that we reject he null hypothesis. This means that the correlation is significantly different from zero.

Solution: We use Excel to find the regression equation. The results are shown in the table below:

SUMMARY OUTPUT

Regression Statistics

Multiple R

0.995918

R Square

0.991852

Adjusted

R Square

0.990947

Standard Error

0.109835

Observations

11

ANOVA

df

SS

MS

F

Significance F

Regression

1

13.21688

13.21688

1095.589

1.03E-10

Residual

9

0.108574

0.012064

Total

10

13.32545

Coefficients

Std. Error

t Stat

P-value

Lower 95%

Upper 95%

Intercept

-0.35871

0.201982

-1.77594

0.109475

-0.81562

0.098207

Personal Income x

0.891226

0.026926

33.09968

1.03E-10

0.830316

0.952136

This means that the regression equation is given by

\[\hat{y}=-0.35871+0.891226x \]

The scatter plot with the regression line is shown below:

regression line example

Question 5: Use the regression line to predict the personal outlays when the personal income is 5.3 trillion dollars.

\[\hat{y}=-0.35871+0.891226\times 5.3=4.364789\]

which means that predicted personal outlay is $4.364789 trillion dollars

Question 6: Find the coefficient of determination and interpret the results.

\[{{r}^{2}}={{0.995918}^{2}}=0.991852\]

This means that 99.1852% of the variation Personal Outlay is explained by Personal Income.

Question 7: Find the standard error of the estimate \({{s}_{e}}\) and interpret the results.

\[{{s}_{e}}=\sqrt{\frac{\sum{{{\left( {{Y}_{i}}-{{{\hat{Y}}}_{i}} \right)}^{2}}}}{n-2}}=0.109835\]

Question 8: Construct a 95% prediction interval for personal outlays when personal income is 6.4 trillion dollars. Interpret the results.

\[\hat{y}=-0.35871+0.891226\times 6.4=5.345138\] \[CI=\left( \hat{y}-E,\text{ }\hat{y}+E \right)=\left( 5.345138-E,\text{ }5.345138+E \right) \] \[E={{t}_{\alpha /2}}\times {{s}_{e}}\sqrt{\frac{1}{n}+\frac{{{\left( X-\bar{X} \right)}^{2}}}{\sum{{{X}^{2}}-n{{{\bar{X}}}^{2}}}}}=2.262159\times 0.109835\sqrt{\frac{1}{11}+\frac{{{\left( 6.4-7.4 \right)}^{2}}}{619-11\times {{7.4}^{2}}}}=0.096552\] \[CI=\left( 5.345138-E,\text{ }5.345138+E \right)=\left( 5.248586,\text{ }5.44169 \right) \]
\[\hat{y}=1257-1.34{{x}_{1}}+1.41{{x}_{2}}\]

where \({{x}_{1}}\) is the number of acres planted (in thousands) and \({{x}_{2}}\) is the number of acres harvested (in thousands). Use the regression equation to predict the y-values for the given values of the independent variable listed below. Then determine which variable has a greater influence on the value of \(y\)

(a) \({{x}_{1}}=2103,\text{ }{{x}_{2}}=2037\) (b) \({{x}_{1}}=3387,\text{ }{{x}_{2}}=3009\)

(c) \({{x}_{1}}=2185,\text{ }{{x}_{2}}=1980\) (d) \({{x}_{1}}=3485,\text{ }{{x}_{2}}=3404\)

21-29

30-39

40-49

50-59

60 and above

Jury

45

128

244

224

359

Population

20.50%

21.70%

18.10%

17.30%

22.40%

\[{{H}_{0}}:{{p}_{1}}=0.205,\text{ }{{p}_{2}}=0.217,\text{ }{{p}_{3}}=0.181,\text{ }{{p}_{4}}=0.173,\text{ }{{p}_{5}}=0.224\]

21-29

30-39

40-49

50-59

60 and above

Total

Jury (Observed)

45

128

244

224

359

1000

Expected

205

217

181

173

224

1000

\[{{\chi }^{2}}=\sum{\frac{{{\left( {{O}_{i}}-{{E}_{i}} \right)}^{2}}}{{{E}_{i}}}}=279.7048\] \[\chi _{C}^{2}=13.2767\]

Gender

16-20

21-30

31-40

41-50

51-60

61+

Male

155

280

265

225

155

50

Female

60

145

145

130

65

15

Observed Frequencies

Age Group

Gender

16-20

21-30

31-40

41-50

51-60

61+

Total

Male

155

280

265

225

155

50

1130

Female

60

145

145

130

65

15

560

Total

215

425

410

355

220

65

1690

Expected Frequencies

Age Group

Gender

16-20

21-30

31-40

41-50

51-60

61+

Total

Male

143.7574

284.1716

274.142

237.3669

147.1006

43.46154

1130

Female

71.2426

140.8284

135.858

117.6331

72.89941

21.53846

560

Total

215

425

410

355

220

65

1690

\[{{\chi }^{2}}=\sum{\frac{{{\left( {{O}_{ij}}-{{E}_{ij}} \right)}^{2}}}{{{E}_{ij}}}}=9.95144\] \[\chi _{C}^{2}=11.07048\]

Question 20: \(\alpha =0.05\), d.f.N = 6, d.f.D = 50

Question 21: \(\alpha =0.10\), d.f.N = 5, d.f.D = 12

\[\begin{array}{cc} &{{H}_{0}}:{{\sigma }_{1}}={{\sigma }_{2}} \\ &{{H}_{A}}:{{\sigma }_{1}}<{{\sigma }_{2}} \\ \end{array}\] \[F=\frac{s_{2}^{2}}{s_{1}^{2}}=\frac{{{25.4}^{2}}}{{{1.31}^{2}}}=375.9455\]