## These are statistic exercises. Need to use R to answer the exercises and include all the answers in just a Rmd file. Also, include the code, too. There are two exercises and each of them has at least

These are statistic exercises. Need to use R to answer the exercises and include all the answers in just a Rmd file. Also, include the code, too.

There are two exercises and each of them has at least 5 questions.

Exercise 1:

e)Test the null hypothesis that the slope parameter β1 is at least 160 versus the research hypothesis that it is less than 160. State the null and the alternative. Use α = 0.02, but be sure to provide a p- value.

f)Provide a 98% confidence interval for the slope parameter. (You can also compute this using the estimate and standard error using the prompt summary(lm(y~x)) of R).

g)R gives a t-statistic for the slope parameter, along with a p-value. State the hypotheses that these are for.

h)Plot the residuals versus density and judge whether they show evidence of either lack of model fit or non-constant variance (that is, whether at least one of the assumptions for the statistical analysis does not hold). Hint: there is something to find.

i)Even though part (h) suggests some assumptions are not appropriate, obtain a normal quantile plot of the residuals. What is evident here (that can be explained by the problem observed in part (h))?

Exercise 2:

a) Plot viability vs. time. Is a straight line fit appropriate?

b) Plot log10 of viability vs. time. Is a straight line appropriate here?

c) Obtain the least squares fitted regression line for part (b) on top of the data. Use this fit also to express the prediction of viability (not just log viability) as a function of time. What is your prediction for two hours?

d) Obtain the analysis of variance table for the regression in (c) and test H0: 1 = 0. Use = .05.

e) Provide a 98% confidence interval for 1. Get the standard error from the computer, but otherwise show the computation of the interval. Use : summary(lm(y~x, data = disease)).

f) Plot the residuals vs. time and judge whether there are any serious problems with model fit or the constant spread assumption.

g) Check the normality assumption of the residuals.

h) Calculate and interpret the R2 coefficient. Why is “correlation” not an appropriate concept for these data?