*Data Analytics*

**Analysis Assignment #3**

*Use LifeStyle.sav data (under “Assignment 3” folder in the Canvas) for this assignment.*

*The sample data (named by “LifeStyle” in the BlackBoard) you will use is survey data from a questionnaire collected in 1998 by Prof. Putnam at Harvard. It contains information on internet usage, political sensitive topics, and demographics.*

*Suppose you are the marketing manager of your favorite political party and management asks you to investigate whether the party should redirect their PR and discussion of sensitive topics to the internet.*

*(Total 7 points + Bonus 1 point).*

*0.5+1.2+1.5+1.2+1.8+0.6+1.2=8*

- Build the frequency table with variable ‘gender’ and ’emplmerg’ and find answers for the following questions (total 0.5 points).

(a)**How many respondents make annual income between $70,000 – $89,999? (0.2 point)**

**(b) What percentage of the sample makes annual income less than $20,000?**(0.3 point)? - Confidence Interval. (total 1.2 points)

(a) What is the mean age? Is it a population mean or sample mean? (0.3 point)

(b) What is the 95% confidence interval for the sample mean for Age? (0.3 point)

(c)What is your interpretation? (0.3 point)

(d) What is the 99% confidence interval for the sample mean for Age? (0.3 point)

Suppose null hypothesis that heavy and light internet users (let “heavy users”: 12 or more times a year vs. “light users” = less than 12 time a year in this problem) have the same opinion regarding the favor of death penalty? Note, please assume unequal variance between two samples (total 1.5 points)**Mean Comparison:**

(a) What is sample mean value of heavy users? (0.3 point)

(b) What is the p-value for testing the null hypothesis (mean comparison of ‘deathpen’ between heavy users and light users)? (0.3 point).

(c) Reject the Null hypothesis or not? Note, use 5% as critical value for your hypothesis testing. (0.3 point)

(d) What is your interpretation based on the test result of (c)? (0.3 point)

(e) What is mean difference between heavy and light? (0.3 point)**ANOVA:**Suppose null hypothesis that there are the same opinions regarding the favor of death penalty (“I am in favor of the death penalty”) regardless of employment status (Total: 1.2 Points).

(a) What are the sample mean of ‘retired’ group? (0.3 point)

(b) What is the p-value for ANOVA testing? (0.3 point)

(c) What is your statistical decision (e.g., Reject or Fail to reject)? For your hypothesis testing, use 5% as critical value. (0.3 point)

(d) What is your interpretation based on the test result of (c)? (0.3 point)

- CrossTab [Run Crosstabs in Excel]: Please use CrossTab Analysis. Let’s examine whether Liberal (“Generally speaking, would you consider yourself to be…”) is independent of Employment Status of Respondent. (total 1.8 points)

a) (0.3 point) What is the observed count number for the cell of “Generally agree yourself to be liberal” and “Retired” in the Cross tab table [Do NOT report two frequency numbers separately]?

b) Do the hypothesis testing by checking Chi-square statistics. What is the null hypothesis? (0.3 point)

c) What is the expected value for the cell of “Generally agree yourself to be liberal” and “Retired” in the CrossTab table? (0.3 point)

d) What is χ2 test statistics value? (0.3 point)

e) What is your statistical decision (e.g., Reject or Fail to reject)? For your hypothesis testing, use 5% as critical value. (0.3 point)

f) What is your interpretation of your statistical decision? (0.3 point) - Correlation (Use Bivariate correlation): (total 0.6 point)

a) What is correlation between the question “I am in favor of the death penalty” and “respondent’s age”? (0.3 point)

b) Let’s examine the hypothesis testing for the correlation. What is your null hypothesis? (0.3 point)

- Regression: Construct and analyze a simple regression model with a dependent variable of “I don’t have a clue what the internet is and what it can do for me” and independent variable of “Respondents Age”. (1.2 point)

a) What is R-square value? What is your interpretation of the R-square (one sentence)? (0.3 point)

b) What is the estimated coefficient of independent variable of “Respondents Age”? Is it significant by using α = 0.05? (0.3 point)

c) Based on Coefficients table, please build the estimated simple regression line equation (0.3 point)

d) Please predict the scale value of “I don’t have a clue what the internet is and what it can do for me” with the age of “70” based on your estimated regression model (0.3 point).