Week 1. Measurement and Description - chapters 1 and 2
1 Measurement issues. Data, even numerically coded variables, can be one of 4 levels -
nominal, ordinal, interval, or ratio. It is important to identify which level a variable is, as
this impact the kind of analysis we can do with the data. For example, descriptive statistics
such as means can only be done on interval or ratio level data.
Please list under each label, the variables in our data set that belong in each group.
Nominal Ordinal Interval Ratio
b. For each variable that you did not call ratio, why did you make that decision?
2 The first step in analyzing data sets is to find some summary descriptive statistics for key variables.
For salary, compa, age, performance rating, and service; find the mean, standard deviation, and range for 3 groups: overall sample, Females, and Males.
You can use either the Data Analysis Descriptive Statistics tool or the Fx =average and =stdev functions.
(the range must be found using the difference between the =max and =min functions with Fx) functions.
Note: Place data to the right, if you use Descriptive statistics, place that to the right as well.
Salary Compa Age Perf. Rat. Service
Overall Mean
Standard Deviation
Range
Female Mean
Standard Deviation
Range
Male Mean
Standard Deviation
Range
3 What is the probability for a: Probability
a.      Randomly selected person being a male in grade E?
b.     Randomly selected male being in grade E?
Note part b is the same as given a male, what is probabilty of being in grade E?
c. Why are the results different?
4 For each group (overall, females, and males) find: Overall Female Male
a. The value that cuts off the top 1/3 salary in each group.
b. The z score for each value:
c. The normal curve probability of exceeding this score:
d. What is the empirical probability of being at or exceeding this salary value?
e. The value that cuts off the top 1/3 compa in each group.
f. The z score for each value:
g. The normal curve probability of exceeding this score:
h. What is the empirical probability of being at or exceeding this compa value?
i. How do you interpret the relationship between the data sets? What do they mean about our equal pay for equal work question?
5.     What conclusions can you make about the issue of male and female pay equality? Are all of the results consistent?
What is the difference between the sal and compa measures of pay?
Conclusions from looking at salary results:
Conclusions from looking at compa results:
Do both salary measures show the same results?
Can we make any conclusions about equal pay for equal work yet? Week 2 Testing means
In questions 2 and 3, be sure to include the null and alternate hypotheses you will be testing.
In the first 3 questions use alpha = 0.05 in making your decisions on rejecting or not rejecting the null hypothesis.
1 Below are 2 one-sample t-tests comparing male and female average salaries to the overall sample mean.
(Note: a one-sample t-test in Excel can be performed by selecting the 2-sample unequal variance t-test and making the second variable = Ho value -- see column S)
Based on our sample, how do you interpret the results and what do these results suggest about the population means for male and female average salaries?
Males Females
Ho: Mean salary = 45 Ho: Mean salary = 45
Ha: Mean salary =/= 45 Ha: Mean salary =/= 45
Note: While the results both below are actually from Excel's t-Test: Two-Sample Assuming Unequal Variances,
having no variance in the Ho variable makes the calculations default to the one-sample t-test outcome - we are tricking Excel into doing a one sample test for us.
Male Ho Female Ho
Mean 52 45 Mean 38 45
Variance 316 0 Variance 334.6666667 0
Observations 25 25 Observations 25 25
Hypothesized Mean Difference 0 Hypothesized Mean Difference 0
df 24 df 24
t Stat 1.968903827 t Stat -1.913206357
P(T<=t) one-tail 0.03030785 P(T<=t) one-tail 0.033862118
t Critical one-tail 1.71088208 t Critical one-tail 1.71088208
P(T<=t) two-tail 0.060615701 P(T<=t) two-tail 0.067724237
t Critical two-tail 2.063898562 t Critical two-tail 2.063898562
Conclusion: Do not reject Ho; mean equals 45 Conclusion: Do not reject Ho; mean equals 45
Is this a 1 or 2 tail test? Is this a 1 or 2 tail test?
- why? - why?
P-value is: P-value is:
Is P-value > 0.05? Is P-value > 0.05?
Why do we not reject Ho? Why do we not reject Ho?
Interpretation:
2 Based on our sample data set, perform a 2-sample t-test to see if the population male and female average salaries could be equal to each other.
(Since we have not yet covered testing for variance equality, assume the data sets have statistically equal variances.)
Ho:
Ha:
Test to use:
Place B43 in Outcome range box.
P-value is:
Is P-value < 0.05?
Reject or do not reject Ho:
If the null hypothesis was rejected, what is the effect size value:
Meaning of effect size measure:
Interpretation:
b. Since the one and two tail t-test results provided different outcomes, which is the proper/correct apporach to comparing salary equality? Why?
3 Based on our sample data set, can the male and female compas in the population be equal to each other? (Another 2-sample t-test.)
Ho:
Ha:
Statistical test to use:
Place B75 in Outcome range box.
Week 3
At this point we know the following about male and female salaries.
a. Male and female overall average salaries are not equal in the population.
b. Male and female overall average compas are equal in the population, but males are a bit more spread out.
c. The male and female salary range are almost the same, as is their age and service.
d. Average performance ratings per gender are equal.
Let's look at some other factors that might influence pay - education(degree) and performance ratings.
1 Last week, we found that average performance ratings do not differ between males and females in the population.
Now we need to see if they differ among the grades. Is the average performace rating the same for all grades?
(Assume variances are equal across the grades for this ANOVA.) A B C
Null Hypothesis:
Alt. Hypothesis:
Place B17 in Outcome range box.
Week 5 Correlation and Regression
1.    Create a correlation table for the variables in our data set. (Use analysis ToolPak or StatPlus:mac LE function Correlation.)
a. Reviewing the data levels from week 1, what variables can be used in a Pearson's Correlation table (which is what Excel produces)?
b. Place table here (C8 in Output range box):
c. Using r = approximately .28 as the signicant r value (at p = 0.05) for a correlation between 50 values, what variables are
significantly related to Salary?
To compa?
d. Looking at the above correlations - both significant or not - are there any surprises -by that I
mean any relationships you expected to be meaningful and are not and vice-versa?
e. Does this help us answer our equal pay for equal work question?
2 Below is a regression analysis for salary being predicted/explained by the other variables in our sample (Midpoint,
age, performance rating, service, gender, and degree variables. (Note: since salary and compa are different ways of
expressing an employee’s salary, we do not want to have both used in the same regression.)
Plase interpret the findings.
Ho: The regression equation is not significant.
Ha: The regression equation is significant.
Ho: The regression coefficient for each variable is not significant Note: technically we have one for each input variable.
Ha: The regression coefficient for each variable is significant Listing it this way to save space.
Sal
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.991559075
R Square 0.983189399
Adjusted R Square 0.980843733
Standard Error 2.657592573
Observations 50
ANOVA
df SS MS F Significance F
Regression 6 17762.29967 2960.383279 419.1516111 1.81215E-36
Residual 43 303.7003261 7.062798282
Total 49 18066
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept -1.749621212 3.618367658 -0.483538816 0.63116649 -9.046755043 5.547512618 -9.046755043 5.547512618
Midpoint 1.216701051 0.031902351 38.13828812 8.66416E-35 1.152363828 1.281038273 1.152363828 1.281038273
Age -0.00462801 0.065197212 -0.070984788 0.943738987 -0.136110719 0.126854699 -0.136110719 0.126854699
Performace Rating -0.056596441 0.034495068 -1.640711097 0.108153182 -0.126162375 0.012969494 -0.126162375 0.012969494
Service -0.042500357 0.084336982 -0.503935003 0.616879352 -0.212582091 0.127581377 -0.212582091 0.127581377
Gender 2.420337212 0.860844318 2.81158528 0.007396619 0.684279192 4.156395232 0.684279192 4.156395232
Degree 0.275533414 0.799802305 0.344501901 0.732148119 -1.337421655 1.888488483 -1.337421655 1.888488483
Note: since Gender and Degree are expressed as 0 and 1, they are considered dummy variables and can be used in a multiple regression equation.
Interpretation:
For the Regression as a whole:
What is the value of the F statistic:
What is the p-value associated with this value:
Is the p-value <0.05?
Do you reject or not reject the null hypothesis:
What does this decision mean for our equal pay question:
For each of the coefficients: Intercept Midpoint Age Perf. Rat. Service Gender Degree
What is the coefficient's p-value for each of the variables:
Is the p-value < 0.05?
Do you reject or not reject each null hypothesis:
What are the coefficients for the significant variables?
Using only the significant variables, what is the equation? Salary =
Is gender a significant factor in salary:
If so, who gets paid more with all other things being equal?
How do we know?
3 Perform a regression analysis using compa as the dependent variable and the same independent
variables as used in question 2. Show the result, and interpret your findings by answering the same questions.
Note: be sure to include the appropriate hypothesis statements.
Regression hypotheses
Ho:
Ha:
Coefficient hypotheses (one to stand for all the separate variables)
Ho:
Ha:
Put C94 in output range box
The question to address is: “What have you learned about statistics?†In developing your responses, consider – at a minimum – and discuss the application of each of the course elements in analyzing and making decisions about data (counts and/or measurements).
The course elements include:
•Descriptive statistics
•Inferential statistics
•Hypothesis development and testing
•Selection of appropriate statistical tests
•Evaluating statistical results.
Writing the Final Paper
The Final Paper:
1.Must be three to- five double-spaced pages in length, and formatted according to APA style as outlined
3.Must begin with an introductory paragraph that has a succinct thesis statement.
4.Must address the topic of the paper with critical thought.
5.Must end with a conclusion that reaffirms your thesis.
6.Must use at least three scholarly sources, in addition to the text.
7.Must document all sources in APA style
Question Attachments
1 attachments —