Demonstrating Quantitative Analysis using SPSS
Demonstrate some Quantitative Analysis
In this case, I
will analyze this dataset by making use of binary logistic regression in
evaluating the impact of all the independent variables (age, highest education
level, weight, height, general health, alcohol drinks per day, caffeine drinks
per day, physical fitness, current weight, hours sleep/weekends, how many hours
sleep needed) on the dependent variable (problem with sleep), because my
outcomes are binary/dichotomous, and binary logistics only assumes two possible
outcomes i.e YES or NO. Binary logistic regression try to predict whether
someone has a problem with sleep (score = 1) or someone do not have a problem with
sleep (score = 0).
Some assumptions need to be met in logistic regression analysis.
·
In logistic
regression a linear relationship between dependent and independent variables is
not required.
·
Homoscedasticity
of variance, linearity, and interval of the independent variable are not required.
·
The residuals (error
term) need not be normally distributed.
·
The binary
logistic regression typically requires little or no multicollinearity among
independent variables.
·
Larger samples
are required for logistic regression.
Using SPSS to run
my analysis = go to Analyse – Regression – Binary logistics
Come up with a
dialogue box
Dependent –
Covariates – All
the independent variables such as
Select the
categorical button – all categorical variables go into the dialogue box on the
right – categorical covariates, while other independent variables which are
continuous remain in the covariates dialogue box, select Continue
Move to select the option button -Hosmer-Lemeshow goodness-of-fit
-CI
for exp(B)-95%, select Continue
OK
will be selected
The result gives:
The output
highlights the cases in the analysis. There is a total case of 271 respondents,
including 191 selected cases and 80 missing cases.
The coding for the
selected variablisre shown in the table above.
The choice of YES
has a problem with sleep and is classified as 0 and those who choose NO do not
have a problem with sleep and are classified as 1.
In this case, the
preferred choice is 1, in other words, we want to see what factors can
make people not have problems with sleep.
Moving to Block 0:
Beginning Block
Block 0 is the
output of the analysis when none of the independent variables are considered.
This will be used aa s reference point to compare the model and the independent
variables after they have been included in the analysis.
Note that the
block above is incomplete and not usable without the independent variables.
Block 1: Method = Enter
Block 1 above
which is the goodness-of-fit statistics indicates whether the model describes
the data at hand adequately. Looking at what you have described or what you have
proposed as a model, does the model describe the data pretty well or not.
The omnibus tests
are used to test the model fit. A significant test indicates a significant
improvement in fit when compared to the initial/null model. Therefore, the
omnibus tests here show a good fit.
This is another
test of model fit, the Hosmer and Lemeshow Test. In this test, a poor fit is where
a significant value is less than (<0.05) while a non-significant value greater
than (>0.05indicateste a good fit.
The Hosmer and
Lemeshow Test above displays the model adequately fits the data. Therefore, no
significant difference between the observed and the predicted model.
Apart from the significant test, here we can see a contingency table for the Hosmer and Lemeshow Test indicating the values between the observed and the expected model are almost equal for both choices. Hence, the data adequately fits the model.
Model Summary |
|||
Step |
-2 Log
likelihood |
Cox
& Snell R Square |
Nagelkerke
R Square |
1 |
221.342a |
.190 |
.254 |
a. Estimation terminated at iteration
number 5 because parameter estimates changed by less than .001. |
This is the model
summary showing all R2 values. Nagelkerke’s R Square will be
utilized because it is an adjusted version of Cox & Snell R2 commonly
used in logistic regression, it therefore ranges from 0 to 1, where a better
fit is indicated by higher value.
Note that Nagelkerke’s
R- R-Square and Cox & Snell R-Square are both Psuedo R2 measures.
Pseudo R-Square is a term used to describe a measure of goodness of fit. While
it does not precisely explain the variation, it provides an approximate
variation in the model.
The table above shows that 25.4% can be explained by the predictor variable in the model
The classification table above demonstrates how the added predictors in the study contribute to predicting the correct category in the model. This will be used for comparison with the classification table presented in Block 0 earlier to assess the improvement in the model after adding the predictor variables.
· The table is
arranged into two columns representing the observed categories: "problem
with sleep" YES or NO.
· The rows represent
the predicted categories based on the model's predictions.
· Each cell
represents the count of observations falling into each combination of observed
and predicted categories.
· Take for instance,
the cell where PROBLEM WITH SLEEP is observed as YES and predicted as YES there
are 42 cases.
· The PERCENTAGE
CORRECT values provide the accuracy of the model's predictions.
· The rows present
information as regards the specificity and sensitivity of the model.
· For cases where
the observed PROBLEM WITH SLEEP is YES, the model correctly predicted 50.6% of
them as YES.
· Additionally, the specificity
of this model is 50.6% and these are those who do not have a problem with their
sleep and are correctly predicted by the model. This is also called the “True Negative
Rate”.
· For cases where
the observed PROBLEM WITH SLEEP is NO, the model correctly predicted 81.5% of
them as NO.
· Also, the sensitivity
of this model is 81.5% and these are those cases expected to fall into the target
group (i.e. those who have problems with their sleep and were correctly predicted
by the model Y=1). This is also called the “True Positive Rate”.
· The table shows the s
overall percentage correct as 68.1%. This indicates that, based on the model's
predictions, 68.1% of cases were correctly classified.
· Hence, I got good
sensitivity in my model, my classification is also appropriate, and the overall
accuracy rate at 68.1% was good.
Finally, which of the independent variables has a significant impact on the problem with sleep.
· The variable in
the equation depicts the relationships between the dependent variables and the independent
variable.
· The odds are the
ratio of probabilities.
· Exp(B) represents
the exponentiation of the beta coefficient (B). The beta coefficient in
logistic regression reflects the change in the log odds of the dependent
variable for a one-unit change in the independent variable. Exp(B) interprets
the effect of the independent variable on the odds ratio.
· Beta is used to
represent the predicted change in Log Odds – in other words, for every 1 unit
change in the independent variable, there is an expected Exp(B) change in the
probability of the dependent variable.
· The beta
coefficients in this table are either positive or negative, indicating the t-value
and significance level associated with each.
· When the Odds
Ratio is 1, it indicates that the probability of not having a problem with
sleep is equal to the probability of having a problem with sleep
· When the Odds
Ratio is greater than 1, it indicates that the probability of not having a
problem with sleep is greater than the probability of having a problem with
sleep. The chances of not having a problem with sleep are high.
· When the Odds
Ratio is less than 1, it indicates that the probability of not having a problem
with sleep is less than the probability of having a problem with sleep. The
chances of not having a problem with sleep are low.
In this data
analysis:
Ø Heights
· Beta (.029): This suggests a one-unit increase in height is associated with a 2.9% increase in the
odds of not having g problem with sleep.
· P-value (.209):
Where the significant level is greater than 0.005 (p>.005) it suggests that the effects of height are not statistically significant.
· Exp(B) = 1.029: If
height increases by 1 unit, the odds of not having a problem with sleep are
expected to increase by approximately 2.9%.
· Confidence Interval: 95% CI for the Odds ratio is from 0.984 to 1.077.
Ø General health
· Beta (.254): This
suggests a one-unit increase in general health is associated with a 25.4% increase
in the odds of not having a problem with sleep.
· P-value (.068):
Where the significant level is greater than 0.005 (p>.005) which suggests that the effects of general health are not statistically significant.
· Exp(B) = 1.290: If
general health increases by 1 unit, the odds of not having a problem with sleep
are expected to increase by approximately 29%.
· Confidence Interval: 95% CI for the Odds ratio is from 0.981 to 1.695.
Ø Caffeine drinks
per day
· Beta (.289): This suggests that a one-unit increase in caffeine drinks per day is associated with a 28.9%
increase in the odds of not having a problem with sleep.
· P-value (.004):
Where the significant level is less than 0.005 (p<.005) whicsuggestsst that
the effects of caffeine drinks per daareis are statistically significant.
· Exp(B) = 1.335: If
caffeine drinks increase by 1 unit, the odds of not having a problem
with sleep are expected to increase by approximately 33.5%.
· Confidence Interval: 95% CI for the Odds ratio is from 1.098 to 1.624.
Ø Physical fitness
· Beta (.367): This suggests that a one-unit increase in physical fitness is associated with a 36.7%
increase in the odds of not having g problem with sleep.
· P-value (.004):
Where the significant level is less than 0.005 (p<.005) which suggests that
the effects of physical fitness are statistically significant.
· Exp(B) = 1.444: If
physical fitness increases by 1 unit, the odds of not having a problem with
sleep are expected to increase by approximately 44.5%.
· Confidence Interval: 95% CI for the Odds ratio is from 1.127 to 1.850.
Ø Current weight
· Beta (.317): This
suggests a one-unit increase in current weight is associated with a 31.7% increase
in the odds of not having a problem with sleep.
· P-value (.083):
Where the significant level is greater than 0.005 (p<.005) which suggests
that the effects of current weight are not statistically significant.
· Exp(B) = 1.374: If
current weight increases by 1 unit, the odds of not having a problem with sleep
are expected to increase by approximately 37.4%.
· Confidence Interval: 95% CI for the Odds ratio is from .959 to 1.967.
Ø Hours of sleep/weekends
· Beta (.317): This suggests that a one-unit increase in current weight is associated with a 31.7% increase
in the odds of not having g problem with sleep.
· P-value (.083):
Where the significant level is greater than 0.005 (p<.005) which suggests that the effects of current weight are not statistically significant.
· Exp(B) = 1.970: If
current weight increases by 1 unit, the odds of not having a problem with sleep
are expected to increase by approximately 37.4%.
· Confidence Interval: 95% CI for the Odds ratio is from .959 to 1.967.
Ø Age
· Beta
(-.018): This suggests that with every one-unit decrease in age, the log odds of having a
problem with sleep are expected to decrease by 0.018. Since the beta is
negative, this implies that as age decreases, the probability of having a
problem with sleep decreases.
· P-value (.286):
Where the significant level is greater than 0.005 (p<.005). This suggests there
is not enough evidence to reject the null hypothesis. Hence, the relationship
between age and having problems with sleep is not statistically significant.
· Exp(B) = .982: For
every one-unit increase in age, the odds of having problems with sleep
decrease by approximately 1.8%.
· Confidence Interval: 95% CI for the Odds ratio is from .951 to 1.014. this provides a range of values for the true effect. Since the interval includes 1, it suggests that the effect may not be statistically significant. In this case, the interval does include 1, indicating that the decrease in the odds of having problems with sleep for each one-unit increase in age may not be statistically significant.
Ø Highest education
level achieved
· Beta
(-.109): This suggests every one-unit decrease in the highest education level
achieved, the log odds of having a problem with sleep are expected to decrease
by 0.109. Since the beta is negative, this implies that as the highest education
level achieved decreases, the probability of having a problem with sleep also decreases.
· P-value (.544):
Where the significant level is greater than 0.005 (p<.005). This suggests there
is not enough evidence to reject the null hypothesis. Hence, the relationship
between the highest education level achieved and having problems with sleep is not
statistically significant.
· Exp(B) = .897: For
every one-unit increase in the highest education level achieved, the odds of
having problems with sleep decrease by approximately 10.3%.
· Confidence Interval: 95% CI for the Odds ratio is from .631 to 1.274. this provides a range of values for the true effect. Since the interval includes 1, it suggests that the effect may not be statistically significant. In this case, the interval does include 1, indicating that the decrease in the odds of having problems with sleep for each one-unit increase in the highest education level achieved may not be statistically significant.
Ø Weight
· Beta
(-.011): This suggests that with every one-unit decrease in weight, the log odds of having
a problem with sleep are expected to decrease by 0.011. Since the beta is
negative, this implies that as weight decreases, the probability of having a
problem with sleep also decreases.
· P-value (.521):
Where the significant level is greater than 0.005 (p<.005). This suggests there
is not enough evidence to reject the null hypothesis. Hence, the relationship
between weight and having problems with sleep is not statistically significant.
· Exp(B) = .989: For
every one-unit increase in weight, the odds of having problems with sleep
decrease by approximately 1.1%.
· Confidence Interval: 95% CI for the Odds ratio is from .956 to 1.023. this provides a range of values for the true effect. Since the interval includes 1, it suggests that the effect may not be statistically significant. In this case, the interval does include 1, indicating that the decrease in the odds of having problems with sleep for each one-unit increase in weight may not be statistically significant.
Ø Alcohol drinks/day
· Beta
(-.199): Thisuggestsst every one-unit decrease in alcohol drinks/day, the
log-odds of having a problem with sleep are expected to decrease by 0.199. Since
the beta is negative, this implies that as alcohol drinks/day decrease, the
probability of having a problem with sleep also decreases.
· P-value (.165):
Where the significant level is greater than 0.005 (p<.005). This suggests there
is not enough evidence to reject the null hypothesis. Hence, the relationship
between alcoholic drinks/day and having problems with sleep is not statistically
significant.
· Exp(B) = .820: For
every one-unit increase in the alcohol drinks/day, the odds of having problems
with sleep decrease by approximately 18%.
· Confidence Interval: 95% CI for the Odds ratio is from .619 to 1.085. This provides a range of values for the true effect. Since the interval includes 1, it suggests that the effect may not be statistically significant. In this case, the interval does include 1, indicating that the decrease in the odds of having problems with sleep for each one-unit increase in alcohol drinks/day may not be statistically significant.
Ø Hours of sleep needed
· Beta
(-.507): This suggests that with every one-unit decrease in hours of sleep needed, the log odds of having a problem with sleep are expected to decrease by 0.507. Since
the beta is negative, this implies that as the hours of sleep needed decrease, the
probability of having a problem with sleep also decreases.
· P-value (.008):
Where the significant level is less than 0.005 (p<.005). This suggests that we
can reject the null hypothesis. Hence, the relationship between hours of sleep
needed and having problems with sleep is statistically significant.
· Exp(B) = .602: For
every one-unit increase in the hours of sleep needed, the odds of experiencing
problems with sleep decrease by approximately 39.8%.
· Confidence
Interval: 95% CI for the Odds ratio is from .413 to 0.878. This provides a
range of values for the true effect. Since the interval does not include 1, it
suggests that the effect is statistically significant. Indicating that the
decrease in the odds of experiencing problems with sleep for each one-unit
increase in hours of sleep needed is statistically significant.
Comments
Post a Comment