Demonstrating Quantitative Analysis using SPSS

 Demonstrate some Quantitative Analysis

In this case, I will analyze this dataset by making use of binary logistic regression in evaluating the impact of all the independent variables (age, highest education level, weight, height, general health, alcohol drinks per day, caffeine drinks per day, physical fitness, current weight, hours sleep/weekends, how many hours sleep needed) on the dependent variable (problem with sleep), because my outcomes are binary/dichotomous, and binary logistics only assumes two possible outcomes i.e YES or NO. Binary logistic regression try to predict whether someone has a problem with sleep (score = 1) or someone do not have a problem with sleep (score = 0).

Some assumptions need to be met in logistic regression analysis.

·      In logistic regression a linear relationship between dependent and independent variables is not required.

·      Homoscedasticity of variance, linearity, and interval of the independent variable are not required.

·      The residuals (error term) need not be normally distributed.

·      The binary logistic regression typically requires little or no multicollinearity among independent variables.

·      Larger samples are required for logistic regression.

Using SPSS to run my analysis = go to Analyse – Regression – Binary logistics

Come up with a dialogue box

Dependent –

Covariates – All the independent variables such as

Select the categorical button – all categorical variables go into the dialogue box on the right – categorical covariates, while other independent variables which are continuous remain in the covariates dialogue box, select Continue

Move to select the option button -Hosmer-Lemeshow goodness-of-fit

-CI for exp(B)-95%, select Continue

OK will be selected

The result gives:


The output highlights the cases in the analysis. There is a total case of 271 respondents, including 191 selected cases and 80 missing cases.

The coding for the selected variablisre shown in the table above.

The choice of YES has a problem with sleep and is classified as 0 and those who choose NO do not have a problem with sleep and are classified as 1.

In this case, the preferred choice is 1, in other words, we want to see what factors can make people not have problems with sleep.

Moving to Block 0: Beginning Block


Block 0 is the output of the analysis when none of the independent variables are considered. This will be used aa s reference point to compare the model and the independent variables after they have been included in the analysis.

Note that the block above is incomplete and not usable without the independent variables.

 

Block 1: Method = Enter 


     
 

Block 1 above which is the goodness-of-fit statistics indicates whether the model describes the data at hand adequately. Looking at what you have described or what you have proposed as a model, does the model describe the data pretty well or not.

The omnibus tests are used to test the model fit. A significant test indicates a significant improvement in fit when compared to the initial/null model. Therefore, the omnibus tests here show a good fit.

 

This is another test of model fit, the Hosmer and Lemeshow Test. In this test, a poor fit is where a significant value is less than (<0.05) while a non-significant value greater than (>0.05indicateste a good fit.

The Hosmer and Lemeshow Test above displays the model adequately fits the data. Therefore, no significant difference between the observed and the predicted model.

 

Apart from the significant test, here we can see a contingency table for the Hosmer and Lemeshow Test indicating the values between the observed and the expected model are almost equal for both choices. Hence, the data adequately fits the model.

 

Model Summary

Step

-2 Log likelihood

Cox & Snell R Square

Nagelkerke R Square

1

221.342a

.190

.254

a. Estimation terminated at iteration number 5 because parameter estimates changed by less than .001.

 

This is the model summary showing all R2 values. Nagelkerke’s R Square will be utilized because it is an adjusted version of Cox & Snell R2 commonly used in logistic regression, it therefore ranges from 0 to 1, where a better fit is indicated by higher value.

Note that Nagelkerke’s R- R-Square and Cox & Snell R-Square are both Psuedo R2 measures. Pseudo R-Square is a term used to describe a measure of goodness of fit. While it does not precisely explain the variation, it provides an approximate variation in the model.

The table above shows that 25.4% can be explained by the predictor variable in the model


The classification table above demonstrates how the added predictors in the study contribute to predicting the correct category in the model. This will be used for comparison with the classification table presented in Block 0 earlier to assess the improvement in the model after adding the predictor variables.

·       The table is arranged into two columns representing the observed categories: "problem with sleep" YES or NO.

·       The rows represent the predicted categories based on the model's predictions.

·       Each cell represents the count of observations falling into each combination of observed and predicted categories.

·       Take for instance, the cell where PROBLEM WITH SLEEP is observed as YES and predicted as YES there are 42 cases.

·       The PERCENTAGE CORRECT values provide the accuracy of the model's predictions.

·       The rows present information as regards the specificity and sensitivity of the model.

·       For cases where the observed PROBLEM WITH SLEEP is YES, the model correctly predicted 50.6% of them as YES.

·       Additionally, the specificity of this model is 50.6% and these are those who do not have a problem with their sleep and are correctly predicted by the model. This is also called the “True Negative Rate”.

·       For cases where the observed PROBLEM WITH SLEEP is NO, the model correctly predicted 81.5% of them as NO.

·       Also, the sensitivity of this model is 81.5% and these are those cases expected to fall into the target group (i.e. those who have problems with their sleep and were correctly predicted by the model Y=1). This is also called the “True Positive Rate”.

·       The table shows the s overall percentage correct as 68.1%. This indicates that, based on the model's predictions, 68.1% of cases were correctly classified.

·       Hence, I got good sensitivity in my model, my classification is also appropriate, and the overall accuracy rate at 68.1% was good.


Finally, which of the independent variables has a significant impact on the problem with sleep.

·       The variable in the equation depicts the relationships between the dependent variables and the independent variable.

·       The odds are the ratio of probabilities.

·       Exp(B) represents the exponentiation of the beta coefficient (B). The beta coefficient in logistic regression reflects the change in the log odds of the dependent variable for a one-unit change in the independent variable. Exp(B) interprets the effect of the independent variable on the odds ratio.

·       Beta is used to represent the predicted change in Log Odds – in other words, for every 1 unit change in the independent variable, there is an expected Exp(B) change in the probability of the dependent variable.

·       The beta coefficients in this table are either positive or negative, indicating the t-value and significance level associated with each.

·       When the Odds Ratio is 1, it indicates that the probability of not having a problem with sleep is equal to the probability of having a problem with sleep

·       When the Odds Ratio is greater than 1, it indicates that the probability of not having a problem with sleep is greater than the probability of having a problem with sleep. The chances of not having a problem with sleep are high.

·       When the Odds Ratio is less than 1, it indicates that the probability of not having a problem with sleep is less than the probability of having a problem with sleep. The chances of not having a problem with sleep are low.

In this data analysis:

Ø  Heights

·       Beta (.029): This suggests a one-unit increase in height is associated with a 2.9% increase in the odds of not having g problem with sleep.

·       P-value (.209): Where the significant level is greater than 0.005 (p>.005) it suggests that the effects of height are not statistically significant.

·       Exp(B) = 1.029: If height increases by 1 unit, the odds of not having a problem with sleep are expected to increase by approximately 2.9%.

·       Confidence Interval: 95% CI for the Odds ratio is from 0.984 to 1.077.

Ø  General health

·       Beta (.254): This suggests a one-unit increase in general health is associated with a 25.4% increase in the odds of not having a problem with sleep.

·       P-value (.068): Where the significant level is greater than 0.005 (p>.005) which suggests that the effects of general health are not statistically significant.

·       Exp(B) = 1.290: If general health increases by 1 unit, the odds of not having a problem with sleep are expected to increase by approximately 29%.

·       Confidence Interval: 95% CI for the Odds ratio is from 0.981 to 1.695.

Ø  Caffeine drinks per day

·       Beta (.289): This suggests that a one-unit increase in caffeine drinks per day is associated with a 28.9% increase in the odds of not having a problem with sleep.

·       P-value (.004): Where the significant level is less than 0.005 (p<.005) whicsuggestsst that the effects of caffeine drinks per daareis are statistically significant.

·       Exp(B) = 1.335: If caffeine drinks increase by 1 unit, the odds of not having a problem with sleep are expected to increase by approximately 33.5%.

·       Confidence Interval: 95% CI for the Odds ratio is from 1.098 to 1.624.

Ø  Physical fitness

·       Beta (.367): This suggests that a one-unit increase in physical fitness is associated with a 36.7% increase in the odds of not having g problem with sleep.

·       P-value (.004): Where the significant level is less than 0.005 (p<.005) which suggests that the effects of physical fitness are statistically significant.

·       Exp(B) = 1.444: If physical fitness increases by 1 unit, the odds of not having a problem with sleep are expected to increase by approximately 44.5%.

·       Confidence Interval: 95% CI for the Odds ratio is from 1.127 to 1.850.

Ø  Current weight

·       Beta (.317): This suggests a one-unit increase in current weight is associated with a 31.7% increase in the odds of not having a problem with sleep.

·       P-value (.083): Where the significant level is greater than 0.005 (p<.005) which suggests that the effects of current weight are not statistically significant.

·       Exp(B) = 1.374: If current weight increases by 1 unit, the odds of not having a problem with sleep are expected to increase by approximately 37.4%.

·       Confidence Interval: 95% CI for the Odds ratio is from .959 to 1.967.

Ø  Hours of sleep/weekends

·       Beta (.317): This suggests that a one-unit increase in current weight is associated with a 31.7% increase in the odds of not having g problem with sleep.

·       P-value (.083): Where the significant level is greater than 0.005 (p<.005) which suggests that the effects of current weight are not statistically significant.

·       Exp(B) = 1.970: If current weight increases by 1 unit, the odds of not having a problem with sleep are expected to increase by approximately 37.4%.

·       Confidence Interval: 95% CI for the Odds ratio is from .959 to 1.967.

Ø  Age

·       Beta (-.018): This suggests that with every one-unit decrease in age, the log odds of having a problem with sleep are expected to decrease by 0.018. Since the beta is negative, this implies that as age decreases, the probability of having a problem with sleep decreases.

·       P-value (.286): Where the significant level is greater than 0.005 (p<.005). This suggests there is not enough evidence to reject the null hypothesis. Hence, the relationship between age and having problems with sleep is not statistically significant.

·       Exp(B) = .982: For every one-unit increase in age, the odds of having problems with sleep decrease by approximately 1.8%.

·       Confidence Interval: 95% CI for the Odds ratio is from .951 to 1.014. this provides a range of values for the true effect. Since the interval includes 1, it suggests that the effect may not be statistically significant. In this case, the interval does include 1, indicating that the decrease in the odds of having problems with sleep for each one-unit increase in age may not be statistically significant.

Ø  Highest education level achieved

·       Beta (-.109): This suggests every one-unit decrease in the highest education level achieved, the log odds of having a problem with sleep are expected to decrease by 0.109. Since the beta is negative, this implies that as the highest education level achieved decreases, the probability of having a problem with sleep also decreases.

·       P-value (.544): Where the significant level is greater than 0.005 (p<.005). This suggests there is not enough evidence to reject the null hypothesis. Hence, the relationship between the highest education level achieved and having problems with sleep is not statistically significant.

·       Exp(B) = .897: For every one-unit increase in the highest education level achieved, the odds of having problems with sleep decrease by approximately 10.3%.

·       Confidence Interval: 95% CI for the Odds ratio is from .631 to 1.274. this provides a range of values for the true effect. Since the interval includes 1, it suggests that the effect may not be statistically significant. In this case, the interval does include 1, indicating that the decrease in the odds of having problems with sleep for each one-unit increase in the highest education level achieved may not be statistically significant.

Ø  Weight

·       Beta (-.011): This suggests that with every one-unit decrease in weight, the log odds of having a problem with sleep are expected to decrease by 0.011. Since the beta is negative, this implies that as weight decreases, the probability of having a problem with sleep also decreases.

·       P-value (.521): Where the significant level is greater than 0.005 (p<.005). This suggests there is not enough evidence to reject the null hypothesis. Hence, the relationship between weight and having problems with sleep is not statistically significant.

·       Exp(B) = .989: For every one-unit increase in weight, the odds of having problems with sleep decrease by approximately 1.1%.

·       Confidence Interval: 95% CI for the Odds ratio is from .956 to 1.023. this provides a range of values for the true effect. Since the interval includes 1, it suggests that the effect may not be statistically significant. In this case, the interval does include 1, indicating that the decrease in the odds of having problems with sleep for each one-unit increase in weight may not be statistically significant.

Ø  Alcohol drinks/day

·       Beta (-.199): Thisuggestsst every one-unit decrease in alcohol drinks/day, the log-odds of having a problem with sleep are expected to decrease by 0.199. Since the beta is negative, this implies that as alcohol drinks/day decrease, the probability of having a problem with sleep also decreases.

·       P-value (.165): Where the significant level is greater than 0.005 (p<.005). This suggests there is not enough evidence to reject the null hypothesis. Hence, the relationship between alcoholic drinks/day and having problems with sleep is not statistically significant.

·       Exp(B) = .820: For every one-unit increase in the alcohol drinks/day, the odds of having problems with sleep decrease by approximately 18%.

·       Confidence Interval: 95% CI for the Odds ratio is from .619 to 1.085. This provides a range of values for the true effect. Since the interval includes 1, it suggests that the effect may not be statistically significant. In this case, the interval does include 1, indicating that the decrease in the odds of having problems with sleep for each one-unit increase in alcohol drinks/day may not be statistically significant. 

Ø  Hours of sleep needed

·       Beta (-.507): This suggests that with every one-unit decrease in hours of sleep needed, the log odds of having a problem with sleep are expected to decrease by 0.507. Since the beta is negative, this implies that as the hours of sleep needed decrease, the probability of having a problem with sleep also decreases.

·       P-value (.008): Where the significant level is less than 0.005 (p<.005). This suggests that we can reject the null hypothesis. Hence, the relationship between hours of sleep needed and having problems with sleep is statistically significant.

·       Exp(B) = .602: For every one-unit increase in the hours of sleep needed, the odds of experiencing problems with sleep decrease by approximately 39.8%.

·       Confidence Interval: 95% CI for the Odds ratio is from .413 to 0.878. This provides a range of values for the true effect. Since the interval does not include 1, it suggests that the effect is statistically significant. Indicating that the decrease in the odds of experiencing problems with sleep for each one-unit increase in hours of sleep needed is statistically significant.

 

Comments

Popular posts from this blog

Quantitative Methods- Assessment