A large bank wants to gain insight into their employees’ job satisfaction. They carried out a survey, the results of which are in bank_clean.sav. The survey included the number of hours people work per week and their gross monthly salaries.

## Research Question

It seems obvious that working hours are related to monthly salaries: employees who work more hours earn more money. But we’d like to know more about this relationship so our research question is how (strongly) is monthly salary related to working hours?Since we already inspected this data file (and set missing values) we can simply run correlations whours salary.and see that the correlation is 0.648, quite a strong linear relation. Some would leave it at. However, a scatterplot will show that there’s *way* more to this relation.

## SPSS Scatterplot Creation

We’ll first run our scatterplot the way most users find easiest: by following the screenshots below.

The aforementioned steps result in the syntax below. Running it creates our first basic scatterplot.

## SPSS Scatterplot Syntax

***Minimal scatterplot syntax from legacy dialogs.**

GRAPH

/SCATTERPLOT(BIVAR)=whours WITH salary

/MISSING=LISTWISE.

Note: you’ll get the exact same result by runninggraph/scatter whours with salary.You probably prefer this second version if you want to create multiple scatterplots by copy-paste-editing the syntax. If you want to create a *huge* number of scatterplots, see SPSS with Python – Looping over Scatterplots.

## Result

As we see, this is not a simple linear relation. First, we see that our dots become more dispersed as our respondents work more hours; the more hours people work, the greater the standard deviation of monthly salary. This is a textbook example of **heteroscedasticity**, the opposite of homoscedasticity, an important assumption for regression.

Second, the see the pattern of dots “bend upwards” towards the right side of our chart. This is a clear indication of **nonlinearity**, which also violates the regression assumptions.

So why do we see heteroscedasticity and nonlinearity in our scatterplot? Well, perhaps the higher hourly wages are only available for those in the more high level jobs which also require more hours per week. Interestingly, we have “job type” in our data, which comes somewhat close to job levels. Let’s now add it to our scatterplot by following the screenshot below. Tip: use the dialog recall button for quick access to the scatter dialog.

## SPSS Scatterplot with Legend

jtype(job type).

should label each dot with the value of a (unique identifier) variable but it **doesn’t work**. We’ll leave it empty.

Optionally, let’s add some nice title to our chart.

## SPSS Scatterplot with Legend Syntax

***Scatterplot with different colors for different job types.**

GRAPH

/SCATTERPLOT(BIVAR)=whours WITH salary BY jtype

/MISSING=LISTWISE

/TITLE “Monthly Salary by Weekly Hours | n = 464”.

## Result

And there we have it. The cause for the heteroscedasticity and nonlinearity is that middle and upper managers have (very) **high hourly wages and typically work more** hours too than the other employees.

This plot also suggests that we should perhaps

**not lump together all job types**: for sales employees (red dots), the relation between hours and salary looks very linear -presumably because their hourly wages are rather fixed. The precise opposite holds for upper management (black dots). We’ll now confirm this by inspecting the correlation for each group separately.

## Correlations for Job Types Separately

***Sort cases needed for split file.**

sort cases by jtype.

***Split file.**

split file by jtype.

***Separate correlations for job types.**

correlations salary with whours.

split file off.

## Result

Indeed, the correlation between hours and salary is **0.79 for sales employees** and 0.21 for upper management. We’ll leave it as an exercise to the reader to create scatterplots for separate job type groups.

## Final Notes

Our first finding on these data was simply a correlation of 0.65 between working hours and salary. However, a scatterplot suggested that it wasn’t quite as simple as that. I hope we gave you an idea how to create scatterplots easily in SPSS and why they can be very useful indeed.

Thanks for reading!