# Matplotlib Series 4: Scatter plot

This blog specifies how to create scatter plot, connected scatter plot and bubble chart with matplotlib in Python.

This blog is part of Matplotlib Series:

## Scatter plot

A scatter plot (also called a scatter graph, scatter chart, scattergram, or scatter diagram) is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data.

### When to use it ?

Scatter plots are used when you want to show the relationship between two variables. Scatter plots are sometimes called correlation plots because they show how two variables are correlated.

### Example 1 This plot describes the positive relation between store’s surface and its turnover(k euros), which is reasonable: for stores, the larger it is, more clients it can accept, more turnover it will generate.

### Example 2 This chart displays a negative relation between two variables: temperature and average volume of hot soup. When it gets colder, people need to think something hot to keep them warmer, however, when it becomes hotter, the needs of hot soup decreases.

### Example 3 This plot shows that there is no relation between client’s age and their purchase cost per week. Thus, we shouldn’t study their relationship for this case.

## Connected scatter plot

A connected scatter plot is a mix between scatter plot and line chart, it uses line segments to connect consecutive scatter plot points, for example to illustrate trajectories over time.

### When to use it ?

The connected scatterplot visualizes two related time series in a scatterplot and connects the points with a line in temporal sequence.

### Example Suppose that the plot above describes the turnover(k euros) of hot soup’s sales during one year. According to the plot, we can clearly find that the sales reach a peak in winter, then fall from spring to summer, which is logical.

## Bubble chart

A bubble chart is a type of chart that displays three dimensions of data, the value of an additional variable is represented through the size of the dots.

### When to use it ?

For conveying information regarding a third data element per observation.

### Example Since I added number of clients as size of each point, which corresponds the explication of the scatter plot above.

## Scatter plot with different colors

Scatter plot which created by matplotlib, cannot specify colors in terms of category variable’s value. So we have to overlap plots of different colors.

### Example 1 This 2-color scatter plot displays clearly the difference of weekly purchase cost between young people and middle aged or elderly people: average weekly purchase of younger people is nealy once more than middle aged or elderly people.

### Example 2 In this plot, some points are overlapped, which will impact our analysis. In this case, it’s better to separate samples of “Paris (75)” and “Val de Marne (94)” into 2 plot: Comparing to the first plot of this example, the graphs above are more clearer and explicable. The rent price per m2 of Val de Marne is almost half of the rent price / m2 of Paris.