using Plots11 Plotting outputs with Plots
While computers are happy enough to deal with large quantities of data, humans are not. Therefore, whenever we have data, it’s important to be able to create visualisations of it, to aid comprehensibility and understanding. Julia does not do this by default, since it requires a lot of machinery which would by costly for loading times if it were included within Base, and would not always be essential. Instead, we need to turn to the landscape of packages, and in particular, Plots.
11.1 Installing the Plots package
Plots is installed just like any other package, through Pkg mode in the REPL:
(@v1.9) pkg> add Plots
You’ll note that this takes quite a while, and this is for two reasons. Firstly, Plots has a lot of dependencies which need to be downloaded alongside it, some of which are reasonably large. The major reason, however, is that these larger modules are precompiled by Pkg, into a format that can be loaded more quickly into Julia when required.
Once installed, we don’t need to run any of this again (until we update the package anyway). Instead, we need only type using Plots to be able to access it in the current environment.
11.2 Collecting data
Data could come in many formats, which we have investigated in earlier chapters. It could have been generated within Julia, and stored in an Array or similar container (Chapter 9). It may be from elsewhere, stored in an external file and needing to be read into our program (Chapter 10). It may not have even been calculated yet, instead stored abstractly as a function, ready to be evaluated on a range of values (Chapter 6).
Plots understands this, and via multiple dispatch (Chapter 7) is able to plot data coming in many forms. However, for simplicity, we’ll assume to begin with that the data we have is numeric, and in the form of two Vectors, one consisting of x-values, and the other consisting of corresponding y-values. Most numeric data can be turned into this format, for instance a list of x-values and a function can give rise to a Vector of y-values by broadcasting (Chapter 9), or values from a .csv file could be read in this format (Chapter 10). We’ll see how we may wish to plot non-numeric data when we consider other types of graph later.
We’ll use some example data about London Underground stations for the rest of this chapter. This comes from a Freedom of Information request online, in the form of a .csv file:
# Data about London Underground Stations, from:
url = "https://www.whatdotheyknow.com/request/512947/response/1238210/attach/3/Stations%2020180921.csv.txt"
using CSV
# Vector of names of stations
names = String[]
# Vector of zones that the stations are in
zones = Int64[]
# Vector of latitudes of stations
lats = Float64[]
# Vector of longitudes of stations
longs = Float64[]
for row ∈ CSV.Rows(download(url))
# We want only the London Underground stations
if row.NETWORK == "London Underground"
push!(names, string(row.NAME))
push!(zones, parse(Int64, row.Zone))
push!(lats, parse(Float64, row.y))
push!(longs, parse(Float64, row.x))
end
endTo get an idea of the data we’ve got here, lets see what some of it looks like:
indices = [1, 51, 101, 151, 201, 251]
[names[indices];; zones[indices];; lats[indices];; longs[indices]]6×4 Matrix{Any}:
"Temple" 1 51.5105 -0.112644
"Edgware Road (Bakerloo)" 1 51.52 -0.168646
"Archway" 2 51.564 -0.133444
"Acton Town" 3 51.5021 -0.278433
"East Ham" 4 51.5389 0.0544549
"Upminster Bridge" 6 51.558 0.237477
11.3 Plotting data
The basic function for creating a plot in Plots is, aptly, plot. If we have a Vector of x values (in this case the longitudes longs) and a Vector of y values (in this case the latitudes lats) of the same length, then we simply plot them by:
plot(longs, lats)Aargh! That’s a mess! The default behaviour of plot is to assume you want a line graph, which makes sense for lots of data, but not ours. For instance, if you plot a function, you would expect a line graph as an output:
f(x) = x^2 + 7x - 5
# Not a `Vector`, but can be turned into one by `convert`
xs = -5:0.01:5
plot(xs, f.(xs))What Figure 11.1 and Figure 11.2 show us though is some of the other behaviour of plot:
Although it’s not obvious from the first example, since we haven’t seen the contents of
longsandlats, the data is always joined in the order it lies in theseVectorsThe
xandyaxes have been automatically translated and scaled to better fit our data on the plotThe use of
plothas created a legend, with a name for the data we’ve just plottedy1. If we add more data, this legend will be added to
As we’ll see later, plot is highly customisable, and every aspect of the output can be tweaked and changed as we need.
11.4 Other types of graph
One of the more consequential changes we want to make is to change the type of graph that is being plotted, which we can do using the keyword argument seriestype. A series is a single set of data points that has been plotted, so for instance if we plotted two functions on the same graph, each would be its own series. Many series types are available in Plots, such as:
# Histogram
# One dimensional data input, frequency densities automatically calculated
plot(lats, seriestype = :histogram)# Pie chart
zonenums = 1:9
zonecounts = [count(==(i), zones) for i ∈ zonenums]
# First input is a Vector of names, second is a Vector of frequencies
plot(zonenums, zonecounts, seriestype = :pie)# Bar chart
letters = Char.(65:90)
lettercounts = [count(x -> x[1] == c, names) for c ∈ letters]
# First input is a Vector of names, second is a Vector of frequencies
plot(letters, lettercounts, seriestype = :bar)Each of these cases have parameters that could be changed to make the graphs look better, or to have them better represent our data. For instance, the bins of the histogram could be customised, the wedges of the pie chart could be labelled instead of the legend, or the missing letters on the x-axis of the bar chart could be readded.
The series type that we need for our latitude/longitude data is :scatter. With many series types, including :scatter, there is a second way to create them, which is to use their own functions instead of plot:
scatter(longs, lats)11.5 Changing the look
While the default look of the series we’ve plotted so far show the data reasonably well (if we chose the right series type for the data, that is!), it could be improved. Moreover, there are other dials we might want to turn to tweak the look, or information we want to add in for clarity or take away for decluttering. These are defined by keyword arguments to plot (or other series type specific function), that is, arguments like seriestype which need naming before we can give them a value. A selection of useful ones are given in the table below
| Name | Possible values | Function |
|---|---|---|
:seriestype |
Symbol, e.g. :bar |
Changes the graph type of plotted data |
:size |
Tuple of two Integers |
Changes the output image size |
:xlims |
Tuple of two Reals |
Changes the range of x values seen |
:ylims |
Tuple of two Reals |
Changes the range of y values seen |
:framestyle |
Symbol e.g. :box |
Changes the layout of the axes |
:legend |
Various, e.g. false, (0.5, 0.7) |
Changes the legend position, or disables it |
:label |
String |
Changes the name for a series in the legend |
:ticks |
Various, e.g. false, xticks = 0:0.05:1 |
Changes the ticks on one or both axes |
:xlabel |
||
:title |
String |
Adds a title to the plot |
:annotations |
Tuple of two Reals and a String |
Adds text annotations at given coordinates |
:markercolour |
Symbol e.g. :red, or RGB, HSL, etc. |
Changes the colour of marked points |
:linecolour |
Symbol e.g. :red, or RGB, HSL, etc. |
Changes the colour of plotted lines |
:fillcolour |
Symbol e.g. :red, or RGB, HSL, etc. |
Changes the colour of filled regions |
:layout |
Tuple of two Integers, or e.g. grid() |
Defines the layout of multiple combined plots |
A comprehensive list of attributes can be found by using the function plotattr() in the REPL. This gives a list of attributes, which you can move up and down with arrow keys ↑ and ↓, and select with Enter ⮠, which will then give you a bit of information about the usage of that attribute. You can also get a list of attributes in one of four categories – :Series, :Subplot, :Plot, :Axis – by using that `Symbol as an argument:
plotattr(:Axis)Defined Axis attributes are:
discrete_values, draw_arrow, flip, foreground_color_axis, foreground_color_border, foreground_color_grid, foreground_color_guide, foreground_color_minor_grid, foreground_color_text, formatter, grid, gridalpha, gridlinewidth, gridstyle, guide, guide_position, guidefontcolor, guidefontfamily, guidefonthalign, guidefontrotation, guidefontsize, guidefontvalign, lims, link, minorgrid, minorgridalpha, minorgridlinewidth, minorgridstyle, minorticks, mirror, rotation, scale, showaxis, tick_direction, tickfontcolor, tickfontfamily, tickfonthalign, tickfontrotation, tickfontsize, tickfontvalign, ticks, unitformat, widen
Additionally, using the name of an attribute as a String input gives more detail on that specific attribute (note that this doesn’t work for all possible attributes, however):
plotattr("ticks"):ticks
Tick values, (tickvalues, ticklabels), `:auto`/`true`, `:none`/`false`/`nothing` (ticks disabled), or `:native` (tells backend to calculate ticks by itself; good idea for interactive backends with mouse zooming).
Aliases: (:tick,).
Type: Union{Nothing, Bool, Symbol, Tuple{AbstractVector{Real}, AbstractVector{AbstractString}}, AbstractVector{Real}}.
`Axis` attribute, defaults to `auto`.
Let’s now put some of these into action:
tubeplot = plot(
title = "Locations of London Underground stations",
xlims = (-0.7, 0.3),
xticks = -0.7:0.1:0.3,
xlabel = "Longitude",
ylims = (51.35, 51.75),
yticks = 51.35:0.05:51.75,
ylabel = "Latitude",
legend = false
)Long lines of code are difficult to read, so changing many attributes should be split over several lines.
This is clearly missing our data! Luckily, we can modify an earlier plot that we made using the mutating variant (e.g. plot! instead of plot), and giving the variable corresponding to that earlier plot as the first argument. Alternatively, using the mutating variant function without specifying a plot modifies the last plot that was created (found by current()). We’ll use both of these methods to add our data onto the graph:
# Adds red circles at every lat/long
scatter!(
tubeplot,
longs,
lats,
markercolor = :white,
markerstrokecolor = :red,
markerstrokewidth = 2
)
# Adds blue rectangles at every lat/long
scatter!(
longs,
lats,
markercolor = :blue,
markershape = Shape([-2, -2, 2, 2], [-0.4, 0.4, 0.4, -0.4]),
markerstrokewidth = 0
)An aspect of Plots that we haven’t mentioned is how it actually turns our data into a plot. This is done by the backend, which in general is a separate module to Plots, perhaps not even written in Julia, that can visualise data. Indeed, Plots is better described as a consistent interface unifying the backends such that we can talk to them all using the same language, rather than a plotting package in its own right.
The default is called GR, which comes prepackaged with Plots, but for different series types, and customisation options, you may wish to choose a different backend. To do this, you’ll need to download the backend as a package, load it in with using, and then run a function to switch backend (for example plotly() changes the backend to Plotly, gr() changes it back to GR, and so on). For our purposes, we’re sticking to GR, since the syntax for using other backends is more or less identical, and all that changes is the look of the output.
11.6 Saving plots
Once we’ve generated a plot like Figure 11.7, and we like the look of it, we may wish to save it separately as a file. While this would be possible by messing around with output streams as we did in Chapter 10, luckily Plots provides us with the function savefig that does this all for us.
savefig(tubeplot, "tubeplot.png")The first argument to the function is the Plot that we want to save, and the second is the name that we want to give the file. Notice that we’ve included a file extension, in this case .png. Plots can infer from this the format we want the image saved in, and several such formats are supported (depending on backend). If no file extension is specified, PNG is the default. Indeed, we can also use the function png, which has the same effect as savefig with a png file extension:
# Equivalent to the above
png(tubeplot, "tubeplot")The location that this will be saved by default can be found by running pwd(), which tells us the current working directory. To save to a different location, you can include the path to that location either relative to the current working directory, or in full. For instance, savefig(tubeplot, "figures/tubeplot.png") would save tubeplot as a PNG file called tubeplot.png, which lies inside a folder called figures within the current working directory. Bear in mind that this won’t create new folders that don’t already exist, so if the folder figures doesn’t exist, we’ll get an error.
For more details on saving to files from Julia, refer to Chapter 10.
11.7 Alternatives to Plots
Plots is not the only package that can create graphical visualisations of data. Some alternatives include:
Colorscan be used to create some visualisations as we saw in Chapter 9. AMatrixof colours can be interpreted as an image, although adding axes or annotations is not possibleMakiedescribes itself as a ‘data visualisation ecosystem’. It is similar toPlotsin that it comes with many different backends, and may well serve as the successor toPlotsin the years to comeGadflywas previously available as aPlotsbackend, but now only functions as a standalone plotting packageVegaLiteis a Julia interface for the Vega-Lite ‘visualisation grammar’ to describe graphs in a more human readable JSON fashion, and still be able to plot themInstead of creating images,
UnicodePlotskeeps its outputs text-based within the REPL. Despite this limitation, it’s possible to create a wide variety of graphs
What’s more, Plots itself has a great deal of flexibility with its choice of backend. It may be that Plots with the default GR backend doesn’t suit your needs, but a different backend may be better.