11  Plotting outputs with Plots

Prerequisites

For this chapter, you should know what a keyword argument is (Chapter 6), how to use a package (Chapter 8), and what a Vector is (Chapter 9).

While computers are happy enough to deal with large quantities of data, humans are not. Therefore, whenever we have data, it’s important to be able to create visualisations of it, to aid comprehensibility and understanding. Julia does not do this by default, since it requires a lot of machinery which would by costly for loading times if it were included within Base, and would not always be essential. Instead, we need to turn to the landscape of packages, and in particular, Plots.

11.1 Installing the Plots package

Plots is installed just like any other package, through Pkg mode in the REPL:

(@v1.9) pkg> add Plots

You’ll note that this takes quite a while, and this is for two reasons. Firstly, Plots has a lot of dependencies which need to be downloaded alongside it, some of which are reasonably large. The major reason, however, is that these larger modules are precompiled by Pkg, into a format that can be loaded more quickly into Julia when required.

Once installed, we don’t need to run any of this again (until we update the package anyway). Instead, we need only type using Plots to be able to access it in the current environment.

using Plots

11.2 Collecting data

Data could come in many formats, which we have investigated in earlier chapters. It could have been generated within Julia, and stored in an Array or similar container (Chapter 9). It may be from elsewhere, stored in an external file and needing to be read into our program (Chapter 10). It may not have even been calculated yet, instead stored abstractly as a function, ready to be evaluated on a range of values (Chapter 6).

Plots understands this, and via multiple dispatch (Chapter 7) is able to plot data coming in many forms. However, for simplicity, we’ll assume to begin with that the data we have is numeric, and in the form of two Vectors, one consisting of x-values, and the other consisting of corresponding y-values. Most numeric data can be turned into this format, for instance a list of x-values and a function can give rise to a Vector of y-values by broadcasting (Chapter 9), or values from a .csv file could be read in this format (Chapter 10). We’ll see how we may wish to plot non-numeric data when we consider other types of graph later.

We’ll use some example data about London Underground stations for the rest of this chapter. This comes from a Freedom of Information request online, in the form of a .csv file:

# Data about London Underground Stations, from:
url = "https://www.whatdotheyknow.com/request/512947/response/1238210/attach/3/Stations%2020180921.csv.txt"

using CSV

# Vector of names of stations
names = String[]
# Vector of zones that the stations are in
zones = Int64[]
# Vector of latitudes of stations
lats = Float64[]
# Vector of longitudes of stations
longs = Float64[]

for row  CSV.Rows(download(url))
    # We want only the London Underground stations
    if row.NETWORK == "London Underground"
        push!(names, string(row.NAME))
        push!(zones, parse(Int64, row.Zone))
        push!(lats, parse(Float64, row.y))
        push!(longs, parse(Float64, row.x))
    end
end

To get an idea of the data we’ve got here, lets see what some of it looks like:

indices = [1, 51, 101, 151, 201, 251]
[names[indices];; zones[indices];; lats[indices];; longs[indices]]
6×4 Matrix{Any}:
 "Temple"                   1  51.5105  -0.112644
 "Edgware Road (Bakerloo)"  1  51.52    -0.168646
 "Archway"                  2  51.564   -0.133444
 "Acton Town"               3  51.5021  -0.278433
 "East Ham"                 4  51.5389   0.0544549
 "Upminster Bridge"         6  51.558    0.237477

11.3 Plotting data

The basic function for creating a plot in Plots is, aptly, plot. If we have a Vector of x values (in this case the longitudes longs) and a Vector of y values (in this case the latitudes lats) of the same length, then we simply plot them by:

plot(longs, lats)
Figure 11.1: First attempt at plotting

Aargh! That’s a mess! The default behaviour of plot is to assume you want a line graph, which makes sense for lots of data, but not ours. For instance, if you plot a function, you would expect a line graph as an output:

f(x) = x^2 + 7x - 5
# Not a `Vector`, but can be turned into one by `convert`
xs = -5:0.01:5
plot(xs, f.(xs))
Figure 11.2: Plot of the function \(f(x) = x^2 + 7x - 5\)

What Figure 11.1 and Figure 11.2 show us though is some of the other behaviour of plot:

  • Although it’s not obvious from the first example, since we haven’t seen the contents of longs and lats, the data is always joined in the order it lies in these Vectors

  • The x and y axes have been automatically translated and scaled to better fit our data on the plot

  • The use of plot has created a legend, with a name for the data we’ve just plotted y1. If we add more data, this legend will be added to

As we’ll see later, plot is highly customisable, and every aspect of the output can be tweaked and changed as we need.

11.4 Other types of graph

One of the more consequential changes we want to make is to change the type of graph that is being plotted, which we can do using the keyword argument seriestype. A series is a single set of data points that has been plotted, so for instance if we plotted two functions on the same graph, each would be its own series. Many series types are available in Plots, such as:

# Histogram
# One dimensional data input, frequency densities automatically calculated
plot(lats, seriestype = :histogram)
Figure 11.3: Histogram of latitudes
# Pie chart
zonenums = 1:9
zonecounts = [count(==(i), zones) for i  zonenums]
# First input is a Vector of names, second is a Vector of frequencies
plot(zonenums, zonecounts, seriestype = :pie)
Figure 11.4: Pie chart of zones
# Bar chart
letters = Char.(65:90)
lettercounts = [count(x -> x[1] == c, names) for c  letters]
# First input is a Vector of names, second is a Vector of frequencies
plot(letters, lettercounts, seriestype = :bar)
Figure 11.5: Bar chart of starting letters

Each of these cases have parameters that could be changed to make the graphs look better, or to have them better represent our data. For instance, the bins of the histogram could be customised, the wedges of the pie chart could be labelled instead of the legend, or the missing letters on the x-axis of the bar chart could be readded.

The series type that we need for our latitude/longitude data is :scatter. With many series types, including :scatter, there is a second way to create them, which is to use their own functions instead of plot:

scatter(longs, lats)
Figure 11.6: Scatter plot showing locations of London Underground stations

11.5 Changing the look

While the default look of the series we’ve plotted so far show the data reasonably well (if we chose the right series type for the data, that is!), it could be improved. Moreover, there are other dials we might want to turn to tweak the look, or information we want to add in for clarity or take away for decluttering. These are defined by keyword arguments to plot (or other series type specific function), that is, arguments like seriestype which need naming before we can give them a value. A selection of useful ones are given in the table below

Table 11.1: Some of the more common plot attributes
Name Possible values Function
:seriestype Symbol, e.g. :bar Changes the graph type of plotted data
:size Tuple of two Integers Changes the output image size
:xlims Tuple of two Reals Changes the range of x values seen
:ylims Tuple of two Reals Changes the range of y values seen
:framestyle Symbol e.g. :box Changes the layout of the axes
:legend Various, e.g. false, (0.5, 0.7) Changes the legend position, or disables it
:label String Changes the name for a series in the legend
:ticks Various, e.g. false, xticks = 0:0.05:1 Changes the ticks on one or both axes
:xlabel
:title String Adds a title to the plot
:annotations Tuple of two Reals and a String Adds text annotations at given coordinates
:markercolour Symbol e.g. :red, or RGB, HSL, etc. Changes the colour of marked points
:linecolour Symbol e.g. :red, or RGB, HSL, etc. Changes the colour of plotted lines
:fillcolour Symbol e.g. :red, or RGB, HSL, etc. Changes the colour of filled regions
:layout Tuple of two Integers, or e.g. grid() Defines the layout of multiple combined plots

A comprehensive list of attributes can be found by using the function plotattr() in the REPL. This gives a list of attributes, which you can move up and down with arrow keys and , and select with Enter ⮠, which will then give you a bit of information about the usage of that attribute. You can also get a list of attributes in one of four categories – :Series, :Subplot, :Plot, :Axis – by using that `Symbol as an argument:

plotattr(:Axis)
Defined Axis attributes are:
discrete_values, draw_arrow, flip, foreground_color_axis, foreground_color_border, foreground_color_grid, foreground_color_guide, foreground_color_minor_grid, foreground_color_text, formatter, grid, gridalpha, gridlinewidth, gridstyle, guide, guide_position, guidefontcolor, guidefontfamily, guidefonthalign, guidefontrotation, guidefontsize, guidefontvalign, lims, link, minorgrid, minorgridalpha, minorgridlinewidth, minorgridstyle, minorticks, mirror, rotation, scale, showaxis, tick_direction, tickfontcolor, tickfontfamily, tickfonthalign, tickfontrotation, tickfontsize, tickfontvalign, ticks, unitformat, widen

Additionally, using the name of an attribute as a String input gives more detail on that specific attribute (note that this doesn’t work for all possible attributes, however):

plotattr("ticks")
:ticks

Tick values, (tickvalues, ticklabels), `:auto`/`true`, `:none`/`false`/`nothing` (ticks disabled), or `:native` (tells backend to calculate ticks by itself; good idea for interactive backends with mouse zooming).

Aliases: (:tick,).

Type: Union{Nothing, Bool, Symbol, Tuple{AbstractVector{Real}, AbstractVector{AbstractString}}, AbstractVector{Real}}.

`Axis` attribute, defaults to `auto`.

Let’s now put some of these into action:

tubeplot = plot(
    title = "Locations of London Underground stations",
    xlims = (-0.7, 0.3),
    xticks = -0.7:0.1:0.3,
    xlabel = "Longitude",
    ylims = (51.35, 51.75),
    yticks = 51.35:0.05:51.75,
    ylabel = "Latitude",
    legend = false
)
Convention

Long lines of code are difficult to read, so changing many attributes should be split over several lines.

This is clearly missing our data! Luckily, we can modify an earlier plot that we made using the mutating variant (e.g. plot! instead of plot), and giving the variable corresponding to that earlier plot as the first argument. Alternatively, using the mutating variant function without specifying a plot modifies the last plot that was created (found by current()). We’ll use both of these methods to add our data onto the graph:

# Adds red circles at every lat/long
scatter!(
    tubeplot,
    longs,
    lats,
    markercolor = :white,
    markerstrokecolor = :red,
    markerstrokewidth = 2
)

# Adds blue rectangles at every lat/long
scatter!(
    longs,
    lats,
    markercolor = :blue,
    markershape = Shape([-2, -2, 2, 2], [-0.4, 0.4, 0.4, -0.4]),
    markerstrokewidth = 0
)
Figure 11.7: Locations of London Underground stations (but improved!)

An aspect of Plots that we haven’t mentioned is how it actually turns our data into a plot. This is done by the backend, which in general is a separate module to Plots, perhaps not even written in Julia, that can visualise data. Indeed, Plots is better described as a consistent interface unifying the backends such that we can talk to them all using the same language, rather than a plotting package in its own right.

The default is called GR, which comes prepackaged with Plots, but for different series types, and customisation options, you may wish to choose a different backend. To do this, you’ll need to download the backend as a package, load it in with using, and then run a function to switch backend (for example plotly() changes the backend to Plotly, gr() changes it back to GR, and so on). For our purposes, we’re sticking to GR, since the syntax for using other backends is more or less identical, and all that changes is the look of the output.

11.6 Saving plots

Once we’ve generated a plot like Figure 11.7, and we like the look of it, we may wish to save it separately as a file. While this would be possible by messing around with output streams as we did in Chapter 10, luckily Plots provides us with the function savefig that does this all for us.

savefig(tubeplot, "tubeplot.png")

The first argument to the function is the Plot that we want to save, and the second is the name that we want to give the file. Notice that we’ve included a file extension, in this case .png. Plots can infer from this the format we want the image saved in, and several such formats are supported (depending on backend). If no file extension is specified, PNG is the default. Indeed, we can also use the function png, which has the same effect as savefig with a png file extension:

# Equivalent to the above
png(tubeplot, "tubeplot")

The location that this will be saved by default can be found by running pwd(), which tells us the current working directory. To save to a different location, you can include the path to that location either relative to the current working directory, or in full. For instance, savefig(tubeplot, "figures/tubeplot.png") would save tubeplot as a PNG file called tubeplot.png, which lies inside a folder called figures within the current working directory. Bear in mind that this won’t create new folders that don’t already exist, so if the folder figures doesn’t exist, we’ll get an error.

For more details on saving to files from Julia, refer to Chapter 10.

11.7 Alternatives to Plots

Plots is not the only package that can create graphical visualisations of data. Some alternatives include:

  • Colors can be used to create some visualisations as we saw in Chapter 9. A Matrix of colours can be interpreted as an image, although adding axes or annotations is not possible

  • Makie describes itself as a ‘data visualisation ecosystem’. It is similar to Plots in that it comes with many different backends, and may well serve as the successor to Plots in the years to come

  • Gadfly was previously available as a Plots backend, but now only functions as a standalone plotting package

  • VegaLite is a Julia interface for the Vega-Lite ‘visualisation grammar’ to describe graphs in a more human readable JSON fashion, and still be able to plot them

  • Instead of creating images, UnicodePlots keeps its outputs text-based within the REPL. Despite this limitation, it’s possible to create a wide variety of graphs

What’s more, Plots itself has a great deal of flexibility with its choice of backend. It may be that Plots with the default GR backend doesn’t suit your needs, but a different backend may be better.