library(kableExtra)
library(readxl)
library(janitor)
library(tidyverse)
library(lubridate)
library(scales)
library(viridis)
library(ggthemes)
options(scipen = 999)
Data Assignment 1
For what you see below, I’ve used the following packages and settings. Use this as a guide to set up your document:
The Canadian Energy Regulator produces a semi-annual-ish report called Canada’s Energy Future which provides scenario-based analysis of global and local energy issues and the effects of these trends and issues on Canada’s Energy Economy.
For the first part of your first data assignment, I’m going to ask you to produce a few graphs and tables based on publicly-available data from the 2023 Canada’s Energy Future Report. There are multiple ways for you to access the data for this report, but I’d recommend going directly to the Open Government portal to access the data sets to complete the main deliverables for this assignment. You’ll need to figure out how to access the data you need and how to read it successfully into R.
The last graph on the assignment is more of a current events challenge. I want you to make a graph of Canadian oil exports by destination using these data from the Canadian International Merchandise Trade Web Application. To make your life a little easier, I have processed the data into a nicer file available for you here.
This assignment tests your skills in filtering, grouping, and presenting data in table and graph form. It also asks you to draw some basic conclusions from the data.
There are five (5) graded deliverables in this assignment.
I am going to ask you for four (4) specific R outputs in this assignment, each of which you can complete using coding techniques that you’ve already seen, with some minor modifications. Each of these is worth 2 points with a grading key as follows : 0 for not attempted, 1 for attempted with reasonable effort but not completed, 1.5 for satisfactory completion, 2 for excellent work.
Next, in addition to these specific visualization deliverables, which you can complete using code that we’ve already used during the term, I’m going to ask you to answer a question on the data. Here, again, I’ll grade you on the scale of 0 for not attempted, 1 for attempted with reasonable effort but not completed, 1.5 for a solid explanation with no errors, and 2 for compelling explanation of the implications of the data.
Deliverable 1
The first deliverable for this assignment is a simple table, but based on filtered data. I would like you filter your prices data to have Western Canadian Select (WCS) prices for the Canada Net-Zero scenario, and only for years that are multiples of 5 (2005, 2010, 2015, …, 2045, 2050) starting from 2005.
Year | Price ($US 2022/bbl) |
---|---|
2005 | 54.33 |
2010 | 87.08 |
2015 | 42.43 |
2020 | 29.61 |
2025 | 62.33 |
2030 | 49.00 |
2035 | 48.00 |
2040 | 47.00 |
2045 | 46.00 |
2050 | 45.00 |
I’ll give you a couple of hints for how to make this happen:
So far, we’ve been using xls or xlsx files. Depending on how you access the data, you may need a
read_csv
orread.csv
command if you’re using a CSV data file. Also, make sure you don’t give the file a different extension when you download it (e.g. don’t set yourdestfile="foo.xlsx"
if you’re downloading a CSV file. Usedestfile="foo.csv"
);If you want to force a number to have two decimal places, you can use
format(value, nsmall = 2)
, but remember that it will create a character string not a numerical value once you do that. Test it out: typeformat(20.2565, nsmall = 2)
and you’ll get back “20.26”, a character string;I don’t want you to get caught up on styling your tables, so here are the three lines of code that I used to create the table itself above:
kbl(escape = FALSE,table.attr = "style='width:80%;'",digits=2,align=rep('c', 2)) %>%
kable_styling(fixed_thead = T,bootstrap_options = c("hover", "condensed","responsive"),full_width = T)%>%
add_header_above(header = c("Western Canada Select (WCS) prices in the Canada Net-Zero scenario of the Canadian Energy Regulator's Canada's Energy Future (2023) report"=2))
- R will calculate a modulo (remainder) using
%%
, so, if you were to enter something like2025%%5
, R would return 0, which you can use to design your filtering for years that divide evenly by 5.
Deliverable 2
The second deliverable for this assignment is a graph of oil prices, which you’ll again have to base on filtered data. I’d like you to graph Brent, WTI, and WCS prices for the Global Net-zero scenario over time, beginning from 2015.
I’ll give you a couple of hints for how to make this happen:
I used
gsub
to strip the 2022 US $/bbl from each of the variable names, to clean up the graph a bit. You can see a little bit of help for how to do that here. I usedvariable= gsub(" - 2022 US\\$/bbl","",variable)
to make that happen. Dollar signs are generally used as indicators for math in markdown, so if you’re using a dollar sign in your text, you need to lead it with a\
like this:\$
. In gsub, you need to lead symbols with\\
to tell it to look for the specific symbol, otherwise it will think you’re building what’s called a regular expression;I used
expand_limits(y=c(0,105))
to make my axes look nice, andscale_color_viridis("",option = "A",discrete=T,begin = 0,end = .8, direction=-1)
for the color palette. I’ve also usedtheme(legend.position = "bottom")
for the position of the legend;You will likely need to change the margins around your plot so that things aren’t getting cut off. Use
theme(plot.margin = unit(c(1,1,1,1), "cm"))
and the four values are in order top, right, bottom, left (think trouble: t r b l);If you want to filter your data to include multiple items, use this:
filter(variable %in% c("string 1","string 2","string 3"))
. Thec()
creates a vector of elements, and%in%
is telling it to filter based on whether the variable is an element of the set. If you wanted to eliminate any variable that is an element of the set, you could usefilter(!variable %in% c("string 1","string 2","string 3"))
, where the!
acts as a ‘not’ symbol.
Deliverable 3
The third deliverable for this assignment should be easy if you got the last one. I want you to graph WCS prices for each of the report’s scenarios over time, again starting from 2015.
You should only need one hint for this one: in this case, your observation groups are going to be the three scenarios, so you need something like geom_line(aes(year,value,group=scenario,color=scenario),linewidth=.65)
to get the graph to separate the three series. I also used mutate(scenario=as_factor(scenario))
to order the scenarios in the same order as they appear in the data, and the same color scale as the previous graph to set the line colors here. You don’t have to replicate all of these elements exactly, and I encourage you to do what works for you.
Deliverable 4
I have processed the CMIT data for you here. Use these data to produce an area plot like the one shown below:
I’ll give you a couple of hints for how to make this happen:
You don’t have to do everything exactly as I have. Make the graph your own.;
If you want to change the order of the trade route categories, use
mutate(trade_route=as.factor(trade_route))
andmutate( trade_route=fct_relevel(trade_route,"SK to the United States",after=2))
;You might need to change the margins around your plot so that things aren’t getting cut off. Use
theme(plot.margin = unit(c(1,1,1,1), "cm"))
and the four values are in order top, right, bottom, left (think trouble: t r b l);I used a the combination of a
geom_area(...,position="stack")
where … is your usual graph aesthetics, andtheme_economist_white()
and ascale_fill_manual("",values=c(list of colours))
to make my plot. You do not have to mirror my graph exactly, but try to do something that looks good!
Deliverable 5
What does this graph tell you about how pending US tariffs might affect Canadian oil production values. (200 words, maximum)
RMD File and HTML/PDF Preparation
I have made you a basic RMD file for your use in completing this (and future) assignments.
Before you start making any changes to the markdown, test your ability to knit to html with this file, and make sure you can make an html file before you proceed.
You need only submit an html file.