Data Assignment #2 (due March 8, 2023 by 11:59pm)
This assignment addresses two related subjects: the fiscal impact of oil and gas in Alberta and the pipelines we use to transport our oil and gas.
Canada is an energy exporting country, and our energy exports are an important source of income to governments, in particular in Alberta.
Let’s start with the pipelines that transport some of those exports. The Canadian Energy Regulator produces pipeline profiles for the largest oil and gas pipelines in Canada, and also tracks other export-related data. The pipeline throughput and capacity data are available here. Crude oil export data sets are available here, while gas trade data are available here.
The US EIA also tracks crude oil imports from Canada and elsewhere, and often provides much more detail than the Canadian export data. Unfortunately, their new Application Programming Interface (API) is very cumbersome and so I’ve skipped teaching you that step and downloaded some data from their API for you to use for this assignment.
There are five deliverables for this assignment, and they build on skills we have already learned:
- a graph of throughput by grade and capacity on the Enbridge mainline ex-Gretna;
- a graph of throughput by grade and destination on the TransMountain pipeline;
- a graph of throughput and capacity on the TC Energy Canadian Mainline;
- a graph of US oil imports by origin (Canada vs Rest of the World) and refining region (PADDs 1 through 5);
- a graph of Alberta resource royalties over time.
To execute my versions of the deliverables that you see below, I’ve used the following packages. Use this as a guide to set up your document:
library(kableExtra)
library(readxl)
library(janitor)
library(tidyverse)
library(lubridate)
library(scales)
library(viridis)
library(cowplot)
You may also find it useful to have a look at the Functions Demo before you start this assignment.
# I made a downloader function for the pipeline profiles data
<-function(pipe_name="Enbridge-Mainline"){
get_pipe_data<-c("Alliance","Cochin","Enbridge-Mainline","Norman-Wells",
names"Keystone","MNP","Trans-Mountain","TQM","tcpl-mainline","Westcoast","ngtl")
<-"No data"
pipe_dataif(pipe_name %in% names)
{<-paste("https://www.cer-rec.gc.ca/open/energy/throughput-capacity/",pipe_name,"-throughput-and-capacity.csv",sep="")
file_name<-paste(pipe_name,".csv",sep="")
local_filedownload.file(file_name,local_file,mode="wb")
<- read_csv(file = local_file)%>%clean_names()%>%mutate(pipe_name=pipe_name)
pipe_data
}
pipe_data }
Deliverable 1: the Enbridge Mainline
The Enbridge Mainline moves more crude oil than all other systems combined. For this graph, I expect you to download data from the pipeline profiles and produce a graph in R that shows exports ex-Gretna by grade as well as capacity at that point. You’ll need to filter the data to make it useful for your graph, and you’ll need to combine an area plot and a line plot.
You should produce something like (but not necessarily identical to) this:
<-get_pipe_data()%>%filter(key_point=="ex-Gretna",direction_of_flow=="east") %>% mutate(
enb_datatrade_type=str_to_title(trade_type) ,
product=gsub("domestic light / ngl", "Light / NGL", product),
product=gsub("domestic heavy", "Domestic Heavy", product),
product=gsub("foreign light", "Foreign Light", product),
pair=paste(product," (",trade_type,")",sep=""))
#I used this because the capacity *used* to be separate data. Now it's combined, so see below for a better version
<-enb_data%>%group_by(date)%>%summarize(capacity=mean(available_capacity_1000_m3_d,na.rm=T))
enb_capacity
ggplot(enb_data) +
geom_area(aes(date,throughput_1000_m3_d,group = pair,fill=pair),position = "stack",color="black",linewidth=.25) +
geom_line(data=enb_capacity,aes(date,capacity,color="Available Capacity"),linewidth=.85, lty="21") +
scale_fill_manual("",values=viridis(n=4,alpha=1,begin=.7,end=0,option = "E",direction=-1))+
scale_x_date(name=NULL,date_breaks = "2 years", date_labels = "%b\n%Y",expand=c(0,0)) +
scale_color_manual("",values=c("black"))+
scale_y_continuous(expand = c(0, 0),
sec.axis = sec_axis( trans=~.*1/.16, name="Shipments (Monthly, Thousands of Barrels per Day)")) +
guides(alpha = guide_legend(override.aes = list(fill=viridis(n=3,alpha=1,begin=.8,end=0,option = "E",direction=-1)[1]),order = 10) ,
fill = guide_legend(order = 1) )+
theme_classic() +
theme(panel.border = element_blank(),
panel.grid = element_blank(),
panel.grid.major.y = element_line(color = "gray",linetype="dotted"),
axis.line.x = element_line(color = "gray"),
axis.line.y = element_line(color = "gray"),
axis.text = element_text(size = 12),
axis.text.x = element_text(margin = margin(t = 10)),
axis.title = element_text(size = 12),
#axis.label.x = element_text(size=20,vjust=+5),
plot.subtitle = element_text(size = 12,hjust=0.5),
plot.caption = element_text(face="italic",size = 12,hjust=0),
legend.key.width=unit(2,"line"),
legend.position = "bottom",
legend.box = "horizontal",
#legend.direction = "horizontal",
#legend.box = "horizontal",
legend.text = element_text(size = 12),
plot.title = element_text(hjust=0.5,size = 14))+
labs(y="Shipments (Monthly, Thousands of Cubic Metres per Day)",x="Date",
title=paste("Enbridge Canadian Mainline Shipments by Product",sep=""),
caption="Source: CER Data for Enbridge Mainline (ex-Gretna), graph by Andrew Leach.")
#same graph, but using the fact that capacity is embedded in the data
<-get_pipe_data()%>%filter(key_point=="ex-Gretna",direction_of_flow=="east") %>% mutate(
enb_datatrade_type=str_to_title(trade_type) ,
product=gsub("domestic light / ngl", "Light / NGL", product),
product=gsub("domestic heavy", "Domestic Heavy", product),
product=gsub("foreign light", "Foreign Light", product),
pair=paste(product," (",trade_type,")",sep=""))
ggplot(enb_data) +
geom_area(aes(date,throughput_1000_m3_d,group = pair,fill=pair),position = "stack",color="black",linewidth=.25) +
geom_line(aes(date,available_capacity_1000_m3_d,color="Available Capacity"),linewidth=.85, lty="21") +
scale_fill_manual("",values=viridis(n=4,alpha=1,begin=.7,end=0,option = "E",direction=-1))+
scale_x_date(name=NULL,date_breaks = "2 years", date_labels = "%b\n%Y",expand=c(0,0)) +
scale_color_manual("",values=c("black"))+
scale_y_continuous(expand = c(0, 0),
sec.axis = sec_axis( trans=~.*1/.16, name="Shipments (Monthly, Thousands of Barrels per Day)")) +
guides(alpha = guide_legend(override.aes = list(fill=viridis(n=3,alpha=1,begin=.8,end=0,option = "E",direction=-1)[1]),order = 10) ,
fill = guide_legend(order = 1) )+
theme_classic() +
theme(panel.border = element_blank(),
panel.grid = element_blank(),
panel.grid.major.y = element_line(color = "gray",linetype="dotted"),
axis.line.x = element_line(color = "gray"),
axis.line.y = element_line(color = "gray"),
axis.text = element_text(size = 12),
axis.text.x = element_text(margin = margin(t = 10)),
axis.title = element_text(size = 12),
#axis.label.x = element_text(size=20,vjust=+5),
plot.subtitle = element_text(size = 12,hjust=0.5),
plot.caption = element_text(face="italic",size = 12,hjust=0),
legend.key.width=unit(2,"line"),
legend.position = "bottom",
legend.box = "horizontal",
#legend.direction = "horizontal",
#legend.box = "horizontal",
legend.text = element_text(size = 12),
plot.title = element_text(hjust=0.5,size = 14))+
labs(y="Shipments (Monthly, Thousands of Cubic Metres per Day)",x="Date",
title=paste("Enbridge Canadian Mainline Shipments by Product",sep=""),
caption="Source: CER Data for Enbridge Mainline (ex-Gretna), graph by Andrew Leach.")
Hints:
To add the second y axis, use
scale_y_continuous(expand = c(0, 0),sec.axis = sec_axis( trans=~.*1/.16, name="Shipments (Monthly, Thousands of Barrels per Day)"))
which translates the units using the multiplier in trans=~;
You’ll probably want to download a csv and look at it. You’ll also likely want to do some fixing of cases. You can do this in R or in your CSV (but it’s easier in R);
There are a couple of ways to make a variable out of two existing varibles for graphing. We’ve previously used the
interaction()
function, e.g.group=interaction(product,trade_type)
, or you can also use paste, as I did in another part of the functions demo. For example, you could create a new variable usingmutate(pair=paste(product," (",trade_type,")",sep=""))
which will create, for example, a variable like Light / NGL (Export);
If you’re getting a weird looking graph, you’re probably graphing by year. Check your data to see what you really should be using.
To add a capacity line, there are a couple of ways to do it, but I’ll show you the way I did it here. First, make a separate data set for capacity, and use the mean in each month:
enb_capacity<-enb_data%>%group_by(date)%>%summarize(capacity=mean(available_capacity_1000_m3_d,na.rm=T))
And then, you can add a line into your graph using those data. This let’s me show you a neat trick too - if you name the colour in a ggplot aesthetic, this will carry through to your legends.
geom_line(data=enb_capacity,aes(date,capacity,color="Available Capacity"),linewidth=.85, lty="21") + scale_color_manual("",values=c("black"))
You should also note that, for a time before the opening of the Dakota Access Pipeline, there was some US-produced crude moving east on the Mainline.
Deliverable 2: the Trans-Mountain Pipeline
The Trans-Mountain pipeline has been around since the 1950s and is the only major crude oil pipeline to cross the continental divide. Trans-Mountain is also unique in that it ships both crude oil and refined products in batches (other pipelines ship in batches too, just not with refined products in the mix). The following description is taken from the Trans-Mountain website”
Transmountain also has three delivery points: exports to the US via Sumas, deliveries to domestic refining in Burnaby, and a port delivery point at Westridge.
To see what gets delivered where on TransMountain, you should produce something like (but not necessarily identical to) this for TransMountain:
<-get_pipe_data(pipe_name="Trans-Mountain")%>% filter(key_point!="system")%>%mutate(
tm_datatrade_type=str_to_title(trade_type) ,
product=str_to_title(product),
pair=paste(key_point," (",trade_type,")",sep=""))
ggplot(tm_data,aes(date,throughput_1000_m3_d,group = product,fill=product)) +
geom_area(position = "stack",,color="black",linewidth=.25) +
scale_fill_manual("",values=viridis(n=4,alpha=1,begin=.7,end=0,option = "E",direction=-1))+
scale_x_date(name=NULL,date_breaks = "2 years", date_labels = "%b\n%Y",expand=c(0,0)) +
scale_y_continuous(expand = c(0, 0),
sec.axis = sec_axis( trans=~.*1/.16, name="Shipments (Monthly, Thousands of Barrels per Day)")) +
guides(alpha = guide_legend(override.aes = list(fill=viridis(n=3,alpha=1,begin=.8,end=0,option = "E",direction=-1)[1]),order = 10) ,
fill = guide_legend(order = 1) )+
facet_wrap(~pair,ncol = 1, scales = "free_y")+
theme_classic() +
theme(panel.border = element_blank(),
panel.grid = element_blank(),
panel.grid.major.y = element_line(color = "gray",linetype="dotted"),
axis.line.x = element_line(color = "gray"),
axis.line.y = element_line(color = "gray"),
axis.text = element_text(size = 12),
axis.text.x = element_text(margin = margin(t = 10)),
axis.title = element_text(size = 12),
#axis.label.x = element_text(size=20,vjust=+5),
plot.subtitle = element_text(size = 12,hjust=0.5),
plot.caption = element_text(face="italic",size = 12,hjust=0),
legend.key.width=unit(2,"line"),
legend.position = "bottom",
legend.box = "vertical",
#legend.direction = "horizontal",
#legend.box = "horizontal",
legend.text = element_text(size = 12),
plot.title = element_text(hjust=0.5,size = 14))+
labs(y="Shipments (Monthly, Thousands of Cubic Metres per Day)",x="Date",
title=paste("Trans-Mountain Pipeline Shipments by Product and Destination",sep=""),
caption="Source: CER Data for Trans-Mountain, graph by Andrew Leach.")
Hint: use
facet_wrap(~key_point,ncol = 1, scales = "free_y")
to allow different y-axes on each of the plots, and to stack the plots vertically if you like. You are not required to present them this way, but I do want to see a three-facet plot
Deliverable 3: US imports of Canadian crude
We haven’t talked a lot about the US market, but it’s worth having a sense of how the US relies on imports. Using data that I’ve provided for you here, you should be able to reproduce a graph that looks like this one for use imports by grade and refining region. Note that refining regions give you a better sense of where demand lies, rather than a focus on the import port or pipeline border crossing point.
<-read_csv("assignment_2_data.csv")
assignment_data_conf
<-
import_plotggplot(
%>%
assignment_data_conf#mutate(origin_name=as_factor(origin_name))%>%
#mutate(origin_name=fct_relevel(origin_name,"ROW"))%>%
#mutate(origin_name=fct_recode(origin_name,"Rest of the World"="ROW")
mutate(origin_name=gsub("ROW","Rest of the World",origin_name)
+
))geom_area(aes(period,quantity,group=origin_name,fill=origin_name),color="black",linewidth=0.25,position="stack")+
scale_x_date(expand=c(0,0))+
scale_y_continuous(breaks=pretty_breaks(), expand=c(0,0))+
#look here, adding in the facet wrap and the labeler
facet_grid(cols=vars(destination_name),rows=vars(grade_name))+
expand_limits(y=c(0,140))+
#scale_fill_manual("",values=c("darkblue","red"))+
scale_fill_manual("",values=c("red","darkblue"))+
theme_classic() +
theme(panel.border = element_blank(),
panel.grid = element_blank(),
panel.grid.major.y = element_line(color = "gray",linetype="dotted"),
axis.line.x = element_line(color = "gray"),
axis.line.y = element_line(color = "gray"),
axis.text = element_text(size = 12),
axis.text.x = element_text(margin = margin(t = 10)),
axis.title = element_text(size = 12),
#axis.label.x = element_text(size=20,vjust=+5),
plot.subtitle = element_text(size = 12,hjust=0.5),
plot.caption = element_text(face="italic",size = 12,hjust=0),
legend.key.width=unit(2,"line"),
legend.position = "bottom",
legend.box = "vertical",
#legend.direction = "horizontal",
#legend.box = "horizontal",
legend.text = element_text(size = 12),
plot.title = element_text(hjust=0.5,size = 14))+
theme(text = element_text(size=16))+
theme(legend.position = "bottom", legend.box = "vertical",
legend.margin=margin(t = 0,b=0, unit='cm'))+
#t, r, b, l (To remember order, think trouble)
theme(plot.margin = unit(c(1,1,0.2,1), "cm"))+
guides(color = guide_legend(keywidth = unit(1.6,"cm"),nrow = 1),
linetype = guide_legend(keywidth = unit(1.6,"cm"),nrow = 1))+
labs(y="Imports (thousands of barrels per day)",x="",
title="US Crude Imports from Canada and the Rest of the World (ROW)",
subtitle="Imports by Refining PADD and Grade",
caption = paste("Data via US Energy Information Administration, current to ",format(max(assignment_data$period),"%B %Y"),". Graph by Andrew Leach.",sep="")
+
)NULL
# I use this one for my class deck: it's the same except I re-order the origin names using fct_relevel to put Canada on the bottom.
<-
import_plot_factorsggplot(
%>%
assignment_data_confmutate(origin_name=as_factor(origin_name))%>%
mutate(origin_name=fct_relevel(origin_name,"ROW"))%>%
mutate(origin_name=fct_recode(origin_name,"Rest of the World"="ROW")
#mutate(origin_name=gsub("ROW","Rest of the World",origin_name)
+
))geom_area(aes(period,quantity,group=origin_name,fill=origin_name),color="black",linewidth=0.25,position="stack")+
scale_x_date(expand=c(0,0))+
scale_y_continuous(breaks=pretty_breaks(), expand=c(0,0))+
#look here, adding in the facet wrap and the labeler
facet_grid(cols=vars(destination_name),rows=vars(grade_name))+
expand_limits(y=c(0,140))+
scale_fill_manual("",values=c("darkblue","red"))+
#scale_fill_manual("",values=c("red","darkblue"))+
theme_classic() +
theme(panel.border = element_blank(),
panel.grid = element_blank(),
panel.grid.major.y = element_line(color = "gray",linetype="dotted"),
axis.line.x = element_line(color = "gray"),
axis.line.y = element_line(color = "gray"),
axis.text = element_text(size = 12),
axis.text.x = element_text(margin = margin(t = 10)),
axis.title = element_text(size = 12),
#axis.label.x = element_text(size=20,vjust=+5),
plot.subtitle = element_text(size = 12,hjust=0.5),
plot.caption = element_text(face="italic",size = 12,hjust=0),
legend.key.width=unit(2,"line"),
legend.position = "bottom",
legend.box = "vertical",
#legend.direction = "horizontal",
#legend.box = "horizontal",
legend.text = element_text(size = 12),
plot.title = element_text(hjust=0.5,size = 14))+
theme(text = element_text(size=16))+
theme(legend.position = "bottom", legend.box = "vertical",
legend.margin=margin(t = 0,b=0, unit='cm'))+
#t, r, b, l (To remember order, think trouble)
theme(plot.margin = unit(c(1,1,0.2,1), "cm"))+
guides(color = guide_legend(keywidth = unit(1.6,"cm"),nrow = 1),
linetype = guide_legend(keywidth = unit(1.6,"cm"),nrow = 1))+
labs(y="Imports (thousands of barrels per day)",x="",
title="US Crude Imports from Canada and the Rest of the World (ROW)",
subtitle="Imports by Refining PADD and Grade",
caption = paste("Data via US Energy Information Administration, current to ",format(max(assignment_data$period),"%B %Y"),". Graph by Andrew Leach.",sep="")
+
)NULL
ggsave("../../slides/oil_transp/import_plot.png",plot = import_plot_factors,dpi=400,width=14,height=7,bg="white")
import_plot
Hints:
Think about your facet_grid elements (rows and cols), your grouping of data, and your fill aesthetic.
Deliverable 4: The TC Energy Canadian Mainline
The first pipeline controversy in Canadian political history was not about an oil pipeline, but about a gas pipeline. The Pipeline Debate of 1956 eventually led to the construction of the Canadian mainline. The Mainline also became the subject of one of the most important regulatory decisions in recent history, as the NEB refused to allow the pipeline to enter into a downward spiral and capped tolls and directed TransCanada to find new business to improve the viability of its asset. Both tolls and throughputs are available on the pipeline profile site.
To see how flows have changed on the Mainline over time, and to see when the regulatory changes really took hold, you should produce a graph like this:
options(timeout=300)
if(!file.exists("mainline_data.rdata"))
{<-get_pipe_data(pipe_name="tcpl-mainline")%>%
mainline_datafilter(key_point=="Prairies",direction_of_flow=="east")
save(mainline_data,file = "mainline_data.rdata")
}load(file = "mainline_data.rdata")
<-ggplot(mainline_data%>%group_by(month,year)%>%summarize(throughput=mean(throughput_1000_m3_d,na.rm=T),
throughput_plotcapacity=mean(capacity_1000_m3_d,na.rm=T)
%>%
)mutate(date=ymd(paste(year,month,1,sep="-")))%>%
filter(date>=ymd("2008-01-01"))) +
geom_area(aes(date,throughput/1000,fill="Throughput"),linewidth=.25) +
geom_line(aes(date,capacity/1000,color="Available Capacity"),linewidth=.85, lty="21") +
scale_fill_manual("",values=viridis(n=4,alpha=1,begin=.7,end=0,option = "E",direction=-1))+
scale_x_date(name=NULL,date_breaks = "1 year", date_labels = "%b\n%Y",expand=c(0,0)) +
scale_color_manual("",values=c("black"))+
scale_y_continuous(expand = c(0, 0)) +
guides(alpha = guide_legend(override.aes = list(fill=viridis(n=3,alpha=1,begin=.8,end=0,option = "E",direction=-1)[1]),order = 10) ,
fill = guide_legend(order = 1) )+
theme_classic() +
theme(panel.border = element_blank(),
panel.grid = element_blank(),
panel.grid.major.y = element_line(color = "gray",linetype="dotted"),
axis.line.x = element_line(color = "gray"),
axis.line.y = element_line(color = "gray"),
axis.text = element_text(size = 12),
#axis.label.x = element_text(size=20,vjust=+5),
plot.subtitle = element_text(size = 12,hjust=0.5),
#plot.caption = element_text(face="italic",size = 12,hjust=0),
legend.key.width=unit(2,"line"),
legend.position = "bottom",
legend.box = "horizontal",
#legend.direction = "horizontal",
#legend.box = "horizontal",
plot.caption=element_blank(),
axis.text.x = element_blank(),
axis.title.x = element_blank(),
legend.text = element_text(size = 12),
legend.title = element_text(size = 12),
plot.title = element_text(hjust=0.5,size = 14))+
labs(y="Throughput or Capacity\n(Millions of Cubic Metres per Day)",x="Date",
title=paste("TCPL Canadian Mainline Throughput and Capacity (Prairies)",sep=""),
caption="Source: CER Data for TCPL Canadian Mainline (Prairies), graph by Andrew Leach.")
#tolls
if(!file.exists("tcpl-tolls.csv"))
download.file("https://www.cer-rec.gc.ca/open/tolls/tcpl-tolls.csv",destfile ="tcpl-tolls.csv",mode="wb" )
<-read_csv("tcpl-tolls.csv")%>%clean_names()%>%
tcpl_tollsfilter(grepl("Empress",path))%>%
filter(!grepl("Emerson",path))%>%
mutate(month=month(date),year=year(date),
path=gsub("Empress to ","",path))%>%
group_by(month,year,path)%>%summarize(toll=mean(toll,na.rm=T))%>%
mutate(date=ymd(paste(year,month,1,sep="-")))%>%
filter(date>=ymd("2008-01-01")#,
#date<=max(mainline_data$date)
)
<-ggplot(tcpl_tolls)+
tollsgeom_line(aes(date,toll,group=path,color=path),linewidth=.85, lty="solid") +
scale_color_manual("Destination:",values=colors_tableau10())+
scale_x_date(name=NULL,date_breaks = "1 years", date_labels = "%b\n%Y",expand=c(0,0)) +
scale_y_continuous(expand = c(0, 0))+
expand_limits(y=0)+
theme_classic() +
theme(panel.border = element_blank(),
panel.grid = element_blank(),
panel.grid.major.y = element_line(color = "gray",linetype="dotted"),
axis.line.x = element_line(color = "gray"),
axis.line.y = element_line(color = "gray"),
axis.text = element_text(size = 12),
axis.text.x = element_text(margin = margin(t = 10)),
axis.title = element_text(size = 12),
#axis.label.x = element_text(size=20,vjust=+5),
plot.subtitle = element_text(size = 12,hjust=0.5),
plot.caption = element_text(face="italic",size = 12,hjust=0),
legend.key.width=unit(2,"line"),
legend.position = "bottom",
legend.box = "horizontal",
#legend.direction = "horizontal",
#legend.box = "horizontal",
legend.text = element_text(size = 12),
legend.title = element_text(size = 12),
plot.title = element_text(hjust=0.5,size = 14))+
labs(y="Pipeline Tolls ($/GJ)",x="Date",
title=paste("TCPL Canadian Mainline Tolls from Empress",sep=""),
caption="Source: CER data for TCPL Canadian Mainline, graph by Andrew Leach.")
plot_grid(throughput_plot, tolls, ncol = 1, align = "v")
Hints:
The Mainline data files are large, so depending on the speed of your conneciton, you may get timeout errors. If you do, just add the command options(timeout=300) to your code before you call the download command. You likely also want to make sure you’re not downloading them every time you run your code, so use an if(!file.exists..) in your code as we’ve done before;
You’re going to want to use either plotgrid or cowplot to combine your plots. I used
plot_grid(throughput_plot, tolls, ncol = 1, align = "v")
;
You’ll want to use a group_by and summarize to make average monthly plots. For the toll data, I used
group_by(month,year,path)%>%summarize(toll=mean(toll,na.rm=T))
andmutate(date=ymd(paste(year,month,1,sep="-")))
to accomplish this.
Use theme commands to suppress the x axis names and titles on your top graph;
Filter the data to keep the same starting date in both cases.
Deliverable 5: Alberta Resource Royalties
Alberta’s budget comes out this week, and resource royalties are always a big part of the story. In the last full fiscal year, resource royalties hit record levels and the forecast of future royalties will drive government spending and savings decisions. To have some perspective on these decisions, let’s make a plot of Alberta resource royalties by product over time. Use the newest data here to create a graph similar to the following:
options(timeout=300)
#tolls
if(!file.exists("royalties.xlsx"))
download.file("https://open.alberta.ca/dataset/382b7a1e-9c34-47c7-9531-38e67ca5441d/resource/4cb804eb-1844-4316-9849-4ace8b6f2355/download/em-alberta-historical-royalty-revenue-2022-2023.xlsx",destfile ="royalties.xlsx",mode="wb" )
<-read_excel("royalties.xlsx",range="B26:BC35")%>%clean_names()%>%slice(-1)%>%slice(-c(7,8))%>%
royaltiesrename(product=1)%>%
pivot_longer(-product,names_to = "year",values_to = "royalties")%>%
mutate(year = as.numeric(gsub("x","",str_split(year, "_", simplify = TRUE)[ , 1])))
ggplot(royalties%>%mutate(product=as_factor(product)))+
geom_area(aes(year,royalties/1000,fill=product),linewidth=.25) +
scale_fill_manual("",values=colors_ua10())+
#scale_x_discrete(name=NULL,expand=c(0,0)) +
scale_color_manual("",values=c("black"))+
scale_y_continuous(expand = c(0, 0)) +
guides(alpha = guide_legend(override.aes = list(fill=viridis(n=3,alpha=1,begin=.8,end=0,option = "E",direction=-1)[1]),order = 10) ,
fill = guide_legend(order = 1) )+
theme_classic() +
theme(panel.border = element_blank(),
panel.grid = element_blank(),
panel.grid.major.y = element_line(color = "gray",linetype="dotted"),
axis.line.x = element_line(color = "gray"),
axis.line.y = element_line(color = "gray"),
axis.text = element_text(size = 12),
#axis.label.x = element_text(size=20,vjust=+5),
plot.subtitle = element_text(size = 12,hjust=0.5),
#plot.caption = element_text(face="italic",size = 12,hjust=0),
legend.key.width=unit(2,"line"),
legend.position = "bottom",
legend.box = "horizontal",
#legend.direction = "horizontal",
#legend.box = "horizontal",
plot.caption=element_blank(),
#axis.text.x = element_blank(),
#axis.title.x = element_blank(),
legend.text = element_text(size = 12),
legend.title = element_text(size = 12),
plot.title = element_text(hjust=0.5,size = 14))+
labs(y="Royalty Revenue (CA$ Billions)",x="Year",
title=paste("Alberta Royalty Revenue",sep=""),
caption="Source: Government of Alberta data, graph by Andrew Leach.")
Hints:
To turn the fiscal years into single years, I used this command:
mutate(year = as.numeric(gsub("x","",str_split(year, "_", simplify = TRUE)[ , 1])))
. This splits the string at the “”, and then replaces the x with a blank. So, for example, if your year entry was ”x1970_71”, you run this command and in first grabs the first part of the outcome of splitting the strong at the ”” (x1970) and then uses gsub to replace the x with a blank.
Other than altering how the fiscal years are read in, you should not need any other new commands. You do not need to replicate the U of A colours.
RMD File and HTML/PDF Preparation
As before, use the basic RMD file to complete this (and future) assignments, just rename it assignment 2. You only need to submit the HTML.