Experimental Economics: Data Workflow
: Data Presentation
1 Data Representation
1.1 A Grammar for graphics
- Wickham (2010) defines a layered grammar of graphics
- Buiding a graph from multiple layers of data
- The main layers of a graph are
- data and aesthetic mappings
- geometry objects
- scales
- facet specification
- In addition we may have
- statistical transformations
- coordinate system
1.2 ggplot
- We rely on the library ggplot2 to provide a graphical representation of data
- Provide a data frame in the form of a tibble
- Define the aesthetic mapping gathered from the data frame
- x: x-dimension
- y: y-dimension
- fill: color to fill the graph
- color: color of the graph
- size: dimension of graph elements
- label: labels in the graph
ggplot(data=“DATA”, mapping=aes(x=“X”,y=“Y”,...))
- This “sets the ground” for the graph
- Still need to specify the exact “geometry” of the graph
= tibble(x=1:10,y=x^2)
dt dt
# A tibble: 10 × 2
x y
<int> <dbl>
1 1 1
2 2 4
3 3 9
4 4 16
5 5 25
6 6 36
7 7 49
8 8 64
9 9 81
10 10 100
ggplot(data=dt, aes(x=x,y=y))
1.3 Geometry
1.3.1 geom_point()
ggplot(data=dt, aes(x=x,y=y))+
geom_point()
1.3.2 geom_line()
ggplot(data=dt, aes(x=x,y=y))+
geom_line()
1.3.3 geom_col()
ggplot(data=dt, aes(x=x,y=y))+
geom_col()
1.3.4 Combined
ggplot(data=dt, aes(x=x,y=y))+
geom_col()+
geom_line()+
geom_point()
1.4 Markers
1.4.1 Lines
- linetype controls the style of line
- Can also be used inside aes(linetype=
) when line type is conditional upon a viariable
- Can also be used inside aes(linetype=
1.4.2 Points
- pch controls the style of points
- Can be used inside aes(pch=
) when point type is conditional upon a viariable
- Can be used inside aes(pch=
1.5 Markers: example
ggplot(data=dt, aes(x=x,y=y))+
geom_point(pch=4)+
geom_line(linetype=2)
1.6 Axes
- Axes provide a guide to read the graph
- Possible to control
- axis dimensions
- axis type
- tick marks
- tick mark labels
ggplot(data=dt, aes(x=x,y=y))+
geom_point()+
scale_x_continuous(
limits=c(0,max(dt$x)),
minor_breaks=seq(0,10,1),
breaks=seq(0,10,1),
labels=seq(0,10,1)
+
)scale_y_continuous(
limits=c(0,max(dt$y)),
minor_breaks=seq(0,max(dt$y),1),
breaks=seq(0,max(dt$y),10),
labels=seq(0,max(dt$y),10)
)
2 Non graphical elements
2.1 Labels
- We can specify labels of the graph
- Title
- Axis labels
- Caption
ggplot(data=dt, aes(x=x,y=y,color=y))+
geom_point()+
labs(
title="THIS IS THE TITLE",
subtitle="SUBTITLE HERE",
y = "This is the y-axis",
x= "This is the x-axis",
caption="CAPTION HERE"
)
2.2 Themes
- You can easily control the size, the orientation, and the color of non-graphical elements with theme
- Axis
- axis.title, axis.text, legend.key, legend.key.size …
- Legend
- legend.background, legend.margin, legend.spacing, legend.key.height, legend.key.width, legend.text, legend.text.align, legend.title, legend.position …
- Facets
- strip.background, strip.text …
- Axis
ggplot(data=dt, aes(x=x,y=y,color=y))+
geom_point()+
theme(
legend.position="right",
axis.text=element_text(size=8),
axis.title=element_text(size=14,face="bold"),
legend.background = element_rect(fill="grey", size=2, linetype="solid")
)
2.3 Colors
- Colours can be used to fill a geometric element (e.g., bars, points) or to define its color (e.g., points, lines, …)
- Colours can also be used to map variables to colors
- aes(… color=VAR, fill=VAR)
- Fill and color for a discrete variable
- *scale_fill_brewer()** or *scale_color_brewer()** to use library(RColorBrewer) palettes
- scale_fill_manual() or scale_color_manual() to manually define colors
- Fill and color for a continuous variable
- scale_fill_gradient() or scale_color_gradient() two-color gradient
- scale_fill_gradientn() or scale_color_gradientn() n-color gradient, equally spaced
2.4 RColorBrewer palettes
2.5 Colors
- Use color to provide a measure of the y-value
ggplot(data=dt, aes(x=x,y=y,color=y))+
geom_point()+
scale_colour_gradient(low="Blue",high="Red")
- Use color to distinguish between Odd and Even x-values (discrete mapping)
# CODE HERE
ggplot(data=dt, aes(x=x,y=y,colour=as_factor(x%%2)))+
geom_point()+
scale_colour_brewer(palette="Set1")
2.6 Default themes
- We can modify the theme of the graph vith theme
- Overall look
- Position of the legend
- Font size
- …
- A gallery of themes can be found here
ggplot(data=dt, aes(x=x,y=y, color=x))+
geom_point()+
theme_dark()+
labs(
title="THIS IS THE TITLE"
)
ggplot(data=dt, aes(x=x,y=y, color=x))+
geom_point()+
theme_bw()+
labs(
title="THIS IS THE TITLE"
)
ggplot(data=dt, aes(x=x,y=y, color=x))+
geom_point()+
theme_classic()+
labs(
title="THIS IS THE TITLE"
)
2.7 Facets
- We can “split” different values into different indpendent panels
- Facets conditiona upon a variable
- As an example, divide odd and even outcomes
ggplot(data=dt, aes(x=x,y=y, color=x))+
geom_point()+
facet_wrap(~ifelse(x%%2==0,"Even","Odd"))
2.8 Combining layers
- Now we can combine different layers of graphical and non-graphical elements to get the desired output
<-
g ggplot(data=dt, aes(x=x,y=y, color=as_factor(x%%2)))+
geom_line(linetype=2, size=1, color="grey" )+
geom_point(size=4, alpha=.5)+
scale_x_continuous(
limits=c(0,max(dt$x)),
minor_breaks=seq(0,10,1),
breaks=seq(0,10,1),
labels=seq(0,10,1)
+
)scale_y_continuous(
limits=c(0,max(dt$y)),
minor_breaks=seq(0,max(dt$y),1),
breaks=seq(0,max(dt$y),10),
labels=seq(0,max(dt$y),10)
+
)facet_wrap(~ifelse(x%%2==0,"Even","Odd"))+
scale_colour_brewer(palette="Set1")+
theme_bw()+
labs(
title="My first graph",
y = "This is the y-axis",
x= "This is the x-axis",
caption="Proudly made by me",
color="Odd"
+
)theme(
legend.position="bottom",
axis.text=element_text(size=8),
axis.title=element_text(size=14,face="bold"),
legend.background = element_rect(fill="grey",
size=2, linetype="solid")
)
3 eporting
3.1 R markdown
- What is R Markdown?
- It combines the simple style of Markdown and the powerful computational environment of R
- R Markdown provides an authoring framework for data science.
- A single R Markdown file to both save and execute code
- Generate high quality reports that can be shared with an audience.
- We rely on Quarto to create a report from R Markdown
- A multi-language, next generation version of R Markdown from RStudio, with many new features and capabilities.
- See the official guide here
3.2 R Markdown: elements
- A R Markdown document is made of 3 main components
- Markdown textual elements
- Contains the text comment/description of the output of the R chunk code
- R chunk code
- Contains the R code to generate the desired output (table, graph, results…)
- YAML header
- Contains information about the document and formatting styles
- Markdown textual elements
3.3 Textual elements
- Markdown is a lightweight markup language with plain text formatting syntax
Plain text
*italics* and _italics_
**bold** and __bold__
superscript^2^
~~strikethrough~~
[link](www.rstudio.com)
inline equation: $A = \pi*r^{2}$
Plain text
italics and italics
bold and bold
superscript2
strikethrough
link
inline equation: \(A = \pi*r^{2}\)
3.4 What is a chunk code?
- You can insert R code into a chunk code
- The chunk has this generic format
``` {r} Your code here ```
- Several features of the code chunk can be controlled (see here for a detailed list)
- eval
- =FALSE \(\Rightarrow\) the chunk is not evaluated
- echo
- =FALSE \(\Rightarrow\) no source code is printed in the output, only the result of the code
- include
- =FALSE \(\Rightarrow\) the chunk is excluded from the output, but still evaluated
- warning, message, and error
- =FALSE \(\Rightarrow\) warnings, messages and errors are not printed in the output
- eval
- One can also control figure dimensions (in inches) with fig.width and fig.height
``` {r} #| echo=FALSE #| include=TRUE #| eval=TRUE #| fig.width=9 #| fig.height=6 Your code here ```
3.5 What is the YAML header?
- The YAML header contains the properties of the document
- Title, author, date
- The output format
- html, pdf, word …
- Reference to external sources
- .bib for bibliography
- .css for style elements
---
title: 'YOURTITLE'
subtitle: 'Yoursubtitle'
author: 'YOU'
date: 'today'
date-format: long
format: html
---
3.6 How to render your document
- In RStudio v2022.07 and later fully supported
- Click on Render
3.7 An example of a R Markdown report
- See
- Source file: Report.qmd
- Rendered file: Report.html
4 Appendix
4.1 Assignemnt
- Create slides from Report.qmd
- Hint: use the following YAML header
---
title: 'Risk preferences'
subtitle: 'Evidence from a MPL experiment'
author: 'Me'
date: 'today'
date-format: long
format:
revealjs: default
---
4.2 References
References
Wickham, Hadley. 2010. “A Layered Grammar of Graphics.” Journal of Computational and Graphical Statistics 19 (1): 3–28.