Experimental Economics: Data Workflow

R: Data wrangling

Author

Matteo Ploner

Published

May 2, 2023

1 Introduction

1.1 Data manipulation, visualization and reporting

  • This course illustrates techniques for data manipulation, visualization and reporting using R and R Markdown
    • Reference to the following sources is made during the course
      • Chang, Winston. 2012. R Graphics Cookbook: Practical Recipes for Visualizing Data. ” O’Reilly Media, Inc.”.
      • Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. ” O’Reilly Media, Inc.”.
      • W. N. Venables, W.N., Smith D. M., and R Core Team. 2019. An Introduction to R.
  • Many useful resources can be found online

1.2 R

  • From https://cran.r-project.org/
    • R is an integrated suite of software facilities for data manipulation, calculation and graphical display.
    • R can be regarded as an implementation of the S language which was developed at Bell Laboratories by Rick Becker, John Chambers and Allan Wilks, and also forms the basis of the of the S-PLUS systems.

R version 3.6.2 (2019-12-12) – “Dark and Stormy Night” Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin15.6.0 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type ‘license()’ or ‘licence()’ for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors. Type ‘contributors()’ for more information and ‘citation()’ on how to cite R or R packages in publications.

Type ‘demo()’ for some demos, ‘help()’ for on-line help, or ‘help.start()’ for an HTML browser interface to help. Type ‘q()’ to quit R.

1.3 RStudio

  • From https://www.rstudio.com/products/RStudio/
    • RStudio is an integrated development environment (IDE) for R
    • RStudio is available in open source and commercial editions and runs on the desktop (Windows, Mac, and Linux) or in a browser connected to RStudio Server or RStudio Server Pro (Debian/Ubuntu, RedHat/CentOS, and SUSE Linux).

1.4 Visual Studio Code

2 Tidy data

2.1 Your data in good shape

  • Data Wrangling is “the art of getting your data into R in a useful form for visualization and modeling()

  • We rely on the tidyverse library

    • The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.
library(tidyverse)

2.2 The Workflow

  • Wickham and Grolemund () present a description of the workflow in data management
contents cluster_0 Explore Import Import Organize (Tidy) Organize (Tidy) Import->Organize (Tidy) Transform Transform Organize (Tidy)->Transform Visualize Visualize Transform->Visualize Model Model Visualize->Model Communicate Communicate Visualize->Communicate Model->Transform

2.3 Tidy data

  • See Wickham () for a description of good practices in organizing your data .

  • 3 rules to have tidy data

    • Each variable forms a column.
    • Each observation forms a row.
    • Each type of observational unit forms a table.


Source: Wickham and Grolemund ()

  • Data that are not tidy are untidy!

2.4 Example of tidy data

Name Height(mt) Weight(kg)
Venonat 1.24 46.59
Ledyba 1.20 14.80
Wingull 0.49 7.05
Treeko 0.51 3.90
Zubat 0.90 8.66
  • Each variable forms a column.
  • Each observation forms a row.
  • Each type of observational unit forms a table.

2.5 Example of untidy data

Name Key Value
Ledyba Height(mt) 1.20
Ledyba Weight(kg) 14.80
Treeko Height(mt) 0.51
Treeko Weight(kg) 3.90
Venonat Height(mt) 1.24
Venonat Weight(kg) 46.59
Wingull Height(mt) 0.49
Wingull Weight(kg) 7.05
Zubat Height(mt) 0.90
Zubat Weight(kg) 8.66

3 Data wrangling

3.1 Coding Style

  • We rely on tidyverse syntax style
    • The code we are going to write relies on pipes: %>%
    • We use %>% to emphasize a sequence of action
      • From R > 4.1.0 the pipe is natively available: |>
    • . is a placeholder
1:10 %>%
sum(.) %>%
.^2
[1] 3025
  • Without pipes
sum(1:10)^2
[1] 3025

3.2 Data transformation

  • We are performing a series of data transformation
    • Import
    • Mutate
    • Select
    • Rename
    • Arrange
    • Filter
    • Join
    • Gather
    • Spread
    • Summarise

3.3 Import

  • The first important step is to import your data in the R workspace
  • R can import several data formats
    • My suggestion is to work with comma-separated (.csv) data
    • In the tidyverse library, the following function is available
read_csv("PATH_TO_YOUR_FILE/FILENAME.csv")

3.4 Tibbles

  • Data are imported in a tibble()
  • A special form of data frames in tidyverse
  • Refined print method that shows only the first 10 rows, and all the columns that fit on screen.
tibble(
  Col1= 1:3,
  Col2=4:6,
  Col3=7:9,
  Col4=10:12
)
# A tibble: 3 × 4
   Col1  Col2  Col3  Col4
  <int> <int> <int> <int>
1     1     4     7    10
2     2     5     8    11
3     3     6     9    12

3.5 Mutate

  • Adds new variables and preserves existing ones
tibble(Sequence=1:5) %>%
mutate(Sequence_rev=rev(Sequence))
# A tibble: 5 × 2
  Sequence Sequence_rev
     <int>        <int>
1        1            5
2        2            4
3        3            3
4        4            2
5        5            1
  • You can also overwrite old variables
tibble(Sequence=1:5) %>%
mutate(Sequence=rev(Sequence))
# A tibble: 5 × 1
  Sequence
     <int>
1        5
2        4
3        3
4        2
5        1

3.6 Select

  • We unselect one of the variables (e.g., z)
tibble(x=runif(5,0,1),y=runif(5,0,2),z=runif(5,-1,1)) %>%
select(-z)
# A tibble: 5 × 2
      x      y
  <dbl>  <dbl>
1 0.404 1.23  
2 0.336 1.36  
3 0.164 0.0738
4 0.454 0.451 
5 0.144 0.851 
  • We explicitly select the variables of interest
tibble(x=runif(5,0,1),y=runif(5,0,2),z=runif(5,-1,1)) %>%
select(x,y)
# A tibble: 5 × 2
       x      y
   <dbl>  <dbl>
1 0.444  0.481 
2 0.0192 0.155 
3 0.0169 0.427 
4 0.341  2.00  
5 0.249  0.0300

3.7 Rename

  • Rename variables
    • Make names more self-explanatory
tmp <- tibble(x=runif(5,1,4),y=runif(5,1,2),z=runif(5,1,20))
tmp
# A tibble: 5 × 3
      x     y     z
  <dbl> <dbl> <dbl>
1  1.46  1.96  3.59
2  2.40  1.22  3.25
3  1.75  1.53 15.6 
4  1.21  1.81 15.2 
5  1.26  1.97  2.58
  • We may want to change the name of variables, to make them more self-explanatory
tmp <- tmp %>% rename(Length=x,Width=y,Height=z)
tmp
# A tibble: 5 × 3
  Length Width Height
   <dbl> <dbl>  <dbl>
1   1.46  1.96   3.59
2   2.40  1.22   3.25
3   1.75  1.53  15.6 
4   1.21  1.81  15.2 
5   1.26  1.97   2.58

3.8 Arrange

# A tibble: 6 × 3
  Type      x     y
  <chr> <dbl> <dbl>
1 A      2.90  1.66
2 B      1.75  3.14
3 C      3.89  3.83
4 A      1.22  2.66
5 B      3.41  1.90
6 C      2.64  2.11
  • Sort your data according to values in one columns
    • Important to get a first impression on rankings
tmp %>% arrange(Type)
# A tibble: 6 × 3
  Type      x     y
  <chr> <dbl> <dbl>
1 A      2.90  1.66
2 A      1.22  2.66
3 B      1.75  3.14
4 B      3.41  1.90
5 C      3.89  3.83
6 C      2.64  2.11
  • Possible to hierarchically sort according to several columns
    • Sort first according to Type and then according to x
tmp %>% arrange(Type,x)
# A tibble: 6 × 3
  Type      x     y
  <chr> <dbl> <dbl>
1 A      1.22  2.66
2 A      2.90  1.66
3 B      1.75  3.14
4 B      3.41  1.90
5 C      2.64  2.11
6 C      3.89  3.83

3.9 Filter

  • To choose rows/cases where conditions are true
tmp <- tibble(x=runif(5,1,4),y=runif(5,1,2),z=runif(5,1,20))
tmp
# A tibble: 5 × 3
      x     y     z
  <dbl> <dbl> <dbl>
1  3.42  1.94 15.2 
2  3.52  1.54 19.6 
3  3.07  1.03  1.46
4  2.70  1.42  6.59
5  2.72  1.26 14.5 
tmp %>% filter(x>2 & y<2)
# A tibble: 5 × 3
      x     y     z
  <dbl> <dbl> <dbl>
1  3.42  1.94 15.2 
2  3.52  1.54 19.6 
3  3.07  1.03  1.46
4  2.70  1.42  6.59
5  2.72  1.26 14.5 

3.10 Join

  • We may want to put two different data frames in relation
    • The two data frames contain a common column
  • Values x and y for Obs are in table tmp.1
  • Values x and y for Obs are in table tmp.2
tmp.1 <- tibble(x=runif(5,1,4),y=runif(5,1,2),Obs=3:7)
tmp.1
# A tibble: 5 × 3
      x     y   Obs
  <dbl> <dbl> <int>
1  3.55  1.03     3
2  2.77  2.00     4
3  1.74  1.22     5
4  1.71  1.13     6
5  1.10  1.80     7
tmp.2 <- tibble(z=runif(5,1,8),w=runif(5,0,2),Obs=1:5)
tmp.2
# A tibble: 5 × 3
      z     w   Obs
  <dbl> <dbl> <int>
1  4.53 1.11      1
2  7.46 1.52      2
3  2.41 1.58      3
4  1.89 0.974     4
5  5.60 0.963     5

3.11 Join: full

full_join(tmp.1,tmp.2,by="Obs")
# A tibble: 7 × 5
      x     y   Obs     z      w
  <dbl> <dbl> <int> <dbl>  <dbl>
1  3.55  1.03     3  2.41  1.58 
2  2.77  2.00     4  1.89  0.974
3  1.74  1.22     5  5.60  0.963
4  1.71  1.13     6 NA    NA    
5  1.10  1.80     7 NA    NA    
6 NA    NA        1  4.53  1.11 
7 NA    NA        2  7.46  1.52 
  • ! Not all Obs are in both tables

3.12 Join: partial

  • We may want to have all values of the “left” data frame (tmp.1)
left_join(tmp.1,tmp.2,by="Obs")
# A tibble: 5 × 5
      x     y   Obs     z      w
  <dbl> <dbl> <int> <dbl>  <dbl>
1  3.55  1.03     3  2.41  1.58 
2  2.77  2.00     4  1.89  0.974
3  1.74  1.22     5  5.60  0.963
4  1.71  1.13     6 NA    NA    
5  1.10  1.80     7 NA    NA    
  • We may want to have all values of the “right” data frame (tmp.2)
right_join(tmp.1,tmp.2,by="Obs")
# A tibble: 5 × 5
      x     y   Obs     z     w
  <dbl> <dbl> <int> <dbl> <dbl>
1  3.55  1.03     3  2.41 1.58 
2  2.77  2.00     4  1.89 0.974
3  1.74  1.22     5  5.60 0.963
4 NA    NA        1  4.53 1.11 
5 NA    NA        2  7.46 1.52 

3.13 Gather

  • Pivot longer takes multiple columns and collapses into key-value pairs, duplicating all other columns as needed
tmp.1
# A tibble: 5 × 3
      x     y   Obs
  <dbl> <dbl> <int>
1  3.55  1.03     3
2  2.77  2.00     4
3  1.74  1.22     5
4  1.71  1.13     6
5  1.10  1.80     7
tmp.2 <-
tmp.1 %>%
pivot_longer(names_to="Key",values_to="Value",c(x,y))
tmp.2
# A tibble: 10 × 3
     Obs Key   Value
   <int> <chr> <dbl>
 1     3 x      3.55
 2     3 y      1.03
 3     4 x      2.77
 4     4 y      2.00
 5     5 x      1.74
 6     5 y      1.22
 7     6 x      1.71
 8     6 y      1.13
 9     7 x      1.10
10     7 y      1.80

3.14 Spread

  • pivot_wider spreads a key-value pair across multiple columns (inverse of gather)
tmp.1 <- tmp.2 %>%
pivot_wider(names_from="Key",values_from="Value")
tmp.1
# A tibble: 5 × 3
    Obs     x     y
  <int> <dbl> <dbl>
1     3  3.55  1.03
2     4  2.77  2.00
3     5  1.74  1.22
4     6  1.71  1.13
5     7  1.10  1.80

3.15 Summarise

# A tibble: 6 × 3
  Type      x     y
  <chr> <dbl> <dbl>
1 A      2.59  3.54
2 B      3.68  3.56
3 C      3.36  3.04
4 A      2.55  2.40
5 B      3.54  2.31
6 C      3.58  1.58
  • Apply functions to your variables
    • Mean, SD, …
tmp %>% summarise_at("x",list(~mean(.),~sd(.)))
# A tibble: 1 × 2
   mean    sd
  <dbl> <dbl>
1  3.22 0.511
  • Very powerful in combination to group_by
    • “Cluster” the application of functions
tmp %>% group_by(Type) %>%  summarise_at("x",list(~n(),~mean(.),~sd(.)))
# A tibble: 3 × 4
  Type      n  mean     sd
  <chr> <int> <dbl>  <dbl>
1 A         2  2.57 0.0310
2 B         2  3.61 0.0957
3 C         2  3.47 0.152 

4 An application to experimental data

4.1 Risk elicitation Task

  • We are applying some of the transformation functions to a real database originated by oTree and data collected from Prolific
    • Risk elicitation in a MPL format (see lecture on Individual Decision Making)
      • Participants choose 10 times between prospect A and B
        • Switching between A and B informs us about risk propensity

4.2 Import data: oTree

  • First we import our data
  • all_apps_wide.csv: choices originated by the oTree app (tibble: 45 x 54)
    • assign to d.1
d.1 <- read_csv("all_apps_wide.csv")
participant.id_in_session participant.code participant.label participant._is_bot participant._index_in_pages participant._max_page_index participant._current_app_name participant._current_page_name participant.time_started participant.visited participant.mturk_worker_id participant.mturk_assignment_id participant.payoff participant.payoff_plus_participation_fee session.code session.label session.mturk_HITId session.mturk_HITGroupId session.comment session.is_demo session.config.treatment session.config.real_world_currency_per_point session.config.participation_fee prolific_id.1.player.id_in_group prolific_id.1.player.prolific_id prolific_id.1.player.code prolific_id.1.player.payoff prolific_id.1.group.id_in_subsession prolific_id.1.subsession.round_number prolific_MPL.1.player.id_in_group prolific_MPL.1.player.check_instr_1 prolific_MPL.1.player.check_instr_2 prolific_MPL.1.player.HL_1 prolific_MPL.1.player.HL_2 prolific_MPL.1.player.HL_3 prolific_MPL.1.player.HL_4 prolific_MPL.1.player.HL_5 prolific_MPL.1.player.HL_6 prolific_MPL.1.player.HL_7 prolific_MPL.1.player.HL_8 prolific_MPL.1.player.HL_9 prolific_MPL.1.player.HL_10 prolific_MPL.1.player.HL prolific_MPL.1.player.sex prolific_MPL.1.player.age prolific_MPL.1.player.comment prolific_MPL.1.player.like prolific_MPL.1.player.payoff prolific_MPL.1.group.id_in_subsession prolific_MPL.1.subsession.round_number prolific_end.1.player.id_in_group prolific_end.1.player.payoff prolific_end.1.group.id_in_subsession prolific_end.1.subsession.round_number
1 ft3rev9i NA 0 6 6 prolific_end End 2021-02-04 20:11:47 1 NA NA 0.7 0.7 ssimrba9 NA NA NA NA 1 control 0 0 1 123456789123456789123456 XC97R 0 1 1 1 0 0 1 1 1 2 2 2 2 2 2 2 NA He 19 NA 3 0.7 1 1 1 0 1 1
2 odu6nfat NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 ssimrba9 NA NA NA NA 1 control 0 0 2 NA NA 0 1 1 2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 2 0 1 1
3 qxjmt2s0 NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 ssimrba9 NA NA NA NA 1 control 0 0 3 NA NA 0 1 1 3 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 3 0 1 1
4 zfqc76t8 NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 ssimrba9 NA NA NA NA 1 control 0 0 4 NA NA 0 1 1 4 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 4 0 1 1
5 dyloyzoh NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 ssimrba9 NA NA NA NA 1 control 0 0 5 NA NA 0 1 1 5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 5 0 1 1
1 a8eikyet NA 0 6 6 prolific_end End 2021-02-05 09:35:08 1 NA NA 0.1 0.1 l7au0ml3 NA NA NA NA 0 control 0 0 1 123456789123456789123456 XC97R 0 1 1 1 0 0 1 1 2 2 2 1 2 1 2 1 NA He 21 just a trial, but wonderful experiment 5 0.1 1 1 1 0 1 1
2 kd2dns9t NA 0 6 6 prolific_end End 2021-02-05 09:54:40 1 NA NA 1.8 1.8 l7au0ml3 NA NA NA NA 0 control 0 0 2 5fb1bdd5dc59737020cdfb5e XC97R 0 1 1 2 0 0 2 2 1 1 2 1 2 2 2 1 2 She 20 NA 3 1.8 1 1 2 0 1 1
3 jpswpu7b NA 0 6 6 prolific_end End 2021-02-05 09:55:23 1 NA NA 0.7 0.7 l7au0ml3 NA NA NA NA 0 control 0 0 3 5f51292bd904fa31bd519280 XC97R 0 1 1 3 0 0 1 1 1 1 2 2 2 2 2 2 2 He 25 NA 5 0.7 1 1 3 0 1 1
4 szsfzz0x NA 0 6 6 prolific_end End 2021-02-05 09:55:38 1 NA NA 0.7 0.7 l7au0ml3 NA NA NA NA 0 control 0 0 4 60193745c8295b026e58ffb0 XC97R 0 1 1 4 0 0 1 1 1 1 1 1 1 2 2 2 1 He 19 It made me feel that I'm risking something even when I was above to win in every case. Interesting! 5 0.7 1 1 4 0 1 1
5 7srwjcj0 NA 0 6 6 prolific_end End 2021-02-05 09:55:39 1 NA NA 1.8 1.8 l7au0ml3 NA NA NA NA 0 control 0 0 5 5d95b8fb7c6bbc0013674753 XC97R 0 1 1 5 0 0 2 1 1 2 2 1 2 1 1 2 NA She 30 NA 4 1.8 1 1 5 0 1 1
6 d3tt3lcx NA 0 1 6 prolific_id ProlificID 2021-02-05 09:55:44 1 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 6 NA NA 0 1 1 6 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 6 0 1 1
7 q9ks08ta NA 0 6 6 prolific_end End 2021-02-05 09:55:45 1 NA NA 0.7 0.7 l7au0ml3 NA NA NA NA 0 control 0 0 7 5fba5e6d9d68a19e88fe7ef5 XC97R 0 1 1 7 0 0 1 1 1 1 1 2 2 2 2 2 2 She 23 NA 5 0.7 1 1 7 0 1 1
8 ndgiif9e NA 0 6 6 prolific_end End 2021-02-05 09:55:47 1 NA NA 0.7 0.7 l7au0ml3 NA NA NA NA 0 control 0 0 8 5f560c6dc8978c2120ef359c XC97R 0 1 1 8 0 0 1 1 1 1 1 2 2 2 2 2 NA He 26 NA 5 0.7 1 1 8 0 1 1
9 gulplfcl NA 0 6 6 prolific_end End 2021-02-05 09:55:57 1 NA NA 0.1 0.1 l7au0ml3 NA NA NA NA 0 control 0 0 9 5fe0d5101f2ef9a760961e24 XC97R 0 1 1 9 0 0 1 1 1 2 2 2 2 2 2 2 2 He 39 NA 4 0.1 1 1 9 0 1 1
10 dd7t9n6v NA 0 6 6 prolific_end End 2021-02-05 09:56:09 1 NA NA 0.9 0.9 l7au0ml3 NA NA NA NA 0 control 0 0 10 5f862e8d85b71e1791feb730 XC97R 0 1 1 10 0 0 1 1 1 1 1 1 1 2 2 2 NA He 31 NA 5 0.9 1 1 10 0 1 1
11 05erea5x NA 0 6 6 prolific_end End 2021-02-05 09:56:27 1 NA NA 0.1 0.1 l7au0ml3 NA NA NA NA 0 control 0 0 11 5c548bf9aba7d60001f1f41a XC97R 0 1 1 11 0 0 1 1 1 2 2 2 2 2 2 2 1 He 21 NA 4 0.1 1 1 11 0 1 1
12 46vjajdd NA 0 6 6 prolific_end End 2021-02-05 10:10:45 1 NA NA 1.8 1.8 l7au0ml3 NA NA NA NA 0 control 0 0 12 5fcfca16b5f39515037d6724 XC97R 0 1 1 12 0 0 2 1 2 1 2 1 1 2 2 2 NA She 28 This was fun. 5 1.8 1 1 12 0 1 1
13 0d9lf0dh NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 13 NA NA 0 1 1 13 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 13 0 1 1
14 va6bfq05 NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 14 NA NA 0 1 1 14 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 14 0 1 1
15 by28russ NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 15 NA NA 0 1 1 15 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 15 0 1 1
16 3mks59xu NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 16 NA NA 0 1 1 16 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 16 0 1 1
17 y6gt6rnl NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 17 NA NA 0 1 1 17 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 17 0 1 1
18 xpn1xgr0 NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 18 NA NA 0 1 1 18 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 18 0 1 1
19 7ixk54kz NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 19 NA NA 0 1 1 19 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 19 0 1 1
20 2pfl4oo8 NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 20 NA NA 0 1 1 20 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 20 0 1 1
21 wnaf0dta NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 21 NA NA 0 1 1 21 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 21 0 1 1
22 3cgunh1s NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 22 NA NA 0 1 1 22 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 22 0 1 1
23 dvxwurfs NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 23 NA NA 0 1 1 23 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 23 0 1 1
24 1s0lb1e7 NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 24 NA NA 0 1 1 24 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 24 0 1 1
25 5yn5cxch NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 25 NA NA 0 1 1 25 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 25 0 1 1
26 8h2onzcf NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 26 NA NA 0 1 1 26 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 26 0 1 1
27 gfs16q68 NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 27 NA NA 0 1 1 27 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 27 0 1 1
28 sx3xya15 NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 28 NA NA 0 1 1 28 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 28 0 1 1
29 su5rlis4 NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 29 NA NA 0 1 1 29 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 29 0 1 1
30 5e3vz07z NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 30 NA NA 0 1 1 30 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 30 0 1 1
31 vn56qpbb NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 31 NA NA 0 1 1 31 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 31 0 1 1
32 dgq2tvu8 NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 32 NA NA 0 1 1 32 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 32 0 1 1
33 ldf8zs56 NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 33 NA NA 0 1 1 33 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 33 0 1 1
34 afuj0rbn NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 34 NA NA 0 1 1 34 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 34 0 1 1
35 0mno6ipy NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 35 NA NA 0 1 1 35 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 35 0 1 1
36 rcoye50u NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 36 NA NA 0 1 1 36 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 36 0 1 1
37 0nwyzv5u NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 37 NA NA 0 1 1 37 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 37 0 1 1
38 w22bfd1j NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 38 NA NA 0 1 1 38 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 38 0 1 1
39 k4kmo6z2 NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 39 NA NA 0 1 1 39 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 39 0 1 1
40 ue3idaco NA 0 0 6 NA NA NA 0 NA NA 0.0 0.0 l7au0ml3 NA NA NA NA 0 control 0 0 40 NA NA 0 1 1 40 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 0.0 1 1 40 0 1 1

4.3 Import data: Prolific

  • prolific_export: background information about the participants (tibble: 11 X 19)
    • assign to d.2
d.2 <- read_csv("prolific_export.csv")
session_id participant_id status started_datetime completed_date_time time_taken age num_approvals num_rejections prolific_score reviewed_at_datetime entered_code Country of Birth Current Country of Residence Employment Status First Language Nationality Sex Student Status
601d15de0fddd503b8c494d0 5fb1bdd5dc59737020cdfb5e AWAITING REVIEW 2021-02-05 09:54:38 2021-02-05 09:59:19 280.907 20 25 0 100 NA 7D04665D Poland Poland Unemployed (and job seeking) DATA EXPIRED Poland Female Yes
601d1606c12f23058f90508d 5f51292bd904fa31bd519280 AWAITING REVIEW 2021-02-05 09:55:21 2021-02-05 09:59:13 231.871 25 27 0 100 NA 7D04665D Iran Italy Unemployed (and job seeking) DATA EXPIRED Iran Male Yes
601d1611a29ceb0599852173 5f1f04c1604e3508877cf62a RETURNED 2021-02-05 09:55:42 NA 1354.159 23 37 1 99 NA NA CONSENT REVOKED CONSENT REVOKED CONSENT REVOKED CONSENT REVOKED CONSENT REVOKED CONSENT REVOKED CONSENT REVOKED
601d1618dca960058fab427d 60193745c8295b026e58ffb0 AWAITING REVIEW 2021-02-05 09:55:35 2021-02-05 10:02:55 439.302 19 0 0 100 NA 7D04665D Poland NA Unemployed (and job seeking) NA Poland Male Yes
601d16191b77080528217123 5d95b8fb7c6bbc0013674753 AWAITING REVIEW 2021-02-05 09:55:37 2021-02-05 09:58:50 192.845 30 97 1 100 NA 7D04665D Italy Italy Unemployed (and job seeking) Italian Italy Female Yes
601d161da622f60525b5ffd3 5fba5e6d9d68a19e88fe7ef5 AWAITING REVIEW 2021-02-05 09:55:41 2021-02-05 10:01:00 318.809 23 15 0 100 NA 7D04665D Greece Greece Unemployed (and job seeking) NA Greece Female Yes
601d162069911a051dbfd26c 5f560c6dc8978c2120ef359c AWAITING REVIEW 2021-02-05 09:55:44 2021-02-05 09:58:21 157.279 26 10 0 100 NA 7D04665D Estonia Estonia Full-Time NA Estonia Male No
601d162a32e53504a7ffcaa2 5fe0d5101f2ef9a760961e24 AWAITING REVIEW 2021-02-05 09:55:54 2021-02-05 09:59:54 239.591 39 11 0 100 NA 7D04665D Italy Italy Full-Time DATA EXPIRED Italy Male No
601d16350225a805953edcc0 5f862e8d85b71e1791feb730 AWAITING REVIEW 2021-02-05 09:56:05 2021-02-05 10:02:58 413.000 30 13 0 100 NA 7D04665D Italy Italy Full-Time DATA EXPIRED Italy Male Yes
601d1644c67525051b837656 5c548bf9aba7d60001f1f41a AWAITING REVIEW 2021-02-05 09:56:26 2021-02-05 10:01:52 326.129 21 82 2 98 NA 7D04665D Poland Poland Unemployed (and job seeking) Polish Poland Male Yes
601d18a0a0bd00058f79b0b2 5fcfca16b5f39515037d6724 AWAITING REVIEW 2021-02-05 10:10:32 2021-02-05 10:16:32 360.250 28 22 1 99 NA 7D04665D South Africa South Africa Full-Time DATA EXPIRED South Africa Female No

4.4 Filter data

  • We keep only observations that are of our interest
    • The oTree database might contain observations that belong ot other apps/sessions
d.1 <- 
    d.1 %>% filter(session.code=="l7au0ml3") %>% # name of the sesison
    filter(participant._current_app_name=="prolific_end") %>% # only those who reached the last page
    filter(prolific_id.1.player.prolific_id!="123456789123456789123456") # fake ID 
  • We obtain a tibble 10 (rows) x 54 (columns)
    • Each participant is a row, each variable a column
      • Tidy data

4.5 Select

  • Our databases contain several “useless” variables (columns)
    • Select only the relevant columns from the OTree output
      • ID + 10 choices
d.1.sel <- 
d.1 %>% 
select(prolific_id.1.player.prolific_id,paste("prolific_MPL.1.player.HL_",1:10,sep=""))
prolific_id.1.player.prolific_id prolific_MPL.1.player.HL_1 prolific_MPL.1.player.HL_2 prolific_MPL.1.player.HL_3 prolific_MPL.1.player.HL_4 prolific_MPL.1.player.HL_5 prolific_MPL.1.player.HL_6 prolific_MPL.1.player.HL_7 prolific_MPL.1.player.HL_8 prolific_MPL.1.player.HL_9 prolific_MPL.1.player.HL_10
5fb1bdd5dc59737020cdfb5e 2 2 1 1 2 1 2 2 2 1
5f51292bd904fa31bd519280 1 1 1 1 2 2 2 2 2 2
60193745c8295b026e58ffb0 1 1 1 1 1 1 1 2 2 2
5d95b8fb7c6bbc0013674753 2 1 1 2 2 1 2 1 1 2
5fba5e6d9d68a19e88fe7ef5 1 1 1 1 1 2 2 2 2 2
5f560c6dc8978c2120ef359c 1 1 1 1 1 2 2 2 2 2
5fe0d5101f2ef9a760961e24 1 1 1 2 2 2 2 2 2 2
5f862e8d85b71e1791feb730 1 1 1 1 1 1 1 2 2 2
5c548bf9aba7d60001f1f41a 1 1 1 2 2 2 2 2 2 2
5fcfca16b5f39515037d6724 2 1 2 1 2 1 1 2 2 2

4.6 Join

  • We join the two databases
    • Use the common column given by the prolific ID
      • Important: the variable must have the same name in the two databases
        • Need to rename the column in the two databases
d.1.sel <- 
d.1.sel %>% 
rename(prolificID=prolific_id.1.player.prolific_id)
d.2 <- 
d.2 %>% 
rename(prolificID=participant_id)

4.7 Join (ii)

  • Now we can join
    • We only want information about those who completed the task according to the oTree DB
d <- 
  left_join(
          d.1.sel,
          d.2,
          by="prolificID"
          )
prolificID prolific_MPL.1.player.HL_1 prolific_MPL.1.player.HL_2 prolific_MPL.1.player.HL_3 prolific_MPL.1.player.HL_4 prolific_MPL.1.player.HL_5 prolific_MPL.1.player.HL_6 prolific_MPL.1.player.HL_7 prolific_MPL.1.player.HL_8 prolific_MPL.1.player.HL_9 prolific_MPL.1.player.HL_10 session_id status started_datetime completed_date_time time_taken age num_approvals num_rejections prolific_score reviewed_at_datetime entered_code Country of Birth Current Country of Residence Employment Status First Language Nationality Sex Student Status
5fb1bdd5dc59737020cdfb5e 2 2 1 1 2 1 2 2 2 1 601d15de0fddd503b8c494d0 AWAITING REVIEW 2021-02-05 09:54:38 2021-02-05 09:59:19 280.907 20 25 0 100 NA 7D04665D Poland Poland Unemployed (and job seeking) DATA EXPIRED Poland Female Yes
5f51292bd904fa31bd519280 1 1 1 1 2 2 2 2 2 2 601d1606c12f23058f90508d AWAITING REVIEW 2021-02-05 09:55:21 2021-02-05 09:59:13 231.871 25 27 0 100 NA 7D04665D Iran Italy Unemployed (and job seeking) DATA EXPIRED Iran Male Yes
60193745c8295b026e58ffb0 1 1 1 1 1 1 1 2 2 2 601d1618dca960058fab427d AWAITING REVIEW 2021-02-05 09:55:35 2021-02-05 10:02:55 439.302 19 0 0 100 NA 7D04665D Poland NA Unemployed (and job seeking) NA Poland Male Yes
5d95b8fb7c6bbc0013674753 2 1 1 2 2 1 2 1 1 2 601d16191b77080528217123 AWAITING REVIEW 2021-02-05 09:55:37 2021-02-05 09:58:50 192.845 30 97 1 100 NA 7D04665D Italy Italy Unemployed (and job seeking) Italian Italy Female Yes
5fba5e6d9d68a19e88fe7ef5 1 1 1 1 1 2 2 2 2 2 601d161da622f60525b5ffd3 AWAITING REVIEW 2021-02-05 09:55:41 2021-02-05 10:01:00 318.809 23 15 0 100 NA 7D04665D Greece Greece Unemployed (and job seeking) NA Greece Female Yes
5f560c6dc8978c2120ef359c 1 1 1 1 1 2 2 2 2 2 601d162069911a051dbfd26c AWAITING REVIEW 2021-02-05 09:55:44 2021-02-05 09:58:21 157.279 26 10 0 100 NA 7D04665D Estonia Estonia Full-Time NA Estonia Male No
5fe0d5101f2ef9a760961e24 1 1 1 2 2 2 2 2 2 2 601d162a32e53504a7ffcaa2 AWAITING REVIEW 2021-02-05 09:55:54 2021-02-05 09:59:54 239.591 39 11 0 100 NA 7D04665D Italy Italy Full-Time DATA EXPIRED Italy Male No
5f862e8d85b71e1791feb730 1 1 1 1 1 1 1 2 2 2 601d16350225a805953edcc0 AWAITING REVIEW 2021-02-05 09:56:05 2021-02-05 10:02:58 413.000 30 13 0 100 NA 7D04665D Italy Italy Full-Time DATA EXPIRED Italy Male Yes
5c548bf9aba7d60001f1f41a 1 1 1 2 2 2 2 2 2 2 601d1644c67525051b837656 AWAITING REVIEW 2021-02-05 09:56:26 2021-02-05 10:01:52 326.129 21 82 2 98 NA 7D04665D Poland Poland Unemployed (and job seeking) Polish Poland Male Yes
5fcfca16b5f39515037d6724 2 1 2 1 2 1 1 2 2 2 601d18a0a0bd00058f79b0b2 AWAITING REVIEW 2021-02-05 10:10:32 2021-02-05 10:16:32 360.250 28 22 1 99 NA 7D04665D South Africa South Africa Full-Time DATA EXPIRED South Africa Female No

4.8 Gather

  • We want to create a DB with all choices in MPL in a column
    • Column Prospect identifies the prospect
    • Column Choice identifies the choice (for that prospect)
      • Useful for data analysis
d.g <- 
d %>% 
select(prolificID,prolific_MPL.1.player.HL_1:prolific_MPL.1.player.HL_10) %>%
  pivot_longer(names_to="Prospect",values_to="Choice",2:11)
prolificID Prospect Choice
5fb1bdd5dc59737020cdfb5e prolific_MPL.1.player.HL_1 2
5fb1bdd5dc59737020cdfb5e prolific_MPL.1.player.HL_2 2
5fb1bdd5dc59737020cdfb5e prolific_MPL.1.player.HL_3 1
5fb1bdd5dc59737020cdfb5e prolific_MPL.1.player.HL_4 1
5fb1bdd5dc59737020cdfb5e prolific_MPL.1.player.HL_5 2
5fb1bdd5dc59737020cdfb5e prolific_MPL.1.player.HL_6 1
5fb1bdd5dc59737020cdfb5e prolific_MPL.1.player.HL_7 2
5fb1bdd5dc59737020cdfb5e prolific_MPL.1.player.HL_8 2
5fb1bdd5dc59737020cdfb5e prolific_MPL.1.player.HL_9 2
5fb1bdd5dc59737020cdfb5e prolific_MPL.1.player.HL_10 1
5f51292bd904fa31bd519280 prolific_MPL.1.player.HL_1 1
5f51292bd904fa31bd519280 prolific_MPL.1.player.HL_2 1
5f51292bd904fa31bd519280 prolific_MPL.1.player.HL_3 1
5f51292bd904fa31bd519280 prolific_MPL.1.player.HL_4 1
5f51292bd904fa31bd519280 prolific_MPL.1.player.HL_5 2
5f51292bd904fa31bd519280 prolific_MPL.1.player.HL_6 2
5f51292bd904fa31bd519280 prolific_MPL.1.player.HL_7 2
5f51292bd904fa31bd519280 prolific_MPL.1.player.HL_8 2
5f51292bd904fa31bd519280 prolific_MPL.1.player.HL_9 2
5f51292bd904fa31bd519280 prolific_MPL.1.player.HL_10 2
60193745c8295b026e58ffb0 prolific_MPL.1.player.HL_1 1
60193745c8295b026e58ffb0 prolific_MPL.1.player.HL_2 1
60193745c8295b026e58ffb0 prolific_MPL.1.player.HL_3 1
60193745c8295b026e58ffb0 prolific_MPL.1.player.HL_4 1
60193745c8295b026e58ffb0 prolific_MPL.1.player.HL_5 1
60193745c8295b026e58ffb0 prolific_MPL.1.player.HL_6 1
60193745c8295b026e58ffb0 prolific_MPL.1.player.HL_7 1
60193745c8295b026e58ffb0 prolific_MPL.1.player.HL_8 2
60193745c8295b026e58ffb0 prolific_MPL.1.player.HL_9 2
60193745c8295b026e58ffb0 prolific_MPL.1.player.HL_10 2
5d95b8fb7c6bbc0013674753 prolific_MPL.1.player.HL_1 2
5d95b8fb7c6bbc0013674753 prolific_MPL.1.player.HL_2 1
5d95b8fb7c6bbc0013674753 prolific_MPL.1.player.HL_3 1
5d95b8fb7c6bbc0013674753 prolific_MPL.1.player.HL_4 2
5d95b8fb7c6bbc0013674753 prolific_MPL.1.player.HL_5 2
5d95b8fb7c6bbc0013674753 prolific_MPL.1.player.HL_6 1
5d95b8fb7c6bbc0013674753 prolific_MPL.1.player.HL_7 2
5d95b8fb7c6bbc0013674753 prolific_MPL.1.player.HL_8 1
5d95b8fb7c6bbc0013674753 prolific_MPL.1.player.HL_9 1
5d95b8fb7c6bbc0013674753 prolific_MPL.1.player.HL_10 2
5fba5e6d9d68a19e88fe7ef5 prolific_MPL.1.player.HL_1 1
5fba5e6d9d68a19e88fe7ef5 prolific_MPL.1.player.HL_2 1
5fba5e6d9d68a19e88fe7ef5 prolific_MPL.1.player.HL_3 1
5fba5e6d9d68a19e88fe7ef5 prolific_MPL.1.player.HL_4 1
5fba5e6d9d68a19e88fe7ef5 prolific_MPL.1.player.HL_5 1
5fba5e6d9d68a19e88fe7ef5 prolific_MPL.1.player.HL_6 2
5fba5e6d9d68a19e88fe7ef5 prolific_MPL.1.player.HL_7 2
5fba5e6d9d68a19e88fe7ef5 prolific_MPL.1.player.HL_8 2
5fba5e6d9d68a19e88fe7ef5 prolific_MPL.1.player.HL_9 2
5fba5e6d9d68a19e88fe7ef5 prolific_MPL.1.player.HL_10 2
5f560c6dc8978c2120ef359c prolific_MPL.1.player.HL_1 1
5f560c6dc8978c2120ef359c prolific_MPL.1.player.HL_2 1
5f560c6dc8978c2120ef359c prolific_MPL.1.player.HL_3 1
5f560c6dc8978c2120ef359c prolific_MPL.1.player.HL_4 1
5f560c6dc8978c2120ef359c prolific_MPL.1.player.HL_5 1
5f560c6dc8978c2120ef359c prolific_MPL.1.player.HL_6 2
5f560c6dc8978c2120ef359c prolific_MPL.1.player.HL_7 2
5f560c6dc8978c2120ef359c prolific_MPL.1.player.HL_8 2
5f560c6dc8978c2120ef359c prolific_MPL.1.player.HL_9 2
5f560c6dc8978c2120ef359c prolific_MPL.1.player.HL_10 2
5fe0d5101f2ef9a760961e24 prolific_MPL.1.player.HL_1 1
5fe0d5101f2ef9a760961e24 prolific_MPL.1.player.HL_2 1
5fe0d5101f2ef9a760961e24 prolific_MPL.1.player.HL_3 1
5fe0d5101f2ef9a760961e24 prolific_MPL.1.player.HL_4 2
5fe0d5101f2ef9a760961e24 prolific_MPL.1.player.HL_5 2
5fe0d5101f2ef9a760961e24 prolific_MPL.1.player.HL_6 2
5fe0d5101f2ef9a760961e24 prolific_MPL.1.player.HL_7 2
5fe0d5101f2ef9a760961e24 prolific_MPL.1.player.HL_8 2
5fe0d5101f2ef9a760961e24 prolific_MPL.1.player.HL_9 2
5fe0d5101f2ef9a760961e24 prolific_MPL.1.player.HL_10 2
5f862e8d85b71e1791feb730 prolific_MPL.1.player.HL_1 1
5f862e8d85b71e1791feb730 prolific_MPL.1.player.HL_2 1
5f862e8d85b71e1791feb730 prolific_MPL.1.player.HL_3 1
5f862e8d85b71e1791feb730 prolific_MPL.1.player.HL_4 1
5f862e8d85b71e1791feb730 prolific_MPL.1.player.HL_5 1
5f862e8d85b71e1791feb730 prolific_MPL.1.player.HL_6 1
5f862e8d85b71e1791feb730 prolific_MPL.1.player.HL_7 1
5f862e8d85b71e1791feb730 prolific_MPL.1.player.HL_8 2
5f862e8d85b71e1791feb730 prolific_MPL.1.player.HL_9 2
5f862e8d85b71e1791feb730 prolific_MPL.1.player.HL_10 2
5c548bf9aba7d60001f1f41a prolific_MPL.1.player.HL_1 1
5c548bf9aba7d60001f1f41a prolific_MPL.1.player.HL_2 1
5c548bf9aba7d60001f1f41a prolific_MPL.1.player.HL_3 1
5c548bf9aba7d60001f1f41a prolific_MPL.1.player.HL_4 2
5c548bf9aba7d60001f1f41a prolific_MPL.1.player.HL_5 2
5c548bf9aba7d60001f1f41a prolific_MPL.1.player.HL_6 2
5c548bf9aba7d60001f1f41a prolific_MPL.1.player.HL_7 2
5c548bf9aba7d60001f1f41a prolific_MPL.1.player.HL_8 2
5c548bf9aba7d60001f1f41a prolific_MPL.1.player.HL_9 2
5c548bf9aba7d60001f1f41a prolific_MPL.1.player.HL_10 2
5fcfca16b5f39515037d6724 prolific_MPL.1.player.HL_1 2
5fcfca16b5f39515037d6724 prolific_MPL.1.player.HL_2 1
5fcfca16b5f39515037d6724 prolific_MPL.1.player.HL_3 2
5fcfca16b5f39515037d6724 prolific_MPL.1.player.HL_4 1
5fcfca16b5f39515037d6724 prolific_MPL.1.player.HL_5 2
5fcfca16b5f39515037d6724 prolific_MPL.1.player.HL_6 1
5fcfca16b5f39515037d6724 prolific_MPL.1.player.HL_7 1
5fcfca16b5f39515037d6724 prolific_MPL.1.player.HL_8 2
5fcfca16b5f39515037d6724 prolific_MPL.1.player.HL_9 2
5fcfca16b5f39515037d6724 prolific_MPL.1.player.HL_10 2

4.9 Mutate

  • Change the name of Prospect and transform it into a number
d.g <- 
      d.g %>%
      mutate(Prospect=substr(Prospect,26,27)) %>%
      mutate(Prospect=as.integer(Prospect))
prolificID Prospect Choice
5fb1bdd5dc59737020cdfb5e 1 2
5fb1bdd5dc59737020cdfb5e 2 2
5fb1bdd5dc59737020cdfb5e 3 1
5fb1bdd5dc59737020cdfb5e 4 1
5fb1bdd5dc59737020cdfb5e 5 2
5fb1bdd5dc59737020cdfb5e 6 1
5fb1bdd5dc59737020cdfb5e 7 2
5fb1bdd5dc59737020cdfb5e 8 2
5fb1bdd5dc59737020cdfb5e 9 2
5fb1bdd5dc59737020cdfb5e 10 1
5f51292bd904fa31bd519280 1 1
5f51292bd904fa31bd519280 2 1
5f51292bd904fa31bd519280 3 1
5f51292bd904fa31bd519280 4 1
5f51292bd904fa31bd519280 5 2
5f51292bd904fa31bd519280 6 2
5f51292bd904fa31bd519280 7 2
5f51292bd904fa31bd519280 8 2
5f51292bd904fa31bd519280 9 2
5f51292bd904fa31bd519280 10 2
60193745c8295b026e58ffb0 1 1
60193745c8295b026e58ffb0 2 1
60193745c8295b026e58ffb0 3 1
60193745c8295b026e58ffb0 4 1
60193745c8295b026e58ffb0 5 1
60193745c8295b026e58ffb0 6 1
60193745c8295b026e58ffb0 7 1
60193745c8295b026e58ffb0 8 2
60193745c8295b026e58ffb0 9 2
60193745c8295b026e58ffb0 10 2
5d95b8fb7c6bbc0013674753 1 2
5d95b8fb7c6bbc0013674753 2 1
5d95b8fb7c6bbc0013674753 3 1
5d95b8fb7c6bbc0013674753 4 2
5d95b8fb7c6bbc0013674753 5 2
5d95b8fb7c6bbc0013674753 6 1
5d95b8fb7c6bbc0013674753 7 2
5d95b8fb7c6bbc0013674753 8 1
5d95b8fb7c6bbc0013674753 9 1
5d95b8fb7c6bbc0013674753 10 2
5fba5e6d9d68a19e88fe7ef5 1 1
5fba5e6d9d68a19e88fe7ef5 2 1
5fba5e6d9d68a19e88fe7ef5 3 1
5fba5e6d9d68a19e88fe7ef5 4 1
5fba5e6d9d68a19e88fe7ef5 5 1
5fba5e6d9d68a19e88fe7ef5 6 2
5fba5e6d9d68a19e88fe7ef5 7 2
5fba5e6d9d68a19e88fe7ef5 8 2
5fba5e6d9d68a19e88fe7ef5 9 2
5fba5e6d9d68a19e88fe7ef5 10 2
5f560c6dc8978c2120ef359c 1 1
5f560c6dc8978c2120ef359c 2 1
5f560c6dc8978c2120ef359c 3 1
5f560c6dc8978c2120ef359c 4 1
5f560c6dc8978c2120ef359c 5 1
5f560c6dc8978c2120ef359c 6 2
5f560c6dc8978c2120ef359c 7 2
5f560c6dc8978c2120ef359c 8 2
5f560c6dc8978c2120ef359c 9 2
5f560c6dc8978c2120ef359c 10 2
5fe0d5101f2ef9a760961e24 1 1
5fe0d5101f2ef9a760961e24 2 1
5fe0d5101f2ef9a760961e24 3 1
5fe0d5101f2ef9a760961e24 4 2
5fe0d5101f2ef9a760961e24 5 2
5fe0d5101f2ef9a760961e24 6 2
5fe0d5101f2ef9a760961e24 7 2
5fe0d5101f2ef9a760961e24 8 2
5fe0d5101f2ef9a760961e24 9 2
5fe0d5101f2ef9a760961e24 10 2
5f862e8d85b71e1791feb730 1 1
5f862e8d85b71e1791feb730 2 1
5f862e8d85b71e1791feb730 3 1
5f862e8d85b71e1791feb730 4 1
5f862e8d85b71e1791feb730 5 1
5f862e8d85b71e1791feb730 6 1
5f862e8d85b71e1791feb730 7 1
5f862e8d85b71e1791feb730 8 2
5f862e8d85b71e1791feb730 9 2
5f862e8d85b71e1791feb730 10 2
5c548bf9aba7d60001f1f41a 1 1
5c548bf9aba7d60001f1f41a 2 1
5c548bf9aba7d60001f1f41a 3 1
5c548bf9aba7d60001f1f41a 4 2
5c548bf9aba7d60001f1f41a 5 2
5c548bf9aba7d60001f1f41a 6 2
5c548bf9aba7d60001f1f41a 7 2
5c548bf9aba7d60001f1f41a 8 2
5c548bf9aba7d60001f1f41a 9 2
5c548bf9aba7d60001f1f41a 10 2
5fcfca16b5f39515037d6724 1 2
5fcfca16b5f39515037d6724 2 1
5fcfca16b5f39515037d6724 3 2
5fcfca16b5f39515037d6724 4 1
5fcfca16b5f39515037d6724 5 2
5fcfca16b5f39515037d6724 6 1
5fcfca16b5f39515037d6724 7 1
5fcfca16b5f39515037d6724 8 2
5fcfca16b5f39515037d6724 9 2
5fcfca16b5f39515037d6724 10 2
  • Create a new variable with following content
    • 1 → A; 2 → B
d.g <- 
      d.g %>%
      mutate(Choice_2=ifelse(Choice==1,"A","B")) 
prolificID Prospect Choice Choice_2
5fb1bdd5dc59737020cdfb5e 1 2 B
5fb1bdd5dc59737020cdfb5e 2 2 B
5fb1bdd5dc59737020cdfb5e 3 1 A
5fb1bdd5dc59737020cdfb5e 4 1 A
5fb1bdd5dc59737020cdfb5e 5 2 B
5fb1bdd5dc59737020cdfb5e 6 1 A
5fb1bdd5dc59737020cdfb5e 7 2 B
5fb1bdd5dc59737020cdfb5e 8 2 B
5fb1bdd5dc59737020cdfb5e 9 2 B
5fb1bdd5dc59737020cdfb5e 10 1 A
5f51292bd904fa31bd519280 1 1 A
5f51292bd904fa31bd519280 2 1 A
5f51292bd904fa31bd519280 3 1 A
5f51292bd904fa31bd519280 4 1 A
5f51292bd904fa31bd519280 5 2 B
5f51292bd904fa31bd519280 6 2 B
5f51292bd904fa31bd519280 7 2 B
5f51292bd904fa31bd519280 8 2 B
5f51292bd904fa31bd519280 9 2 B
5f51292bd904fa31bd519280 10 2 B
60193745c8295b026e58ffb0 1 1 A
60193745c8295b026e58ffb0 2 1 A
60193745c8295b026e58ffb0 3 1 A
60193745c8295b026e58ffb0 4 1 A
60193745c8295b026e58ffb0 5 1 A
60193745c8295b026e58ffb0 6 1 A
60193745c8295b026e58ffb0 7 1 A
60193745c8295b026e58ffb0 8 2 B
60193745c8295b026e58ffb0 9 2 B
60193745c8295b026e58ffb0 10 2 B
5d95b8fb7c6bbc0013674753 1 2 B
5d95b8fb7c6bbc0013674753 2 1 A
5d95b8fb7c6bbc0013674753 3 1 A
5d95b8fb7c6bbc0013674753 4 2 B
5d95b8fb7c6bbc0013674753 5 2 B
5d95b8fb7c6bbc0013674753 6 1 A
5d95b8fb7c6bbc0013674753 7 2 B
5d95b8fb7c6bbc0013674753 8 1 A
5d95b8fb7c6bbc0013674753 9 1 A
5d95b8fb7c6bbc0013674753 10 2 B
5fba5e6d9d68a19e88fe7ef5 1 1 A
5fba5e6d9d68a19e88fe7ef5 2 1 A
5fba5e6d9d68a19e88fe7ef5 3 1 A
5fba5e6d9d68a19e88fe7ef5 4 1 A
5fba5e6d9d68a19e88fe7ef5 5 1 A
5fba5e6d9d68a19e88fe7ef5 6 2 B
5fba5e6d9d68a19e88fe7ef5 7 2 B
5fba5e6d9d68a19e88fe7ef5 8 2 B
5fba5e6d9d68a19e88fe7ef5 9 2 B
5fba5e6d9d68a19e88fe7ef5 10 2 B
5f560c6dc8978c2120ef359c 1 1 A
5f560c6dc8978c2120ef359c 2 1 A
5f560c6dc8978c2120ef359c 3 1 A
5f560c6dc8978c2120ef359c 4 1 A
5f560c6dc8978c2120ef359c 5 1 A
5f560c6dc8978c2120ef359c 6 2 B
5f560c6dc8978c2120ef359c 7 2 B
5f560c6dc8978c2120ef359c 8 2 B
5f560c6dc8978c2120ef359c 9 2 B
5f560c6dc8978c2120ef359c 10 2 B
5fe0d5101f2ef9a760961e24 1 1 A
5fe0d5101f2ef9a760961e24 2 1 A
5fe0d5101f2ef9a760961e24 3 1 A
5fe0d5101f2ef9a760961e24 4 2 B
5fe0d5101f2ef9a760961e24 5 2 B
5fe0d5101f2ef9a760961e24 6 2 B
5fe0d5101f2ef9a760961e24 7 2 B
5fe0d5101f2ef9a760961e24 8 2 B
5fe0d5101f2ef9a760961e24 9 2 B
5fe0d5101f2ef9a760961e24 10 2 B
5f862e8d85b71e1791feb730 1 1 A
5f862e8d85b71e1791feb730 2 1 A
5f862e8d85b71e1791feb730 3 1 A
5f862e8d85b71e1791feb730 4 1 A
5f862e8d85b71e1791feb730 5 1 A
5f862e8d85b71e1791feb730 6 1 A
5f862e8d85b71e1791feb730 7 1 A
5f862e8d85b71e1791feb730 8 2 B
5f862e8d85b71e1791feb730 9 2 B
5f862e8d85b71e1791feb730 10 2 B
5c548bf9aba7d60001f1f41a 1 1 A
5c548bf9aba7d60001f1f41a 2 1 A
5c548bf9aba7d60001f1f41a 3 1 A
5c548bf9aba7d60001f1f41a 4 2 B
5c548bf9aba7d60001f1f41a 5 2 B
5c548bf9aba7d60001f1f41a 6 2 B
5c548bf9aba7d60001f1f41a 7 2 B
5c548bf9aba7d60001f1f41a 8 2 B
5c548bf9aba7d60001f1f41a 9 2 B
5c548bf9aba7d60001f1f41a 10 2 B
5fcfca16b5f39515037d6724 1 2 B
5fcfca16b5f39515037d6724 2 1 A
5fcfca16b5f39515037d6724 3 2 B
5fcfca16b5f39515037d6724 4 1 A
5fcfca16b5f39515037d6724 5 2 B
5fcfca16b5f39515037d6724 6 1 A
5fcfca16b5f39515037d6724 7 1 A
5fcfca16b5f39515037d6724 8 2 B
5fcfca16b5f39515037d6724 9 2 B
5fcfca16b5f39515037d6724 10 2 B

4.10 Arrange

  • To perform a quick visual inspection, arrange by Prospect and Choice
d.g <- 
      d.g %>%
       arrange(Prospect,Choice_2)
prolificID Prospect Choice Choice_2
5f51292bd904fa31bd519280 1 1 A
60193745c8295b026e58ffb0 1 1 A
5fba5e6d9d68a19e88fe7ef5 1 1 A
5f560c6dc8978c2120ef359c 1 1 A
5fe0d5101f2ef9a760961e24 1 1 A
5f862e8d85b71e1791feb730 1 1 A
5c548bf9aba7d60001f1f41a 1 1 A
5fb1bdd5dc59737020cdfb5e 1 2 B
5d95b8fb7c6bbc0013674753 1 2 B
5fcfca16b5f39515037d6724 1 2 B
5f51292bd904fa31bd519280 2 1 A
60193745c8295b026e58ffb0 2 1 A
5d95b8fb7c6bbc0013674753 2 1 A
5fba5e6d9d68a19e88fe7ef5 2 1 A
5f560c6dc8978c2120ef359c 2 1 A
5fe0d5101f2ef9a760961e24 2 1 A
5f862e8d85b71e1791feb730 2 1 A
5c548bf9aba7d60001f1f41a 2 1 A
5fcfca16b5f39515037d6724 2 1 A
5fb1bdd5dc59737020cdfb5e 2 2 B
5fb1bdd5dc59737020cdfb5e 3 1 A
5f51292bd904fa31bd519280 3 1 A
60193745c8295b026e58ffb0 3 1 A
5d95b8fb7c6bbc0013674753 3 1 A
5fba5e6d9d68a19e88fe7ef5 3 1 A
5f560c6dc8978c2120ef359c 3 1 A
5fe0d5101f2ef9a760961e24 3 1 A
5f862e8d85b71e1791feb730 3 1 A
5c548bf9aba7d60001f1f41a 3 1 A
5fcfca16b5f39515037d6724 3 2 B
5fb1bdd5dc59737020cdfb5e 4 1 A
5f51292bd904fa31bd519280 4 1 A
60193745c8295b026e58ffb0 4 1 A
5fba5e6d9d68a19e88fe7ef5 4 1 A
5f560c6dc8978c2120ef359c 4 1 A
5f862e8d85b71e1791feb730 4 1 A
5fcfca16b5f39515037d6724 4 1 A
5d95b8fb7c6bbc0013674753 4 2 B
5fe0d5101f2ef9a760961e24 4 2 B
5c548bf9aba7d60001f1f41a 4 2 B
60193745c8295b026e58ffb0 5 1 A
5fba5e6d9d68a19e88fe7ef5 5 1 A
5f560c6dc8978c2120ef359c 5 1 A
5f862e8d85b71e1791feb730 5 1 A
5fb1bdd5dc59737020cdfb5e 5 2 B
5f51292bd904fa31bd519280 5 2 B
5d95b8fb7c6bbc0013674753 5 2 B
5fe0d5101f2ef9a760961e24 5 2 B
5c548bf9aba7d60001f1f41a 5 2 B
5fcfca16b5f39515037d6724 5 2 B
5fb1bdd5dc59737020cdfb5e 6 1 A
60193745c8295b026e58ffb0 6 1 A
5d95b8fb7c6bbc0013674753 6 1 A
5f862e8d85b71e1791feb730 6 1 A
5fcfca16b5f39515037d6724 6 1 A
5f51292bd904fa31bd519280 6 2 B
5fba5e6d9d68a19e88fe7ef5 6 2 B
5f560c6dc8978c2120ef359c 6 2 B
5fe0d5101f2ef9a760961e24 6 2 B
5c548bf9aba7d60001f1f41a 6 2 B
60193745c8295b026e58ffb0 7 1 A
5f862e8d85b71e1791feb730 7 1 A
5fcfca16b5f39515037d6724 7 1 A
5fb1bdd5dc59737020cdfb5e 7 2 B
5f51292bd904fa31bd519280 7 2 B
5d95b8fb7c6bbc0013674753 7 2 B
5fba5e6d9d68a19e88fe7ef5 7 2 B
5f560c6dc8978c2120ef359c 7 2 B
5fe0d5101f2ef9a760961e24 7 2 B
5c548bf9aba7d60001f1f41a 7 2 B
5d95b8fb7c6bbc0013674753 8 1 A
5fb1bdd5dc59737020cdfb5e 8 2 B
5f51292bd904fa31bd519280 8 2 B
60193745c8295b026e58ffb0 8 2 B
5fba5e6d9d68a19e88fe7ef5 8 2 B
5f560c6dc8978c2120ef359c 8 2 B
5fe0d5101f2ef9a760961e24 8 2 B
5f862e8d85b71e1791feb730 8 2 B
5c548bf9aba7d60001f1f41a 8 2 B
5fcfca16b5f39515037d6724 8 2 B
5d95b8fb7c6bbc0013674753 9 1 A
5fb1bdd5dc59737020cdfb5e 9 2 B
5f51292bd904fa31bd519280 9 2 B
60193745c8295b026e58ffb0 9 2 B
5fba5e6d9d68a19e88fe7ef5 9 2 B
5f560c6dc8978c2120ef359c 9 2 B
5fe0d5101f2ef9a760961e24 9 2 B
5f862e8d85b71e1791feb730 9 2 B
5c548bf9aba7d60001f1f41a 9 2 B
5fcfca16b5f39515037d6724 9 2 B
5fb1bdd5dc59737020cdfb5e 10 1 A
5f51292bd904fa31bd519280 10 2 B
60193745c8295b026e58ffb0 10 2 B
5d95b8fb7c6bbc0013674753 10 2 B
5fba5e6d9d68a19e88fe7ef5 10 2 B
5f560c6dc8978c2120ef359c 10 2 B
5fe0d5101f2ef9a760961e24 10 2 B
5f862e8d85b71e1791feb730 10 2 B
5c548bf9aba7d60001f1f41a 10 2 B
5fcfca16b5f39515037d6724 10 2 B
  • We are redy to perform our analysis!

5 Appendix

5.1 References

References

Wickham, Hadley. 2014. “Tidy Data.” The Journal of Statistical Software 59. http://www.jstatsoft.org/v59/i10/.
Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. " O’Reilly Media, Inc.".

Footnotes

  1. A condensed online version available here.↩︎