Experimental Economics: Data Workflow
Lecture 1: Introduction
1 Our Course
1.1 The Data Workflow
- The course deals with the design of an experiment in social sciences and the management of the related data workflow
- From Design to Data Organization
1.2 The Data Workflow (ii)
- Design
- Methodological aspects of the design of an economic experiments are discussed in the classroom
- Registration
- Illustration of the pre-registration process on OSF with specific attention to hypotheses and data analysis
- Programming
- Introduction to oTree architecture and examples of experimental apps
- Data collection
- Recruitment of participants via Prolific and online deployment on render.com
- Data organization
- Collection, organization and documentation of data in R
1.3 Prerequisites
- Basic knowledge of experimental economics.
- Basic knowledge of computer programming.
- Access to a computer.
- Installation of o-Tree software (open source and free of charge).
- Installation of R software (open source and free of charge) and R Studio (open source and free of charge).
2 Software Requirements
2.1 Editor
- I am using Visual Studio Code on macOS
- It is free and open source
- It is cross-platform
- I will use it for oTree, R and Github
- Handling of python, R and markdown files
- However, It is possible to use any text editor
2.2 oTree
2.3 oTree: Code
- Programming language of oTree is Python
- Popular object-oriented programming language
- Developed in early 90’s by Guido Van Rossum
- OTree’s user interface is based on HTML5
- Supported by modern browser and rich in functionalities
- Can be enriched with
- css
- javascript
- bootstrap
- …
- Can be enriched with
- Supported by modern browser and rich in functionalities
- All the components of oTree are free and open-source
2.4 oTree: Functioning
- The basic setup consists in
- An app (experiment) written within oTree
- A server computer
- Cloud server, local PC …
- Subjects’ devices with a browser
- PC, Laptop, Tablet, Mobile Phone …
- oTree creates a session on the server and generates links for all participants
- Participants click on the links and are sent to a personal page
- They submit their answers, which are collected by the server
- The experimenter can check the progress on the server
2.5 R
- R is a programming language and software environment for statistical computing and graphics
- Developed in 1993 by Ross Ihaka and Robert Gentleman (Ihaka and Gentleman 1996)
- It is free and open source
- It is cross-platform
4.2.2 (2022-10-31) -- "Innocent and Trusting"
R version Copyright (C) 2022 The R Foundation for Statistical Computing
: x86_64-apple-darwin17.0 (64-bit)
Platform
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.'license()' or 'licence()' for distribution details.
Type
in an English locale
Natural language support but running
R is a collaborative project with many contributors.'contributors()' for more information and
Type 'citation()' on how to cite R or R packages in publications.
'demo()' for some demos, 'help()' for on-line help, or
Type 'help.start()' for an HTML browser interface to help.
'q()' to quit R. Type
2.6 Your data in good shape
Data Wrangling is “the art of getting your data into R in a useful form for visualization and modeling” (Wickham and Grolemund 2016)
We rely on the tidyverse library
- The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.
library(tidyverse)
1.3.2 ──
── Attaching packages ───────────────────────────────────────────────────────────────────────────────────────── tidyverse 3.4.0 ✔ purrr 1.0.1
✔ ggplot2 3.1.8 ✔ dplyr 1.0.10
✔ tibble 1.3.0 ✔ stringr 1.5.0
✔ tidyr 2.1.3 ✔ forcats 0.5.2 ✔ readr
2.7 Github
- The material of the course is available on Github
- The repository is here
- You can download the material from there
- The repository is here
- Github is a web-based hosting service for version control using Git
- It offers all of the distributed version control and source code management functionality of Git as well as adding its own features
- It provides access control and several collaboration features such as bug tracking, feature requests, task management, and wikis for every project
- It offers all of the distributed version control and source code management functionality of Git as well as adding its own features
- Students will deliver their code to Github as well
- You must open a Github account!
2.8 What is Git?
- Git is a version control system (VCS)
- It is a tool to manage your source code history
- It allows you to track changes in your files
- It allows you to collaborate with other developers
- Consult the book Pro Git (Chacon and Straub 2014) for more information
- Version control is a system that records changes to a file or set of files over time
- A common approach
- My_thesis.tex → My_thesis_1.tex → My_thesis_2.tex → … My_thesis_29.tex
- Messy!
- My_thesis.tex → My_thesis_1.tex → My_thesis_2.tex → … My_thesis_29.tex
- A common approach
- In Local Version Control Systems (VCSs) a database keeps tack of changes to files
- Fully trackable
- Reversible changes
2.9 The lifecycle of a file in Git
- A file in a directory can be either tracked or untracked.
- Tracked files files that Git knows about
- They can be unmodified, modified, or staged
- Untracked files are everything else
- They can be tracked by adding them to the staging area
- Tracked files files that Git knows about
2.10 Git: initialize and commit
- Open a terminal and move to your working directory
cd PATH_TO_YOUR_WORKING_DIRECTORY
- In this example, the directory contains a file ‘Hello_world.txt’ that displays “Hello”
- 1) Initialize a Git repository
git init
- 2) Add the file to the staging area
git add Hello_world.txt
- 3) Commit the file
git commit -m "First commit"
- If we modify the file and add a new line “World!”
git status
- modified: Hello_world.txt
- Redo steps 2 and 3
git add Hello_world.txt
- if you are happy :) about the change → commit it
git commit -m "Added World!"
- if you are unhappy :( with the change remove it
git reset HEAD Hello_world.txt
git checkout Hello_world.txt
- You are back to the previous version of the file
2.11 Git: branching
- There is a main line of development
- You can diverge from it to “explore” new versions
- It is convenient to develop workflows that branch and merge often
- It is a good practice to keep the main line of development clean
- It is the version that is used in production
- It is the version that is used to create new branches
- When you start a commit a Master branch is created with all the commits made so far
- The Head is the last commit
- To start a new branch
git branch development
2.12 Basic commands: branching (iii)
- To switch to the new branch
git checkout development
- You are now in the new branch
2.13 Development of a new branch
- If you make new commits now you are in the development branch - You can make as many commits as you want - You can make as many branches as you want - You can switch between branches as you want
- Master is still pointing at Snapshot C
- Modify ‘Hello World!’ with the line ‘Hello World, I am developing’
git commit -a -m "Modified Hello World"
- Modify ‘Hello World!’ with the line ‘Hello World, I am developing a new feature!’
git commit -a -m "Modified again Hello World"
- Modify ‘Hello World!’ with the line ‘Hello World, I am developing’
- Master is still pointing at Snapshot C
2.14 Merging branches
- When you are done with the development of the new branch
- You can merge it with the master branch
git checkout master
git merge development
- You are now back in the main branch
- The main branch now contains the new commits
- You can merge it with the master branch
2.15 Git: status, log, diff
- Commands to understand the state of the repository
git status
- Shows the status of the repository
git log
- Shows the history of the repository
git diff
- Shows the differences between the current state and the last commit
3 Appendix
3.1 Assignment
3.2 Resources
3.2.1 oTree
3.2.2 R
3.2.3 Git
3.3 References
References
Chacon, Scott, and Ben Straub. 2014. Pro Git. Springer Nature.
Holzmeister, Felix. 2017. “oTree: Ready-Made Apps for Risk Preference Elicitation Methods.” Journal of Behavioral and Experimental Finance 16: 33–38.
Ihaka, Ross, and Robert Gentleman. 1996. “R: A Language for Data Analysis and Graphics.” Journal of Computational and Graphical Statistics 5 (3): 299–314.
Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. " O’Reilly Media, Inc.".