Experimental Economics: Data Workflow

Lecture 1: Introduction

Author

Matteo Ploner

Published

May 10, 2023

1 Our Course

1.1 The Data Workflow

  • The course deals with the design of an experiment in social sciences and the management of the related data workflow
    • From Design to Data Organization

1.2 The Data Workflow (ii)

  • Design
    • Methodological aspects of the design of an economic experiments are discussed in the classroom
  • Registration
    • Illustration of the pre-registration process on OSF with specific attention to hypotheses and data analysis
  • Programming
    • Introduction to oTree architecture and examples of experimental apps
  • Data collection
  • Data organization
    • Collection, organization and documentation of data in R

1.3 Prerequisites

  • Basic knowledge of experimental economics.
  • Basic knowledge of computer programming.
  • Access to a computer.
    • Installation of o-Tree software (open source and free of charge).
    • Installation of R software (open source and free of charge) and R Studio (open source and free of charge).

2 Software Requirements

2.1 Editor

2.2 oTree

  • oTree is a framework based on Python to run controlled experiments
    • Games
    • Questionnaires
  • Support by the community
  • oTree is open-source,
    • Licensed under an adaptation of the MIT license.
      • Cite it in the paper when you use it

2.3 oTree: Code

  • Programming language of oTree is Python
    • Popular object-oriented programming language
    • Developed in early 90’s by Guido Van Rossum
  • OTree’s user interface is based on HTML5
    • Supported by modern browser and rich in functionalities
      • Can be enriched with
        • css
        • javascript
        • bootstrap
  • All the components of oTree are free and open-source

2.4 oTree: Functioning

  • The basic setup consists in
    • An app (experiment) written within oTree
    • A server computer
      • Cloud server, local PC …
    • Subjects’ devices with a browser
      • PC, Laptop, Tablet, Mobile Phone …
  • oTree creates a session on the server and generates links for all participants
  • Participants click on the links and are sent to a personal page
    • They submit their answers, which are collected by the server
    • The experimenter can check the progress on the server

2.5 R

  • R is a programming language and software environment for statistical computing and graphics
    • Developed in 1993 by Ross Ihaka and Robert Gentleman (Ihaka and Gentleman 1996)
    • It is free and open source
    • It is cross-platform
R version 4.2.2 (2022-10-31) -- "Innocent and Trusting"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin17.0 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

2.6 Your data in good shape

  • Data Wrangling is “the art of getting your data into R in a useful form for visualization and modeling(Wickham and Grolemund 2016)

  • We rely on the tidyverse library

    • The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.
library(tidyverse)
── Attaching packages ───────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.0      ✔ purrr   1.0.1 
✔ tibble  3.1.8      ✔ dplyr   1.0.10
✔ tidyr   1.3.0      ✔ stringr 1.5.0 
✔ readr   2.1.3      ✔ forcats 0.5.2 

2.7 Github

  • The material of the course is available on Github
    • The repository is here
      • You can download the material from there
  • Github is a web-based hosting service for version control using Git
    • It offers all of the distributed version control and source code management functionality of Git as well as adding its own features
    • It provides access control and several collaboration features such as bug tracking, feature requests, task management, and wikis for every project
    • It offers all of the distributed version control and source code management functionality of Git as well as adding its own features
  • Students will deliver their code to Github as well
    • You must open a Github account!

2.8 What is Git?

  • Git is a version control system (VCS)
    • It is a tool to manage your source code history
    • It allows you to track changes in your files
    • It allows you to collaborate with other developers
    • Consult the book Pro Git (Chacon and Straub 2014) for more information
  • Version control is a system that records changes to a file or set of files over time
    • A common approach
      • My_thesis.tex → My_thesis_1.tex → My_thesis_2.tex → … My_thesis_29.tex
        • Messy!
  • In Local Version Control Systems (VCSs) a database keeps tack of changes to files
    • Fully trackable
    • Reversible changes

2.9 The lifecycle of a file in Git


(Chacon and Straub 2014)

  • A file in a directory can be either tracked or untracked.
    • Tracked files files that Git knows about
      • They can be unmodified, modified, or staged
    • Untracked files are everything else
      • They can be tracked by adding them to the staging area

2.10 Git: initialize and commit

  • Open a terminal and move to your working directory
    • cd PATH_TO_YOUR_WORKING_DIRECTORY
    • In this example, the directory contains a file ‘Hello_world.txt’ that displays “Hello”
  • 1) Initialize a Git repository
    • git init
  • 2) Add the file to the staging area
    • git add Hello_world.txt
  • 3) Commit the file
    • git commit -m "First commit"
  • If we modify the file and add a new line “World!
    • git status
      • modified: Hello_world.txt
  • Redo steps 2 and 3
    • git add Hello_world.txt
    • if you are happy :) about the change → commit it
      • git commit -m "Added World!"
    • if you are unhappy :( with the change remove it
      • git reset HEAD Hello_world.txt
      • git checkout Hello_world.txt
        • You are back to the previous version of the file

2.11 Git: branching

  • There is a main line of development
    • You can diverge from it to “explore” new versions
    • It is convenient to develop workflows that branch and merge often
    • It is a good practice to keep the main line of development clean
      • It is the version that is used in production
      • It is the version that is used to create new branches
  • When you start a commit a Master branch is created with all the commits made so far
    • The Head is the last commit
  • To start a new branch
    • git branch development

2.12 Basic commands: branching (iii)

  • To switch to the new branch
    • git checkout development
      • You are now in the new branch

2.13 Development of a new branch

  • If you make new commits now you are in the development branch - You can make as many commits as you want - You can make as many branches as you want - You can switch between branches as you want
    • Master is still pointing at Snapshot C
      • Modify ‘Hello World!’ with the line ‘Hello World, I am developing
        • git commit -a -m "Modified Hello World"
      • Modify ‘Hello World!’ with the line ‘Hello World, I am developing a new feature!
        • git commit -a -m "Modified again Hello World"

2.14 Merging branches

  • When you are done with the development of the new branch
    • You can merge it with the master branch
      • git checkout master
      • git merge development
    • You are now back in the main branch
    • The main branch now contains the new commits

2.15 Git: status, log, diff

  • Commands to understand the state of the repository
    • git status
      • Shows the status of the repository
    • git log
      • Shows the history of the repository
    • git diff
      • Shows the differences between the current state and the last commit

3 Appendix

3.1 Assignment

  1. Install Git on your computer
    • Reproduce the lifecycle of the example Hello_world.txt above
  2. Install oTree on you computer
  3. Install R on your computer

3.2 Resources

3.2.1 oTree

3.2.2 R

3.2.3 Git

3.3 References

References

Chacon, Scott, and Ben Straub. 2014. Pro Git. Springer Nature.
Holzmeister, Felix. 2017. “oTree: Ready-Made Apps for Risk Preference Elicitation Methods.” Journal of Behavioral and Experimental Finance 16: 33–38.
Ihaka, Ross, and Robert Gentleman. 1996. “R: A Language for Data Analysis and Graphics.” Journal of Computational and Graphical Statistics 5 (3): 299–314.
Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. " O’Reilly Media, Inc.".