Each model submission must include its written verbal description and its implementation (coded using either SAS, Matlab, or R languages).
To facilitate the accumulation of knowledge we will impose three requirements on the submitted models:
First, the model should replicate the 14 qualitative phenomena described in Table 1 (the exact replication criteria are detailed at the bottom of this page and are also specified in the baseline model examples codes). competition’s website).
Second, the verbal description should be short: The maximal allowed length of this verbal description is 1500 words (the number of words in the current description of BEAST is 1319).
Third, the verbal description must be clear.
The clarity of the model’s description will be evaluated as follows:
Between 3-5 skilled behavioural modellers will be asked to reproduce the model with the lowest MSD (as well as its predictions) according to this model's verbal description alone.
At least one of the modellers should be able to reproduce the model and its predictions in order for it to win the competition.
If none of the modellers succeed, then the modellers will be asked to reproduce the second-best model (ranked according to MSD), and if non succeed they will be asked to reproduce the third-best model.
The highest ranked model, among the top-three models, which is successfully reproduced wins the competition.
If no modeller succeeds in reproducing any of the top-three models, then all three models will be asked to send a correction for their verbal description and the process will start over.
Competition criteria
The current competition focuses on the prediction of the mean B-rates in each of the five blocks of trials for each choice problem. As in Erev et al.’s (2010) competitions, the accuracy of the prediction will be evaluated using a mean squared deviation (MSD) score. We will first compute the squared difference between the observed and predicted rates in each block of five trials, in each of the 60 problems of Study 3 (the competition study), and then compute the mean over the 300 scores. The MSD criterion, which has been also used by previous studies (e.g., Erev, Ert, Roth, 2010; Erev, Ert, Roth, Haruvy et al., 2010; Ert et al., 2011), has several advantages over other model estimation techniques (e.g., likelihood criteria). In particular, the MSD score underlies traditional statistical methods (like regression and the t-test) and is a proper scoring rule (Brier, 1950; Selten, 1998) which is less sensitive to large errors than other measures.
Replication criteria
This is a brief description of the predictions that the model need to produce to replicate the qualitative phenomena described in Table 1.
In the following pseudo code pred1(B) and pred5(B) are the predictions of the proportion of choosing B in trials 1- 5 (block 1) and trials 21-25 (block 5) respectively.
From description:
1. Allais/Certainty effect: pred1(B) < 50% and pred1(B’) > 50%
2. Reflection effect: pred1(B) > pred1(B’)
3. Overweighting of rare events: pred1(B) > 50%
4. Loss aversion: pred1(B)* <50%
5. Low mag eliminates loss aversion: pred1(B) > pred1(B)*
6. St. Petersburg paradox: pred1(B)<50%
7. Ellsberg/Ambiguity Aversion: pred1(B)<50%
8. Break even effect: pred1(B) > pred1(B’)
9. Get something effect: pred1(B) < pred1(B’)
10. Splitting effect: pred1(B) < pred1(B’)
From experience:
11. Underweighting of rare events: pred5(B'') < 50%
12. Reversed reflection: pred5(B) < pred5(B’)
13. Payoff variability effect: Pred5(B) > pred5(B’)
14. Correlation effect: Pred5(B) < pred5(B’)