Computational Game Theory is a way to study and evaluate behaviors using game theory models, via agent-based computer simulations. One of the most known example of this approach is the famous Classical Iterated Prisoner's Dilemma (CIPD). It has been popularized by Axelrod in the beginning of the eighties and had led him to set up a successful Theory of Cooperation.
This use of simulations has always been a challenging application of computer science, and of agent-based approaches, in particular to Social Sciences. It may be viewed as Empirical Game Theory. These kind of approach is often necessary since, in the general case, classical analytical ones do not give suitable results. These tools are also often used when full game-theoretic analysis is intractable.
The usual method to evaluate behaviors consists in the collection of strategies, through open contests, and the confrontation of all of them as in a sport championship. Then it becomes, or at least seems to become, easy to evaluate and compare the efficiency of these behaviors.
Evaluating strategies can however not be done efficiently without the insurance that algorithms used are well formed and that they can not introduce bias in their computation. It can not be done without tools able to prevent or, at least, measure deviation from the object of the study. Unfortunately people using such simulations often do not take care seriously about all those aspects, because they are not aware of it, and sometimes because they are. We will try to show effects of bad simulations practice on the simplest example.
We show methodological issues which have to be taken care of, or avoided in order to prevent trouble in simulation results interpretation. Based on some simple illustration, we exhibit two kinds of bias that could be introduced. We classify them as voluntary or involuntary mistakes. The former ones can be explained by poor design of experimentations whereas the latter can defeat the purpose of the evaluation using simple ideas of agreement and cooperation. We also show the implications on interpretations and conclusions that such errors may produce.
We state that scoring/ranking methods are part of the game, and as such have to be described with the game. Many points described may seem to be widely known. We think that with the growth of interest of such methods they have to be detailed and exposed clearly.