Repeatability & Workability Evaluation
SIGMOD 2008
was the first database conference that offered to test submitters' programs
against their data to verify the repeatability of the experiments published.
A detailed report on this initiative is has been published in
ACM SIGMOD Record, 37(1):39-45, March 2008.
Given the quite positive experiences with and feedback about the SIGMOD 2008
repeatability initiative, SIGMOD 2009 continued the repeatability initiative
in a slightly modified and extended version. A report on this effort has
been published in
ACM SIGMOD Record, 38(3):40-43, September 2009.
Contents
The Goal
On a voluntary basis, authors of accepted SIGMOD 2009 papers can provide
their code/binaries, experimental setups and data to be tested for
-
repeatability of the experiments described in the accepted papers;
-
workability in the sense of running different/more experiments
with different/more parameters than shown in the respective papers;
by a
repeatability & workability committee (RWC)
under the responsibility of the
repeatability & workability officers (RWO).
The People
The RWO are
Ioana Manolescu
and
Stefan Manegold.
The RWC are
- Christopher Re (University of Washington)
- Dominique Laurent (Université de Cergy-Pontoise)
- Gang Gou (North Carolina State University)
- Jianlin Feng (Sun Yat-Sen University)
- Konstantinos Karanasos (INRIA Saclay - Ile-de-France)
- Loredana Afanasiev (Univertiteit van Amsterdam)
- Marios Hadjieleftheriou (AT&T Shannon Labs)
- Mihai Lupu (National University of Singapore)
- Nicola Onose (University of California, San Diego)
- Panagiotis Kalnis (King Abdullah University of Science and Technology)
- Pierre Senellart (Télécom ParisTech)
- Stavros Harizopoulos (HP Labs)
- Tianyi Wu (University of Illinois at Urbana-Champaign)
- Virginie Sans (Université de Cergy-Pontoise)
Instruction for authors
Detailed instrcutions for authors as sent out by email are also available
here.
The Process
After the acceptance notification, willing authors of an accepted paper are
invited to transmit to the RWO
-
the batch of code, including
-
repeatability instructions;
-
workability instructions: how to vary the inputs of the experiments,
what constitutes legal inputs for each experiment; what "should work";
-
the data sets used in the experiments;
-
the paper.
The RWO will designate for each submission:
-
A first reviewer, which will do all the verification work as far as he/she
is able to (download and install the code, run it, write a
repeatability/workability report). This should span at most 2/3rds of the
reviewing period.
-
A second reviewer, which will check the report of the first, help clarify
any pending issues. The second reviewer is expected to interfere, if
needed, in the last 3rd of the reviewing period.
The first and second reviewer will interact until they are both satisfied
with the terms of the report.
They will both sign the report.
If there is disagreement that the reviewers cannot work out, the RWO have
the final say. They may propose alternative wording for the report, more
tests, and/or endorse responsibility together with one reviewer, if the
other cannot agree with the chosen wording (and thus is unwilling to sign
it).
During the evaluation, the first reviewer and the authors interact via a
Wiki or Blog dedicated to the paper. The second reviewer and the RWO may also
post on that page. The recommendation is that the first reviewer is left
alone with the authors during the first 2/3rds of the reviewing period, to
avoid confusion. The Wiki/Blog is used to document all interaction between
the reviewers and the authors.
Since it is not our major goal to test/evaluate the authors' packaging and
scripting skills and techniques, the authors will be allowed to provide
support and fixes in case the reviewers have problems to install, setup
and/or run the authors experiments. Like all other communication between the
reviewers and the authors, also such support and fixes must be documented on
the Wiki/Blog.
The identity of the reviewers is hidden during the evaluation process, but
obviously will be revealed afterwards by the reviewers signing their report.
The Output
The final RW report on the Wiki/Blog will include
-
a summary of the interaction with and fixes by the authors that were
required to get the experiments running properly;
-
a repeatability result as for SIGMOD 2008;
-
a description of what else the first reviewer was able to run and to which
extent the results are expected.
The final RW report will be visible for all on the Wiki/Blog site, and
will be linked to from the ACM Digital Library site with the paper.
Code Archiving
Participating in the repeatability/workability evaluation does not entail
the code is archived for everyone to use subsequently.
If the authors agree to provide their code for archiving, it will go in the
SIGMOD digital library.