SIGMOD 2010 Repeatability & Workability Evaluation
                               for paper #26
Active Knowledge : Dynamically Enriching RDF Knowledge Bases by Web Services
by Nicoleta Preda, Fabian Suchanek, Gjergji Kasneci, Thomas Neumann, Wenjun Yuan

Hardware & Software environment
===============================

       | Paper                               | Review 1
-------+-------------------------------------+-----------------------------------------------------------------
 class | (not specified)	             | desktop
 CPU   | Pentium(R) Dual-Core CPU 2.50GHz    | AMD Athlon(tm) 64 X2 Dual Core Processor 4600+, cpu MHz 1000.000
 RAM   | 2 GB                                | 2 GB
 OS    | Debian 4.3.2, Kernel version 2.6.30 | Fedora 11 Linux version 2.6.30.9-96.fc11.x86_64


Submission
==========

The authors provided
 - source code for all their code and additional software ();
 - data generators;
 - scripts to re-run all experiments and collect the output of the 
   proposed methods in a text file;

Repeatability Evaluation
========================

Process
-------

The given instructions are clear and simple to follow. The authors
were very responsive in supplying additional/improve code and
explanations.

Detailed Results
----------------

The queries supplied in the benchmark differ from the queries reported
in the paper. It was not possible to consider queries different from
those fixed by the authors. By the end of this RWE, the authors
produced nevertheless a user-friendly user interface for their system.

* Experiment 1

  Runnable, but the repeated results are partly different from those
  reported in the paper (and also different from those supplied with
  the code).

  For query 1, the repeated results have the same number of web calls
  and total results as reported.

  For query 2, the repeated results (number of calls and total
  answers) are different for DF: In most cases, the first three
  results given with the code are much better in terms of total
  answers per web calls, while the remaining three results in the
  given set are worse than in the repeated results.
  
  In the repeated results, F-RDF needs consistently more time and web
  calls to obtain the same number total answers. F-RDF(R) obtains less
  answers, though with a comparable number or more web calls.

  For query 3, the repeated results for DF and F-RDF are comparable
  with those reported. For F-RDF(R), we need in average less calls for
  the same answers. This is different for the previous cases, where we
  needed more calls for the number of answers.

  
  For query 4, DF and F-RDF need less or the same number of calls for
  the same number of answers in the repeated results. F-RDF(R) uses
  about twice less calls to retrieve a third of the number of answers
  given.

  For query 5, DF uses more calls but retrieves far less answers.
  F-RDF(R) can even obtain 20 times more answers with an increase of
  about 6% in the number of calls.

  For query 6, DF and F-RDF obtain comparable results (almost
  identical) to those reported. F-RDF(R), however, obtains the
  same number of answers with far less calls.

  For query 7, DF and F-RDF obtain comparable results (almost
  identical) to those reported. F-RDF(R), however, obtain the
  same number of answers with far less calls.

* Experiment 2

  Repeatable.

  Figure 9(c) could not be fully followed. The plots are generated
  from three data points and it would have been more appropriate to
  plot the data points only and not a continuous line connecting them.
  It is not discussed to what extent the continuous line would truly
  approximate the outcome of the actual runs. This issue has been
  raised with the authors who agree with it.

* Experiment 3.

  The paper does not explain how to repeat this experiment. It only
  states that (i) 100 queries were considered that ask for books
  published by an author whose name is given, and that (ii) 98% of the
  output books were correct answers.

  The authors explained how this test could be run, but it has severe
  external limitations due to restricted number of calls allowed per
  day.


Summary
-------

Some of the experiments of the paper are not repeatable due to (i) the
randomness inherent in some of the algorithms and the fact that no
quality/error guarantees seem to be offered by the proposed
algorithms, (ii) changes in content of external web sources, as well
as (iii) various access restrictions imposed by external web
sources. This has been raised with the authors who confirmed it.

The machine used in the repeated experiments needed consistently more
time than reported in the paper. This is most likely due to the
external RDF engine.

Workability Evaluation
======================

was only evaluated to the extend that the experiments were run on a
different hardware platform.