Repeatability and Workability Evaluation of paper 8:
Histograms Reloaded: The Merits of Bucket Diversity

Repeatability Evaluation
It was possible to perform all the experiments in the provided code package.
It includes:

1) experiments building homogeneous histograms for 6 data sets,
described in sec. 4.4 and Table 2

2) experiments constructing heterogeneous diagrams described in
sec. 5.3 and presented in Table 3.

The paper contains a number of other experiments which are not
included in the provided package. Among those are the evaluation of
the errors produced by the commercial histograms (Table 1), the
evaluation of the algorithm constructing optimal heterogeneous
histograms (Table 4) and statistics about the bucket type
distribution (Table 5).

The experiments 1) and 2) generally confirmed the results presented in
the paper.  We observe small differences in the sizes of some
homogeneous histograms in 1).  The results in experiment 2) differ
only in the cpu times for histogram construction.  This was
expected since we used a test machine less powerful than the
experimental environment described in the paper. The cpu times follow
the same tendencies with respect to different data sets and q-error
values.

Workability Evaluation

The package preparation allowed very easy addition of new data sets.
 
One of the statistics generated by the package, but not presented in
the paper, shows that 5 of the 6 data sets have relatively small
number of distinct values ( < 2500) and only one has 72512 values.

We prepared an additional data set based on TPCH SF1 with the intention
to check the scalability of the approach. The attribute is the
customer key in the orders table, having ~100000 distinct values over
1,5 M rows.

We observed long execution times of the entire experimental set, 6h
for the homogeneous histogram, >9h for each of the heterogeneous
ones. To compare with other data sets, the heterogeneous histogram on
customer key with q-error=2 has size of 15709 and was constructed for
1 hour, which is an order of magnitude slower than construction times
for the data sets presented in the paper.

This experiment suggests that some scalability issues (perhaps in the
current implementation) may hinder general application of the idea for
large tables with attribute domains with high number of distinct
values.