JudgeD:Probabilistic DatalogBrend Wanders,Maurice van Keulen,Jan Flokstra(University of Twente)

(Use arrow keys to navigate. f to go fullscreen.)

PART IDatalog 101

Facts, Rules and Queries

``````shape(▲, triangle).
shape(▶, triangle).
shape(■, rectangle).
shape(●, circle).

corners(triangle, 3).
corners(rectangle, 4).
corners(circle, 0).
``````
``````query(C,S) :- shape(S, T), corners(T,C).
``````
``````query(3, X)?
``````

Reasoning

• ```query(3, X)?
% use query(C, S) :- shape(S, T), corners(T, C).
% Unification: C = 3, S = X```
• ```query(3,X) :- shape(X, T), corners(T,3).
% use corners(triangle, 3).
% Unification: T = triangle
```
• ```query(3,X) :- shape(X, triangle).
% use shape(▲, triangle).
% Unification: X = ▲
```
• `query(3, ▲).`

Safety First

• Termination is necessary:
We want an answer, no query should run forever.

Variables in head ⊆ variables in body

``````% Unsafe rule:
foo(A, B) :- bar(A). % What 'B'? What do you want?!``````
• No guessing of things:

Variables in negative literals ⊆ variables in positive literals

``````% Another unsafe rule:
foo(A, B) :- bar(A), ~quux(B). % How do I even?``````

Practical Applications

• Connect directly to external data sources

Example:
``employee(ID, N, D)``
Maps naturally to relational:
``````TABLE Employee (
Employee_Id int,
Name varchar(255),
Department_Id int,
)``````
And with very little effort to XQuery or RDF.

Practical Applications, cont'd.

• Recursive queries

``````edge(a, b).
edge(b, c).
edge(c, d).
edge(d, a).

path(X, Y) :- edge(X, Y).
path(X, Y) :- edge(X, Z), path(Z, Y).

path(X, Y)?
``````

PART IIProbabilistic Datalog

Coin Toss

``````coin(c1).

result(c1, tails).

result(c1, X)?
``````
Result:
``````result(c1, heads).
result(c1, tails).
``````

Coin Toss, cont'd.

``````coin(c1).

result(c1, tails) [x=2].

@uniform p(x).

result(c1, X)?
``````
Result:
``````
% iterations: 1000
% root-mean-square error: 0.01570031846810759
result(c1, heads). % p = 0.499
result(c1, tails). % p = 0.501
``````

Confusing Authors

``````% Known facts:
name(purple, "Brend") [u=1].
name(orange, "Jan")   [u=2].

% Reconstruct authors:

author(ID, N, A) :- name(ID, N), address(ID, A) [z=2].
author(solo, N, A) :- name(_, N), address(_, A) [z=1].

% Detail predicates for querying:

actual_name(P, N)    :- author(P, N, _).
actual_address(P, A) :- author(P, _, A).
actual_person(P)     :- author(P, _, _).
``````

Demo

Probability Calculations

Monte Carlo approximation

• Approximates until error is small enough or maximum numbers of iterations reached.
• Reasonably fast for small cases, necessary for complex cases.
• Requires little to no adaption of datalog engine.

Exact

• Calculates exact probability based on descriptive sentence.
• Blazingly fast for small cases, slow for complex cases.
• Requires adaptation of datalog engine.

That's all folks!

• Where to get JudgeD?

https://github.com/utdb/judged
• How to get in touch?

• Brend Wanders <b.wanders@utwente.nl>