Skip to content
Snippets Groups Projects
Name Last commit Last update
README.md
mar.net
mcc.py
mnar.net
test.net

MCC experiments

Usage

install the dependencies and run

./mcc.py test.nt 500

Notes

20/11/24

Effect of independent attributes

Consider the following example:

D^*

a na c1
a na c2
a na c3

and the join probability P(b1|a) = 2/3 and P(b2|a) = 1/3. So, the value of B is independent from the attribute C.

There is a rewriting for the following query :

SELECT B FROM T WHERE A=a

but there is no rewriting with a condition on the independent attribute, e.g.

SELECT B FROM T WHERE C=c2

Remarks about the BID

  • It allows to represent together the part of the database that is certain of
    D^*
    with the part is known only under some probabilities.
  • For example by projecting out the null values on one world of the BID, the results is the same as on
    D^*
    . It is not the case when by doing so on a database generated from the join distribution.
  • Using the BID only, it is impossible to distinguish the queries that can be rewritten from the others.

Open question

Given a MG and

D^*
is there a class
C
such that :

  1. the distance between the join distribution and the empirical distribution of
    C
    is minimal
  2. the probability of the class
    C
    is strictly positive