Skip to content
Snippets Groups Projects
Maxime Buron's avatar
Maxime BURON authored
5d1335af
History
Name Last commit Last update
README.md
mar.net
mcc.py
mnar.net
test.net

MCC experiments

Usage

install the dependencies

python -m venv venv
source venv/bin/activate
pip install pyAgrum psycopg2 

and run

./mcc.py test.nt 500

Notes

20/11/24

Effect of independent attributes

Consider the following example:

D^*

a na c1
a na c2
a na c3

and the join probability P(b1|a) = 2/3 and P(b2|a) = 1/3. So, the value of B is independent from the attribute C.

There is a rewriting for the following query :

SELECT B FROM T WHERE A=a

but there is no rewriting with a condition on the independent attribute, e.g.

SELECT B FROM T WHERE C=c2

Remarks about the BID

  • It allows to represent together the part of the database that is certain of
    D^*
    with the part is known only under some probabilities.
  • For example by projecting out the null values on one world of the BID, the results is the same as on
    D^*
    . It is not the case when by doing so on a database generated from the join distribution.
  • Using the BID only, it is impossible to distinguish the queries that can be rewritten from the others.

Open question

Given a MG and

D^*
is there a class
C
such that :

  1. the distance between the join distribution and the empirical distribution of
    C
    is minimal
  2. the probability of the class
    C
    is strictly positive