Skip to content
Snippets Groups Projects
Commit 791d015b authored by Maxime BURON's avatar Maxime BURON
Browse files

README

parent 94ce96b6
No related branches found
No related tags found
No related merge requests found
# MCC experiments
## Usage
install the dependencies and run
```
./mcc.py test.nt 500
```
## Notes
### 20/11/24
#### Effect of independent attributes
Consider the following example:
D^*
```
a na c1
a na c2
a na c3
```
and the join probability P(b1|a) = 2/3 and P(b2|a) = 1/3. So, the value of B is independent from the attribute C.
There is a rewriting for the following query :
```sql
SELECT B FROM T WHERE A=a
```
but there is no rewriting with a condition on the independent attribute, e.g.
```sql
SELECT B FROM T WHERE C=c2
```
#### Remarks about the BID
- It allows to represent together the part of the database that is certain of $D^*$ with the part is known only under some probabilities.
- For example by projecting out the null values on one world of the BID, the results is the same as on $D^*$. It is not the case when by doing so on a database generated from the join distribution.
- Using the BID only, it is impossible to distinguish the queries that can be rewritten from the others.
#### Open question
Given a MG and $D^*$ is there a class $C$ such that :
1. the distance between the join distribution and the empirical distribution of $C$ is minimal
2. the probability of the class $C$ is strictly positive
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment