From 6e6216425e2719d81735750dd54092700d2e0faa Mon Sep 17 00:00:00 2001 From: Maxime Buron <maxime.buron@inria.fr> Date: Tue, 27 Jun 2023 15:16:26 +0200 Subject: [PATCH] missing data project --- .gitignore | 1 + projects/index.org | 1 + ...6c9d053b3952bf48300ded8b9cfea3a1e6e881.png | Bin 0 -> 20325 bytes .../best-answer-vs-most-probable.org | 438 ++++++++++++++++++ projects/missingdata/index.org | 19 + projects/missingdata/most-probable-class.org | 94 ++++ projects/missingdata/most-probable-class.tex | 68 +++ 7 files changed, 621 insertions(+) create mode 100644 projects/missingdata/.ob-jupyter/e16c9d053b3952bf48300ded8b9cfea3a1e6e881.png create mode 100644 projects/missingdata/best-answer-vs-most-probable.org create mode 100644 projects/missingdata/index.org create mode 100644 projects/missingdata/most-probable-class.org create mode 100644 projects/missingdata/most-probable-class.tex diff --git a/.gitignore b/.gitignore index 5b96f6c..6adefb3 100644 --- a/.gitignore +++ b/.gitignore @@ -1,2 +1,3 @@ projects/het2onto-benchmark/.tmp-conf.txt *.log +*/ltximg diff --git a/projects/index.org b/projects/index.org index 52b2547..3091d22 100644 --- a/projects/index.org +++ b/projects/index.org @@ -1,5 +1,6 @@ #+TITLE: Projects +- [[file:missingdata/index.org][Databases with missing data]] - [[file:obi-wan/index.org][Obi-Wan]] - [[file:het2onto-benchmark/index.org][Obi-Wan Benchmark]] - [[file:qa-test/qa-test.org][Query Answering Test]] diff --git a/projects/missingdata/.ob-jupyter/e16c9d053b3952bf48300ded8b9cfea3a1e6e881.png b/projects/missingdata/.ob-jupyter/e16c9d053b3952bf48300ded8b9cfea3a1e6e881.png new file mode 100644 index 0000000000000000000000000000000000000000..c879e52584a148a65e2e18a731bae45e78e3b9c6 GIT binary patch literal 20325 zcma*PcRZKx`v&|jg^*Dsdu6YR>=79ik-bM`%gUaW86g$2%g)SJ$Q}{0lI)d{y|?GQ z(f9lL{C>~xc|Cuq^d9$pU)On^*Kr=labC~w%SoTdBE>?XQ0MR5mQX~YP7|O|r|L1$ z;eTZNGK=9aetSuEdnIdQd#49>MyPua>}@Qq?Jdm=s2z>$9-CQPadGf+Tw|j)wYRr< zEWpY6=s$1Zu(mVd?8n0Nhc7v2b6evv3WfIo`8ky?nr4PV$#ve55L0%JogZ{le`9nY zy0mHYcJu{S%@gW(iBYMSQVl-6WXS%Y_IsyQM+?tLE0rQZ&SDLFjyk*SPkn@0e?)4( zre}-Nqf|wbUuuIYUDqGC5}pnBroJ9<HFjZ`isLoiH;>wHTvb;Wdvzb{w5+<sM>R68 zeVY1`&_0!PQ5+MWfPkPVOj<et1^@WOfJ++?5YWUyE&=}^ro$INO-oB#W`i##AtB*7 zO8pF8oU0K#18?o~^~QwP^06@p;I)UuxU}i_iXZj+efuWszFbe-xo<B(LFqCPrdqk! z{XxmLSI6lT>i4%O?<Y^5s8SG<kp<m(%}})S_;a*Unw(f#*XW?raFM0k=AhI2t{I!z z?hlT_k^GN;Q}ghse$spNi2ZkKjDJ7?rn9T7!OxEm<U~uR61*rY+&29pBZ)sd{6R<E zmXnjRv&-)i-PMeWiuyidKY#E^qHUz7gYjmLVOQ$C(y4g2k@@-HtSmOmUxnu`GRWF& zd1Pi}e02DuTE6`|hBTJ`UMwGh+r}L7tkMa0gH*HBpH?2M=Q{7N_lCWE`F-|OQU#Co z&<izPm$Nf7GjFdxIE{mjhSJ|&nUvetrgf?kJ5$swdblk=@VH(0t&q$0A=f|b-{OVI zoQ6GkMGm$MlarJA)`ql-EyX-Mgc;=HvpV`5V`pd0x3;$Ir+HiBh20WO+T$rUwpoU~ z4u#!4Jo>qnQWSM`&g<#v#Ux;$LWMUj*VNRo_2*U|xVDNUH0jv#*Pic-b}YWpy3=Qp z@ZeXWne$RL#`35?744NPXJUA*8@@$xT{CJXoa{tLVbUzYomKbulX>m-4`&(t>CJ8Y znP=Eh;pP<6csP8xPbGfq)-|)vE6OS=9}Zu#n`#<t?+zb^JVnE(X>Z57<&RqrPuFg| z>j?vLnb(T^G#c9I)RccmlJt0&ylBy{G_yYa-HrLu!-L(90u%YI?d@ty9y)%05^oH= z26(!wu4|@SJ3F*IJeM+6vrnVkcV^N`XHsKE;7iD?N>6E(SQWW0CV1|ggD{{P8y`o$ zUAMyZ!OFx}KKHGUEA1}xb`5HJ9%tj^ddL3!8uoVh^MmHvHG+eK158ZJDhj88r*Ns_ zEfO?zoF~=#RaH;X@{C(aCtXgd9gvmpZ}g|<<P^u=>Qa=wcklcq4())_QoiYR*R|Om zZ0t*HRrNu{*+WTY=PYXd&X+DXkd2OydrJmgD%x_%%gb{<*q%Ct8XX;l&}#bj?aX&~ z{_!3)Jzsx+43X`Li<z5mxlC~P_V%)J_8~MwY{wgDn3ym<Jv~b&BeXR4+?Hn25|Gfn zdKo7mD5y5!U@c8mCoT1H0+)$g!8}C~&aG}*F)NdexjtG3hSa`|7cX#Z=X2{|N%@3^ zhN_o4+XgL$hlSC=CF(*b1hb0K@Zp~E#W59>l;}M7|L6r(Roz^MDA1T!)6ro+efo5r zmUV@ZgC`75L_%-279$S2xAWGhzodspMbGlJYu60M>Yg`?KJMbLZ)gZ4ppyzX+#fz9 zg775x`t@r~ON&e_pKVS4FpM{Sc}0b!xcD>m605kMg=VjB-?>8|5GUsju@<qfnypEX z3V>;PjzQ&q-+iOcWGpFE<ih&;IxREv9UmW`fKzDb0Zv0M(w6<j_hh5F@1XF5F0o(W z9kB6sKHT4|k6_oTOIJwpsn*Ab@GmNNva-r?Uir#cFHPlf#qROr`i5Y#`Zvr<KDoKM zTkT#4hJBozobo7qh#_-FT^AP@aZDN(7M41Quq<7RA6W!LLqlV2f{TUZo6W4bm~nE9 z3zdhS%kW7h%by>`t}rlQd3kx+PBeuXFM2^<dM+=rEm6KX<d&#c=}9Xf@VS0CGjqPL zKzhJsDz2utx7Ut?o}NB2FHd2>wvhrEn7zuw&BMBl`TlyCM#j;_MNX^oWhs~l!}Vby zT3Ye7nJ(!p?N5qxKl5(w!4k4sUj4HDo9vp?g7&GKmV=*F=HS~22?;T2d~V;pd*RNV zI}sjaWMuVmPD9L}ofbLd;soe@@C_!L!%shP-kmRD-`(F|^dg2KmyNnEi9CcmjE_NQ z@|~KlE+uz}%!qxF!@b9&Gc(Lq73->b4{FhoBoG*5{Y7JKVX&-kxFDjgt}dXjz=XHo zbKd2m!xdcGf%{tM1RLh05^uu8pF^yuz&em27qCz7^NitnbRj-IJ~27DZgwvkognwi zPR7#O8ll%#EtTe8tmppDb7rN~!kyTf+FIYQ5A!WqSy(PmQc}{gvbN+u>P}anWn+_7 zR#nw3?&<9fs`kbRN>69%`1zB`XKUM70OrUzA^gpoE2x0|-A%3H@bK_+^78VdfBsn9 z(C4$Ceu<*ib(_BtGo`Di$7yb1A?fa3?mYIKk*!~gE1*=>);4dtD=qXgx9LkJPEKNR z%&>$6&hql|ItV-SB5Na~l&hq{WASba1mc(--@jAS($Ur5;~jkX30EA`cD9>&bbdaY zTmrMh{UwtRKI9xCRNX6jnXKYmtr9uZN79E4b#<6<JGOH_uT7eqTV7pU-(H={E-ta> z;$UYdQczGZs0$z%gP|pW)jd{Hxt~q1tD-{hx;9M%N!`*x`_?V*%P@?WuNn9l^*4pl zNj`j-jAR~5t|-|==}<~?iTjb&lQ?$K<>f*UjWrLiKUf^9fRC6Kc>KuK3%zF8z^zqF zEI#typHf^(3jIfpPLN5wOKN>PCN8ej{rmSb+h>aVt#0Y-znh#|TC(x+^&N$fL*)of zZ0CCIW5aG&;5yRY)pe1}q^&NB%ftpAC3ADHC)cxQ%tTT`f|`b=dgXB_r7-@bOMVY( z{b;UV@67iYsd|di=f^;$=jJw!)d$%^mcNzTTieot?OH+E7)sR$iC$}eszOyog@{2e zHXu5h6j|*Nvc8Y{i{F3wLRnzkDh|nIauV`pa`F}T^&U0U6T`-k(Sc8o-(EBH>*-XJ zmBo<?rF^vM;;}s$iTZB2bKrF_<u$UnIBfYrzv(5-^l)Y6V1kLv?c10?vbFrJzigu? zNry_o@MKeN4OY1Ea%h)6fhg9}No#13+FOaNto!1zdtXJxHzR{(uD=8i(1b7K%gL>( zsw$4<IXs1nBI<YKRu{bv$x>2MoF~K8@QH|e*mTv@{9)Oj;jCCwEKmo0fm}_@8~qN~ zO+XVrQpF^0-@aY&2;KWsPfri($;yfyCXMDdOViKGk>TMQk4nPN!eE3N_piBn6wJf$ zRE3+Ij!sTV84czgsw!v~34-UhUmT)<HQ_w+6oYN*mntO0w>*zXJ@=NS=LSAq?C9u- zP`9<ch8p=DV{Nspju*;0%o}|$CzX`fCnA;9u~68VpMTxi#pMQ>?5w@O<=_TYy~pe? z$oVgLqo1!S8TJhFr}DgvBzN^LhiINh(XHd0+NG#CIe!?KnBM2<nIqIS=eM`Esw{bU z`*k%nGiIh_6GSMak_Z62m|9vE3XS}jSZx=TmXtiZ@MU+wxRDeG9WoZ0(||38Rpq`T zB&wo%3{Fu<Ece<!%^2&btN&zkTo~X--83~#?{PGFXn<tRw>K;<UE<WWEUG9h?7r^y zT@_Dci~7sf7`A0!eo9}~{jP)=SeW9nvM*XUO-X%-h=^_k6%}R|iHnQh*V7AKood%Q zf5iu%iG{_Sr_|PXLk1!qmrCS?kI$L%r5fzF+-4Vc)@IN~rv9w${K++h<ibu#MM)X* z`Ln>Zy?}qt>$%05qDpTG3f~qID<;4ekgOxDRjR_R5BiHAb?p_zdmXquG%*?XCDyH~ z@x{(Q0zR<ee9u(Ny%bsj8+}+zOb{#;YJe*6;pl|nqP_k7YjzX2A=OsTl6e^#(ha#S z3QI^z@*9!vYYN1*DYn1+nwH=hSYXorYu&S5^iT-4L0^~^GWc;T*a9eW3U(>jK~LBE z<5@4CMFAwaDIsxY$m`GplZLG^LBxZ!%xSUs&j3n)pwt$XgA<=q?78m>*h^!qE#mcS zZvd)7hkL6{U%!gk*j!DOPYC(-OA}SaUNZ2^YVb2;ch8S5aci!^$gLXGVIsF`?)M=- zKVM_RpB(_p)<W50R`LGUcxY=xSAf6;6WW3AkR_IDusKJ1D-UkOd+wNZcXy8z_3GY& zjHM*9xZFr3ee))Y(tSxx`?LMI+=>lSQZlkUmsR7vgU!lFN<o{En{&U4u;CE}J(q*@ zMj(1hG)ZTAbE(kJllrvA@LD*2(a_MiEb1v_RW{EfwBD`s{m;7k7kB<IhX-!ufD7jp z(U5J+833N`phNEiSn_-}BdCL|M$zI@w&WFftg&!4-ReTKuJ4szUUP$Gy%)GH;~;2^ z0BQzeJ$EE}`CL}6cy71oCAw`cFmZ4+R37eIZ|&|Hb|gwJEiF~aa8~$WP&%U!-0itp zHarU13?Q4mX|(6w)^h21h~Uy(K|7?~(FiT8YIv7FfUN=N@!+@ok5dO12Q__teYfVV zy^uxx@ZrO%+FG<fKeB5%J(sbCR=;sIwY5pXFzI^jJ0R$um_xe?ZWl?V5X<BJ)|K_J zU*s9oW1)`jMDJiF5*d+r_mw-`+}tC0UR!7|XSSm?XD7G6#qwXQ+@H6;=5^q{H>*}z zGt-rp_weg0IUO{gf%}ajy>}>#S|ZpH;0Q=W4FMKxTE*X(`$*K>06Y0cMn;w{e%Y-9 z$jN~Uc+Du!BUqf0w%H!bPnT1+AOK@#5gyX_gRtGA{lkoKjt7=!Z;p-utn!-SVLia& zfeJ}d2>z)vPe^48@YvnB;nogMea-ayyGx>(DXSG51I;Zh^^m|N)z#lFt*%}y8FJ11 zS!AJBz__aQI<8}FxG1H!sE-gq3?+k(F_>6b7tp=M5U^p^m8y}#fkjgXL&6Sd-d8u5 z81{Q&R<)eMsP)I$5eK_iw)BIWfLVzEj@I<|YeD4|0D=AUm003uzv4m$!6asOULhfE z!b_Jl=QkTg4-`=NWV}{|mMyKVL>C$52!&nO@B!{6s^#h;aCWQ~Pqa{AV?0#!(k(w6 z0@n&%KtEdLJF}VD2W~$Ae?+ip&_Xt$b9Z;ITN<sckJR)0I)^i>Y%i{TgYizc-0${; zP%i+}5R^Bi6R@DneST8rQe|EFM=5OEOr7{4tg9zCGhyJ8I5ij7#jvojrS)}5JG<*4 z<oshVlxS#aXICIwKL7ffk?#8S3j%StFh+`$uOZh!YJ-~ZoqU3bYLA9iiy0RI<aF@| z58h8YIAE01%S7nUbf%~mTap08#7OE0SRQXo%FplANV_E=(QTxnqViz+`+Im@BXgs@ zrKJ{N<8#0aBO}$`&I_NWmVO$BKG@$`f3%~W*7JO6WySoOwKnF{xjD<^v@}B~tx->U zGS%pWgvc#=bLwy52~T7x!K2N-Vvv1H{fNLOn|Ij#B9zurj*dJ}SZ}y&EDnc??$5~c z?4ByBGxH7@Zwy5q2=+!g>5w{DTODcgR1=ev@6*#UnVFd*JkFz|y?XWPquZucj5lUM zpQXy_;=DWpSSL4OQ!OJrgrVuB_ql0gL=Vu6y+S|seQIih;%GO4c!2RHk9YO=tXJ6; zPzV`1P@(52=XgxTIo(%N!?3nqRILbALMK2j8-2q9Nf3m@#N(5ZdJWSZNdV{xWCphY z4`mtxEY1FWS(lZS6&p$oyM~zy1Yt2uOiUj$S1c?b=~eMwJ(FHJx8!#A%$bqS6dBe2 z>({TpjEb6Bazj&dEiJUC%E`@@HZwEJj2YlJ!na+2(|n$H;7MJb#M*p6G3*<~+}WQT zeR?|6|CnPFb4X}vMo&%!24X?9k%>+cQGYzIL;k$!T1NBr(<>Cpjl4X~PNw(E#XP)) zbk&BLf@8=gmZLRYT(&ZzU1RdFZ#Z-s%5by@7XH?G=~XN2A?CR%prlD3OiZlzs(~{i zI(b=UlBknUHt>Kqa&j^B^Y8OtHD$QF6F2;OFmZ4ScZHP!BtPA0xvDKnw$$B!D0RN5 zbgkAL3Ji4z<{##l67R@tjz$!>6v`~(K;<)FQ+uAzbB`}oHtO_9Gt=~Jm>g6f1qvNK zmiNZbpr0-JK*UYJpIU%NJ4;S3F->I4Sx+rK^~eqX!Nx1R)9%6(hgZlnr)OPmEShg~ zS44(#<H&pO{c@Rpbl8WwIhFmkK`s7?!OM4BgQmfb8V4IIPure~`Fz*eUc5X}w;R)8 zYWYK<I63)@uC8u`RYi6OhXd_5W3CWN;XvSb>gwyOO}S|q85=L;W@k%US#bbd9Hc}7 ze?mwAB@l<kIh9~wV3643aT!%?JsdJx>tC|tf?0LPJ0LCI73<TdPff5QfVSb3jo~@Z z@cJC=(0+1hx6I910HnMKCgUC3U94=3<$sK@4kTR0eB<^&YW)grt6f$j(}8WzAD5!$ z{Vji#_MK57`)T}@)zwqlE?4w_WNAd~PXP<nl_pQe&VC+ne0_5`>oOFYBOlV@&AX-q zjQGcaZrN(J_Vn)TR9u_uyO<;uTr*VRMsV>W8p`>1l*xS^onY7;lFG};ksyZU9rsxk z0C~5{{BR`|l0PS>QcW<S3=W2Ei5T{wh*M`#FendT6F^F!r3Nx<YFK=*H*3ap3Z5di zsxrDBg-=fYVk{p^<f2%2$mbm{0%OVYUr#YuTQVlwaR`|yvxF2Fyo{-Hl@Iids$I6d zUrS1^IBz5W2q)&aJD$!I(U?hyXZ)k5Y5YRs`%AxTy>3bRt}c+=+oZ1zUVXn>ct~@* zL*H+E#4as9Q}x$m^<798B@d=v`SdJvJcIn0KeiTyUXL4+YV(6tk=ufZ>--Iel_<X! zW0&#)OSj_p$(jfEc9cK4r;e_j-IlHIyY7NDwY|FH?DFAH8#fA7PTxmG$#!=d=ZJMQ z`@q~%6E?1M=7?B~wzJ_+^2p#ct#N330}~}c)P_)#X3twlj%Y{T<~Ex+E_zzpL^)qB z0tu^z4njF2bq(3Ol-BnNV=~n0EdER`M+-<^W0u59`^vUh=W>ug&)wQZtil|-LMW|5 z-1h2%OI~{Z1r@Bc#Gurfa-pFx=2+ASm8GES;)qGf*+{M<Le#1-wi@9*!y*Pu8EPE- z1mWu4A8Z(a5xp&=-F8HiPTS++(w4>|{GkrH2|}b6PP~7dWs#2UKt4(REImk}$LxOP zQ=u%Tu6I`J*%@f>P1m!{Cm*R6uH@vX^gK<EYzQKnFUMxN;y~LRtAin~J5`+w?=qkC z_@GU>*d&0D!Y8*9RAsbI%3<bph1>YokbDiEl{jK*lj%835L)-E?Z0lXw8IV^#?(rx zXI3kmhK2mzsPfwBT#NfDHY}PhtcrY`%!+(6gv_xfS6~2a6kj!H+-Hy<WpljhTaYN* z5x8p*CN$K+9D8bT(pCeD@kte>7>{9ak!#aoi&;~!F|D>qDXdb7IIQC*ubad<qKA*h zFJOZ2A`DA1Ge#tr_Vljs(bQ^=*`dQ-h9N_x{UW2MBp=r&BfKQP%{wW(eA3pW_bFd) z2HJyHjYbaCsF4@Rmn0PO%5f#k^1L0W1>9JzG5*{<<J+Y5)ZeEsg!gTQ0>h#GJWPSH zNokwxaqwu`B@xQ^@*nMP%Exy=&+Gc})zOyD{^@u;BqZQ%QJB3X3ovF1X*E%piE>C3 zsgx@8H1}7l4rizi6|IM)*yyUJ+BB1iBz$<2b}gRiHI?3d7V9Y;raR+8dTjC?$|*Kj zDy)mm*elM`X*P~rcWbMg3FZ2;znr2L68dhO^`!w#^ls=6{F8V(4w+ZuvPew%0atEs zb^YRTg!v7a;R8KzG<wHlnY_E$a<U@kzil(e29^rdc8(It6(k}5X7ODepjRQ5lf+7= z^<CArP2u*`?R;L4s2pF@r>@Hs?Bq<`Rz=zQv8|f0?YoT&mWjIh6Yin1w0Y-wqnY>7 zC$n(+=u+Gha$E$*%P`;=_yh@_r`<yC5*UA2U@%$+fsDv+wQ&(#pbyMil4qlh+OS%? zXsci(rpl>chz~WFDJx(ker(QD9oh~Qz&8nZpq^M>SuM8U=hkJ5pv=rdQ_x}$&7Ag_ zw3TGFR-p*6dOw(Kt3rWkSel3{kuF3@Y+lwTc`tl#Gi0vzXw~`U*q?TK$RJ+!*p}x= zqTo*U_HH5J;11MxM1tt#-bly1!{HVmQt{qho;ZrGqZj0;Tk@5Qc2pnQi1WqPFw1vj zsU+J(|Cv~Z3>7%#JF96{KAkV*yP9_0Wv>v%@CB`8L1~jCtM$$Jnscb&$yeNM&DUr0 z<m5+jaKikAKl{pzSHE9T9eN<;A|xJGLy~=abnYUpwaQRgOttY`zEj+q`aS;ASem0; zFcDAg!BCU;Eo7tq_)%bbJ2=`U9vT^e?m=_oMo<1d%%>omNJvQm#wP~yl@Cf!w^&>V z$H)TrZ98OBY;43hckUbj?@0SYuf1_9V2hZ53Wkt<y1P(@XJiyeUt+=MfuCnREK<Jy zL+kSOhXFks{|A&*{GxjYNnbPE&P5%-cf5>_4qRKaCl_+YH*9!O?Q`zp%Y=mCu>(D? z{Ts%u(Y~J^PmKYOwY5}>hkU=#QWY8kE^%o&bUypIEmjaCg2MNT-S!)OzW7La<mxJr zDi;?B#Gm2%Y)`5C_I2Q}{(M6gqFqMnB#yN1qaynm#cy$f#CSqWD0E!%0Dz|w($a?Y zfkf;Jl}?Mgbqx(;d5x5}w6)0*N>ga<`>me<_+mtQKX?b6#lEBuaK98N{p{kY|3XiM zumLRd6C{nGMI)Y=2iK#rbdhx2Q?w~m@M-e5T~SdHVEZRM!bB)VMMcXYCD_M<q4w8@ z4HjSgU1{lMDvvb+ptpXeKDPfwJ?4!kQU!Pzs_`u2yu?IW)J^A72LT0V=TdsdY$(ys zU|?wJ0+;hEfg2XGcVYDNfRK>l&x=L&qGz7~3V@2Qa3)>=ln<D+!zw7?-uM1_z)jkb zt;Jh@u=D59PJsZuVA8vHs8G=ZK0q|MR8;*g6}@tOm^7Ah<80cc_y9vA67Dfv#JJG? zfG~wLxrkc1k%whgs5^5kP<1FnfS_4*2v!-f1$ic#ZKARY##lagP6nESGE+i*x${9H z_Yid^+NmZBF?)N8fQ1zS{6V*7t*4vHg)1T40zaCyR4`~$V}vMwTL|jX|H>sQRqZ1= zd-g0Q*3Nb`8BXyZeQgizi+#fDP?uF1h6?);>pGu8C?YIu?B%sCdBdS{7iU1|njSfF zBogF&HYlJ_0Ox<Sv@kWTYKs%hv+TQ0F6f9;6;ART5MGN2yxuCMcn{==mD3jp2povS z#Kcs+F||N_0s_0w@`sqCd~0)a4Z?H)$3lv_`mhNfDeu3%z2*$GVfI4nbvcbTM1ewh zBG9EYbb(}Ptk@h1f?wW4Sp@_z@FscWN?KY$0|VMGD20EKq(K~XbNt$^zFVOi3KdwU z8w+Y;V&Z3cn(h~O5=zP+w@|?PqGmHn)gCv?O6coTqvKIp7PSr!Q_;}SM9hEq@Bvhq zYf#7%P*H_~R#Xe37?<%kn+9e`Kt-Xrrd8n)5q?0o8f-0%-j<a`^cUZ~J@;=hym(dN z(x=<vg@bNb{6L|AC4Moyzu=^vXW(nZPj}-65uMbFo4_<=ik8%}sEPo`0CaJ~-u9}3 zu5L&eowNa{YJl|T(vKF<9YCPc8X6?J?#q%;7#l2&RR7F3!Uj}~LIGk_y+McQtj*1L zfhoX&+WrcxaKAjm#xY<XZpDbGG5)%4-SPabECa-DQfXGAiiH$xfs+`lt?Pjpj24s2 z8Kf%S*%4jeqhmi^@hV+vTwU3cr(Z$WN+ZIPix<?Q366|xBD3t#Gu<|G!dCh)f0-)l zO+(y{{v`cHhV!BS!H(nmi`BOlO*VJAf9OheG!F}`YZuiFA+U|qE^UYecA288uqeO7 z1%0I?*!O93qMu)7)pY4)(sWS+IKpZ^!5$uz0e}ReoUy${iElMX89*ECcU|wJ(5v$L z(0gQW-B0J9n;DV4^;Y2^yCgCr+)3U=NJLrWH9pamP=6U?QWW*~#trt?%pon;tcat` zd%hHaS^A&Rqn$TGfLbsq=@>Q99HS=YV^DW%f&1@%Y;R+OKPrwV32%Fdr&7(N5tt@a zt2pSZ6jt+rjUnasD0fd&8O0^Nh_tKmY>{a;znMSV{AMk)X&%ooP;ek7)Zr7f#Y$7d zOX42NHsLiJxJ<wnoT<oXNW>iLaclD4g{0GyUI$n8wwF)6B9ucd^|<|dapT9Ykg)wD z`MvW5GKg5c83V53zp#M&FT^@g;NupW5tMK?;rX#>&+u&ERZ{jBB7d2F4_+A5%vp(a zO%Ju+=07);qb41w0|_0dSKvnuUQ6KkawV9UW%i~v(<kcS-UtVM*!*;{Y58Pb)1B~K zSgjo>FkM&|O%301=VYM~(B_WgXQ!ieC@Emn3_k^2Ix_xf*Ui@3X}9Nj>Gg}FO?G~M zjN=cFfy7q;GpWw7(iI{bqY%%)`D=pSFb8urM^8h0_88GO)!xFu!`n@}=!4c+%nNn` z89P2-CBUj-smwCtSr#wTOn<<gvMTXeOe`n1tPr;DVY%$3t*#I@DFqU?J)V;7-10R5 z81J23-m0=v8d$AI8Ckgc!@o((#R=~ZM=%-P1aou-VOwcDXdig%VGVTqlgzR20j0S_ zr$A7R(1-r1upD+@*PZ8GZaV^CfA1}z@3`7<(rg<+Oc1w^qT>iLHF3lu<K6jEu=NCe z(;#(N&xhHYOjwfLtg{kL=3?m(*4@qHDmMT$&_k|1%it2{Z%Dtg_0_;7bD_WlW}`W9 z!h!mm5q&2gOIyo$TMin*fD-8s|DRS<gLGN`Yd-^S!$*cjSDc5hZhzWT>3%w$0uxZH z#u>azMs6|r=zh1|l4@PW70Kvwp---hC$-dV3<Pnf6Z0L#2&}tFoVG;(4Pfm!0^~Ht zNTm>_b6m9~S4Cmk-Dy@G4x}I?iSAS#dLmF?pbCXR5>zP>X*Th>JTl{F!(r$TnH2d} z$z?~&?`v}e_r8GwLQlC)kk7XZA29JC-GC#hNiwZ3*N@9rf;VSo7=t&l!_sV|_}<pc znAjA;T8FjcH#5n67`VW^jm0T**iCwZix>XZS%88Wp0%IYP>$}E{^(Kk>Hk$c0BlWH zQcSgB`DK%#$QQ07KsF~odge*AK>spJflz&cY*>xU#w!+IrJ;JIs2Uf!_!^Cj=^B@9 z_b~3EQRdh|OQ*gwsD%42LUl}wbtEKag7>(7Fv%~!z<Zr$voSxaj%C72)u_nlQ4WiB zD1%ULo7a*PZ+kiLs?e3;5#e7q>R%s+2FAaW$4NzK!;b{|?Wae;8Ad+}^BBC=LFmMO z>kz;XDL$>%^Ynhb#q6B*lt*5ws_Ps-)HSin-A|eB7p`#dr@=-}3lJXJxm5?h;Qn|z z-6PYHqos(QSP58J+o^H(=9yh*nei`Bra)x5GM21F^thB>i1bUDOzdB6m;*rXj?{s$ z-i}wF?a@um)%+{Mj{=2T@Z!|?aePw>#9i-_a3}{A>6rkdw3Z92>(kyF5XX>*GoeVm z*Hf${6rNp<<Yh0GSCI4)vR^jHw1vxztH2st5qBy7{Yofj>>_n2<n-P3H)+m(QcX&H zNr{lqM^_w&dVTxl#$U`Z$alP0dY$Hs&Z5Y-4`Yn3!fFkbI*VkNQkeJ5H^Mg72mOLg z1`kC~2JR#Q7I2Xl0>h3&a{UaT2Oyor4Gd^ScIUWcfa(07SUP}9&KFX?j>|N;SDzu+ zb-;k(9U6*v;lhQF7QGxZ2gU^?peh7BfBv*u|K{))k374{=Hh*gc|ny=pj#vKF+fjI zga<{oQ6SdyaR`A)ewHE=Y0#H%^s~g8>V*_tLnu{f^WAMtPJjb9N~{J2l2cP@g@q|F z@F?p*{=E92=FG&@RKL(~kfy6a$#@E$2#m%MidrC~LjUBH<=jj7^Jek|@WopDd*EEq z2Q2)v%!xAx+&-XR(ZN@Nx_K6rrCUJ?J{0vbM|MirX&RGQB(?5-=P$kTpz*y-BnQIQ zFIS=Ap-}k+1$mG9d9u{=PJ=Adx-tCOffWoDwks2N)~0PIAB_L}`IE;hrz_~*4G<;( z>fQut_rk@CVUdykfOWyIq`q<OG7fkYYQRMB=u2pyr5~vMuJC{kJooTa1y)XfpNlu) ziP73C2XP9(U)AgvEEJHyJd2rBUVGR4@Tfwd26+ecJSe-O@vgJX%4#hIjX9Hr7_4{z z-hfdBI+mK68U?g|-FmN{^){NXzgC(V_-e)hEb|O=o^ZH7W?nUr&*y(=`Yz#ZfC1(1 z;|ihrRl95<--7%GM=VBEZ02W6t_{Nvdz>@2Dp*te+o;deS>@O!Mz&$xA=pzBwkA-P zhraaNubp6ye-IW(C?`qW_C?CYUB6W5UV+og`O*%(_!^QCx<kf3dCt&XXV%Y_kWz?( zmNQEIX>)e$7EDP~=k|4`H6(+kr%f{xxQ-~}KFi7znD4B|&SJj3xC;}w3{O%EB^x}H zgqayLqACDQTsZRyGwnPrI?m-MfKw4wO1;uk*ciw@z^F2NzdlgX_}K55jMn(xmXRsg zv%DS^t)majR!HjZ;vqRdB?=1*3!iXp_1zEst4*O6wH`%4tAnm^2<jP7fH)(~ubM!K zLO8;zUNsSQtFb~cAKhH(O3!^13PH#ql`lnB?wxHH*=0k$1G=7oK-6oTiJ94qC-8H0 zwf=qGo5<yU-}aO2T1K&vgs{AS{~mlFmOOu`%W|b8B#?=WICTbvL(T`zBOp=HmzS5b z1$$NJQ%=#G0UCE~Y)qe<H0<?j!{<Fee$evr5`#=`mD`1=As0Zbge8}?kcoN;Z&>}} z+QMEk0>N5pKQlG?72<^e<nn;PK=XAIK_mQSFsnTPwTjPSmJq@DP@p1?f>My2!6Jy1 ztboA{#~XsT4t5u>nY59E-|Rud3z9ZI3U~;|E@jK+Ff}a@{YU3Zh8uvYwOyN55L&Lg z*wo(MSnjfl`VXsX^6l-@moM=LfN}+Y6xGMaQ*uK4Yh8$sO{?O~DLym)B&Z9aNR)W+ zfaaOE_cf?TKpZCmAp`@3D1!fQX#GuZG9x_noIv8-Is~$Us;VENUIIYBEaZH(*s|~0 z+qaidH$n7KX$^St1Sv&EI;5%UL6su7aN(KT!Y2&Q%00XOHkEYB*Ne%r3>V?b3fih2 z88_(6_1fCn(yiZ^sCD<W48POt(Esy_b~ka%E{;21Ps51IF}cq1H#&-(pSAr9p9Zz{ zM+crtWzyWj<-+2#1zgFt#wI_1L=|wsch=l`rl2rd?k3gcG1J!TSk<q#?8c#>=5bDo zvaI#PR0h!oLTz^9YdyWl%w$4JSW&LhrI#Uq@6arOk1)JQ0U8xh!{U<dPt{`C7)tcD z@@s<PHwnUc1=!Y#R#d-9@VTYFmrmv;fA`i5bCEv5Y?^|IEqGI<aK-Uq>M2wW{e1a; zIM;b(_liPq4|y~>KP~HxJt&y+v;Otbro|Bt6@vuUG0OT&pDnEE)1M%Q6Zo}I$tfM; zL_{v}QYl^WCYefhd3b#?0Hb+RtSNcoiL*%|M134Q;ZCZ$Y#B?kpSX1ul5dYH69r{N zq`hR>W(gbyM1xjan{1oJ;#&|2S;fo8kU6&FlxL@NdO|tDyDa@Tx!%mNu1A}bQd|x5 zdDca%g;zBn{*7X9XR40mJUODy&L3~Ie|P`K@4ue$KdChSizD4Xz8Z+n?<q0xmy#^k z?U^mprOeHF-TO2>Bdk$|qxbf>3-P1_t>??`r>G|hZQ6w-g0DHPCv|ky*@BbiUI;DB z3E<NxblTi^mNyf+E+9b~IzxZx4vgBLT7Z+T%(FLrwq{b6QQmtA_cYisVILR#m7(SE z*lvyNBeus=hTwmeK4CfoALbll6om{84a4H%9u++T*%^fM?j?kIZzB3z7<FUC9zWoW zeTeYGS#d*5)gy6qx6q6Ml(!@eb3GlMxKu`o`CCyn=KzuKZGk#a1hH}hH`OLW7eHhh z%S3j0C}Fb_UuZ&D$*PE!O^0z-`BM4XPh23`f+PIONG7w(>5|}&@kDql2EYT1wKsqv zfut_mb(I-Ec%?c7Wk<H@)V$GdSyW`&9&2hO;5-5~4{xK)g^lVt95UC9x#u-C;w|l< z1A$L4I)M<dZswSs-+6E))ej#YI3ach#B2*m_`JOQc6uy4=;u_PRtbc7VU&-rOTP0O zQCA{$U7sI;IJF0ke&+#lAXdnpVsIi^xQat~=6nM-^z9h1YRNQCz%Vl-LHm%nO`?<; zo}cBca(InKFLh`P5p%qXF{mH)z&Q$4D#q7rdxQxcjB5ZZE9MD7=_&_kJDOAYQsiVp z_5`E+dkIr(vr;!7*vFKu;Y|s;EA5pcioiL}suQ>ALog<Y9b7HV1whStXY|a3S!@PJ zZ2T<wJ;}Gsv2jl^$ann~?hY^TBR=kvLNb8x1p3F}-p|rz4n@`s)Bt8pwbf!ibWnO7 zA=&>F2*}eM&-SM-L_g4ZlhNj#D0_yK$bov6Y|?@9n*-#7_0G}{nnMP<Kt9>L7doJl zWL@k^g9vO(P$^s$@%rWXtq;Pn;dU)e7?eN?6=gEx%N>Ks%VGqlI-L*kp|rW}<q{Wg z6G49znDD1H=XT}DQ2dIDzQ!<GPiu9D^*fHWvk{;F0_Y<i;UTf(@{uZEs?=wpCZGnj z<*dZX(?JmiQ{cG##JgQ0{a2AAExAM~6U;?MkCXS<`B|6<Bp@R#pacx>S#kC>3g<r1 z2@~>E?r7e9L+e{GA^V&=<dxEjbNJqV%HAY%1fbgQx^VLqh7gatKu};vZ^?}ByYfW< zXka}w`FaEzTGY${6%)C<r3@^c8>&_v?C$xWOU!`2;8;^S$y{<%>eWm~gZ_!x$(7;R ziL!N46~|F#O;FcJLOGc}dz+h28zWJSv@$sz&2~evh}$z+VyJmHATfON`D^0j<*{c5 zX4>`-DFQyKyYSr|Y8K>?)5S2jsmyD(=ZBP!Uj`=}XkQ$NbUHr?O~my{3@@pcq1;?~ zbBLW4nE4}LMX5Dk6F-Oh-o1^k{Y|eB@<o<o1_G&VmyU{ueC+=z6WmVg5}Udo2l|QR zakcZ0GD9GiMN%nj^n>gul<$q1e*pU5SRrhyk-5>i)v*d=Ih=6q&bVmYn>6)H#89U5 zUYfi!UV25IKn3e#obe0+C8IDmn<jv=wA-WP>86vyP|cB$U6CK1o3Vb`01Dn0SC~=v zXWjiul$*-1+oYHZ@P*Hbb9GEy$x?^KD1lV82;<-a!qANoyNRD0zu!a!0B<|szS;_u znuNOgWzfN&K|au!r}MdKY|H?j1aP7TO-wu_=eMgPIr3asMaMt4nJ@~SY>A}f;J_6T z5dns?Do3YG(_sK{O;>|S5=`S3t0n?Q_=uYfA*jK$h?uv)>kmxxeU<f9fi3Wvegxhd zA?AS6^X@RM+gcu1(9oE9aime@BQC`5Ze{RkjDwluGM6z9cru;<hd5&kzB$Bc2VQ8C zUXYsc^Pq15)MgCC4fy^2tvvG{7N{=9!SasE`6m9~dZ8Y^D9?4>TpLJE@OwBMvz2&* z?RaXxR`b&w8o>U57?TlF|1zHqHPj3Q6coWktg5#|-KO9BUo~vN0h8*lb?HZu7_6); zIovX{9plYPlaNrsvnPJ{ZlDpBcl8r+4b2Tzl+|xujcL6aC9BE_AgrwcO~79d6+O86 z&G>Opf*P=~Xc)NagG#5U>FMi#0|$?|=D^28O-EO|Aez}ll|?8{4dObOj97tv#X-Rp zPNRJM{k7)F!14#gXUUcblrrjtreTlei{rj_@~J?{Pssmn9q@bsEZYnzO3rNa?Z{7G z^ROe6m#v|s6#e~Gi7jXa7W%p`MI;D1lv4d>u?<5~GkN<v&BJc@l|OQea4W_ql3Zj8 z_22rFVrztzRwCr9lnVgCp$~*Tfli}>8;ut~n1+v$P*6N75~ia=gC>*K4Ni7;Nifd^ zZFZ;b&KHGHiH5+YvI!~;ae4V16&C0*%T08lBV~(UYJ2s(2H6}Ri4^147>8_t4>2qv z!g#yy;q@D3RS>8;l44?~o;`cUCQ630Ldu?9k;cQ#tq*V(9+hBcb8+!Izkmb&@^n(@ z7`U4nfLA9VA~Nc;;?~_}P*qm;L74RoaNQWweP96pmCGsM+6AIjiCDd#J<CETDE9I5 z1Fv{NUzT^an>jddkYW~kCC0*}DNmued3eMz5x=74@D~bL3XcXq6`SYsTo50Dm#h8G zlP!VAhL)qv)tnci9~`9o{QMD*KuT$9ZLJ6EO`Pprc!c@H4{|<qxy4}lL-_yCX{T?> z$;rXkyn9DYF5-R}j1~FiNFNXM1B^h(Ugoz0#=D=HivVf9_*G<qI1SIBqiae@S65eq znacOSzKyLd=eov5!{xDh@SxScAmNHVx}=V(rY5OzTkLbNfzVyMMqpjBehCa4*Q|$x zP->Lt_}n(wz_)>hhlkg~6?hMNXGX@yyDZBB!9;YAnI4oj)qXXIKxi~Vq2gVBlPIgH z86NCygRQ&~^sYRQT}QYdfpG->{F0PjYkMo=A<*31yfhZXnfUP|J`7+3bT8l|_6{aC zHsv2LKKjdH8fB)*eSLVHhL7)Zy%ABiCfDo>^5(YtH-2VBHgfhhlbNb$M>GbpWTq@z z%S`kUA2w&cYzQUi3~#`eE8MHIef_I|G1*qRd06jx9|;dor2=%Xzxph_)K^yKTa53p z*K^$uS{x<QKO`QTl?ivYAJ9i!89$!{9ie4(?Kow_Bq1Pa6>`(j9z-;9D)-zSZ)^dE z5%*G4OXT5Ha=Bb>RJnm!`9>`40ZClvpT6(>sOD21=OiK9q9KLC&wXolIm2|)<Cc#* z1C8pnF8sMEb|4ui4%xPxTPoe43{cMg%-?mBqb*MJ>0_idESUz6kEeC<uFRPn`y7ab z?&l{^5uCEd>ytky|Fe<A1@mPeq4Xof5ufq87he0XyeyZ@GESel+LO*#TC5w7=Bfeg zU0#M9si+w+Oq5J78@=u)+gyqT92gDRpTULICgt5#Gj%Fv`Oz1%hJwBqQRr*u@<_-o zuyay!@d@1|U8I<149-;Io2-9_6KJ3?gnP+*McKVYF&m@odnC7=;lheV^NA|^LGH}I zDhhlhAy`9A9RJ=({*q^b;mPMtLDhqPhaD5(_N;&lML?iA{LvMz(z)d+NfA)X*pj@g zBuaE=yvIcD_IN2%F5Ua?d%BI)?|`>0%T*wy`q}G9fr+|3{iy6x-MTQ$5G|T}%JNM; z=|4A{)UH(sPa=|d@hpQK)rlg1f()FR|G`3!-bJJQC+;~RbO%11BE}%Vy<f5u(fEhV za`C<rqx^DKR!(F?dvr$1^1cac%JMCs&0mW)D_!MGP=IpG4l7O1Uert|HDPZ{3|@*c zoR}}q#m<XsAfX(g6Xy9Y9bYq(c6W5`Rj#7z2Tcdov5ee1vZGi2-d+=$fHU7mrqQOn z`QZN|#(!fS=)`$E;jDIIboejpjDabMo5lw7_YKk7NnMEB5}QfQ_VmMqd><b^$L`#c zZ=7I$8@|n$Vyn%^7@D~j@WyXHY4`TH$4*O`4emTKOK@nwxi<~tWnahNpK^`41}R(@ zZg-ykFnVl~=FsZhw-IIc^L8oLcO)pproc3CG}i-f1ldjg)?}pr1LM>Y%EQ3s2l>nG zqY7r>rvK^5K*X2R@2n;V-~D`FEH>0u)@)f8x9KZW)M<#9z<X!>P=GlXZF$@aFfhmO z{VPYsTF!+F^ye`en*Q9Z(<7z{##8EkdV7vNclA5xPw*O`rhi+P;hsUA+4^~qIXTXr zfa(V3G*i<dGk(rE(~5S=;K=C^c4E{>_BNEcz`;#$;!1iOhrr#xaOXr|8I$V+ne8^; zui3KC#bTtKx%q<&yP}f-yvpRY-!xQ;(t^K7@cr#kHujf|D&k2gqYEBmDa!|FP<Bbz z`8SO#h;=$!zO3XPYSkOc5v-y<%rtzNZ0qXMeCWHRTs&qR$e>`7>TQT8{kc=`!3p9% z7S{ZP-g*0@DLtd);3ocGvkk)v?i=M}lMc}00t6>~dFi^WL=w>2w~~aca$cA0<dj+> zLlRqRT6RfdL7q6Jk#bL07b9^3@rit8cG0G1QzgZnXUs*j-KU1drv-A>1zH&M#AhBU z%ldjQ6quskY1~#zU?Q6TEKGRI30Y3IxR+w=-ap6Oq&zBoT7xN{j~tst{Er<jc1a&j z>Wqn_5j;v(ai5)6jxlhMR=Wh#b)g42p!G$YPcX-xC5lX09w^6($=e?$z7s0EnK0hT z7bw>>&eHH@W$DR5=hpY?(atT-oz0gG{=R2hL`9pVN%2;~56yRHCz+#N<+ryqBWlhy z^*fYxmf+Bu3-Y>Nsi`>9IiepYDg4b#ps&wC=LopGr5--K0-hnBN538t5&vKIV{<iu z^9LsX8?#z2ltF*k3o21cVYdy5`}gsI@dcW=n8zy>_C;_PBi<6&)_xece|<%-{@I=x ze(?iuI5;y6z(Aj#Rj7j=jSvFdMkahK=9x$rCWzRGG2eNK8Ze7DwAWILidNR|JAnZj z+{=dGu0VYLNLPStybuXc;+9~ToQw*9ck0*poMUhB`G;`?3#DG+dIPA0JjZ!W#QD+2 zhdBH=EWj51R?vw9dTLIgZh}1-X*dR?O3lie*>}*_*QYAVsat*->3d~NaDPT-U7pvI z(7(_w@eg<Me=Q;<t>eI0$}+s}fvq#Wpu=qG|6;Q{cV~N`69Rzxb4uZfsK>qLn{zJ` z6Ink5jgJ0cegb4XpT|yN>o_!Dd^GJy<nf5ekTF5bzD8Cr8`@(u5Ac3Hlz%!8;_gq@ zJLBJWb;A-}7NjA15zy&SAtYiDP>wzI^VEu7ofypuM=Z@LbTgsv(g&FqOA(==6@X#q z3hzibb}<Rk2?+c$e+(Tm*SWZ2=5;kS1O5E`E-hx>+tdz6nsd`LGHQf32f9*~=0Aez ze|6dyzjI*67n_)toxOFX;Oj$k$EUEXeGYu~?DUzl`+L4E`Y*jcng8)9`iiI!pqy9^ zm6JSp@Zga^qa&5X<zTfh)t88ffWEECuK+!uk7?HYAMmn*3n1&i;Ux$F+Ah?O7Y05d zb|G8nUM$(x?%$6>dTlQUO99QPUi=SgcoCt7JC)r`OrR0>%9ShXgKDR!85yNr^*o@z z0ikuXbjmJ4iBLEdIvX1m0Q$x#m_qXRVnL%o+#cZHnI0_TRa8>Sb6z$8kVq#YLK#HN z>6ef|W&8~~>5&VV^HL~$ybgs0(hvs>YuMYj0Z?OLZ*OnE<+Y}K!Ww!eR@m(lG$x+1 zF+yFl_`$X}$>~)MO+su1m%n`Zf{7bZLQxtG%VT$QvGGyKpz-#n84W+6f5xG2y`HOG z$Q$~&za@xLO-@b%uDya-lc68i<k-Z;<FR88O_rne#CiyY43*6bcCBJ+1_pH0eRXwH z6|OO8+hd*m^CwTg`ZSyR$Ey5_8IeIv0UG5;=>m*^{zgXh2_sd0YI7+XnE>TGPR!Xt zW*PShnBAgvi4WWdGnLqu4IvAn48oVr;;L>vboAqHo5M-WsRtM;bmcM;QSl;)rn?mM z5p9qQ2R*sDD>|;2a_%{r8$aU7K-Y$?gS2zD`mWmd6byn4{)lJ<&a7l;nz`$BNSf^h zEn=lCCt=9{!W9tc!WX_xx*(XGz%Alxn#?VdqsMC|^u^!ZEW^K2mvmFDmD~Ai8df>E zX^#gz$N7r{1fRDMMKQgium5mt7VHVfO7v{N+>4OT1qZ|>n{F|3Y{nQcWkKx3P_KVd zI@zI{I5<Qv()i3Pw|(}5f{mzW#XwQI1*Coxf0^Xv#SIxnzEdOJG%nf<j&HeznBM1d zi)=^%QD$f91V|<Z*}8_cNH@FO?vAseOG@Mfo+$Z{VUI6$?^6fUzjbDs|KKYbToLsd zpf`^hF%NA|1}Kvr&^4~+YYe5V%DNPBA1sfqc5YQV9bO6MB_HMtZgFnFi)^qaCE^U0 z|J=b8Jba<&7AY?}02Q$>D}-|1yk(Xb!rbhVrdV{m5aitbp%gCvFUIFTUIP}l)_)<6 z38{7!yCb;G%u6a+x<{RuBla%m{iFavN;tp{(cKTqq{DeKus!~6qTK$<yoGbOkfh8= z)@vrO%L*$JcAge37k~8G6k9LvJ+)gs8U0Ec7~-T1w00T@oql2)DcwWnLct`_+q;h> zI_~+Z?J>ef2JHAgss1ukf%lGIW8T#GZnG^W5&nsi^7qaq!`&c`+oAi@*f0(C2f1^? z?5LYdOApQ5Yf4Y?7<Qq9Vmy8eEqQt207itW!jL(dGKX84ftEH6zU1iI$I^6u&@P+t z)nltj-7$@Gg1*P(DiG@_B#u{P!i3c)J`4yH1#JcIjD`pc+VXYD__d#jxgw9J3xdC1 zOkVE02w;}6zCY(ILC2u9tL2k?m6AjtGI8O8eT~5kwA#n>5mnmd9sXCnKeA8X`gY{* z-YxC8^%tg`R5w=K1fx6N0h?b~eTH)L>wlK~5wV}H@RvKJGTQ^Y`BZQiFr1XH90$=k zx33T=aX$|bU_6571k5L+JT<27=*P-#`cI0GfcfY*IBo;MBw^r#;IEy;An-b@9e%f3 z7{|i=_z)XAd0AR&J)(i`^H;$~vnpoMIcqUMLuZf48RdRDGN3!!ZX-jM*yi)MX!sxD zR5&4}^w;Tov?e65bX*w{-G}o-M9zXulk51g%St=w2q-FJ^Zus~RZThpbmy>rd(!f` z?>~0Upj89JJ*cm}MLc#!T6u>pS4|wJsSO4{JKVl^PZAuVxRgQx(58fJ_aMP7Hg&;5 zw_|=(k??zRaunJq5Ru$wX+%tDbKtS_TwdcxXfx*lS0N@%;TGCYL+IB>4vY8*vVTi4 zoJnEO{D!%yx%m~P#ZdzUPVM>1f<8fDdjYN4OvV#XUU&f=IcLtE<)7z5q%-B&pC({> zyL_wPF$XRRW`Tk&XpTfw!Iol|)!(B?=N_CP07ga%o{I{{TouAonY-P+i{-dDG$@Aj zPEWi|#bBjyDe@KG>;nA6-G3M&xKMSlzpJ38);rt2@DYPmMn(qu&YvKiNb`UVfWtB` zt_114&$4(=TS4Iz3jcf&k7(_~>t^=F@P*ARhi;;#p<kV(7lrRN>g|evi51=Zecu7x z<>{NyZyvM@hY%zc7J_;~kS)-Q+urtFhil@3ICnof>X-~uD3~u2%_`m=8l~}iym#h# zKza)kE9)&An~zFETF}D{Ej1z>zeYE)Ffni17q0Ju*ARNvx_)2Fu1IL!ur@1`^*>Mh z*MIp&Sy8dm(*6wUKb=jmox-6M=&)PL?fOnPgKn6Zl%#O~{z&0^bxvS!n`gOW$mn{3 z9&{4BZY~5<3cK>nb3u#UOW31-Y4&qXWcPHU-lb?@zMJ1vLV-DxePMBO()Z=dm#PUS zaOGoR7(`)jV9NUQg&7$!oCY1#3icmCLlOqYD(7o7z!}c_b9ELbqPl{OOFMn-wY8r@ zPwzl`b7N!UhXmW~;vxA{)Ib3xxUW3#DW2~yF$5Rp(#D3>x`~r%pb79qL!P^jva~;4 zKst9W$_qU`wr1wi$pidrN=aWxgXgWw(kNitZ3o9@ntXy3d<f}F3TIUtgC4tTC3)eV z)b{!BCbUR5E_7#?Z;b^3?^B0RtVS)Dpy|<m!VD(q6JgJO;m42PGmIs`zKm$SjPcH= zt;?6ufm~kxWsy6oR=J}RBT`aJl@%-MB?A6{;oswMm<)8GS>`}TL~g>q^ZZ<+#nfe8 z(cMqAMXv|thdnk2jr)P&TY|O4w-!aAX$cKe0orQ*n@9Pnw+_jj|La(uND7I<BJ0nB zzdk)?M0}@BO}C+I8cx9Z%EIx#>F#7hx#HJf!FvnWcmenYP8N6o%Z$%)?ro1qxo2@U zI7y(fGLKO5F8Etb5<HBbul_31RaLzN)d)Z^_I~Gm&4{R|b8&Je>(@zxBlvP-Hni>V zJAQnbS;)x9zyicppKruoBB7W6@LU9kP9Su*J^;rJ#01hK%fiaqzbz}60DcI7K`3bN zm6Vgi1z;5R>eVyo@NYqGLQ+z46i!~Lhc%byb?EtmLhuqqOXMV#rEb}eKJelo=k9<r zsLot4H{sELIxDx99>~1tZ9V6MO$rAy0PKtJDF)Y#9Mapo_>OCK4;292VVa&+mX?H4 zNeYPTC*skK8#ieA_|%aDf+%eK;<WDEd5)q63ym*+x#z|-@ZHGq94_;k(08o}*fS_L zmOKm&wxIOdv!yxXwfPVxv<|3FtryFm`Tbj3C4P4C#znWZwA@lq2<vg+KJwO#IarD* z{n<|aEnM!I-3y5W8t73_hJl56qR1lUUtagsov#$4p5)MMiX0*XsL>);$mIemH>SYe zxxTqLKh0X;Mz4Lg=I0QI--Q6B*+L(DFq70~mkc3pe6*2ui(qhqAY3Ut8ggh32nxta z4_XJ+zl#dy|JvP=PT-(rr}kpRecT#-199&J1_dpzv~U`(O?N<p0%mT^>(T-0#5-EM zsY)%*6g_a*6M;MQb#FnJeoVddz9rdJeH1Q*fDdF3187HmE9}MtBgby5TNXh?NH_xB z=8ez*moiB0vL79D^}(~r!(r%5gr}B3ELb2}ydb@S1&{Va+ls$7{>x0acGtz0@({aF zksT)FI3?t)0?3XL^W~oAa0CNfq85Sn;o-i}zBk6{wT+z{R|6ZxeMQA*;P-~36cCvP zIs~2Pi+V>UCyhQy!6)Gah0;+!GMl|E+k+wg=Pl=Ha~5Jei(n$bHU8PVt$7`3zcA(2 z)M*5Z)5P!JD*|K9C9kC^UCxm4SXMy;@pGtbGWl}#`p?nm^TfE!6T*=C!yu2t#7FeP zE!i$sY>vS%aB2x&t{!Vtt1-E-E0WCjDjrARx+Qm=qH<q)3c2diV)<$aVnLBGH_uqw z+$2tui<_NR`lAf%q8_FX`8kWq0|m(z_DS2FH8UC?eB^8vI2Ys++zq}z9@VAF?N)2R z^FAQLP`Ll5ScI>;1bq5fX6}~@;691y1rdvV7gA`k&8T=DcwNBl#R%s=2qK1?b3m0N zv^UL}TpcZ)xH$MbfgL>L$y>9Txpm-ds)u&QdJx6IAOCK0sOnFua3q|iLQ@B|HVrt? z!EQv%J81u!em%sq*su}z)Vn?l=;Rd-U}cskAt9+T<;L=^rbZ4A2!IiCn!Dq4fnA3e zWzdtM65#<cQwE+Jb}vylg^L+cx=3zfR$KI{_Tth)UO}oT7&IE>Fd{fn<*k522E<_i zeH+|F@o6yI(3X`B91%F({C^?T|M!VykUx)(I&(lyJOY$5Kix%5%lp62H`D0`VU7T7 z0ASy@U;BWsKn$%s3dC#G7*r#OnGc$guR`q@B0@nsn+U*k)*ymYF9a~fgVo=yrI2I< zcI*Svu~%oOFH+OaXX#Z^*4Nh~>jVyN2srxG7^v3WzBpt&IDjeYGE_3DeWs_oXc9R> zyfG6wlC6ioz)4*g$mwHnC{kgPynn!FKB!6@u0caRSO^7Q#>V1+egm?f(;73u$^CKb zSE1aYVFHdu^Gi!(JlLBOJ$^O$qBcUP&<g$)b&qN?<P5jJBZ7E>j}Z3%-^1Vj`N0EG Z^BKP$Qd)B&IPwm4M^a8AM_m8e{{w}q$kYG; literal 0 HcmV?d00001 diff --git a/projects/missingdata/best-answer-vs-most-probable.org b/projects/missingdata/best-answer-vs-most-probable.org new file mode 100644 index 0000000..74fb263 --- /dev/null +++ b/projects/missingdata/best-answer-vs-most-probable.org @@ -0,0 +1,438 @@ +#+TITLE: Difference between most probable and most correct answer +#+PROPERTY: header-args :session main :exports both :results output :tangle yes + +The following python notebook shows that the following example of distribution with n rows is a case where the most probable and most correct answers are different (see the two last blocks) + +#+BEGIN_src jupyter-python :results none + dist = { + 1 : 0.75, + 2 : 0.125, + 3 : 0.125 + } + n = 8 +#+END_src + + +#+BEGIN_src jupyter-python + import pprint + # generate the possible worlds + def possible_worlds(dist, n): + worlds = [[]] + for i in range(n): + new_worlds = [] + for w in worlds: + for v in dist: + if len(w) <= i: + w.append(v) + else: + w[i] = v + new_worlds.append(w.copy()) + worlds = new_worlds + return worlds + + print(len(possible_worlds(dist, n))) +#+END_src + +#+RESULTS: +: 6561 + +#+BEGIN_src jupyter-python :results none + # compute a key for each world based on the values it contains + def world_key(w): + w.sort() + count = 0 + current = w[0] + key = "" + for i in range(len(w)): + count+=1 + if len(w)-1 == i or w[i+1] != current: + key += "{}x{}".format(count, current) + if len(w)-1 != i and w[i+1] != current: + key += " " + current = w[i+1] + count = 0 + return key + + # compute the probability of a world + def world_prob(w, dist): + prob = 1 + for v in w: + prob*=dist[v] + return prob +#+END_src + + +#+BEGIN_src jupyter-python + # computes classes of worlds based on a function computing a key for each world + # worlds are in the same class iff they have the same key + def world_classes(dist, n, class_key=world_key): + worlds = possible_worlds(dist, n) + classes = {} + for w in worlds: + key = class_key(w) + if key in classes: + classes[key]["count"]+=1 + classes[key]["class_prob"]+=world_prob(w, dist) + classes[key]["possible_values"].add(world_key(w)) + else: + wp = world_prob(w, dist) + classes[key] = { + "world_ex": w, + "possible_values": {world_key(w)}, + "class_prob": wp, + "count": 1 + } + return classes + + # the classes of possible worlds based on the function work_key + pprint.pprint(world_classes(dist, n)) +#+END_src + +#+RESULTS: +#+begin_example + {'1x1 1x2 6x3': {'class_prob': 2.002716064453125e-05, + 'count': 56, + 'possible_values': {'1x1 1x2 6x3'}, + 'world_ex': [1, 2, 3, 3, 3, 3, 3, 3]}, + '1x1 2x2 5x3': {'class_prob': 6.008148193359375e-05, + 'count': 168, + 'possible_values': {'1x1 2x2 5x3'}, + 'world_ex': [1, 2, 2, 3, 3, 3, 3, 3]}, + '1x1 3x2 4x3': {'class_prob': 0.00010013580322265625, + 'count': 280, + 'possible_values': {'1x1 3x2 4x3'}, + 'world_ex': [1, 2, 2, 2, 3, 3, 3, 3]}, + '1x1 4x2 3x3': {'class_prob': 0.00010013580322265625, + 'count': 280, + 'possible_values': {'1x1 4x2 3x3'}, + 'world_ex': [1, 2, 2, 2, 2, 3, 3, 3]}, + '1x1 5x2 2x3': {'class_prob': 6.008148193359375e-05, + 'count': 168, + 'possible_values': {'1x1 5x2 2x3'}, + 'world_ex': [1, 2, 2, 2, 2, 2, 3, 3]}, + '1x1 6x2 1x3': {'class_prob': 2.002716064453125e-05, + 'count': 56, + 'possible_values': {'1x1 6x2 1x3'}, + 'world_ex': [1, 2, 2, 2, 2, 2, 2, 3]}, + '1x1 7x2': {'class_prob': 2.86102294921875e-06, + 'count': 8, + 'possible_values': {'1x1 7x2'}, + 'world_ex': [1, 2, 2, 2, 2, 2, 2, 2]}, + '1x1 7x3': {'class_prob': 2.86102294921875e-06, + 'count': 8, + 'possible_values': {'1x1 7x3'}, + 'world_ex': [1, 3, 3, 3, 3, 3, 3, 3]}, + '1x2 7x3': {'class_prob': 4.76837158203125e-07, + 'count': 8, + 'possible_values': {'1x2 7x3'}, + 'world_ex': [2, 3, 3, 3, 3, 3, 3, 3]}, + '2x1 1x2 5x3': {'class_prob': 0.0003604888916015625, + 'count': 168, + 'possible_values': {'2x1 1x2 5x3'}, + 'world_ex': [1, 1, 2, 3, 3, 3, 3, 3]}, + '2x1 2x2 4x3': {'class_prob': 0.0009012222290039062, + 'count': 420, + 'possible_values': {'2x1 2x2 4x3'}, + 'world_ex': [1, 1, 2, 2, 3, 3, 3, 3]}, + '2x1 3x2 3x3': {'class_prob': 0.001201629638671875, + 'count': 560, + 'possible_values': {'2x1 3x2 3x3'}, + 'world_ex': [1, 1, 2, 2, 2, 3, 3, 3]}, + '2x1 4x2 2x3': {'class_prob': 0.0009012222290039062, + 'count': 420, + 'possible_values': {'2x1 4x2 2x3'}, + 'world_ex': [1, 1, 2, 2, 2, 2, 3, 3]}, + '2x1 5x2 1x3': {'class_prob': 0.0003604888916015625, + 'count': 168, + 'possible_values': {'2x1 5x2 1x3'}, + 'world_ex': [1, 1, 2, 2, 2, 2, 2, 3]}, + '2x1 6x2': {'class_prob': 6.008148193359375e-05, + 'count': 28, + 'possible_values': {'2x1 6x2'}, + 'world_ex': [1, 1, 2, 2, 2, 2, 2, 2]}, + '2x1 6x3': {'class_prob': 6.008148193359375e-05, + 'count': 28, + 'possible_values': {'2x1 6x3'}, + 'world_ex': [1, 1, 3, 3, 3, 3, 3, 3]}, + '2x2 6x3': {'class_prob': 1.6689300537109375e-06, + 'count': 28, + 'possible_values': {'2x2 6x3'}, + 'world_ex': [2, 2, 3, 3, 3, 3, 3, 3]}, + '3x1 1x2 4x3': {'class_prob': 0.003604888916015625, + 'count': 280, + 'possible_values': {'3x1 1x2 4x3'}, + 'world_ex': [1, 1, 1, 2, 3, 3, 3, 3]}, + '3x1 2x2 3x3': {'class_prob': 0.00720977783203125, + 'count': 560, + 'possible_values': {'3x1 2x2 3x3'}, + 'world_ex': [1, 1, 1, 2, 2, 3, 3, 3]}, + '3x1 3x2 2x3': {'class_prob': 0.00720977783203125, + 'count': 560, + 'possible_values': {'3x1 3x2 2x3'}, + 'world_ex': [1, 1, 1, 2, 2, 2, 3, 3]}, + '3x1 4x2 1x3': {'class_prob': 0.003604888916015625, + 'count': 280, + 'possible_values': {'3x1 4x2 1x3'}, + 'world_ex': [1, 1, 1, 2, 2, 2, 2, 3]}, + '3x1 5x2': {'class_prob': 0.000720977783203125, + 'count': 56, + 'possible_values': {'3x1 5x2'}, + 'world_ex': [1, 1, 1, 2, 2, 2, 2, 2]}, + '3x1 5x3': {'class_prob': 0.000720977783203125, + 'count': 56, + 'possible_values': {'3x1 5x3'}, + 'world_ex': [1, 1, 1, 3, 3, 3, 3, 3]}, + '3x2 5x3': {'class_prob': 3.337860107421875e-06, + 'count': 56, + 'possible_values': {'3x2 5x3'}, + 'world_ex': [2, 2, 2, 3, 3, 3, 3, 3]}, + '4x1 1x2 3x3': {'class_prob': 0.02162933349609375, + 'count': 280, + 'possible_values': {'4x1 1x2 3x3'}, + 'world_ex': [1, 1, 1, 1, 2, 3, 3, 3]}, + '4x1 2x2 2x3': {'class_prob': 0.032444000244140625, + 'count': 420, + 'possible_values': {'4x1 2x2 2x3'}, + 'world_ex': [1, 1, 1, 1, 2, 2, 3, 3]}, + '4x1 3x2 1x3': {'class_prob': 0.02162933349609375, + 'count': 280, + 'possible_values': {'4x1 3x2 1x3'}, + 'world_ex': [1, 1, 1, 1, 2, 2, 2, 3]}, + '4x1 4x2': {'class_prob': 0.0054073333740234375, + 'count': 70, + 'possible_values': {'4x1 4x2'}, + 'world_ex': [1, 1, 1, 1, 2, 2, 2, 2]}, + '4x1 4x3': {'class_prob': 0.0054073333740234375, + 'count': 70, + 'possible_values': {'4x1 4x3'}, + 'world_ex': [1, 1, 1, 1, 3, 3, 3, 3]}, + '4x2 4x3': {'class_prob': 4.172325134277344e-06, + 'count': 70, + 'possible_values': {'4x2 4x3'}, + 'world_ex': [2, 2, 2, 2, 3, 3, 3, 3]}, + '5x1 1x2 2x3': {'class_prob': 0.0778656005859375, + 'count': 168, + 'possible_values': {'5x1 1x2 2x3'}, + 'world_ex': [1, 1, 1, 1, 1, 2, 3, 3]}, + '5x1 2x2 1x3': {'class_prob': 0.0778656005859375, + 'count': 168, + 'possible_values': {'5x1 2x2 1x3'}, + 'world_ex': [1, 1, 1, 1, 1, 2, 2, 3]}, + '5x1 3x2': {'class_prob': 0.0259552001953125, + 'count': 56, + 'possible_values': {'5x1 3x2'}, + 'world_ex': [1, 1, 1, 1, 1, 2, 2, 2]}, + '5x1 3x3': {'class_prob': 0.0259552001953125, + 'count': 56, + 'possible_values': {'5x1 3x3'}, + 'world_ex': [1, 1, 1, 1, 1, 3, 3, 3]}, + '5x2 3x3': {'class_prob': 3.337860107421875e-06, + 'count': 56, + 'possible_values': {'5x2 3x3'}, + 'world_ex': [2, 2, 2, 2, 2, 3, 3, 3]}, + '6x1 1x2 1x3': {'class_prob': 0.155731201171875, + 'count': 56, + 'possible_values': {'6x1 1x2 1x3'}, + 'world_ex': [1, 1, 1, 1, 1, 1, 2, 3]}, + '6x1 2x2': {'class_prob': 0.0778656005859375, + 'count': 28, + 'possible_values': {'6x1 2x2'}, + 'world_ex': [1, 1, 1, 1, 1, 1, 2, 2]}, + '6x1 2x3': {'class_prob': 0.0778656005859375, + 'count': 28, + 'possible_values': {'6x1 2x3'}, + 'world_ex': [1, 1, 1, 1, 1, 1, 3, 3]}, + '6x2 2x3': {'class_prob': 1.6689300537109375e-06, + 'count': 28, + 'possible_values': {'6x2 2x3'}, + 'world_ex': [2, 2, 2, 2, 2, 2, 3, 3]}, + '7x1 1x2': {'class_prob': 0.13348388671875, + 'count': 8, + 'possible_values': {'7x1 1x2'}, + 'world_ex': [1, 1, 1, 1, 1, 1, 1, 2]}, + '7x1 1x3': {'class_prob': 0.13348388671875, + 'count': 8, + 'possible_values': {'7x1 1x3'}, + 'world_ex': [1, 1, 1, 1, 1, 1, 1, 3]}, + '7x2 1x3': {'class_prob': 4.76837158203125e-07, + 'count': 8, + 'possible_values': {'7x2 1x3'}, + 'world_ex': [2, 2, 2, 2, 2, 2, 2, 3]}, + '8x1': {'class_prob': 0.1001129150390625, + 'count': 1, + 'possible_values': {'8x1'}, + 'world_ex': [1, 1, 1, 1, 1, 1, 1, 1]}, + '8x2': {'class_prob': 5.960464477539063e-08, + 'count': 1, + 'possible_values': {'8x2'}, + 'world_ex': [2, 2, 2, 2, 2, 2, 2, 2]}, + '8x3': {'class_prob': 5.960464477539063e-08, + 'count': 1, + 'possible_values': {'8x3'}, + 'world_ex': [3, 3, 3, 3, 3, 3, 3, 3]}} +#+end_example + +#+BEGIN_src jupyter-python + # the classes of possible worlds where a class contains the world having the same sum of values + pprint.pprint(world_classes(dist, n, sum)) +#+END_src + +#+RESULTS: +#+begin_example + {8: {'class_prob': 0.1001129150390625, + 'count': 1, + 'possible_values': {'8x1'}, + 'world_ex': [1, 1, 1, 1, 1, 1, 1, 1]}, + 9: {'class_prob': 0.13348388671875, + 'count': 8, + 'possible_values': {'7x1 1x2'}, + 'world_ex': [1, 1, 1, 1, 1, 1, 1, 2]}, + 10: {'class_prob': 0.2113494873046875, + 'count': 36, + 'possible_values': {'7x1 1x3', '6x1 2x2'}, + 'world_ex': [1, 1, 1, 1, 1, 1, 1, 3]}, + 11: {'class_prob': 0.1816864013671875, + 'count': 112, + 'possible_values': {'5x1 3x2', '6x1 1x2 1x3'}, + 'world_ex': [1, 1, 1, 1, 1, 1, 2, 3]}, + 12: {'class_prob': 0.16113853454589844, + 'count': 266, + 'possible_values': {'4x1 4x2', '5x1 2x2 1x3', '6x1 2x3'}, + 'world_ex': [1, 1, 1, 1, 1, 1, 3, 3]}, + 13: {'class_prob': 0.10021591186523438, + 'count': 504, + 'possible_values': {'5x1 1x2 2x3', '4x1 3x2 1x3', '3x1 5x2'}, + 'world_ex': [1, 1, 1, 1, 1, 2, 3, 3]}, + 14: {'class_prob': 0.062064170837402344, + 'count': 784, + 'possible_values': {'4x1 2x2 2x3', '2x1 6x2', '3x1 4x2 1x3', '5x1 3x3'}, + 'world_ex': [1, 1, 1, 1, 1, 3, 3, 3]}, + 15: {'class_prob': 0.02920246124267578, + 'count': 1016, + 'possible_values': {'1x1 7x2', + '2x1 5x2 1x3', + '3x1 3x2 2x3', + '4x1 1x2 3x3'}, + 'world_ex': [1, 1, 1, 1, 2, 3, 3, 3]}, + 16: {'class_prob': 0.0135384202003479, + 'count': 1107, + 'possible_values': {'1x1 6x2 1x3', + '2x1 4x2 2x3', + '3x1 2x2 3x3', + '4x1 4x3', + '8x2'}, + 'world_ex': [1, 1, 1, 1, 3, 3, 3, 3]}, + 17: {'class_prob': 0.004867076873779297, + 'count': 1016, + 'possible_values': {'1x1 5x2 2x3', + '2x1 3x2 3x3', + '3x1 1x2 4x3', + '7x2 1x3'}, + 'world_ex': [1, 1, 1, 2, 3, 3, 3, 3]}, + 18: {'class_prob': 0.0017240047454833984, + 'count': 784, + 'possible_values': {'3x1 5x3', '6x2 2x3', '1x1 4x2 3x3', '2x1 2x2 4x3'}, + 'world_ex': [1, 1, 1, 3, 3, 3, 3, 3]}, + 19: {'class_prob': 0.0004639625549316406, + 'count': 504, + 'possible_values': {'5x2 3x3', '2x1 1x2 5x3', '1x1 3x2 4x3'}, + 'world_ex': [1, 1, 2, 3, 3, 3, 3, 3]}, + 20: {'class_prob': 0.00012433528900146484, + 'count': 266, + 'possible_values': {'2x1 6x3', '1x1 2x2 5x3', '4x2 4x3'}, + 'world_ex': [1, 1, 3, 3, 3, 3, 3, 3]}, + 21: {'class_prob': 2.3365020751953125e-05, + 'count': 112, + 'possible_values': {'3x2 5x3', '1x1 1x2 6x3'}, + 'world_ex': [1, 2, 3, 3, 3, 3, 3, 3]}, + 22: {'class_prob': 4.5299530029296875e-06, + 'count': 36, + 'possible_values': {'2x2 6x3', '1x1 7x3'}, + 'world_ex': [1, 3, 3, 3, 3, 3, 3, 3]}, + 23: {'class_prob': 4.76837158203125e-07, + 'count': 8, + 'possible_values': {'1x2 7x3'}, + 'world_ex': [2, 3, 3, 3, 3, 3, 3, 3]}, + 24: {'class_prob': 5.960464477539063e-08, + 'count': 1, + 'possible_values': {'8x3'}, + 'world_ex': [3, 3, 3, 3, 3, 3, 3, 3]}} +#+end_example + +#+BEGIN_src jupyter-python :results output + # returns the class with the highest probability + def most_probable_classes(classes): + keys = [] + prob = 0 + for k in classes: + if prob == classes[k]["class_prob"]: + keys.append(k) + if prob < classes[k]["class_prob"]: + keys = [k] + prob = classes[k]["class_prob"] + return { "keys": keys, "prob": prob } + + # compute the most probable answers for a given aggregate function + def most_probable_ans(dist, n, agg): + classes = world_classes(dist, n, agg) + answers = [] + mc = most_probable_classes(classes) + for k in mc["keys"]: + answers.append({ "ans": k, "prob": mc["prob"], "possible_values": classes[k]["possible_values"] }) + return answers + + print(most_probable_ans(dist, n, sum)) +#+END_src + +#+RESULTS: +: [{'ans': 10, 'prob': 0.2113494873046875, 'possible_values': {'7x1 1x3', '6x1 2x2'}}] + +#+BEGIN_src jupyter-python + # compute the answer the most correct answer for a given aggregate function + def most_correct_ans(dist, n, agg): + classes = world_classes(dist, n) + answers = [] + mc = most_probable_classes(classes) + for k in mc["keys"]: + world_of_most_probable_class = classes[k]["world_ex"] + answers.append({ "ans": agg(world_of_most_probable_class), "prob": mc["prob"], "possible_values": k }) + return answers + + + print(most_correct_ans(dist, n, sum)) +#+END_src + +#+RESULTS: +: [{'ans': 11, 'prob': 0.155731201171875, 'possible_values': '6x1 1x2 1x3'}] + + +The case with another distribution. It was almost a good example, but there are two most probable answers : +#+BEGIN_src jupyter-python + dist = { + 0 : 0.5, + 1 : 0.25, + 2 : 0.25 + } + n = 4 + + print(most_correct_ans(dist, n, sum)) + print(most_probable_ans(dist, n, sum)) +#+END_src + +#+RESULTS: +: [{'ans': 3, 'prob': 0.1875, 'possible_values': '2x0 1x1 1x2'}] +: [{'ans': 2, 'prob': 0.21875, 'possible_values': {'2x0 2x1', '3x0 1x2'}}, {'ans': 3, 'prob': 0.21875, 'possible_values': {'1x0 3x1', '2x0 1x1 1x2'}}] + +#+BEGIN_src jupyter-python + import statistics + dist = { + 1 : 0.75, + 2 : 0.125, + 3 : 0.125 + } + n = 8 + print(most_correct_ans(dist, n, statistics.median)) +#+END_src + +#+RESULTS: +: [{'ans': 1.0, 'prob': 0.8861846923828125, 'possible_values': {'6x1 1x2 1x3', '5x1 3x3', '8x1', '5x1 3x2', '5x1 1x2 2x3', '7x1 1x2', '5x1 2x2 1x3', '6x1 2x2', '7x1 1x3', '6x1 2x3'}}] diff --git a/projects/missingdata/index.org b/projects/missingdata/index.org new file mode 100644 index 0000000..256b08d --- /dev/null +++ b/projects/missingdata/index.org @@ -0,0 +1,19 @@ +#+TITLE: Databases with missing data + +The aim of this project is to study the query answering over database with missing data, where the missingness is described by a graph of missingness. + +* Query answering over block dependent probabilistic databases + +The different notions of query answering for a numerical query q (including Boolean queries: 0 or 1) over a BIPDB D: + +- the *expect value* defined by $E(q(D))$ +- a *most probable answer* is an possible answer having the highest probability +- a *best answer* is an answer on a most probable distribution of the tuples. In this case, the possible worlds that have the same distribution of tuples are considered as equivalent : we say that they form a *class*. + +The answer of a CQ over BIPDB should be another BIPDB. + +** Open questions + +- The [[file:best-answer-vs-most-probable.org][comparison of the best answer and the most probable answer]] shows that the two notions are different on a small example. +- The best answer and the expect value are the same notion when the number of rows in D is such that for every probabilities p, $|D| \times p$ is an integer ? It leads us to another question. In this case, is the class of possible worlds where the tuples is compliant with the distribution the most probable class ? I started to work on those questions [[file:most-probable-class.org][here]]. + diff --git a/projects/missingdata/most-probable-class.org b/projects/missingdata/most-probable-class.org new file mode 100644 index 0000000..bf91d94 --- /dev/null +++ b/projects/missingdata/most-probable-class.org @@ -0,0 +1,94 @@ +#+TITLE: Which is the most probable class ? +#+PROPERTY: header-args :session most-prob-class :exports both :results output :tangle yes +#+OPTIONS: toc:nil + +* Theoretical result + +We consider the case of a random variable $X$ with a finite range $\{v_{1}, \dots, v_{m}\}$ and there exists an minimal integer $Z$ such that $P(X=v_{i}) = \frac{u_{i}}{Z}$. So, we have $\sum_{1\leq i \leq m} u_{i} = Z$. + +We perform $n$ independent draws of $X$, the probability of obtaining $k_{i}$ times the values $v_{i}$ with $\sum_{1\leq i \leq m} k_{i} = n$ is: +$$\binom{n}{k_{1}} (\frac{u_{1}}{Z})^{k_{1}} \times \binom{n - k_{1}}{k_{2}} (\frac{u_{2}}{Z})^{k_{2}} \dots \times \binom{n - k_{1} \dots - k_{m-1}}{k_{m}} (\frac{u_{m}}{Z})^{k_{m}}$$ + +We can simply the formula to obtain: +$$\frac{n!}{Z^{n}} \prod_{1\leq i \leq m} \frac{u_{i}^{k_{i}}}{k_{i}!}$$ + +Finding the set of values $k_{i}$ that maximize the above formula is equivalent to find for each $i$ the $k_{i}$ maximizing $\frac{u_{i}^{k_{i}}}{k_{i}!}$. According the following section, the maximum is reached when $k_{i} = u_{i}$. However the additional constraint $\sum_{1\leq i \leq m} k_{i} = n$ ensures that the choice $k_{i} = u_{i}$ is possible iff $n$ is a multiple of $Z$. + + +* Analyze of u^k/k! + +In the following, we observe that the maximum of $\frac{u^{k}}{k!}$ for fixed $u$ seems to be reached when $k=u$. + +#+BEGIN_src jupyter-python + import matplotlib.pyplot as plt + import numpy as np + n = 50 + + k = np.arange(0, n) + k[0] = 1 # fact(0) + u = np.repeat(np.arange(0, n), n).reshape((n, n)) + uoverk = np.divide(u, k) + uoverk[:,0] = 1 # u^0 =1 + res = np.cumprod(uoverk, axis=1) + normalized_res = res/res.max(axis=1)[:,None] +#+END_src + +#+RESULTS: + + +#+BEGIN_src jupyter-python + fig, axis = plt.subplots() # il me semble que c'est une bonne habitude de faire supbplots + heatmap = axis.pcolor(normalized_res, cmap=plt.cm.Blues) # heatmap contient les valeurs + plt.colorbar(heatmap) + plt.xlabel("k") + plt.ylabel("u", rotation=0) + plt.title("u^k/k! normalized for fixed u") + plt.show() +#+END_src + +#+RESULTS: +[[file:./.ob-jupyter/e16c9d053b3952bf48300ded8b9cfea3a1e6e881.png]] + + +We just have to write the following equation : + +$$\frac{u^n}{n!} = \frac{u}{1} \times \frac{u}{2} \dots \times \frac{u}{n}$$ + +* 6k draws of a dice +:PROPERTIES: +:CUSTOM_ID: dice +:END: + +We choose for a given $k$, $n=6k$, $Z=6k$, $m=6$, $u_i=k$, so the $k_i$ have to be equal to $k$ to maximize the probability and the formula becomes : + +$$\frac{(6k)!}{(6k)^{6k}} \prod_{1\leq i \leq 6} \frac{k^k}{k!} = \frac{(6k)!}{6^{6k} (k!)^6}$$ + +#+BEGIN_src jupyter-python + def most_prob_class(k): + prob = 1 + for i in range(1, 6*k +1): + d = i % k if (i % k) != 0 else k + prob = prob * (i/(6*d)) + return prob + + print([most_prob_class(1), most_prob_class(10), most_prob_class(100), most_prob_class(300)]) +#+END_src + +#+RESULTS: +: [0.015432098765432098, 7.456270054665195e-05, 2.4632858255234786e-07, 1.5853278892898133e-08] + +* TODO Comparison of expected value and best answer + +In general, a PDB is a triplet $(\mathcal D, \mathcal W, P)$ where $\mathcal D$ is the possibly infinite set of possible tuples, $\mathcal W$ is a $\sigma$ algebra on $\mathcal D$, it represents the set of the possible database instances, so every member of $\mathcal W$ is a finite set and $P$ is a probability over $\mathcal W$. + +How to define the union or intersection of two instances in $\mathcal W$ with the bag semantic ? + +How to define an independent block PDB as a PDB from the probabilities of the values in each block ? It should be easy. Is the order of the tuples taken into account ? + +Finally, how to relate the previous results with the probabilities of BIDPDB ? + +For $Q$ a given numerical query and a $(\mathcal D, \mathcal W, P)$ a PDB, the /expected value/ of $Q$ on $(\mathcal D, \mathcal W, P)$ is defined by: +$$E(Q(D)) = \int_{D \in \mathcal W} Q(D) dP$$ + + + diff --git a/projects/missingdata/most-probable-class.tex b/projects/missingdata/most-probable-class.tex new file mode 100644 index 0000000..2192bbe --- /dev/null +++ b/projects/missingdata/most-probable-class.tex @@ -0,0 +1,68 @@ +% Created 2023-06-09 ven. 18:11 +% Intended LaTeX compiler: pdflatex +\documentclass[11pt]{article} +\usepackage[utf8]{inputenc} +\usepackage[T1]{fontenc} +\usepackage{graphicx} +\usepackage{longtable} +\usepackage{wrapfig} +\usepackage{rotating} +\usepackage[normalem]{ulem} +\usepackage{amsmath} +\usepackage{amssymb} +\usepackage{capt-of} +\usepackage{hyperref} +\author{Maxime Buron} +\date{\today} +\title{Which is the most probable class ?} +\hypersetup{ + pdfauthor={Maxime Buron}, + pdftitle={Which is the most probable class ?}, + pdfkeywords={}, + pdfsubject={}, + pdfcreator={Emacs 28.2 (Org mode 9.5.5)}, + pdflang={English}} +\begin{document} + +\maketitle + +\section{Theoretical result} +\label{sec:org8ad4e4a} + +We consider the case of a random variable \(X\) with a finite range \(\{v_{1}, \dots, v_{m}\}\) and there exists an minimal integer \(Z\) such that \(P(X=v_{i}) = \frac{u_{i}}{Z}\). So, we have \(\sum_{1\leq i \leq m} u_{i} = Z\). + +We perform \(n\) independent draws of \(X\), the probability of obtaining \(k_{i}\) times the values \(v_{i}\) with \(\sum_{1\leq i \leq m} k_{i} = n\) is: +$$\binom{n}{k_{1}} (\frac{u_{1}}{Z})^{k_{1}} \times \binom{n - k_{1}}{k_{2}} (\frac{u_{2}}{Z})^{k_{2}} \dots \times \binom{n - k_{1} \dots - k_{m-1}}{k_{m}} (\frac{u_{m}}{Z})^{k_{m}}$$ + +We can simply the formula to obtain: +$$\frac{n!}{Z^{m}} \prod_{1\leq i \leq m} \frac{u_{i}^{k_{i}}}{k_{i}!}$$ + +Finding the set of values \(k_{i}\) that maximize the above formula is equivalent to find for each \(i\) the \(k_{i}\) maximizing \(\frac{u_{i}^{k_{i}}}{k_{i}!}\). According the following section, the maximum is reached when \(k_{i} = u_{i}\). However the additional constraint \(\sum_{1\leq i \leq m} k_{i} = n\) ensures that the choice \(k_{i} = u_{i}\) is possible iff \(n\) is a multiple of \(Z\). + + +\section{Analyze of u\textsuperscript{k}/k!} +\label{sec:org338bec8} + +In the following, we observe that the maximum of \(\frac{u^{k}}{k!}\) for fixed \(u\) seems to be reached when \(k=u\). + +\begin{center} +\includegraphics[width=.9\linewidth]{./.ob-jupyter/e16c9d053b3952bf48300ded8b9cfea3a1e6e881.png} +\end{center} + + +TODO: theoretically show the result with the sign of \(\frac{u^{k+1}}{(k+1)!} - \frac{u^{k}}{k!}\) + +\section{{\bfseries\sffamily TODO} Comparison of expected value and best answer} +\label{sec:org41aad11} + +In general, a PDB is a triplet \((\mathcal D, \mathcal W, P)\) where \(\mathcal D\) is the possibly infinite set of possible tuples, \(\mathcal W\) is a \(\sigma\) algebra on \(\mathcal D\), it represents the set of the possible database instances, so every member of \(\mathcal W\) is a finite set and \(P\) is a probability over \(\mathcal W\). + +How to define the union or intersection of two instances in \(\mathcal W\) with the bag semantic ? + +How to define an independent block PDB as a PDB from the probabilities of the values in each block ? It should be easy. Is the order of the tuples taken into account ? + +Finally, how to relate the previous results with the probabilities of BIDPDB ? + +For \(Q\) a given numerical query and a \((\mathcal D, \mathcal W, P)\) a PDB, the \emph{expected value} of \(Q\) on \((\mathcal D, \mathcal W, P)\) is defined by: +$$E(Q(D)) = \int_{D \in \mathcal W} Q(D) dP$$ +\end{document} \ No newline at end of file -- GitLab