Learning Markov Network Structure using Few Independence Tests.
|Título||Learning Markov Network Structure using Few Independence Tests.|
|Publication Type||Conference Proceedings|
|Year of Conference||2008|
|Authors||Gandhi P, Bromberg F, Margaritis D|
|Conference Name||SIAM Data Mining|
In this paper we present the Dynamic Grow-Shrink Inference-based Markov network learning algorithm (abbreviated DGSIMN), which improves on GSIMN, the state-of-the-art algorithm for learning the structure of the Markov network of a domain from independence tests on data. DGSIMN, like other independence-based algorithms, works by conducting a series of statistical conditional independence tests toward the goal of restricting the number of possible structures to one, thus inferring that structure as the only possibly correct one. During this process, DGSIMN, like the GSIMN algorithm, uses the axioms that govern the proba bilistic independence relation to avoid unnecessary tests i.e.,tests that can be inferred from the results of known ones. This results in both efficiency and reliability advantages over the simple application of statistical tests. However, one weakness of GSIMN is its rigid and heuristic ordering of the execution of tests, which results in potentially inefficient execution. DGSIMN instead uses a principled strategy, dynamically selecting the locally optimal test that is expected to increase the state of our knowledge about the structure the most. This is done by calculating the expected number of independence facts that will become known (through inference) after executing a particular test (before it is actually evaluated on data), and by selecting the one that is expected to maximize the number of such inferences, thus avoiding their potentially expensive evaluation on data. As we demonstrate in our experiments, this results in an overall decrease in the computational requirements of the algorithm, sometimes dramatically, due to the decreased the number of tests required to be evaluated on data. Experiments show that DGSIMN yields savings of up to 88% on both sampled and benchmark data while achieving similar or better accuracy in most cases.
Miembros del DHARMa que son autores::