## Abstract

In many multivariate domains, we are interested in analyzing the dependency structure of the underlying distribution, e.g., whether two variables are in direct interaction. We can represent dependency structures using Bayesian network models. To analyze a given data set, Bayesian model selection attempts to find the most likely (MAP) model, and uses its structure to answer these questions. However, when the amount of available data is modest, there might be many models that have non-negligible posterior. Thus, we want compute the Bayesian posterior of a feature, i.e., the total posterior probability of all models that contain it. In this paper, we propose a new approach for this task. We first show how to efficiently compute a sum over the exponential number of networks that are consistent with a fixed order over network variables. This allows us to compute, for a given order, both the marginal probability of the data and the posterior of a feature. We then use this result as the basis for an algorithm that approximates the Bayesian posterior of a feature. Our approach uses a Markov Chain Monte Carlo (MCMC) method, but over orders rather than over network structures. The space of orders is smaller and more regular than the space of structures, and has much a smoother posterior "landscape". We present empirical results on synthetic and real-life datasets that compare our approach to full model averaging (when possible), to MCMC over network structures, and to a non-Bayesian bootstrap approach.

Original language | American English |
---|---|

Pages (from-to) | 95-125 |

Number of pages | 31 |

Journal | Machine Learning |

Volume | 50 |

Issue number | 1-2 |

DOIs | |

State | Published - Jan 2003 |

### Bibliographical note

Funding Information:The authors thank Yoram Singer for useful discussions and Harald Steck, Nando de Freitas, and the anonymous reviewers for helpful comments and references. This work was supported by ARO grant DAAH04-96-1-0341 under the MURI program “Integrated Approach to Intelligent Systems”, by DARPA’s Information Assurance program under subcontract to SRI International, and by Israel Science Foundation (ISF) grant 244/99. Nir Friedman was also supported through the generosity of the Michael Sacher Trust Alon Fellowship, and Sherman Senior Lectureship. Daphne Koller was also supported through the generosity of the Sloan Foundation and the Powell Foundation. The experiments reported here were performed on computers funded by an ISF infrastructure grant.

## Keywords

- Bayesian model averaging
- Bayesian networks
- MCMC
- Structure learning