Bayesian Interest Group

 View Only

These are the notes on the summary of today's talk "Influence Diagrams for Bayesian Decision Making"

  • 1.  These are the notes on the summary of today's talk "Influence Diagrams for Bayesian Decision Making"

    Posted 2 days ago
      |   view attached

    Notes from Alibab's Qwen3.5 LLM create a summary, which seems to be a reasonable

    representation of the chat content if you need a TL;DR

    Here are my immediate comments & responses to the summary John Mark

    Based on the provided meeting transcript, here is a structured extraction of key questions,

    comments, and a topical summary regarding Bayesian Methods and Networks.

    1. Topical Summary

    The discussion centered on the theoretical foundations and practical implementation of

    Bayesian Networks (BN) and Influence Diagrams. The conversation moved between three main

    pillars:

    Philosophical Foundations: Determining whether "influence" in network arcs implies

    causality or merely relevance/informational connection, referencing Leonard Savage's work

    versus modern causal inference standards (Judea Pearl). There was debate over the

    exclusivity of Kolmogorov probabilities versus other uncertainty quantifications.

    Savage's writing on personal probabilities has nothing to say about causality. Pearl's 2000

    book "Causality" started the current discussion. The topic is deep and deserves a longer

    discussion. As regards Kolmotorov - the conventional axiomatic definition of probability -

    alternatives such as Dempster-Shafer or fuzzy methods don't add any value for what we are

    doing.

    Model Construction & Interpretation: Clarifying how arrows are drawn (relevance vs.

    causality), the distinction between structure validation and conditional probability updating,

    and the handling of continuous nodes versus discrete variables in popular software

    packages (e.g., Genie, AgenaRisk).

    Hmm, many things are rolled togeter here. My term is a "structural prior" meaning the

    judgment from which the connections in the network are drawn. The structural prior has

    more importance in model validity than the numerical priors, in part since the independence

    claims made are strong assumptions, and add power to the model.

    Utility & Optimization: Defining the role of utility functions (risk preferences vs. NPV) and the

    computational challenges of fitting Bayesian models to real data, specifically regarding

    multi-objective optimization for likelihood maximization across stochastic nodes.

    This talk barely touched on utility functions, except to say they are straightforward to add to

    the model.

    2. Key Questions Raised on Bayesian Methods

    The following technical questions were posed by participants regarding methodology and

    implementation:Foundations & Probability Theory

    Is the standard strictly Kolmogorov probability, or do we need to stop using alternative forms

    of uncertainty quantification? (George Hazelrigg)

    If by alternate forms you mean e.g. Dempster-Shafer, yes -- if they violate probability

    axioms the whole rationality edifice collapses.

    Is Bayes Theorem simply a matter of set theory, or is it an extension of logic requiring Cox's

    Theorem and Edwin Jaynes' work? (Sheldon Bernard)

    Yes Bayes rule is just algebra of measures over mutually exclusive events (the reason for

    adopting BNs is to avoid exponential complexity with large numbers of variables.) Cox's

    Theorem justifies using Bayes rule for information updating.

    Do we have to "know" causality exists before incorporating it into the analysis, or can it be

    reasonably inferred? (Jim Spanier)

    In short the engineer / expert's explanation of how the system functions is by its nature

    causal, in the (commonly used) sense I'm using. If there's a question whether some relation

    is causal, then the science is incomplete and it is ammenable to statistical investigation. On

    the other hand, since we are expressing causes by probabilities, (epistemic) uncertainty

    about the cause can be expressed in the probabilities.

    Network Semantics & Construction

    In ML, are we looking at Decision Forests as well here? (Jamie Marzonie)

    You're referring to using other ML techniques like Random Forests? Any ML method that

    admits of a Bayesian interpretation can be used - linear regression included. I used CART

    for its first class explainability, and for the direct mapping between the classification tree leaf

    nodes and cells in the BN node's CPT.

    Is "influence" a mix of correlation and causation, or could be either? (MarcyConn)

    If by "influence" we just mean a euphemism for "cause" (my claim) then I think it is clearer

    not to call a correlation an influence. But as for BN structure, there's nothing wrong with

    building it from correlations, just as long as we don't label it causal.

    Where is the boundary between considering correlation versus weak causality when

    quantifying relationships? (George Hazelrigg)

    "Weak causality?" I think the point is to relegate correlation only to a value computed over

    data, and not ascribe it to belief -- for semantic reasons.

    If I have to know causality prior to constructing my model, will many cases not be able to

    even start drawing/using the network? (Reidar Bratvold)

    I think this has to do with the depth of understanding of the experts one works with. Experts

    with a deep understanding of the behavior of the system at hand naturally "know its causality" simply in their ability to explain how it works or how it fails. But I have worked withengineers who've given me for example network logs of observables and have no notion of why the system behaves as it does. "Expert systems" presume expertise.

    When fitting Bayesian Network to real data... is the fitting process itself more like a multi-

    objective optimization that is quite hard to balance? (zihan ren)

    I didn't get into learning network structure also, since it can be problematic, since one needs

    to solve an exponentially complex extension of an exponentially complex problem. It can be

    useful, but my experience is that the resulting network requires careful review by experts.

    Dale forwarded me question - what about using Bayesian regression to learn from data?

    Yes - I chose CART just for convenience of explanation. The entire Bayesian ML field is

    applicable and deserves investigation.

    Utility & Decision Logic

    Is the term Utility referring to risk preferences or things like NPV? (Brian Putt)

    Yes, there's the distinction that utility is a function of value that expresses risk preference.

    What does the data update? Is it used to validate structure or just inform the conditional

    probabilities? (ferna02d)

    The problem I posed is just to update the CPTs, which one would think is a straightforward,

    obvious application of ML, granted how it can be made to fit with the BN. Learning structure

    would raise questions about learning causality -- a harder problem.

    Which is the decision node in the diagram notation? (Brian Putt)

    Circles are probabilities, diamonds are values, squares are decisions.

    Practical Implementation & Tools

    BN advocates show examples using binary/discrete nodes... their practicality is limited by

    clumsiness in handling continuous nodes... Is this a significant limitation for problems with

    many continuous variables? (Keith Shepherd)

    In short yes. I don't know of the various attempts to approximate continuous distributions

    which fare better. This is the domain of Markov Chain Monte Carlo simulation methods -

    SeeStan

    Packages that do dynamic discretisation become desperately slow with more than a handful

    of continuous variables. What is the industry standard workaround? (Keith Shepherd)

    One approach incidentally is the CART classification tree algorithm I demonstrated that

    chooses discretization over all continuous input variables based on prediction accuracy. Perhaps it can

    be extended to optimize for decision EV.

    3. Key Comments & Expert Insights

    The following comments highlight specific insights or consensus reached during the discussion:On Relevance vs. Causality: Reidar Bratvold clarified Howard's perspective that arcs in a network represent relevance, not causality. He noted: "Including an arrow means that there

    MAY be a relevance. Not including an arrow is an absolute statement that there is no

    relevance between the events."

    Agree that the absence of an arc is a strong condition however we call it.

    On Savage's Work: The group referenced Leonard Savage's The Foundations of Statistics

    and Von Neumann & Morgenstern's Theory of Games, establishing these as key texts for

    understanding the "Bayesian promise" regarding inductive reasoning.

    On Bayesian Probability Definition: Sheldon Bernard emphasized that "Bayesian

    probability is... more accurately described as the extension of logic" rather than just set

    theory, referencing Cox's Theorem.

    On Causal Inference Resources: To address the difficulty of establishing causality vs.

    correlation, Sheldon recommended Scott Cunningham's Causal Inference: The Mixtape and

    Judea Pearl's work for understanding how modern modelers approach this.

    Just to note, nothing I presented today touches on "establishing causality" - the domain of

    statistics.

    On Fitting & Optimization: While acknowledging that fitting data requires likelihood

    maximization, the group noted that "we are trying to infer cause -- not deductive reasoning

    but inductive reasoning based on plausibility" (Sheldon Bernard), accepting that we cannot

    know "cause" 100%.

    Since we cannot, using probability to reason with it makes sense..

    On Decision Notation: MarcyConn confirmed that rectangular notation is used for decision

    nodes, helping standardize how the model is read by domain experts.

    On SME Elicitation: A distinction was made between Structure (elicitation) and

    Probabilities (data update). The group questioned whether data validates the structure itself

    or simply informs the Conditional Probabilities (CPs).

    As a major take-away, the use of ML to use data to add rigor to influence diagrams -- finding

    the balance between where judgment is best applied and where data can be applied in this

    framework is key. The techniques are still in their infancy.  _JM



    ------------------------------
    John Mark Agosta
    johnmark.agosta@fondata.ai
    -------
    LinkedIn: https://www.linkedin.com/in/john-mark-agosta/
    \
    Find me at https://medium.com/@johnmark54
    ------------------------------

    Attachment(s)

    pdf
    IDs_for_BDM_annotated.pdf   128 KB 1 version