Information?: An Inquiry

Tuesdays, 9:30-11 am
Science Building, Room 227

Schedule | On-line Forum | Evolving Resource List

For further information contact Paul Grobstein.


An Exchange on Bayesian Inference and Formal Axiomatic Systems

Alan Baker and Paul Grobstein

(continuing a discussion begun at a meeting of the Information Working Group
of the Center for Science in Society at Bryn Mawr College)


Grobstein to Baker, 22 July 2004

Alan -

Pleased you were able to be around today (particularly), and appreciated the issues you raised/helped to raise. Which do relate to/depend on expertise in philosophy of math (that I, obviously, do not have). Two related points in particular seem to me important, worth pursuing further:

I think these are inter-related in the following potentially interesting ways:

Bayes' Theorem, as I currently understand it, requires as "primitives" the concepts of arithmetic operators, of current and next (but not of "time" in any continuous sense), and of "input from outside" (the "new" observation; this is less demanding than a concept of "space"). I accept your assertion that the theorem is derivable from some existing set of axioms but that doesn't I think necessarily equate to "can't create anything more than can be created by a formal axiomatic system", which is what I'm interested in. I suspect that any derivation of Bayes' theorem depends ALSO on an "indeterminacy" primitive (something that allows for "probability"). If so, that would, as I understand it, take the axiomatic system which gives rise to Baye's theorem out of the realm of "formal axiomatic systems" and make it not subject to the Godel/Turing limitations. There exists also the possibility, even if Bayes CAN be derived in a strictly formal axiomatic system, that it could instead be taken as an axiom for a new axiomatic system lacking one or more of the starting points in the system from which it can be derived and that this in turn could potentially yield a less "incomplete" deductive system.

What particularly intrigues me is the idea (perhaps incorrect?) that the history of math takes integer as the primitive, and consistency as the sine qua non, and then gets into trouble when one hits successively infinities of various sizes, incompleteness, formally undecidable propositions and incompressible numbers. I can't help but wonder what would happen if one took as primitives ALL the reals BETWEEN zero and one and a few other things (as perhaps in Baye's theorem). Might one be able to work backwards to integers in a way that would perhaps create less of a struggle between "consistency" and "completeness"?

Paul


Baker to Grobstein, with further comments by Grobstein indented, 25 July 2004

Paul,

Thanks for your message. I am definitely interested in many of the issues concerning the pros and cons of formal axiomatic systems which seem to have arisen in the course of the Group's discussions over the past couple of months. With regard to the specific topic of probability I have less background, though the question of how exactly to interpret probability remains a philosophically "hot" topic.

The standard, and widely accepted, formal axiomatization of probability theory is due to Kolmogorov. This "classical' theory seems to have a similar status to that of classical logic. Bayes' Theorem can be derived from the Kolmogorov axioms. In general, the notion of consistency in classical logic is replaced - in probability - with the notion of coherence. As you mentioned in discussion, agents are free to assign any probability to any individual (contingent) proposition, but if their overall distribution of probabilities (at a given time) is not coherent then the axioms can be used to derive more than one different probability assignment for some propositions.

Ah so. If I'm understanding correctly, "coherence" basically means that the probability assignments for "related" hypotheses ("statements" in classical logic) must sum to less than 1.0 (I imagine that defining "related" is a major task in its own right?). And this then assures that there is one and only one probability assignment for permitted assemblies of hypotheses (parallel to the truth or falsity of well-constructed compound statements in classical logic, yes?). Is there an accessible (to me) literature on this? Does it follow from the parallels that the Kolmogorov logic generates a Godel-like "incompleteness" as classical logic does? ie are there compound statements whose probability assignment is indeterminate? The Bayesian approach treats probabilities as subjective (about degrees of belief) rather than objective (about intrinsic properties of the world). One philosophical issue concerns just what is meant by "degree of belief." It's easy enough to see intuitively what this is, but not easy to define it in a non-circular way. [e.g. it won't do to say that your degree of belief measures how likely you think the hypothesis is to be true, because this is still an inherently probabilistic notion] One popular approach is to define degrees of belief in terms of betting quotients (this is due to Ramsey, in the 1920's) - i.e. a degree of belief can be expressed in terms of a ratio between a stake and a pay-off where the pay-off occurs if the hypothesis in question is true. e.g. to say that I have degree of belief .25 that it will rain tomorrow is to say that I would be indifferent w.r.t. a bet which involves staking $1 and pays out $3 if it rains tomorrow. Think I understand both the problem and the Ramsey approach, both in spirit and in practice. Assume that the Ramsey approach does not, by itself, assure "coherence"?, that that's handled as a separate problem? FEELS to me like the K-derived probability logic stems from the same mindset as classical two-valued logic, ie not only does it have the same "problems" but has also the same "disconnect" from how the brain actually works (I think). That "degree of belief" would be a "philosophical issue" might be an indication of that. Could one not simply take "degree of belief" as something along the lines of "existence in brain of some relevant degree of order" defined re "randomness"? On this interpretation, it is difficult to see how degrees of belief could be irrational, let alone incompressible. [There is also Mark's argument that the things we attach probabilities to tend to involve ratios between discrete quantities, but this seems less convincing for two reasons. Like/share your sense of limitations of Mark's argument. Had your first point in my mind; like your second and think it likely to be more compelling to Mark. I was, I freely admit, hoping the incompressibility notion might give some heft to intuitions about how things work that I have trouble describing in terms compelling to most people. So it could well be a red herring. Agree it WOULD be a red herring on the Ramsey/Kolmogorov approach. On flip side, though, I have no reason to believe that the nervous system is constrained in its basic functions to operations that would yield only rational numbers and it is demonstrably capable of conceiving not only irrationals but incompressibles as well. One other thing which follows from Bayes' Theorem (which I think is contrary to what you said in discussion) is that conditionalizing on evidence that is inconsistent with an hypothesis results in a posterior probability of zero for that hypothesis. This is because the numerator in the likelihood ratio (i.e. P (H/e) ) is itself zero when e implies ~H. [Of course, there is a fudge here between "The sun did not come up this morning' and "It appears to me that the sun did not come up this morning.' I agree that conditionalizing on the second of these would not reduce the posterior probability of "The sun rises every day' to zero.] I had enough trouble persuading myself that I partially understood the "likelihood factor" (and STILL don't understand what constrains it in such a way as to preclude the posterior probability from exceeding one). So I'm more than happy to receive correction/tutelage on this one. Actually, I LIKE the idea that the posterior probability can go to zero. I have for years been arguing that "science" cannot "prove" an hypothesis (an argument for which I will find Bayes' Theorem helpful) but CAN "disprove" one. The latter is important to me for other (though perhaps related) reasons, basically because its the incentive to "change the story", ie go off looking for observations in some new direction (which I think is one essential component of "science", something akin to Kuhn's "revolutions"). With respect to what Bayes' Theorem does or does not require in terms of assumptions, I'm not entirely convinced that it requires a primitive concept of current / next. For instance, there is nothing in principle to stop us conditionalizing on temporally "older' evidence before we conditionalize on more recent evidence. Also, any "indeterminacy" primitive that is necessary for Bayes' Theorem seems to me to already be necessary anyway for classical probability theory (at least for the theory to be interesting). All "indeterminacy' seems to mean in this context is that propositions about the world can take numerical values between 0 and 1. Hence I remain unclear why Bayes' Theorem takes us "out of the realm of "formal axiomatic systems' " (as you put it). [We know from examples such as fuzzy logic that formality and indeterminacy / vagueness can be combined] In here is, I think, the core of what makes this worth talking through/trying to clarify for me (and perhaps for you too?). Am not of course sure what Bayes had in mind, nor do I have the background to know what potentially related developments there were at the time/have been since (though am intrigued by the notion that Bayes was working BEFORE the rational/analytic mindset become dominant). What I AM interested in is whether "indeterminacy" is ONLY "that propositions about the world can take numerical values between 0 and 1". My intuition is that "indeterminacy" can usefully be seen as more than that. It is, for me, the "wiggle room" that precludes adequate understanding through the mechanism of ANY "formal axiomatic system" and that makes the next iteration fundamentally unpredictable (to some degree) from the results of the previous one. Is not just "vagueness" in understanding; is at the core of the phenomena one is trying to "understand". And if THAT's the case, then one DOES have to recognize the fundamental limits of any/all FAS's (since they are "mechanical" in the specific sense that the next iteration IS fully predictable from the prior one). And one might aspire to a formalization of the inquiry/understanding process that doesn't have those limitations. My guess is that whatever the "primitive" is that represent "indeterminacy" in classical probability theory, it doesn't have the additional character I want, but I'm prepared to be told I'm wrong about this (the issue is not dissimilar from the Copenhagen arguments re quantum theory). You suggest that even if BT is derivable from the standard probability axioms one could construct an alternate system which takes BT as an axiom and drops one or more of the other standard axioms. In fact it can be shown by logical argument that this scenario is impossible.

Let P, Q, R be the axioms of probability theory. [it doesn't matter exactly how many there are] By assumption {P, Q, R} |- B Try dropping an axiom, say R, and adding B to the other axioms, so we have {P, Q, B} Assume that this allows us to prove something we couldn't prove in the original theory, call it K. So {P, Q, B} |- K and {P, Q, R} does not |- K

But B is derivable from {P, Q, R}, so {P, Q, R} and {P, Q, R, B} are logically equivalent theories. So {P, Q, R, B} does not |- K. But {P, Q, R, B} is a stronger theory than {P, Q, B} so {P, Q, B} does not |- K, and we have contradicted our assumption above.

Had a suspicion there was a problem along these lines when I made the suggestion. And may be another red herring. Let's see whether its worth returning to in light of other things. What I had in mind was, I think, not quite the one the proof responds to. The idea was not to substitute Baye's theorem for an existing axiom or set of axioms but rather to use Baye's theorem as a starting point and then add additional axioms as seemed necessary. I don't think that I have a complete handle on your suggestion concerning integers and reals. Perhaps it is because I am not sure what you mean (in this context) by "taking as primitive." In terms of formal systems, we can give an axiomatization of arithmetic (viz. Peano arithmetic) where the only primitives are "0' and the successor relation. So none of the other natural numbers are "primitive' in this formal sense. Real numbers are typically defined in terms of sequences of rational numbers, which are in turn defined in terms of ratios of integers. >From this perspective, are you suggesting an axiomatization of the reals which does not proceed in this cumulative way from the integers? I'm pretty sure this can (and has) been done, by focusing directly on the defining properties of the real numbers, such as being ordered, dense, closed under polynomial composition, etc. But I am unclear how this might (even in principle) avoid the Godel-style incompleteness results, since the resulting theory will be strong enough to embed integer arithmetic. Yeah, had precisely in mind "an axiomatization of the reals which does not proceed in this cumulative ways from the integers". And, here too, may be a red herring (or something already fully explored). I'm not though entirely dissuaded by your notion that doing so would not "avoid the G-style incompleteness results, since the resulting theory would be strong enough to embed integer arithmetic". It seems to me at least possible (and if possible desirable) that it would further illuminate G-incompleteness by showing it to hold in one realm (integer arithmetic) of a larger space in which it does not generally hold. On a separate note, relating to Godel and to Thursday's discussion, I think that it is important not to lose sight of the limited scope and strength of Godel's result. He's not saying that there are truths which cannot be captured in any formal system, but that any given formal system will unavoidably leave out some truths. Also Godel does not show that there are any interesting true arithmetical claims which are unprovable (i.e. of independent interest to mathematicians, aside from their unprovability). There has been considerable work done over the past twenty years by logicians and mathematicians to try to find "interesting' arithmetical claims which are not provable in Peano arithmetic (PA) but are provable in Zermelo Frankel set theory (ZF). Increasingly interesting and natural claims have been found, but nothing which approaches the simplicity or intuitiveness of (say) Fermat's Last Theorem, or the Goldbach Conjecture. Do understand that Godel incompleteness is NOT "absolute", that what one can't reach in one formal system CAN be reached in another. In fact, one of the things that I think is important in all this is that FAS's themselves GENERATE the unreachable (for themselves). Would need some tutoring on PA, ZF etc if it looks relevant in the long run, but had the impression that Chaitin was quite directly concerned with establishing "interesting arithmetical claims" (his omega number and its relation to diophantine equations) not provable in FAS's as he understands them at least.


Baker to Grobstein, 29 July 2004, referring by indent italics to previous Grobstein

Paul,

I think that your are correct to emphasize the parallels between classical logic and the standard axiomatization of probability. One further parallel - often overlooked - is that classical logic also gives you "freedom' in initial assignments. From the point of view of pure logic, each contingent atomic proposition can be assigned either "true' or "false." Below I have some more specific responses to your comments from the last message.

Ah so. If I'm understanding correctly, "coherence" basically means that the probability assignments for "related" hypotheses ("statements" in classical logic) must sum to less than 1.0 (I imagine that defining "related" is a major task in its own right?). And this then assures that there is one and only one probability assignment for permitted assemblies of hypotheses (parallel to the truth or falsity of well-constructed compound statements in classical logic, yes?). Is there an accessible (to me) literature on this? Does it follow from the parallels that the Kolmogorov logic generates a Godel-like "incompleteness" as classical logic does? ie are there compound statements whose probability assignment is indeterminate? The point about incoherence, for the probability calculus, is not that you have a choice of different probabilities you can assign (since this is also true for coherent systems), but that you can derive - within the (incoherent) system - two (or more) different probability assignments for a single proposition. Hence the analogy with inconsistency in classical logic - where you can derive two truth-value assignments (true and false) for a single proposition. For example, say I assign probability 25 to the proposition that it will rain tomorrow. If I am working with an incoherent set of probability axioms then I will be able to derive, within the system, a second, distinct probability for this same proposition, say 45.

It's not right, I think, to say that the probabilities for "related" hypotheses must sum to less than 1 (at least not on any understanding of "related" that I can think of). Perhaps you are thinking of the theorem of classical probability which says that if A and B are mutually inconsistent propositions then P(A) + P(B) is less than or equal to 1. More generally, there seem to be plenty of counterexamples to your claim.

e.g.
let C = "the dice does not come up 6"
D = "the dice comes up odd"

C and D are related, since D logically implies C.
But (for a fair dice), P(C) + P(D) = 5/6 + 1/2 = 4/3 > 1.

There *are* more general constraints on the probability assignments of compound statements, relative to the probability assignments of their atomic components. But there is no single such constraint.

e.g.
P(A & B) < P(A)
P(A & B) < P(B)

P(A v B) = max [P(A), P(B)]

Also, if A -> B then P(B) > P(A)

The issue of completeness in probability is an interesting one. What the above remarks indicates is that there are "compound statements whose probability is indeterminate" (relative to the probabilities of their components). Perhaps the simplest case is conjunctions of non-independent events.

Let A be the hypothesis that George Bush is re-elected. Let B be the hypothesis that the stock market rises in 2005. Say you assign probabilities of .5 to each. Then the only coherence constraint on P(A & B) is that it be less than .5. You are free to assign any probability to A&B between 0 and .5.

When discussing Godel and related results it is important to be clear about what "completeness" means in the logical context. A logic typically consists of a syntax (rules for well-formed formulas, rules for proof) and a semantics (an interpretation of the syntax, in which meaning is assigned to the logical symbols). In classical 1st-order propositional logic, the semantics is simply the truth-table interpretation of each logical connective. A logic is complete iff every semantically valid claim is syntactically derivable. (Soundness is the converse condition.) And in this sense, classical (1st-order) logic is complete.

Once we move beyond purely logical systems, a more useful definition of completeness is iff - for any sentence of the system, S - either S or ~S is provable in the system. This is the sense of completeness relevant to Godel's result. Clearly this is an unreasonable - indeed undesirable - condition to place on logical systems in general. (We don't want to be able to prove logically that it will rain tomorrow or prove logically that it won't rain tomorrow!) But in the special case of arithmetic (and in mathematical systems more generally) this sort of completeness is a desirable goal. We would like our formal mathematical systems to decide every question in the given domain. It is this goal, in the particular case of mathematical systems strong enough to embed arithmetic, which Godel showed is impossible.

Think I understand both the problem and the Ramsey approach, both in spirit and in practice. Assume that the Ramsey approach does not, by itself, assure "coherence"?, that that's handled as a separate problem? FEELS to me like the K-derived probability logic stems from the same mindset as classical two-valued logic, ie not only does it have the same "problems" but has also the same "disconnect" from how the brain actually works (I think). That "degree of belief" would be a "philosophical issue" might be an indication of that. Could one not simply take "degree of belief" as something along the lines of "existence in brain of some relevant degree of order" defined re "randomness"? I didn't mention this in my previous message, but the Ramsey-style "betting quotient' approach does assure coherence. There are some theorems (known as "Dutch Book' theorems) which show that a person will accept a set of bets in which he is guaranteed to lose money (a so-called "Dutch Book') iff his method for calculating the probabilities of compound statements does not conform to the standard probability axioms.

You write: "Could one not simply take "degree of belief" as something along the lines of "existence in the brain of some relevant degree of order" defined re "randomness"?"

I'm unclear what you mean here. Degrees of belief are supposed to attach to particular hypotheses. These could relate to immediate sensory experience (e.g. "the coin will come up heads on the next flip") or they could relate to much more abstruse matters (e.g. "the 109th element in the periodic table will be created by scientists in the next 24 hours"). I don't see how a particular number - corresponding to the person's degree of belief for each such hypothesis - can (even in principle) be extracted from the physical state of the brain. And where does randomness come in, exactly?

Like/share your sense of limitations of Mark's argument. Had your (i) in my mind; like your (ii) and think it likely to be more compelling to Mark. I was, I freely admit, hoping the incompressibility notion might give some heft to intuitions about how things work that I have trouble describing in terms compelling to most people. So it could well be a red herring. Agree it WOULD be a red herring on the Ramsey/Kolmogorov approach. On flip side, though, I have no reason to believe that the nervous system is constrained in its basic functions to operations that would yield only rational numbers and it is demonstrably capable of conceiving not only irrationals but incompressibles as well. I totally agree with your observation that there is "no reason to believe that the nervous system is constrained "to operations that would yield only rational numbers." But I think that it is important not to conflate the numbers which enter into a description of the nervous system with the numbers which our minds can conceive of. I don't see any clear inference, for example, from "We can conceive of incompressibles" to "The best description of the functioning of our nervous system involves incompressibles." I had enough trouble persuading myself that I partially understood the "likelihood factor" (and STILL don't understand what constrains it in such a way as to preclude the posterior probability from exceeding one). So I'm more than happy to receive correction/tutelage on this one. Actually, I LIKE the idea that the posterior probability can go to zero. I have for years been arguing that "science" cannot "prove" an hypothesis (an argument for which I will find Bayes' Theorem helpful) but CAN "disprove" one. The latter is important to me for other (though perhaps related) reasons, basically because its the incentive to "change the story", ie go off looking for observations in some new direction (which I think is one essential component of "science", something akin to Kuhn's "revolutions"). The possibility of posterior probabilities going to zero clearly fits nicely with the Popperian view of scientific hypotheses as falsifiable. The easiest way, I think, to see why the posterior probability cannot exceed 1 is to cash out the conditional probability in the numerator using the formula P(A/B) = P(A & B) / P(B) (using P(B) does not = 0)

Bayes' Theorem says P(H/e) = P(H). P(e/H) / P(e)

Rewriting the RH side, using the above formula, yields P(H) . P(e&H) / P(H) . P(e)

But now we can cancel out the two P(H)'s, so the resulting (posterior) probability is P(e&H) / P(e). And since the probability of a conjunction is less than or equal to the probability of one conjunct, this expression cannot exceed 1.

In here is, I think, the core of what makes this worth talking through/trying to clarify for me (and perhaps for you too?). Am not of course sure what Bayes had in mind, nor do I have the background to know what potentially related developments there were at the time/have been since (though am intrigued by the notion that Bayes was working BEFORE the rational/analytic mindset become dominant). What I AM interested in is whether "indeterminacy" is ONLY "that propositions about the world can take numerical values between 0 and 1". My intuition is that "indeterminacy" can usefully be seen as more than that. It is, for me, the "wiggle room" that precludes adequate understanding through the mechanism of ANY "formal axiomatic system" and that makes the next iteration fundamentally unpredictable (to some degree) from the results of the previous one. Is not just "vagueness" in understanding; is at the core of the phenomena one is trying to "understand". And if THAT's the case, then one DOES have to recognize the fundamental limits of any/all FAS's (since they are "mechanical" in the specific sense that the next iteration IS fully predictable from the prior one). And one might aspire to a formalization of the inquiry/understanding process that doesn't have those limitations. My guess is that whatever the "primitive" is that represent "indeterminacy" in classical probability theory, it doesn't have the additional character I want, but I'm prepared to be told I'm wrong about this (the issue is not dissimilar from the Copenhagen arguments re quantum theory). The connections between issues concerning formal axiomatic systems, Godel's results, and quantum mechanics are deep and interesting. [The Lucas-Penrose argument is one well-known attempt to exploit such connections.] When you claim that FAS's are 'mechanical' in the specific sense that the next iteration IS fully predictable from the prior one," are you referring to iterations *within* the system or iterations *of* the system. In the latter case it seems up to us (to a considerable degree) how we change one FAS to yield another. In the former case, I'm not sure exactly what is being iterated. Are we talking about the repeated application of a rule of the system? But even in this case we may not have full predictability. For example, the Disjunction rule in classical logic allows you to infer from "P' to "P v Q', and you are free to choose for "Q' *any* well-formed formula in the language. (In other cases we do have full predictability. For example, the Modus Ponens rule gives us no choice but to infer "B" from "If A then B" and "A".) Had a suspicion there was a problem along these lines when I made the suggestion. And may be another red herring. Let's see whether its worth returning to in light of other things. What I had in mind was, I think, not quite the one the proof responds to. The idea was not to substitute Baye's theorem for an existing axiom or set of axioms but rather to use Baye's theorem as a starting point and then add additional axioms as seemed necessary. Using Bayes' Theorem as a starting point "and then add[ing] other axioms as seem necessary" still sounds weird to me. I think that my worry is that the expressions appearing in BT require more fundamental axioms to define their meaning. (In particular the concept of conditional probability.) It seems analogous to starting, say, with Fermat's Last Theorem as an axiom, without having in place basic axioms for defining the concept of number. Yeah, had precisely in mind "an axiomatization of the reals which does not proceed in this cumulative ways from the integers". And, here too, may be a red herring (or something already fully explored). I'm not though entirely dissuaded by your notion that doing so would not "avoid the G-style incompleteness results, since the resulting theory would be strong enough to embed integer arithmetic". It seems to me at least possible (and if possible desirable) that it would further illuminate G-incompleteness by showing it to hold in one realm (integer arithmetic) of a larger space in which it does not generally hold. As I understand the Godel incompleteness results, any formal system which is strong enough to embed arithmetic is either inconsistent or incomplete. Hence it cannot be the case that Godel incompleteness holds only in "one realm of a larger space in which it does not generally hold." Do understand that Godel incompleteness is NOT "absolute", that what one can't reach in one formal system CAN be reached in another. In fact, one of the things that I think is important in all this is that FAS's themselves GENERATE the unreachable (for themselves). Would need some tutoring on PA, ZF etc if it looks relevant in the long run, but had the impression that Chaitin was quite directly concerned with establishing "interesting arithmetical claims" (his omega number and its relation to diophantine equations) not provable in FAS's as he understands them at least. I think we need to be careful in interpreting the claim that "FAS's themselves GENERATE the unreachable (for themselves)." Certainly, the demonstrably unprovable statements are typically system-specific. But remember that Godel's system for coding statements about provability-in-a-system into arithmetical statements is itself arbitrary. There are infinitely many different coding schemes for each system, and each coding system generates a different set of demonstrably unprovable (in that system) statements. In other words, it is misleading to think that the system itself leads us inexorably to some particular (unprovable) Godel sentence. As for your point about the omega number, I'm sure you're right. I haven't looked at Chaitin's stuff for a while, and I had forgotten (or was never aware) that the omega number has independent mathematical interest.

As for accessible references concerning probability and its philosophical ramifications, as I said in my original reply this is not my area of professional expertise so I am not really up on the literature. Having said that, one good place to start is the article "Interpretations of Probability" in the Stanford Encyclopedia of Philosophy. This is an online encyclopedia which (in general) has excellent survey articles on a wide range of topics. It is peer reviewed and considered highly respectable within the field. At the end of the article is a fairly extensive bibliography. The author of the article, Alan Hajek, is a good friend of mine from grad school, currently at Caltech. There is another article in the Stanford Encyclopedia specifically on Bayes' Theorem. I haven't read it, but it might be useful.


Grobstein responding to Baker indent italics, 4 August 2004

I think that your are correct to emphasize the parallels between classical logic and the standard axiomatization of probability. One further parallel "often overlooked" is that classical logic also gives you "freedom" in initial assignments. From the point of view of pure logic, each contingent atomic proposition can be assigned either "true" or "false." Do think its interesting/potentially productive to ask both about the degree of parallelism and about the inevitability or lack thereof of whatever parallelisms exist. Seems to me possible (subject to some of the below) that parallelisms might reflect the fact that formalization of probability theory occurred during/after people got interested in formalization of arithmetic, and might have gone differently if it had proceeded more independently.

A gingerly suggestion, since I'm finding myself constantly reminded of my naivete in these realms, and there may in consequence be an important something I keep overlooking. Perhaps in the area of what is/is not taken as "givens" in FAS's (aspects of which we touched on earlier). It hadn't been clear in my mind that the "givens" are not only axioms and "sentence generation mechanisms" and "inference rules" (my terms, do they bear some relation to your vocabulary?) but also (in all cases?) "atomic propositions" and arbitrary "value" assignments. Wouldn't mind clearing this up, if you don't mind what are probably some very elementary clarifications. While on this topic, I assume there is nothing in an FAS that correspond to the "new observation" of the "scientist", ie a something from OUTSIDE the formal system that has to be taken into account? Suspect this bears on a point below where you found me unclear ("I'm not sure what is being iterated"), will see when we get there.

The point about incoherence, for the probability calculus, is not that you have a choice of different probabilities you can assign (since this is also true for coherent systems), but that you can derive - within the (incoherent) system - two (or more) different probability assignments for a single proposition. Hence the analogy with inconsistency in classical logic - where you can derive two truth-value assignments (true and false) for a single proposition. For example, say I assign probability 25 to the proposition that it will rain tomorrow. If I am working with an incoherent set of probability axioms then I will be able to derive, within the system, a second, distinct probability for this same proposition, say 45.

It's not right, I think, to say that the probabilities for "related" hypotheses must sum to less than 1 (at least not on any understanding of "related" that I can think of). Perhaps you are thinking of the theorem of classical probability which says that if A and B are mutually inconsistent propositions then P(A) + P(B) is less than or equal to 1. More generally, there seem to be plenty of counterexamples to your claim.

e.g.
let C = "the dice does not come up 6"
D = "the dice comes up odd"

C and D are related, since D logically implies C.
But (for a fair dice), P(C) + P(D) = 5/6 + 1/2 = 4/3 > 1.

There *are* more general constraints on the probability assignments of compound statements, relative to the probability assignments of their atomic components. But there is no single such constraint.

e.g.
P(A & B) < P(A)
P(A & B) < P(B)

P(A v B) = max [P(A), P(B)]

Also, if A -> B then P(B) > P(A)

There IS something here that I'm having trouble getting my head around, and so may require us to backtrack a bit (if you're willing). The "sum to less than 1" idea was my "guess" as to what the requirement was to achieve "coherence" in probability calculus. The assumption I was making (I think) is that what probability calculus does is to derive probabilities given a set of starting conditions (which include some probability assignments to "atomic sentences"?) and that "coherence" (parallel to completeness) meant, as you say, that there would never be "two or more probability assignments for a single proposition". Since I assumed probability assignments (like true/false?) could be ARBITRARILY assigned only to "atomic sentences", I don't quite understand the idea of "probability assignments" of COMPOUND sentences. Am I missing something? On the other hand, the "general constraints" do make sense to me (and indeed probably more so than my "sum to less than 1" idea.

Its interesting though, assuming I'm understanding you, that the "operators" used in setting the constraints are those of "classic logic". Again, I can't help but wonder what kind of "formalization" of probability might have resulted if it had been done (was done) independent of that framework.

The issue of completeness in probability is an interesting one. What the above remarks indicates is that there are ěcompound statements whose probability is indeterminateî (relative to the probabilities of their components). Perhaps the simplest case is conjunctions of non-independent events. Let A be the hypothesis that George Bush is re-elected. Let B be the hypothesis that the stock market rises in 2005. Say you assign probabilities of .5 to each. Then the only coherence constraint on P(A & B) is that it be less than .5. You are free to assign any probability to A&B between 0 and .5. Hmmmm. So there is not "rule" for determining the value of A&B in probability calculus as there is (I thought) in classical logic (ie true if A and B are both true, otherwise false)? Is there a difference between "undetermined" and "inconsistent"? In probability calculus? In classical logic? (are Godel statements simply "undetermined" or do they themselves or indirectly lead to inconsistencies?).

When discussing Godel and related results it is important to be clear about what ěcompletenessî means in the logical context. A logic typically consists of a syntax (rules for well-formed formulas, rules for proof) and a semantics (an interpretation of the syntax, in which meaning is assigned to the logical symbols). In classical 1st-order propositional logic, the semantics is simply the truth-table interpretation of each logical connective. A logic is complete iff every semantically valid claim is syntactically derivable. (Soundness is the converse condition.) And in this sense, classical (1st-order) logic is complete.

Hmmmm. This may be part of where my naivete gets me into trouble. I understand (I think) "syntax" (that is what I called above "construction rules" together with "inference rules", yes?). What's less clear to me is the distinction between that and "semantics". I guess I assumed the "truth table" was part of the definition of the "logic", and am less clear about how "semantics" would be characterized in other cases. Even so, I thought that Godel showed "incompleteness" for 1st order logic. Not so, huh? Only for more sophisticated logics adequate to support arithmetic? And those have a different "semantics"?

Once we move beyond purely logical systems, a more useful definition of completeness is iff "for any sentence of the system, S" either S or ~S is provable in the system. This is the sense of completeness relevant to Godel"s result. Clearly this is an unreasonable, indeed undesirable, condition to place on logical systems in general. (We don"t want to be able to prove logically that it will rain tomorrow or prove logically that it won"t rain tomorrow!) But in the special case of arithmetic (and in mathematical systems more generally) this sort of completeness is a desirable goal. We would like our formal mathematical systems to decide every question in the given domain. It is this goal, in the particular case of mathematical systems strong enough to embed arithmetic, which Godel showed is impossible.

Back on track here (thanks for the detour). And agree that, for ARITHMETIC "completeness" in the sense defined is would be desirable. Is less obvious, MUCH less obvious, that it would be desirable for "science", much less down to day life. Assume this has been appreciated, issue is whether people have proposed alternate formal systems more approprite for ... science?

I didn't mention this in my previous message, but the Ramsey-style "betting quotient" approach does assure coherence. There are some theorems (known as "Dutch Book" theorems) which show that a person will accept a set of bets in which he is guaranteed to lose money (a so-called "Dutch Book") iff his method for calculating the probabilities of compound statements does not conform to the standard probability axioms.

You write: "Could one not simply take 'degree of belief' as something along the lines of 'existence in the brain of some relevant degree of order' defined re 'randomness'?"

I'm unclear what you mean here. Degrees of belief are supposed to attach to particular hypotheses. These could relate to immediate sensory experience (e.g. ěthe coin will come up heads on the next flipî) or they could relate to much more abstruse matters (e.g. ěthe 109th element in the periodic table will be created by scientists in the next 24 hoursî). I don"t see how a particular number "corresponding to the person's degree of belief for each such hypothesis" can (even in principle) be extracted from the physical state of the brain. And where does randomness come in, exactly?

May want to come back to interesting "Dutch book" issue, "theorems ... which show that a person" confuses me (observations can show ...., but a theorem?). But, let's stay for the moment with your question to me. If Bayes was right, that probability = degree of certainty, then "degree of certainty" MUST be a feature of the "physical state of the brain" and, if it is, it can be "extracted" by appropriate observations. This is true whether the "cause" of that particular brain state is "immediate sensory experience" or "more abstruse matters". Randomness may be a factor in the "more absruse matters". But, more immediately, my speculation is that the brain representation of "degree of certainty" uses a state of randomness (ie 0 certainty) as a baseline. I totally agree with your observation that there is "no reason to believe that the nervous system is constrained" to operations that would yield only rational numbers.î But I think that it is important not to conflate the numbers which enter into a description of the nervous system with the numbers which our minds can conceive of. I don't see any clear inference, for example, from "We can conceive of incompressibles" to "The best description of the functioning of our nervous system involves incompressibles."

No argument about lack of "clear inference", is speculative leap. BUT am not conflating description with what "our minds can conceive of". The latter is a very small part of what the nervous system is/does (consciousness, the "story teller"). The rest of the nervous system may well not only be "describable" using incompressible numbers but may operate in ways that involve them (whether consciousness can conceive them or not ... the rest of the ns, for example, clearly works happily with high numbers of degrees of freedom/dimensions whereas most people can only "conceive" three).

The possibility of posterior probabilities going to zero clearly fits nicely with the Popperian view of scientific hypotheses as falsifiable. The easiest way, I think, to see why the posterior probability cannot exceed 1 is to cash out the conditional probability in the numerator using the formula P(A/B) = P(A & B) / P(B) (using P(B) does not = 0)

Bayes' Theorem says P(H/e) = P(H). P(e/H) / P(e)

Rewriting the RH side, using the above formula, yields P(H) . P(e&H) / P(H) . P(e)

But now we can cancel out the two P(H)'s, so the resulting (posterior) probability is P(e&H) / P(e). And since the probability of a conjunction is less than or equal to the probability of one conjunct, this expression cannot exceed 1.

Nice, got it. Relates to one of the coherence criteria above, yes? No, wait a minute. If that argument holds (as I'm understanding it) then the "likelihood factor" can't exceed one and the posterior probability can never be increased by a new observation? What am I missing?

The connections between issues concerning formal axiomatic systems, Godel's results, and quantum mechanics are deep and interesting. [The Lucas-Penrose argument is one well-known attempt to exploit such connections.] When you claim that FAS's "are "mechanical" in the specific sense that the next iteration IS fully predictable from the prior one," are you referring to iterations *within* the system or iterations *of* the system. In the latter case it seems up to us (to a considerable degree) how we change one FAS to yield another. In the former case, I'm not sure exactly what is being iterated. Are we talking about the repeated application of a rule of the system? But even in this case we may not have full predictability. For example, the Disjunction rule in classical logic allows you to infer from "P" to "P v Q", and you are free to choose for "Q" *any* well-formed formula in the language. (In other cases we do have full predictability. For example, the Modus Ponens rule gives us no choice but to infer "B" from "If A then B" and "A".)

IS (perhaps?) the key issue (in my way of thinking/concerns). Had in mind iteration WITHIN the system and the ambition that some scientists have explicitly (and many scientists as well as non-scientists have unconsciously) to achieve a "theory of everything": a small set of starting conditions and transformation rules from which everything else in "reality" follows. The week before the session you were at, I made a strong (I think) argument (see http://serendip.brynmawr.edu/local/scisoc/information/grobstein15july04.html) that trying to describe "reality" as an FAS would inevitably fail (whether it actually IS one or not), and so one needn't some formalization of the inquiry process that didn't include the presumption that reality could be usefully described as a particular FAS. DO understand that "fully predictable iteration" is NOT a property of formal logic (nor of arithmetic; both involve "irreversible" transformations (your examples in logic; 3 + 5 = 8 in arithmetic). But physics tries to use only time-symmetric, hence reversible, hence deterministically iterative transformations, and many other sciences try to copy that. And THAT is what I am fundamentally trying to get people to get away from.

Hmmmm .... effort to clarify here between us useful (yeah other places too but maybe here especially). Am realizing that my quarrel may be less with FAS per se and more with a particular KIND of FAS?, a kind exemplified by Turing computability. I thought FAS's and Turing computability systems were formally equivilent? Does anyone make a distinction between FAS's with and without irreversibility?, with and without transformation rules that have some "arbitrariness" (observer choice) to them? What is presumed to be the "source" of the arbitrary choices in the theory of FAS?

And another line to think more about: do NOT want to argue that FAS's and/or Turing computability are irrelevant to science, only that they cannot serve as the aspired-to goal. So COULD imagine a characterization of science in terms of "how we change one FAS to yield another". Are there efforts to "formalize" science in those terms?

Using Bayes' Theorem as a starting point ěand then add[ing] other axioms as seem necessaryî still sounds weird to me. I think that my worry is that the expressions appearing in BT require more fundamental axioms to define their meaning. (In particular the concept of conditional probability.) It seems analogous to starting, say, with Fermat"s Last Theorem as an axiom, without having in place basic axioms for defining the concept of number.

Yeah, fair enough. But ... what if one DID take the "message" of Bayes Theorem (ie "degree of uncertainty", no presumptions about nature/even existence of external world, iterative change) as a starting point and then attempted to create a "formal system" around that, defining axioms/terms as needed?

As I understand the Godel incompleteness results, any formal system which is strong enough to embed arithmetic is either inconsistent or incomplete. Hence it cannot be the case that Godel incompleteness holds only in "one realm of a larger space in which it does not generally hold." Maybe issue here, what needs to be clarified, is "formal system", what exactly the requirements/constraints presumed in it are?

I think we need to be careful in interpreting the claim that "FAS's themselves GENERATE the unreachable (for themselves)." Certainly, the demonstrably unprovable statements are typically system-specific. But remember that Godel's system for coding statements about provability-in-a-system into arithmetical statements is itself arbitrary. There are infinitely many different coding schemes for each system, and each coding system generates a different set of demonstrably unprovable (in that system) statements. In other words, it is misleading to think that the system itself leads us inexorably to some particular (unprovable) Godel sentence. As for your point about the omega number, I"m sure you're right. I haven't looked at Chaitin's stuff for a while, and I had forgotten (or was never aware) that the omega number has independent mathematical interest.

Was not aware of the dependence of the definition of the "unreachable" on the choice of coding scheme. Is important to know that, thanks. Was myself impressed by learning about Chaitin's persistence in establishing "independent mathematical interest". All of which makes me wonder still more about my presumption of a formal equivalence between Godel and Turing. There may be more of interest to chew over there.


Home | Calendar | About | Getting Involved | Groups | Initiatives | Bryn Mawr Home | Serendip Home

Director: Liz McCormack -
emccorma@brynmawr.edu | Faculty Steering Committee | Secretary: Lisa Kolonay
© 1994- , by Center for Science in Society, Bryn Mawr College and Serendip

Last Modified: Wednesday, 04-Aug-2004 11:40:53 EDT