202405201744
Status: #idea
Tags: Probability

Probability Measure (According to Kolmogorov)

A Probability measure is a special type of measure which maps elements from a σalgebra to the interval [0,1]. That is really it.

It has the same properties as you would expect for a standard measure:

This is the standard definition of probability. Note that there is a different but congruent definition of probability which can be built not from Measure Theory axioms, but from the concept Expected Value directly. This is Probability Measure (Based on Expected Value).

Since they are congruent and have the same properties one can go from one to the other with impunity, but it is important to note.

Properties of Probability Measures

Formula for Complement

But from that we can derive everything we come to know about probability.
For example, since we know that P(Ω) (where Ω is the sample space) is 1, and we know that for any set A, AAc=Ω and AAc=, we can use the second axiom of measures to say to derive this well-known fact:

Ω=AAcP(Ω)=P(AAc), using both sides as inputs to the probability measure PP(Ω)=P(A)+P(Ac), by second axiom1=P(A)+P(Ac), by definition of a probability measureP(Ac)=1P(A)

You can also prove easily that AB, then P(A)P(B), we leave that as an exercise to the reader. Lel.

Using Measure Theory as the foundation of Probability Theory makes all the other derivations similarly beautiful.

Formula for unions : The Inclusion-Exclusion Principle

Let's first start by proving the following P(AB)=P(A)+P(B)P(AB),
This is my derivation of it.

AB=(AB)B, we rewrite the LHS in a more convenient fashionP(AB)=P(ABB=P(AB)+P(B), by second axiom=P(ABc)+P(B)=P(A)P(AB)+P(B), since AB is the part of Athat is not in B=P(A)+P(B)P(AB), simple rearrangement

Boom!
This gives us the rigorous proof for why this equality holds, now what is the general form of the formula? What if instead of A and B, we have A1,A2,A3,,An which are all in our σalgebra.

What is the formula for P(i=1nAi)?
Pasted image 20240520183722.png
screencap from Wikipedia cause I ain't typing allat.

How to Prove it?

Continuity of Probability Measures

If A1,A2,σ-algebra Ξ then,

P(i=1Ai)=limmP[i=1mAi)]

This looks rather obvious (I mean it looks really similar to how we definite infinite summations,) but there's actually more to this statement than meets the eye. This is a really important theorem that is used all the time in Probability Theory. Also, among other things we are NOT taking the limit inside the brackets, an actual rigorous proof is required to show the equivalency.

Also shouldn't be seen as a sequential operator which operates A1 and A2, and then A1A2 and A3, etc. Instead it takes everything at once. While the following statement is true

i=1Ai=limm(i=1mAi)

(this limit refers to increasing inclusions of sets) it is not wise to move the limits in or out.

Indeed, one should be careful when introducing limits in a Measure Theory context, especially when it comes to bringing a limit in and out of a measure. Unless there's a specific argument to support it, or a convergence theorem that says we can, one should not assume it is correct to do so.

A proof of this theorem can be found on YouTube, the second link in the references covers it during the lecture.

Corollary

1.

If A1A2A3
Then by the previous result:

P(i=1Ai)=limmP(Am)
2.

If A1A2A3
Then by previous result and DeMorgan's:

P(i=1Ai)=limmP(Am)

This is not a typo, in fact it makes perfect sense. If I keep taking intersections of a sequence of set which are non-increasing (in the sense that AiAj for all i,j where i<j), then at the end of the infinite road, the one set that will be in all of them is Am.

Observation, we see that if the sequence is non-increasing, or non-decreasing the limit of an infinite sequence of set will simply be limmP(Am).

Union-Bound Property

Let A1,A2, all be events, then:

P(n=1Ai)i=1P(Ai)

Intuitively, if all the Ai are disjoint, then by the second axiom we have equality, but if even one pair Ai, Aj overlaps, we will double count their intersections. This problem will compound the more pairs that overlaps exist.

This can be proven pretty neatly using Indicator Variables, but it can be also shown directly using the Ai to Bi transformation that is used to show the continuity of probability measures. The second link of references is really useful for understanding all of that.

References

Probability Spaces
Probability Measure (Based on Expected Value)
Probability Measures Lecture