202405291209
Status: #idea
Tags: Probability Theory
State: #nascient

Probability Measure (Based on Expected Value)

Why?

It is a competitor to the trusted definition of Probability Measure (According to Kolmogorov) that is purported to make more sense, because it is founded on a concept that even non mathematicians are familiar with: Arithmetic Mean.

Fundamentally, an expected value is an average and in the context of probability represents the value that we expect some Random Variable to take after n samples (potentially infinite.) And it is computed by basically taking a weighted average. This allows us to skip measure theoretic nonsense (jk, I like measure theory) and directly start talking of Probability Theory. Which is definitely a huge advantage if you're a starry eyed learner who decided to learn probability for whatever reason and that get hit with a wall of convergence theorems, limits, and just Measure Theory concepts that must be front-loaded before you get to even see what probability even means.

Definition of Expected Value

First what is an Expected Value?
For some random variable X, the expected value is an operator such that:

  1. If X0E(X)0
  2. If cRE(cX)=cE(X)
  3. E(X1+X2)=E(X1)+E(X2)
  4. For some constant c, E(c)=c
  5. If 0Xn(ω)in n such that Xn+1(ω)Xn(ω) for all nE(Xn)E(X) Monotone Convergence Theorem

Most of these properties are intuitive except 5. The last property essentially states that if Xn is an increasing sequence with respect to n, the expected value of E(Xn) will grow towards E(X) (or approach from the left) as n approaches infinity. And X here that seems to come out of nowhere is understood to be the pointwise limit of the sequence Xn, in other words what happens when n gets really big, so X=limnXn.

The idea is that X encapsulates the entire sequence Xn and for that reason we drop the index.

With this in mind you can see that an equivalent statement to 5 is :
If Xn(ω)in n such that Xn+1(ω)Xn(ω) for all nlimnE(Xn)=E(limnXn) Monotone Convergence Theorem

Fundamentally, these are axioms so you can just take them as fact and go on your merry way, but if you need intuition (and a rigorous proof), especially for the Monotone Convergence Theorem, it is a theorem from Measure Theory and it is explained in its own note.

So?

With the above established, we can say that as long as E is defined in such a way that all these properties is maintained, it is an Expected Value. Obviously if whatever definition of "expected value" we have fails with one of those properties it is NOT an expected value.

From there you see that the standard definition of expected values as:
Discrete case

i=1nxpx

Continuous case

xfx dx

Will hold all the above properties, and since px and fx are probability measures (not yet defined here, don't worry) that will range between 0 and 1, we are effectively taking a weighted average.

Probability using Expected Value

Well we recall Indicator Functions, which are defined in their own notes.

For some event A and indicator function IA, we say the following:

P(A)=E(IA(ω))=E(IA)

As simple as that!
All the properties we've proved for Probability Measure (According to Kolmogorov) apply here.