Fundamentally, an expected value is an average and in the context of probability represents the value that we expect some Random Variable to take after samples (potentially infinite.) And it is computed by basically taking a weighted average. This allows us to skip measure theoretic nonsense (jk, I like measure theory) and directly start talking of Probability Theory. Which is definitely a huge advantage if you're a starry eyed learner who decided to learn probability for whatever reason and that get hit with a wall of convergence theorems, limits, and just Measure Theory concepts that must be front-loaded before you get to even see what probability even means.
Most of these properties are intuitive except . The last property essentially states that if is an increasing sequence with respect to , the expected value of will grow towards (or approach from the left) as approaches infinity. And here that seems to come out of nowhere is understood to be the pointwise limit of the sequence , in other words what happens when gets really big, so .
The idea is that encapsulates the entire sequence and for that reason we drop the index.
I currently understand it as is the pointwise limit of the sequence which is increasing. It is so labelled, because it encapsulates the behaviour of the entire sequence . For that reason its definition is as approaches infinity. Now this could exist (if converges) or not. But since we always assume that is non-negative (no variables cancelling each other possible) and non-decreasing (no oscillations possible), it follows that always exist in (in probability, since we operate in the extended real number system, a limit that is or is said to exist). if its finite then it is effectively an upper bound of the expected value of our sequence since is increasing, and considering some parts when taking our expected value can never be bigger than taking the whole by extension of axiom 1 (and the fact it is increasing). If its not finite, then it's also an upper bound but we will require more analysis to see if such a result is expected or if its indicative of a wrong model. Whichever it is will be meaningful for us.
After all, now that we have found our maximum, it follows that under the aforementioned conditions, as we add more and more random variables, we can only increase (or fill) towards that upper bound.
Fundamentally, these are axioms so you can just take them as fact and go on your merry way, but if you need intuition (and a rigorous proof), especially for the Monotone Convergence Theorem, it is a theorem from Measure Theory and it is explained in its own note.
So?
With the above established, we can say that as long as is defined in such a way that all these properties is maintained, it is an Expected Value. Obviously if whatever definition of "expected value" we have fails with one of those properties it is NOT an expected value.
From there you see that the standard definition of expected values as: Discrete case
Continuous case
Will hold all the above properties, and since and are probability measures (not yet defined here, don't worry) that will range between and , we are effectively taking a weighted average.