Let me put it in this way:
Define A= Head turned up, B= Tail turned up
If I toss a coin, it is natural to say that the probability of A and B is $P(A)=P(B)=\frac{1}{2}$.
Why we can't assign, let's say, 0.6 to the probability of each events? That is, $P(A)=P(B)=0.6$
We can't do this because $P(A)+P(B)>1$ but why it must be 1?
Is there a fundamental reason that the probability of an event must be some number between $0$ and $1$? Or we adopted this convention that 1 as a scale?
$\endgroup$ 93 Answers
$\begingroup$For any event $A$, a certain event $B$, and an impossible event $C$, where $A$, $B$ and $C$ are all independent, we need $A$ and $B$ happening to be as probable as $B$, $B$ and $C$ happening to be as probable as $C$, and $A$ and $C$ happening to be as probable as $C$. Written out with the definition of independence, this means that:
$$P(AB) = P(A)P(B) = P(A)$$ $$P(BC) = P(B)P(C) = P(C)$$ $$P(AC) = P(A)P(C) = P(C)$$
The events $A$ and $C$ are also disjoint ($C$ won't happen whenever $A$ happens because $C$ can't happen), and since we need the probability of either happening to equal the probability of just $A$ happening, we need: $$P(A \cup C) = P(A) + P(C) = P(A)$$
These are all true only if $P(B) = 1$ and $P(C) = 0$. Put differently, in order for independence to distribute through probabilities, we need certainty to correspond with the multiplicative identity 1 and impossibility to correspond with the additive identity 0. Formally, this is true in any probability space where the events form a field.
Edit: better justification for impossibility being 0
$\endgroup$ 4 $\begingroup$Each of the several definitions of "probability" implies that the probability of disjoint events must sum to less than $1$.
If you are tossing a fair coin then by definition of "fair" each of the two possibilities is equally likely. Then the probability of $H$ is $$ \frac{\text{number of possible outcomes with } H}{\text{number of possible outcomes}} = \frac{1}{2}. $$
If you have a real coin and you don't know whether or not it's fair then you toss it many times to figure out the probability using that kind of fraction, where you use the total number of heads seen for the numerator and the total number of tosses as the denominator. That might come out to be $0.6$. Since you always see a tail when you don't see a head, that must have happened $40\%$ of the time.
Either definition must always result in a fraction between $0$ and $1$. In either case the probability will be $0$ if the numerator event never happens and $1$ if it always happens.
$\endgroup$ $\begingroup$One advantage to having outcome probabilities for a (discrete) sample space sum to $1$, is that then the probability of an outcome (or event) is the same as the (expected or long-term) relative frequency of the outcome (or event).
For instance in your coin example, having $P(H)=P(T)=0.6$ may be well and good for some purposes; but still you would get heads and tails each with a relative frequency of $0.5$
$\endgroup$