Conditional Probability

Conditional probability is all about updating a probability based on some knowledge of the situation being modeled. For instance, does your personal probability, about carrying an umbrella with you today, change dependent on whether or not it is raining outside?

There is a formula to update a probability based on surrounding contextual information. If this extra information has something to say about the probability of interest, then the probability might change. On the other hand, if the extra information is completely irrelavent, then we expect the probability not to change at all.

Let's first review some notation. For a random variable representing some distribution and an arbitrary set , the probability that the random variable takes on values in the set is written

The notation and language can vary across different sources. For example, it's also common just to see used instead of . This is not unreasonable for most introductions to conditional probability (including this one), since the dependence on the random variable is of secondary importance.

Further, it's common to use language such as the event , instead of the set . This is purely semantics, as we're referring to the same thing, a set, with two different words. However, it does make talking about conditional probability a bit nicer. By referring to sets as events, the outcome of rolling a die is an event that happened, rather than the more verbose "we observed the outcome of the die and the outcome corresponds to the elements of the set".

This introduction to conditional probability will take these notational and language conveniences, so that we can better understand questions like

What's the conditional probability that the sum of two dice is at least 6, given that one die shows 3?

For this question, the events are the sum of the two dice is at least , which we denote , and one die shows a , which we denote . Given these events, we are interested in the probability of given , written .

Formula

Let be two events (sets). The conditional probability of given the event is

where is the intersection of the two events -- that is the set of outcomes that are simultaneously in and .

Notice that the event claimed to be known is called "given" and follows the pipe character.

Example 1

What's the conditional probability that the sum of two dice is at least 6, given that one die shows 3?

Let represent the sum of the two dice is at least 6, and let represent the event that one die shows a 3.

Then we are interested in finding , which is just a matter of using the formula.

The numerator, , is the probability that the two dice sum to at least and one die shows a . If one die shows a , then there are possible values of the other die, namely , for which the sum of the two dice is at least . Since there are possible outcomes of two dice, the numerator probability is .

The difference between and is subtle, but meaningful. Think of as two dice being rolled simultaneously and you see the values of both dice at once.

On the other hand, for , imagine the event happens first. Only one die is rolled and you see a [1]. Now that you know , what is the probability that upon rolling the second die the sum of the dice is greater than ? Mathematically, this is written as the conditional probability .

For the probability of , , the total number of outcomes is still even though the event of interest is just about one die. There are only possible pairs where one of the die shows a

It is only in the conditional probability that the number of possible outcomes changes; effectively from to . Since is given, we know the value of one of the dice. There are no longer outcomes like , , , since none of these have one of the die showing a .

Of the possible pairs where one die is showing a , only of them sum to at least . Hence, the conditional probability is .

Example 2

A survey of students at a university revealed the following data about class standing and place of residence[2].

Residence\Class Freshman Sophomore Junior Senior
Dormitory 89 34 46 15
Apartment 32 17 22 48
With Parents 13 31 3 0

Suppose you meet a fellow student of the university from which the survey was taken and you learn that this student lives in an apartment. What is the probability that this student is a sophomore?

Such a scenario is more commonly written as, What is the probability that a student is a sophomore given that the student lives in an apartment?

In either case, this is question about conditional probability.

Let's first define our events. Let represent a student being a sophomore, and let represent a student who lives in an apartment. To find the conditional probability , we first need to find and .

The probability that a student is a sophomore and lives in an apartment is .

The probability that a student (of any class) lives in an apartment is .

With these two probabilities, we can compute .

Practice

Suppose two fair dice are rolled[3].

a. What is the conditional probability that one turns up two, given they show different numbers?

b. What is the conditional probability that the first turns up six, given that the sum is , for each from two through ?

c. What is the conditional probability that at least one turns up six, given that the sum is , for each from two through ?

Independence

Two events, are said to be independent if the following holds

Notice, that must be non-empty; there needs to be some overlap between two events (sets) , because if there is nothing in their intersection then . So long as and , then can't be independent.

This definition also coincides nicely with the definition of conditional probability. If are independent, then

so that indeed, there is no information in that might change the probability of .

Law of Total Probability

Consider the following picture.

Notice that each for [4] intersects . We can write as the union of the intersections of with each

The sets are all disjoint, so the axioms of probability say we can find the probability of by summing up all of the unioned intersections

By the formula for conditional probability, we can write

or

The Law of Total probability says that we can find by putting the above formulas together to get

Example 1

Suppose we conduct the following experiment[5]. First, we flip a fair coin. If heads comes up, then we roll one die and take the result. If tails comes up, then we roll two dice and take the sum of the two results. What is the probability that this process yields a ?

Let represent the random variable who's value we are interested in being . We want to find , but we don't know the distribution of . Nonetheless, we can use the Law of Total Probability.

Since the coin is fair, we know . If is flipped, there's a probability of that we get a from rolling one die. If is flipped, there's a probability of that we get a from summing the outcomes of two dice. Therefore,

Example 2

Suppose[6] the AFC Bournemouth forward Justin Kluivert shoots of his penalty kicks to the goal keeper's left and to the keeper's right. And suppose Kluivert is about to shoot a penalty kick against the Aston Villa keeper Emiliano Martínez. It's known that Martínez blocks of shots kicked to the left and of shots kicked to the right. What is the probability that Martínez blocks Kluivert's shot?

Let be the events that Kluivert kicks the ball to the left and right, respectively. Let be the event that the shot is blocked.

We this notation, we have , , , and . Then

Bayes' Theorem

Bayes' Theorem is often described as the way to reverse conditional probabilities, if you have , but want , then this is the formula to use. This is generally overly simplified, because you need a bit more information than that suggested to use this formula.

Symbollically, it's just conditional probability twice, together with the Law of Total Probability:

Example 1

A factory produces electrical components using two machines: Machine and Machine . On any given day, Machine produces of the total output, while Machine produces the remaining . It is known that of the components produced by Machine are defective, whereas only of the components produced by Machine are defective. If a randomly selected component is found to be defective, what is the conditional probability that it was produced by Machine ?

Let stand for defective. We want to find , the probability that a found to be defective component was produced by Machine .

We know that of components are produced by Machine , so , and . Further, we have that of the components produced by machine are defective, , and .

We can now put these pieces together to get

Example 2

Some doctors recommend that men over the age of undergo screening using a test called prostate specific antigen (PSA) to screen for possible prostate cancer. For the PSA test, we know that when the disease is present, of the time the test is positive. Further, when the disease is not present, of the time the test is negative. Suppose the probability that a man over has the disease is . Given a positive test result, what is the probability that a randomly selected patient has the disease?

Our goal is to find .

Since , we know that . Similarly, we're given . Therefore, .

The surprising part of this example is the relatively low probability of having the disease even after a positive test result. The fact is due to the incredibly low base rate of anybody having the disease, .


  1. We can relate the difference between and to our example of how likely you are to carry an umbrella with you today. The probability describes the probability you will carry an umbrella with you today and it is raining, before you have looked outside. The probability describes the probability your carry an umbrella with you today, after having looked outside and observed that it is raining. ↩ī¸Ž

  2. This example was borrowed from Maxie Inigo, Jennifer Jameson, Kathryn Kozak, Maya Lanzetta, & Kim Sonier, taken from the LibreTexts page Conditional Probabilities on 2025-03-24. ↩ī¸Ž

  3. This example was borrowed from Paul Pfeiffer, taken from the LibreTexts page Problems on Conditional Probability on 2025-03-24. ↩ī¸Ž

  4. The notation is the set of all integers from to including both and , namely . ↩ī¸Ž

  5. This example was borrowed from Eric Lehman, F. Thomson Leighton, and Alberty R. Meyer, taken from the LibreTexts page The Law of Total Probability on 2025-03-31. ↩ī¸Ž

  6. All of these numbers are made up. ↩ī¸Ž