Conditional probability is a difficult topic for students to master. Often counter-intuitive, its central laws are composed of abstract terms and complex equations that do not immediately mesh with subjective intuitions of experience.

I. Definition

In terms of practical range, probability theory is comparable with geometry; both are branches of applied mathematics that are directly linked with the problems of daily life.  But while pretty much anyone can call up a natural feel for geometry to some extent, many people have trouble with the development of a good intuition for probability.  Probability and intuition do not always agree.

In probability theory, we see two types of probabilities: absolute (or unconditional) probabilities and conditional probabilities.  Absolute probabilities are probabilities of the form P(A) (read the probability of A), while conditional probabilities are probabilities of the form P(A,B) (read the probability of A, given B). These two types of probability can be defined in terms of each other.

1. The Law of Large Numbers

Intuitively, the probability of an event is supposed to measure the long-term relative frequency of the event.  This concept was in fact taken as the definition of probability by Richard Von Mises.  

Suppose that we repeat an experiment indefinitely.  (Note that this actually creates a new, compound experiment.)  For an event A, let Nn(A) denote the number of times A occurred in the first n runs. (Note that this is a random variable in the compound experiment.)  Thus,

Pn(A) = Nn(A) / n

is the relative frequency of A in the first n runs (it is also a random variable in the compound experiment).  If we have chosen the correct probability measure for the experiment, then in some sense we expect that the relative frequency of each event should converge to the probability of the event:

Pn(A) → P(A) as n → ∞

This precise statement is the law of large numbers, or law of averages, and is one of the fundamental theorems in probability.


Sample Problem 1

1. Find the probability that when two standard 6-sided dice are rolled, the sum of the numbers on the top faces is 5.

Solution:  There are 6 · 6 = 36 possible outcomes when we roll a pair of dice. We can list the outcomes in which the sum of the top faces is 5:


Value on Die 1 Value on Die 2 Total
1 4 5
2 3 5
3 2 5
4 1 5

We can now reach an answer by dividing the number of successful outcomes by the total number of possible outcomes:

P(sum of 5) = # of successful outcomes / # of total outcomes = 4/36 = 1/9


2. Conditional probability

Suppose we assign a distribution function to a sample space and then learn that an event E has occurred.  How should we change the probabilities of the remaining events?  We call the new probability for an event F the conditional probability of F given E and denote it by P(F|E).

The probability of an event a that is conditional on another event b, written P(a|b), is defined to be P(ab)/P(b), where ab is the conjunction of a and b.  (The conditional probability is undefined when the probability of b is zero.)  For example, the probability of obtaining three heads on three successive tosses of a coin, conditional on the first toss yielding heads, is the probability of obtaining three heads in a row, one eighth, divided by the probability of obtaining heads on the first coin, one half, which gives a final result of one quarter.

Conditional probability is a difficult topic for students to master. Often counter-intuitive, its central laws are composed of abstract terms and complex equations that do not immediately mesh with subjective intuitions of experience.  If students are to acquire the mathematical skills necessary for rational judgment, teaching must focus on challenging the personal biases and cognitive heuristics identified by psychologists.  Teaching must also demonstrate the power of probabilistic reasoning in the most accessible way.


Sample Problem 2

From a data table, one finds that in a population of 100,000 females, 89.835% can expect to live to age 60, while 57.062% can expect to live to age 80.  Given that a woman is 60, what is the probability that she lives to age 80?  This is an example of a conditional probability.  In this case, the original sample space can be thought of as a set of 100,000 females.  The events E and F are the subsets of the sample space consisting of all women who live at least 60 years, and at least 80 years, respectively.  We consider E to be the new sample space, with F as a subset of E.  Thus, the size of E is 89.835, and the size of F is 57.062.  So, the probability in question equals 57.062/89.835 = 0.6352.  Thus, a woman who is 60 has a 63.52% chance of living to age 80.


Sample Problem 3

Suppose that three slips of paper have the names Anne, Ben, and Carol written on them.  These are given at random to three people named Anne, Ben, and Carol.  What is the probability that exactly one person gets the paper matching his or her name?

Solution: There are six possible outcomes to this situation.

paper received by Anne (A) paper received by Ben (B) paper received by Carol (C)

Each situation has probability 1/6.  There are exactly three situations (those bolded) in which exactly one person gets the matching paper.  The probability is then 3/6 = 1/2 .


3. Logical probability

Many philosophers—Leibniz, von Kries, Keynes, Wittgenstein, Waismann, Carnap, and others—have tried to explain the following “logical” concept of conditional probability:

P(p|q) = the proportion of logically possible worlds in which both p and q are true / the proportion of logically possible worlds in which q is true

An obvious problem has been justifying a measure of the proportion of logically possible worlds in which a proposition is true.

Probability is of fundamental importance not only in logic but also in statistical and physical science.


II. Research in Conditional Probability

Recent research in cognitive and language development suggests that infants and young children are capable of complex computations and statistical inference.  The present studies investigated whether 4-year-old children can solve simple probabilistic reasoning problems.  Two experiments investigated children’s ability to use information from a sample to make generalizations about a population and vice versa.  Results suggest that even young children can use the random sampling assumption and base rate information in simple probabilistic reasoning tasks.  Future studies for addressing alternative interpretations and implications for learning and conceptual development are discussed. 
(Stephanie Denison, Kathleen Konopczynski, Vashti Garcia , Fei Xu . Probabilistic Reasoning in Preschoolers: Random Sampling and Base Rate)

Research has documented that students frequently hold misconceptions about probability that are not necessarily resolved by traditional instruction.  The purpose of this study was threefold:  survey college professors of introductory applied statistics about their awareness of students’ misconceptions; design and validate a test instrument to identify some prevalent misconceptions of probability; and investigate approaches and strategies used by college professors to facilitate the resolution of these misconceptions.  The results revealed that, in the opinion of survey respondents, misconceptions about probability interfere with students’ ability to master inferential statistics.  Some evidence was obtained that instructors who targeted misconceptions directly in instruction achieved better results in facilitating the resolution of misconceptions than those who used formal instruction. 
(Leonid Khazanov. An Investigation of Approaches and Strategies for Resolving Students’ Misconceptions about Probability in Introductory College Statistics.  Proceedings of the AMATYC 31st Annual Conference, San Diego, California, 2005, pp. 40-48.)

“We develop a scenario of probability teaching in Brazil: curricular issues, teacher formation, didactic books, and activities developed by researchers.  Next, we present the following inquiry:  which pedagogical actions may contribute to an effective improvement in probability teaching?  From our research, we conclude that, in the short time, such actions may include:  investing in the continuing process of teachers’ formation, and proposing pedagogical activities which allow the pupils to develop probabilistic reasoning, and then to formalize concepts.  According to these ideas, we have designed special workshops for teacher in-service.  Each of the workshops for a group of basic mathematics education teachers was carried out in three meetings of 4 hours each, in which we worked on several topics in probability and statistics.  In this paper, we describe the last meeting with the title “Influence of previous knowledge in a Bayesian approach”, in which we dealt with the following concepts: prior information; concepts of probability like classic, frequentist, geometric and conditional probability, and the Bayes’ theorem. Beyond the positive results reached in this workshop, we detected the need of planning and elaborating more didactical sequences that allow the pupils to improve their probabilistic reasoning, and to follow-up these teachers’ work at school in order to verify pupils’ reactions to  those kinds of activities, and therefore, making possible an effective improvement in probability teaching in basic education.”
(Verônica Yumi Kataoka, Ademária Aparecida de Souza, Anderson de Castro Soares de Oliveira, Fabrícia de Matos O. Fernandes, Patrícia Ferreira Paranaíba, Marcelo Silva de Oliveira. Probability Teaching in Brazilian Basic Education: Evaluation and Intervention)

The cyclic system is one of the most important models used to describe the real world and pervades daily life.  The main contribution of this paper is to discuss probabilistic reasoning, especially the application of Bayesian network in the cyclic system in detail.  First, they analyze the cyclic system from the rationality of its existence to its explicit description in probabilistic reasoning domain.  From this, they find that it is impossible to apply the Bayesian network to the cyclic system directly because of the speciality of cyclic structure.  Then, they define the virtual Bayesian network for the cyclic system and discuss the concrete reasoning by means of it.  As a result, they apply the main idea of Bayesian network to the cyclic mechanism indirectly to handle the challenges to probabilistic reasoning.
(Kedian Mu, Zuoquan Lin. Probabilistic Reasoning in the Cyclic System)