ELEC 321

Conditional Probability

Updated 2017-09-28

Conditional Probability

The outcome could be any element in the sample space , but the range of possibilities is restricted due to partial information.

Partial Information: Insufficient or fuzzy information about the output

Example: examples of partial information

Conditioning Event: The event that represents partial information. The event of interest is denoted by

Example: rolling a dice with conditional event and event of interest

(conditioning event)

(event of interest)

Example: the final grade of ELEC 321

(conditioning event)

(event of interest)

Definition of Conditional Probability

Suppose that the probability of is not 0: , then,

This reads “the probability of given equals to the probability of and divided by the probability of .

Rearrange and we get useful formulas:

Conditional Probability and Probability Axioms

is a function of and for fixed (otherwise the axioms doesn’t hold) satisfies all of the probability axioms listed in module 1.

  1. If are disjoint for then:

Example: dice roll from above (and assuming each side of the dice is equally likely)

Example: ELEC 321 grades

We suppose that (each percentage is equally likely).

Screening Tests

Consider a screening test for defective iPhones, the screening test can either result in:

But screening test itself sometimes have two types of errors:

Given these outcomes, there are total of 4 possible events for each event:

For iPhone status:

For test result:

In this scenario, we will arbitrarily define the sensitivity (probability of test positive given that the iPhone is defective) of the test to be 0.95. Which also implies that the probability of test negative given a defective iPhone is .

We will also arbitrarily define the specificity (probability of test negative given that the iPhone is not defective) of the test to be 0.99. Similarly, the probability that the test is positive if the device is not defective is .

The proportion of the the defective items is also known.

Given the conditions, we can compute:

Bayes’ Theorem

Bayes’ theorem is a formula that describes how to update the probability of hypothesis given some evidence.

Where is hypothesis, is the evidence.

The simple form of Bayes’ formula is:

How did we get to the expression on the right? which . The denominator can be broken down intuitively into . And then we turn the intersections in the denominator into conditional probability form.

The general form of Bayes’ formula is:

This is identical to above except some conditions need to be satisfied:

Example: three prisoners


Result: Given the information, C is now twice more likely to be pardoned than A, why?



and let

so we can say that the probability of each prisoner pardoned is . These are the Prior Probability. This also implies that are all disjoint, which satisfies the conditions for the general Bayes’ formula.

Since we safely assume that the warden never lies, we can list the conditional probability of given each of events . The probability of given (probability of warden saying prisoner B is not pardoned while prisoner B is pardoned) is 0. Next, the probability of given prisoner A being pardoned is because of the random coin toss. Last, the probability of given prisoner C being pardoned is 1.

Now we may use Bayes’ formula to compute

… and .

The probability of C being pardoned, thus, is proved to be twice the probability of A pardoned.

Example: Screening Example II


Also let , so suppose

Given these information, we may ask:

Probability of positive test:

Probability of defective:

Probability of positive test:

see slides


Events and are independent if the probability of the intersection of and equals to the product of the probability of each.

If and are independent, then

If , then is independent of all .

If and are non-trivial and mutually exclusive () then they cannot be independent.

If then they cannot be independent.

If and are dependent, the probability of can still be calculated.

System of Independent Components

Consider a system in system and parallel.

In series:

graph LR i((input))-->a a-->b b-->c c-->o((output))

In parallel:

graph LR s((input))-->a s-->b s-->c a-->o((output)) b-->o c-->o

The reliability of the system is the rate the of getting a correct output given an input.


And we assume , , and are independent. so

Example: consider that each of the component , , or have the reliability of .

In series, the reliability is:

In parallel, we compute the reliability by computing the contrary (when the system will fail):

Conditional Independence


are conditionally independent given the event if:

Sequential Bayes’ Formula

Let be the outcome of -th test:

The outcome will evolve as we obtain more information. is evaluated with no historical information or it is given.

where are outcome as a sequence. and the ‘data’ at step is

So assume that are independent given some event and also given , then for :

This means that the new probability depends on which is the previous probability, and . is the new piece of data. is the intersection of and all previous data ().

Example: pseudo code


Outcomes for the tests:

Probability of event of interest :

Sensitivity of test:

Specificity of test:


Example: demine whether a component of a device is fault or not, base on experiments

Example: whether if a patient has cancer or not

Example: Spam e-mail detection