Naïve Bayesian Inference

We consider 10 sick people with names A,B,C,D,E,F,G,H,I, and J . Furthermore , we assume there are a total of three disease a person can have { flu , drunk , allergy } and a total of five symptoms { fever , dizzy , tired , chest pain , headache } . Recall that diseases imply a symptom and not the other way around. Typically, a sick person presents a doctor with symptoms and seeks the corresponding disease, thus it appears we are asking the doctor to make an incorrect inference of the form symptom implies disease .

The following data is collected:
 

Fever
Dizzy
Tired
Chest Pain
Headache
Total
Flu
A,D,E,I
D,I
D
A,I
A,D,E
4
Drunk
F
B,F,G
B,G


3
Allergy
C,H
H,J
H
J
C,H,J
3
Total
7
7
4
3
6


The following probability matrix is formed:


Fever
Dizzy
Tired
Chest Pain
Headache
Total
Flu
4/10 = 0.4
2/10 = 0.2
1/10 = 0.1
2/10 = 0.2
3/10 = 0.3
4/10 = 0.4
Drunk
0.1
0.3
0.2
0
0
0.3
Allergy
0.2
0.2
0.1
0.1
0.3
0.3

0.7
0.7
0.4
0.3
0.6


This table has the following interpretation:
P(Flu)=0.4
P(Fever)=0.7
P( Flu and Fever)=0.4
P(Flu and Dizzy)=0.2 etc.

We now ask the following question: what is the probability that somebody with the flu has chest pains ? Stated another way, we ask what proportion of flu sufferers also have chest pains? The number of people with the flu is 4, while among flu sufferers the number with chest pains is 2. So the probability of having chest pains given that one has the flu is 2/4 = 0.5. This is called a conditional probability and is denote by
        P( chest pains | flu)
In general, P(A|B) is defined as follows:
        P(A|B) = P( A and B) / P(B)

The conditional probability table is given below:


Fever
Dizzy
Tired
Chest Pains
Headache
Flu
0.4/0.4 = 1
0.2/0.4=0.5
0.1/0.4=0.25
0.2/0.4=0.2
0.3/0.4=0.75
Drunk
0.1/0.3=0.33
0.3/0.3=1.0
0.2/0.3=0.66
0
0
Allergy
0.2/0.3=0.66
0.2/0.3=0.66
0.1/0.3=0.33
0.1/0.3=0.33
0.3/0.3=1.0


Let us now consider the following: a patient walks into the doctor’s office and complains of fever. What disease does the patient have. We will consider the following quantities in answering the question:
    P(flu | fever )     P(drunk| fever)     P(allergy | fever)

Whichever one of these quantities is the largest will determine the disease. How does one determine P( flu | fever) ? We  know
    P(flu) , P(fever) and P(fever| flu).
By definition,
    P( flu | fever) = P( flu and fever) / P(fever).
Also note that P(flu and fever) = P(fever and flu). We already are given P(fever | flu) = P(fever and flu)/P(flu). Note that
    P(fever and flu) = P(flu and fever) = P(fever | flu) * P( flu).
Thus, 
    P(flu | fever) = P(flu and fever ) / P (fever) = P(fever | flu) * P(flu) / P(fever)
This is called Bayes Formula. Generally, if we are given P(A), P(B) and P(A|B), then Bayes formula is
    P(B|A)=P(A|B)*P(B)/P(A)
Returning to our example,
    P(flu | fever)=1.0 * 0.4/0.7 = 0.571 ,
    P(drunk |fever)=0.33*0.3/0.7=0.141,
    P(allergy |fever)=0.66*0.3/0.7=0.283 .

We see that flu is the mosat likely explanation; twice as likely as the next possible disease.

Naïve Bayes Theorem
Let us continue with this example and suppose that the patient complains of both a fever and of being dizzy. To calculate P(flu| fever and dizzy), we would have:
    P(flu | fever and dizzy) = P(fever and dizzy | flu) *P(flu) / P(fever and dizzy)
In general, we don’t have all this information, and it would be impractical to store it all anyway (too much space). We make the following assumption: all symptoms are independent. That is , if S1 and S2 are symptoms, then
    P(S1 and S2)=P(S1)*P(S2).
Thus,
    P(flu | fever and dizzy) = P(flu) * P(fever | flu) * P(dizzy | flu) / (P(fever) * P(dizzy)).

Summarizing this, if S1 andS2 are independent symptoms then
    P(D | S1 and S2) = P(D) * ( P(S1|D) / P(S1) ) * ( P(S2|D) / P(S2) )

This result can be extended to multiple symptoms. Returning to our example:
    P(flu | fever and dizzy)= 0.4 * (1.0/0.7)*(0.5/0.7) = 20/49 or about 0.3
    P(drunk | fever and dizzy) = 0.3 *(0.33/0.7)*(1.0/0.7) = 0.099/0.49 or about 0.1
    P(allergy| fever and dizzy)= 0.3 * (0.66/0.7)*(0.66/0.7) = 0.13/0.49 or about 0.27

Thus , we would still conclude that the patient has the flu, but this diagnosis is not as likely as the single symptom case , since allergy is almost as likely.