The problem with Discrete random variables was they can’t be used for continuous data. In engineering, there comes a time when we need to model continuous random data, and it happens a lot the most popular random variable which we use is Gaussian Random Variable. Since Discrete Random Variables can be characterized or fully described by their PMF (Probability Mass Function) we can’t use the same for a continuous case because the probability would be 0. We can explain this by giving an example.
To see discrete random variables please visit this article.
Suppose you want to measure a temperature of a room which can be modeled by a continuous random variable as the temperature is random. As an interesting note, its variance is not that large enough; for example the temperature of a room observed over an hour can be: { 35.1, 35.2, 35.3, 35.2, 34.9 and so on} so changes between values are not that large, hence variance is low we will talk about variance later on. But as a sample data given what is the probability that at any instance of time the temperature is exactly 35.1? Consider the data as continuous i.e. infinite.
Answer: It will be zero always as the event that the temperature being exactly 35.1 is very unlikely to occur.
Now the question exist: Is there any mathematical tool to model these continuous random variables? The answer is YES, and their names are Cumulative Distribution Function and Probability Density Function.
Cumulative Distribution Function:
PMF tells that the probability of a random variable [latex]X[/latex] taking the value x can be something between 0 to 1, and that’s why it is defined as [latex]P(X=x)[/latex], but that is not happening here in the continuous case as [latex]P(X=x) = 0[/latex]. So, we define CDF as:
[latex]F_X(x) = P(X \leq x)[/latex]
Please don’t forget that PMF is for discrete, and CDF is for continuous also they are both Probabilities.
So in short CDF tells us that what are the chances of [latex]X[/latex] taking the values less than [latex]x[/latex]. But the idea will be clear by solving multiple problems related to CDF. Some important properties of CDF are mentioned below they will help you to clear the idea on how continuous random variables are characterized by CDF.
Properties of CDF:
- [latex]F_X(-\infty) = 0, \; \; F_X(\infty) = 1[/latex].
The above statement makes sense doesn’t it ? as CDF is defined [latex]P(X \leq x)[/latex] hence when [latex]x = -\infty[/latex] then [latex]P(X \leq -\infty)[/latex] should be 0 as there are no values less than [latex]-\infty[/latex] and probability should be 0 as there are no values exist below [latex]-\infty[/latex]. The same applies for [latex]P(X \leq \infty)[/latex] i.e. positive infinity that means X has taken all values upto infinity so the probability automatically jumps to 1.
- [latex]0 \leq F_X(x) \leq 1[/latex]
As CDF is a function of Probability hence we know that probability is always between 0 to 1, hence CDF is always from 0 to 1.
- If [latex]x_1 < x_2 \; \text{then} \; F_X(x_1) \leq F_X(x_2)[/latex]
We can proof the above property. Did you remember that if A \subset B then P(A) \leq P(B) if not then please see this article.
[latex]F_X(x_1) = P(X \leq x_1)[/latex]
[latex]F_X(x_2) = P(X \leq x_2)[/latex]
Since [latex]x_1 < x_2[/latex] then [latex]\{ X < x_1 \} \subset \{ X < x_2 \}[/latex]
then we have [latex]F_X(x_1) \leq F_X(x_2)[/latex]
- Given [latex]x_1 < x_2[/latex] then [latex]P(x_1 < X \leq x_2) = F_X(x_2) – F_X(x_1)[/latex]
The last property can be prooven by simply noting that [latex]\{ X \leq x_2 \} = \{ X \leq x_1 \} \cup \{ x_1 < X \leq x_2 \}[/latex]
then [latex]P(X \leq x_2) = P(X \leq x_1) + P(x_1 < X \leq x_2)[/latex]
[latex]\Rightarrow \; P(x_1 < X \leq x_2) = P(X \leq x_2) – P(X \leq x_1)[/latex]
Thats all for now we will add more problems and examples to our problem database. If you have any suggestions, questions just leave a comment below: