Effects of CAS on understanding probabilistic concepts
Mihály Klincsik, University of Pécs
Pollack Mihály Faculty of Engineering
Define random variables and working with them
Create random variables and use of the main characteristic features
> | restart:with(Statistics): |
Create random variables with known distribution
A random variable X can be creat ed by the Maple command "RandomVariable " from Statistics package.
Choose one of the known distribution from Maple list
> | X:=RandomVariable(Normal(mu,sigma)); |
X is a
normally distributed
continuous random variable with mean
and standard deviation
.
Calculations of main features of the random variables
PDF procedure gives the formula of the Probability Density Function
> | assume(sigma>0):f:=unapply(PDF(X,u),u); |
The expected value of X can get from the
Mean
or
ExpectedValue
procedures and by the definition
using integration
> | Int(x*f(x),x=-infinity..infinity) = ExpectedValue(X);value(%); |
The
standard deviation
of X can get by the StandardDeviation procedure and from the definition:
deduce to the mean procedure.
> | sqrt(ExpectedValue((X-mu)^2))=StandardDeviation(X); |
Maple can give the formulas of the
(i) Cumulative Distribution Function of X with "CDF" command:
(ii) Characteristic function of X with "CharacteristicFunction" command
> | int(f(t),t=-infinity..x)=CDF(X,u); |
> | Mean(exp(I*t*X))=CharacteristicFunction(X,t); |
The list of all procedures which are available for a distribution we can get by using the " Distribution " command.
> | Z:=Distribution(Normal(mu,sigma)); |
These theoretically konwn features are implemented or realized in the practice.
The "Conditions" procedure gives assumptions about the parameters
and
.
> | Z:-Conditions; |
Features of a discrete type random variable are different from the continuous one.
> | U:=Distribution(Poisson(lambda)); |
We know from the theory that the parameter
of the Poisson distribution is non negative.
> | U:-Conditions; |
> | assume(lambda>0): Y:=RandomVariable(Poisson(lambda)); |
> | p:=unapply(ProbabilityFunction(Y,k),k):p(k); |
> | F:=unapply(CDF(Y,k),k); |
> | plot(subs(lambda=2,p(floor(k))),k=-1..9,title=`Poisson distribution`); |
> | plot(subs(lambda=2,F(k)),k=-1..8,thickness=3,discont=true,title=`Cumulative distribution function` ); |
Create random variables with new distribution
The following function f is appropriate to a probability density function.
> | f:=t -> piecewise(t < 0, 0, t < 1, 1-exp(-t), exp(1-t)-exp(-t)):f(t); |
> | plot(f(t),t=0..4); |
> | Int(f(t),t=0..infinity)=int(f(t),t=0..infinity);simplify(%); |
Create with "Distribution" command a new distribution object T for wich the PDF is the function f.
> | T := Distribution(PDF = f ); |
Let X random variable with distribution type T.
> | X:=RandomVariable(T); |
Probability density function is the given f
> | PDF(X,u); |
We can get the mean and standard deviation of X.
> | Mean(X),StandardDeviation(X); |
> |
Manipulating with continuous random variables
> | restart:with(Statistics): |
Linear operations on a continuous random variable
Consider Z as a standard normal random variable, i.e. the mean 0 and standard deviation 1.
> | Z:=RandomVariable(Normal(0,1)); fZ:=PDF(Z,u); |
Define a new random variable R by the operations
, where Z~N(0,1).
> | R:=2*Z+3; |
Maple can calculate the probability density function for the new random variable R such as the original Z.
> | fR:=PDF(R,u); |
> | plot([fZ,fR],u=-3..9,color=[blue,red],thickness=[3,3]); |
If we compare this density function with the general form of the density function of the normal distribution then we can recognize
~ Normal(3,2) where Z~Normal(0,1) .
> | PDF(Normal(3,2),u); |
Maple can identify that the repeated sum is a multiplication
> | Z+Z+Z; |
Operations on discrete version are not working
Similar simple operations are not working in discrete cases.
> | U:=RandomVariable(Poisson(1)); |
> | ProbabilityFunction(U,k); |
> | U2:=U+U; |
> | ProbabilityFunction(U2,k); |
Nonlinear operations on continuous random variable
Multiplication or power
> | R2:=Z*Z; |
> | PDF(R2,u)=PDF(ChiSquare(1),u); |
We compared the probablity density function of the square of a Z~N(0,1) and a Chi-Square with 1 degree of freedom.
We get the theoretical result that
Operations between two random variables
We can continue these investigations with two independent random variables having N(0,1) distribution.
> | Z1:=RandomVariable(Normal(0,1)); Z2:=RandomVariable(Normal(0,1)); |
In this case the random variables Z1 and Z2 are independent. Consider the sum of their squares!
> | C:=Z1^2+Z2^2; |
> | PDF(C,u) = PDF(RandomVariable(ChiSquare(2)),u); |
The result is well-know that
~
, where
~ N(0,1)
Operations between two random variables with different distributions
Consider a random variable X having exponential distribution with parameter 1 and Y having uniform distribution on interval [0,1].
> | X:=RandomVariable(Exponential(1)); Y:=RandomVariable(Uniform(0,1)); |
Form the sum of X and Y
> | Z:=X+Y; |
We can get the probability density function of the sum.
> | PDF(Z,u); |
> | plot(%,u=-4..4); |
> |
Theoretical results without making a great effort
Sum of independent exponentially distributed random variables
Experimenting and make conjecture
Create two independent random variables both having exponential distribution with parameter
> | restart:with(Statistics): |
> | assume(lambda>0): V1:=RandomVariable(Exponential(lambda)); V2:=RandomVariable(Exponential(lambda));; |
> | p(x)=PDF(V1,x); |
Create the sum of the two independent exponentially distributed random variable
> | V:=V1+V2; |
We obtaine the probability density function of the sum
> | fV:=PDF(V,x); |
Maple can't search from the built-in distribution that this is a known distribution. However we can compare it with gamma distribution with parameters
and 2.
> | PDF(Gamma(lambda,2),x); |
Conjecture
If
are independent random variables having
distribution, then the distribution of the sum
is Gamma with parameters
and n.
Sum of normally distributed random variables
Experimenting and make conjecture
Creating two indepenedent random variables V1 and V2 with normal distributions and let the parameters are different.
> | restart:with(Statistics): |
> | assume(sigma1>0,sigma2>0): V1:=RandomVariable(Normal(mu1,sigma1)); V2:=RandomVariable(Normal(mu2,sigma2)); |
Let us form the sum of V1 and v2.
> | S:=V1+V2; |
Maple can give the probability density function of the sum S !
> | fS:=PDF(S,x); |
Now we can compare this function with probability distribtion function having normal distribution with mean
and standard deviation
.
> | f:=PDF(Normal(mu1+mu2,sqrt(sigma1^2+sigma2^2)),x); |
> | fS-f=simplify(fS-f); |
Conjecture
If
are (not necessarily independent) random variables with
distribution, then the distribution of the sum
is also normal.
> |
Sum of two independent Poisson distribution.
A little bit harder to get similar results on discrete type random variables because of Maple can't compute the quantities. Have to give more inventions to get the results.
Define two independent random variables X and Y both having Poisson distribution with parameter
and
, respectively.
> | restart:with(Statistics): |
> | assume(lambda1>0,lambda2>0): X:=RandomVariable(Poisson(lambda1)); Y:=RandomVariable(Poisson(lambda2)); |
> | f1:=unapply(ProbabilityFunction(X,k),k); f2:=unapply(ProbabilityFunction(Y,k),k); |
Let S is the sum of X and Y.
> | S:=X+Y; |
Maple can't give the distribution of the random variable S.
> | ProbabilityFunction(S,k); |
We have to determine the probabilities P(S=n) for n = 0, 1, 2, ....
Choose an integer n and since S = X+Y and the values of X, Y may also be 0, 1,2,.. therefore
S = n = 0+n=1+(n-1)=...=k+(n-k) =...=n+0
From the condition of independence we get P(X=k, Y=n-k) =P(X=k)*P(Y=n-k).
Maple can easily give the results for specific n=0, 1, 2, ...., but we can write the approprate sums!
> | seq(factor(sum(f1(k)*f2(n-k),k=0..n)),n=0..5); |
We can make a conjection on the basis of this values.
However for the proof of the desired result for all non-negative integer n we have to make some more assumptions and symbolic simplifications.
> | assume(n>0):assume(k>=0):assume(n-k>=0): |
> | simplify(sum(f1(k)*f2(n-k),k=0..n),symbolic); |
> | P('S'=n)=convert(%,factorial); |
Theorem
The sum S=X+Y of two independent random variables both having Poisson distribution with parameters
and
is also Poisson distributed with paramater (
).
> |
A new result
We can make interesting explorations between the built-in functions easily!
Define two independent random variables having standard normal distributions.
> | restart:with(Statistics): |
> | Z1:=RandomVariable(Normal(0,1)): Z2:=RandomVariable(Normal(0,1)): |
Investigate the multiplication of the two standard normal distributions.
> | A:=Z1*Z2; |
It is amazing that the probability density function can be given by the second kind BesselK-function.
> | fA:=unapply(PDF(A,u),u); |
We get a new distribution having the BesselK - function as probabiblity density function.
> | plot(fA(u),u=-3..3); Int(fA(u),u=-infinity..infinity)=int(fA(u),u=-infinity..infinity); |
> |
Example: Breaking - load of a steel cable which is twisted from wires
Problem
A thick steel cable are twisted from 100 identical thin steel thread. We want to load the cable with M=1960 kg mass. Assume that the breaking stress of each thin threads having normal distribution with the mean m=20 kg and the standard deviation
kg. The strength of the whole cable is the sum of the component forces.
What is the probability that the cable is not breaking under this loading?
(We assume that neither of the thin threads break at an earlier date as the whole cable is breaking!)
Solution
Denote by X the breaking stress of a thin steel thread measuring it in kg. It is a random value and on the basis of the assumptions X ~ Normal (20 , 4).
Plot the probability density function of X!
> | restart:with(Statistics): |
> | X:=RandomVariable(Normal(20,4)); |
> | plot(PDF(X,u),u=0..40,title=`Probability density function of Normal(20,4)`); |
> |
Theoretical background
The cable consists of n=100 thin steel wires. The strength of the cable is the sum of the strength of the wires:
S =
+
+... +
,
where
are independent and distributed by the same Normal(20;4). We know from the theory of random variables that
(i) S is normally distributed;
(ii) the expected value E(S) =
= 100*20 = 2000;
(iii) the variance Var(S)=
= 100*
= 1600, because of the variables are independent.
> |
Practical considerations
We can create 100 random variables and add them by using the following Maple commands:
(a)
"sum"
Maple command;
(b) " add " Maple command;
(c) " for ... do .." loop statement.
However use carefully with these cases!
The "sum" statement is not appropriate creating the sum because of it creates always the same variable. So the sum is the 100 times of one variable.
> | n:=100; |
> | U:=sum(RandomVariable(Normal(20,4)),k=1..n); |
> | Variance(U); |
The result obtained by the " add " command is convenient.
> | Z:=add(RandomVariable(Normal(20,4)),k=1..n); |
The sum is also acceptable for us using the "for.. do " loop statement.
> | S:=RandomVariable(Normal(20,4)); for i from 2 to 100 do S:=S+RandomVariable(Normal(20,4)); end do:S; |
The expected value and the variance of S is coinciding with the theoretically expected
> | expected_value=Mean(S); |
> | variance_of_S=Variance(S); |
Plot the density function of the sum variable S!
> | fS:=PDF(S,u); |
> | density:=plot(fS,u=1850..2150,title=`Density function of the breaking-load of the cable`):density; |
The cable is not breaking under the load M=1960 kg, when the cable breaking - load S is greater than the actual loading mass M, i.e. S > 1960. The probability P(S>1960) can visualize by painting the adequate area under the density function.
> | painted:=plot(PDF(S,u),u=1960..2150,filled=true,color=yellow): plots[display]([density,painted]); |
Calculate the probability from the cumulative density function CDF(u)=P(S<u)!
P('S'>1960)=1-CDF(S,1960);
P('S'>1960)=evalf(rhs(%));
So nearly 84% is the probability that the cable does't break under the load M=1960 kg. Only 16% is the chance that the cable is breaking away.
> |