Statistics

Note

This module has been deprecated. See the stats module.

The statistics module in SymPy implements standard probability distributions and related tools. Its contents can be imported with the following statement:

>>> from sympy import *
>>> from sympy.statistics import *
>>> init_printing(use_unicode=False, wrap_line=False, no_global=True)

Normal distributions

Normal(mu, sigma) creates a normal distribution with mean value mu and standard deviation sigma. The Normal class defines several useful methods and properties. Various properties can be accessed directly as follows:

>>> N = Normal(0, 1)
>>> N.mean
0
>>> N.median
0
>>> N.variance
1
>>> N.stddev
1

You can generate random numbers from the desired distribution with the random method:

>>> N = Normal(10, 5)
>>> N.random() 
4.914375200829805834246144514
>>> N.random() 
11.84331557474637897087177407
>>> N.random() 
17.22474580071733640806996846
>>> N.random() 
9.864643097429464546621602494

The probability density function (pdf) and cumulative distribution function (cdf) of a distribution can be computed, either in symbolic form or for particular values:

>>> N = Normal(1, 1)
>>> x = Symbol('x')
>>> N.pdf(1)
   ___
 \/ 2
--------
    ____
2*\/ pi
>>> N.pdf(3).evalf()
0.0539909665131880
>>> N.cdf(x)
   /  ___        \
   |\/ 2 *(x - 1)|
erf|-------------|
   \      2      /   1
------------------ + -
        2            2
>>> N.cdf(-oo), N.cdf(1), N.cdf(oo)
(0, 1/2, 1)
>>> N.cdf(5).evalf()
0.999968328758167

The method probability gives the total probability on a given interval (a convenient alternative syntax for cdf(b)-cdf(a)):

>>> N = Normal(0, 1)
>>> N.probability(-oo, 0)
1/2
>>> N.probability(-1, 1)
   /  ___\
   |\/ 2 |
erf|-----|
   \  2  /
>>> N.probability(-1, 1).evalf()
0.682689492137086

You can also generate a symmetric confidence interval from a given desired confidence level (given as a fraction 0-1). For the normal distribution, 68%, 95% and 99.7% confidence levels respectively correspond to approximately 1, 2 and 3 standard deviations:

>>> N = Normal(0, 1)
>>> N.confidence(0.68)
(-0.994457883209753, 0.994457883209753)
>>> N.confidence(0.95)
(-1.95996398454005, 1.95996398454005)
>>> N.confidence(0.997)
(-2.96773792534178, 2.96773792534178)

Plug the interval back in to see that the value is correct:

>>> N.probability(*N.confidence(0.95)).evalf()
0.950000000000000

Other distributions

Besides the normal distribution, uniform continuous distributions are also supported. Uniform(a, b) represents the distribution with uniform probability on the interval [a, b] and zero probability everywhere else. The Uniform class supports the same methods as the Normal class.

Additional distributions, including support for arbitrary user-defined distributions, are planned for the future.

API Reference

Sample

class sympy.statistics.distributions.Sample

Sample([x1, x2, x3, ...]) represents a collection of samples. Sample parameters like mean, variance and stddev can be accessed as properties. The sample will be sorted.

Examples

>>> from sympy.statistics.distributions import Sample
>>> Sample([0, 1, 2, 3])
Sample([0, 1, 2, 3])
>>> Sample([8, 3, 2, 4, 1, 6, 9, 2])
Sample([1, 2, 2, 3, 4, 6, 8, 9])
>>> s = Sample([1, 2, 3, 4, 5])
>>> s.mean
3
>>> s.stddev
sqrt(2)
>>> s.median
3
>>> s.variance
2

Continuous Probability Distributions

class sympy.statistics.distributions.ContinuousProbability

Base class for continuous probability distributions

probability(s, a, b)

Calculate the probability that a random number x generated from the distribution satisfies a <= x <= b

Examples

>>> from sympy.statistics import Normal
>>> from sympy.core import oo
>>> Normal(0, 1).probability(-1, 1)
erf(sqrt(2)/2)
>>> Normal(0, 1).probability(1, oo)
-erf(sqrt(2)/2)/2 + 1/2
random(s, n=None)

random() – generate a random number from the distribution. random(n) – generate a Sample of n random numbers.

Examples

>>> from sympy.statistics import Uniform
>>> x = Uniform(1, 5).random()
>>> x < 5 and x > 1
True
>>> x = Uniform(-4, 2).random()
>>> x < 2 and x > -4
True
class sympy.statistics.distributions.Normal(mu, sigma)

Normal(mu, sigma) represents the normal or Gaussian distribution with mean value mu and standard deviation sigma.

Examples

>>> from sympy.statistics import Normal
>>> from sympy import oo
>>> N = Normal(1, 2)
>>> N.mean
1
>>> N.variance
4
>>> N.probability(-oo, 1)   # probability on an interval
1/2
>>> N.probability(1, oo)
1/2
>>> N.probability(-oo, oo)
1
>>> N.probability(-1, 3)
erf(sqrt(2)/2)
>>> _.evalf()
0.682689492137086
cdf(s, x)

Return the cumulative density function as an expression in x

Examples

>>> from sympy.statistics import Normal
>>> Normal(1, 2).cdf(0)
-erf(sqrt(2)/4)/2 + 1/2
>>> from sympy.abc import x
>>> Normal(1, 2).cdf(x)
erf(sqrt(2)*(x - 1)/4)/2 + 1/2
confidence(s, p)

Return a symmetric (p*100)% confidence interval. For example, p=0.95 gives a 95% confidence interval. Currently this function only handles numerical values except in the trivial case p=1.

For example, one standard deviation:

>>> from sympy.statistics import Normal
>>> N = Normal(0, 1)
>>> N.confidence(0.68)
(-0.994457883209753, 0.994457883209753)
>>> N.probability(*_).evalf()
0.680000000000000

Two standard deviations:

>>> N = Normal(0, 1)
>>> N.confidence(0.95)
(-1.95996398454005, 1.95996398454005)
>>> N.probability(*_).evalf()
0.950000000000000
static fit(sample)

Create a normal distribution fit to the mean and standard deviation of the given distribution or sample.

Examples

>>> from sympy.statistics import Normal
>>> Normal.fit([1,2,3,4,5])
Normal(3, sqrt(2))
>>> from sympy.abc import x, y
>>> Normal.fit([x, y])
Normal(x/2 + y/2, sqrt((-x/2 + y/2)**2/2 + (x/2 - y/2)**2/2))
pdf(s, x)

Return the probability density function as an expression in x

Examples

>>> from sympy.statistics import Normal
>>> Normal(1, 2).pdf(0)
sqrt(2)*exp(-1/8)/(4*sqrt(pi))
>>> from sympy.abc import x
>>> Normal(1, 2).pdf(x)
sqrt(2)*exp(-(x - 1)**2/8)/(4*sqrt(pi))
class sympy.statistics.distributions.Uniform(a, b)

Uniform(a, b) represents a probability distribution with uniform probability density on the interval [a, b] and zero density everywhere else.

cdf(s, x)

Return the cumulative density function as an expression in x

Examples

>>> from sympy.statistics import Uniform
>>> Uniform(1, 5).cdf(2)
1/4
>>> Uniform(1, 5).cdf(4)
3/4
confidence(s, p)

Generate a symmetric (p*100)% confidence interval.

>>> from sympy import Rational
>>> from sympy.statistics import Uniform
>>> U = Uniform(1, 2)
>>> U.confidence(1)
(1, 2)
>>> U.confidence(Rational(1,2))
(5/4, 7/4)
static fit(sample)

Create a uniform distribution fit to the mean and standard deviation of the given distribution or sample.

Examples

>>> from sympy.statistics import Uniform
>>> Uniform.fit([1, 2, 3, 4, 5])
Uniform(-sqrt(6) + 3, sqrt(6) + 3)
>>> Uniform.fit([1, 2])
Uniform(-sqrt(3)/2 + 3/2, sqrt(3)/2 + 3/2)
pdf(s, x)

Return the probability density function as an expression in x

Examples

>>> from sympy.statistics import Uniform
>>> Uniform(1, 5).pdf(1)
1/4
>>> Uniform(2, 4).pdf(2)
1/2
class sympy.statistics.distributions.PDF(func, (x, a, b)) represents continuous probability distribution with probability distribution function func(x) on interval (a, b)

If func is not normalized so that integrate(func, (x, a, b)) == 1, it can be normalized using PDF.normalize() method

Examples

>>> from sympy import Symbol, exp, oo
>>> from sympy.statistics.distributions import PDF
>>> from sympy.abc import x
>>> a = Symbol('a', positive=True)
>>> exponential = PDF(exp(-x/a)/a, (x,0,oo))
>>> exponential.pdf(x)
exp(-x/a)/a
>>> exponential.cdf(x)
1 - exp(-x/a)
>>> exponential.mean
a
>>> exponential.variance
a**2
cdf(x)

Return the cumulative density function as an expression in x

Examples

>>> from sympy.statistics.distributions import PDF
>>> from sympy import exp, oo
>>> from sympy.abc import x, y
>>> PDF(exp(-x/y), (x,0,oo)).cdf(4)
y - y*exp(-4/y)
>>> PDF(2*x + y, (x, 10, oo)).cdf(0)
-10*y - 100
normalize()

Normalize the probability distribution function so that integrate(self.pdf(x), (x, a, b)) == 1

Examples

>>> from sympy import Symbol, exp, oo
>>> from sympy.statistics.distributions import PDF
>>> from sympy.abc import x
>>> a = Symbol('a', positive=True)
>>> exponential = PDF(exp(-x/a), (x,0,oo))
>>> exponential.normalize().pdf(x)
exp(-x/a)/a
transform(func, var)

Return a probability distribution of random variable func(x) currently only some simple injective functions are supported

Examples

>>> from sympy.statistics.distributions import PDF
>>> from sympy import oo
>>> from sympy.abc import x, y
>>> PDF(2*x + y, (x, 10, oo)).transform(x, y)
PDF(0, ((_w,), x, x))