Numerical characteristics of a system of two random variables. Covariance and correlation coefficient

Above we became acquainted with the laws of distribution of random variables. Each distribution law comprehensively describes the properties of the probabilities of a random variable and makes it possible to calculate the probabilities of any events associated with a random variable. However, in many practical issues there is no need for such a complete description and it is often enough to indicate only individual numerical parameters that characterize the essential features of the distribution. For example, the average around which the values ​​of a random variable are scattered, some number characterizing the magnitude of this scatter. These numbers are intended to express in a concise form the most significant features of the distribution, and are called numerical characteristics of a random variable.

Among the numerical characteristics of random variables, we primarily consider the characteristics that fix the position of the random variable on the numerical axis, i.e. some average value of a random variable around which its possible values ​​are grouped. Of the characteristics of position in probability theory, the greatest role is played by expected value, which is sometimes simply called the mean of the random variable.

Let us assume that the discrete SV? takes the values x ( , x 2 ,..., x n with probabilities R j, p 2,... at Ptv those. given by distribution series

It is possible that in these experiments the value x x observed N ( times, value x 2 - N 2 times,..., value x n - N n once. At the same time + N 2 +... + N n =N.

Arithmetic mean of observation results

If N great, i.e. N-" oh, then

describing the center of distribution. The average value of a random variable obtained in this way will be called the mathematical expectation. Let us give a verbal formulation of the definition.

Definition 3.8. Mathematical expectation (MO) discrete SV% is a number equal to the sum of the products of all its possible values ​​and the probability of these values ​​(notation M;):

Now consider the case when the number of possible values ​​of the discrete SV? is countable, i.e. we have RR

The formula for the mathematical expectation remains the same, only in the upper limit of the amount P is replaced by oo, i.e.

In this case, we already get a series that may diverge, i.e. the corresponding CB^ may not have a mathematical expectation.

Example 3.8. SV?, given by the distribution series

Let's find the MO of this SV.

Solution. A-priory. those. Mt. does not exist.

Thus, in the case of a countable number of values ​​of SV, we obtain the following definition.

Definition 3.9. Mathematical expectation, or average value, discrete SV, having a countable number of values ​​is a number equal to the sum of a series of products of all its possible values ​​by the corresponding probabilities, provided that this series converges absolutely, i.e.

If this series diverges or converges conditionally, then they say that CB ^ does not have a mathematical expectation.

Let us move from a discrete SV to a continuous one with density p(x).

Definition 3.10. Mathematical expectation, or average value, continuous CB is called a number equal to

provided that this integral converges absolutely.

If this integral diverges or converges conditionally, then they say that the continuous SV has no mathematical expectation.

Remark 3.8. If all possible values ​​of the random variable J;

belong only to the interval ( A; b), That

Mathematical expectation is not the only position characteristic used in probability theory. Sometimes they are used, for example, as mode and median.

Definition 3.11. Fashion CB^ (designation Mot,) its most probable value is called, i.e. that for which the probability p i or probability density p(x) reaches its greatest value.

Definition 3.12. Median SV?, (designation Met) its value is called for which P(t> Met) = P(? > Met) = 1/2.

Geometrically, for a continuous NE, the median is the abscissa of that point on the axis Oh, for which the areas lying to the left and right of it are the same and equal to 1/2.

Example 3.9. NEt,has a distribution series

Let's find the mathematical expectation, mode and median of SV

Solution. MЪ,= 0-0.1 + 1 0.3 + 2 0.5 + 3 0.1 = 1.6. L/o? = 2. Me(?) does not exist.

Example 3.10. Continuous CB% has a density

Let's find the mathematical expectation, median and mode.

Solution.

p(x) reaches a maximum, then Obviously, the median is also equal since the areas on the right and left sides of the line passing through the point are equal.

In addition to position characteristics, a number of numerical characteristics for various purposes are used in probability theory. Among them, the initial and central moments are of particular importance.

Definition 3.13. Initial moment of kth order SV?, called mathematical expectation k-th degrees of this quantity: =M(t > k).

From the definitions of the mathematical expectation for discrete and continuous random variables it follows that


Remark 3.9. Obviously, the initial moment of the 1st order is the mathematical expectation.

Before defining the central moment, we introduce a new concept of a centered random variable.

Definition 3.14. Centered SV is the deviation of a random variable from its mathematical expectation, i.e.

It is easy to verify that

Centering a random variable is obviously equivalent to moving the origin to point M;. The moments of a centered random variable are called central points.

Definition 3.15. Central moment of kth order SV% is called mathematical expectation k-th degree of centered random variable:

From the definition of mathematical expectation it follows that


Obviously, for any random variable ^ the central moment of the 1st order is equal to zero: c x= M(? 0) = 0.

The second central point is of particular importance for practice. with 2. It's called dispersion.

Definition 3.16. Variance SV?, is called the mathematical expectation of the square of the corresponding centered quantity (notation D?)

To calculate the variance, you can obtain the following formulas directly from the definition:


Transforming formula (3.4), we can obtain the following formula for calculating DL;.

SV dispersion is a characteristic dispersion, the scattering of the values ​​of a random variable around its mathematical expectation.

The variance has the dimension of the square of a random variable, which is not always convenient. Therefore, for clarity, it is convenient to use a number whose dimension coincides with the dimension of the random variable as a characteristic of dispersion. To do this, take the square root of the variance. The resulting value is called standard deviation random variable. We will denote it a: a = l/s.

For non-negative SV?, it is sometimes used as a characteristic the coefficient of variation, equal to the ratio of the standard deviation to the mathematical expectation:

Knowing the mathematical expectation and standard deviation of a random variable, you can get an approximate idea of ​​the range of its possible values. In many cases, we can assume that the values ​​of the random variable % only occasionally fall outside the interval M; ± For. This rule for the normal distribution, which we will justify later, is called three sigma rule.

Expectation and variance are the most commonly used numerical characteristics of a random variable. From the definition of mathematical expectation and dispersion, some simple and fairly obvious properties of these numerical characteristics follow.

Protozoaproperties of mathematical expectation and dispersion.

1. Mathematical expectation of a non-random value With equal to the value c itself: M(s) = s.

Indeed, since the value With takes only one value with probability 1, then M(c) = With 1 = s.

2. The variance of the non-random quantity c is equal to zero, i.e. D(c) = 0.

Really, Dc = M(s - Mc) 2 = M(s- c) 2 = M( 0) = 0.

3. A non-random multiplier can be taken out as a sign of the mathematical expectation: M(c^) = c M(?,).

Let us demonstrate the validity of this property using the example of a discrete SV.

Let SV be given by a distribution series

Then

Hence,

The property is proved similarly for a continuous random variable.

4. The non-random multiplier can be taken out of the sign of the squared dispersion:

The more moments of a random variable are known, the more detailed understanding of the distribution law we have.

In probability theory and its applications, two more numerical characteristics of a random variable are used, based on the central moments of the 3rd and 4th orders - asymmetry coefficient, or m x.

For discrete random variables expected value :

The sum of the values ​​of the corresponding value by the probability of random variables.

Fashion (Mod) of a random variable X is its most probable value.

For a discrete random variable. For a continuous random variable.


Unimodal distribution


Multi modal distribution

In general, Mod and expected value Not

match up.

Median (Med) of a random variable X is a value for which the probability that P(X Med). Any Med allocation can only have one.


Med divides the area under the curve into 2 equal parts. In the case of a single-modal and symmetric distribution

Moments.

Most often in practice, moments of two types are used: initial and central.

Starting moment. The th order of a discrete random variable X is called a sum of the form:

For a continuous random variable X, the initial moment of order is called the integral , it is obvious that the mathematical expectation of a random variable is the first initial moment.

Using the sign (operator) M, the initial moment of the th order can be represented as a checkmate. expectation of the th power of some random variable.

Centered The random variable of the corresponding random variable X is the deviation of the random variable X from its mathematical expectation:

The mathematical expectation of a centered random variable is 0.

For discrete random variables we have:


The moments of a centered random variable are called Central moments

Central moment of order random variable X is called the mathematical expectation of the th power of the corresponding centered random variable.

For discrete random variables:

For continuous random variables:

Relationship between central and initial moments of different orders

Of all the moments, the first moment (mathematical expectation) and the second central moment are most often used as a characteristic of a random variable.

The second central moment is called dispersion random variable. It has the designation:

According to definition

For a discrete random variable:

For a continuous random variable:

The dispersion of a random variable is a characteristic of the dispersion (scattering) of random variables X around its mathematical expectation.

Dispersion means dispersion. The variance has the dimension of the square of the random variable.

To visually characterize the dispersion, it is more convenient to use the quantity m y the same as the dimension of the random variable. For this purpose, the root is taken from the variance and a value called - standard deviation (RMS) random variable X, and the notation is introduced:

The standard deviation is sometimes called the “standard” of the random variable X.

has a variance equal to 1 and a mathematical expectation equal to 0.

Normalized random variable V is the ratio of a given random variable X to its standard deviation σ

Standard deviation is the square root of the variance

The mathematical expectation and variance of the normalized random variable V are expressed through the characteristics of X as follows:

MV= M(X)σ=1v, DV= 1,

where v is the coefficient of variation of the original random variable X.

For the distribution function F V (x) and the distribution density f V (x) we have:

F V (x) = F(σx), f V (x) = σf(σx),

Where F(x)– distribution function of the original random variable X, A f(x)– its probability density.

Correlation coefficient.

Correlation coefficient is an indicator of the nature of the mutual stochastic influence of changes in two random variables. The correlation coefficient can take values ​​from -1 to +1. If the absolute value is closer to 1, then this means the presence of a strong connection, and if closer to 0, the connection is absent or is significantly nonlinear. When the correlation coefficient is equal in modulus to one, we speak of a functional relationship (namely a linear dependence), that is, changes in two quantities can be described by a linear function.

The process is called stochastic, if it is described by random variables whose value changes over time.

Pearson correlation coefficient.

For metric quantities, the Pearson correlation coefficient is used, the exact formula of which was derived by Francis Hamilton. Let X and Y be two random variables defined on the same probability space. Then their correlation coefficient is given by the formula:

Chebyshev's inequalities.

Markov's inequality.

Markov's inequality in probability theory, gives an estimate of the probability that a random variable will exceed a fixed positive constant in absolute value, in terms of its mathematical expectation. The resulting estimate is usually quite rough. However, it allows one to get a certain idea of ​​the distribution when the latter is not known explicitly.

Let a random variable be defined on a probability space and its mathematical expectation is finite. Then

,

Where a > 0.

Chebyshev-Bienieme inequality.

If E< ∞ (E – математическое ожидание), то для любого , справедливо

Law of large numbers.

Law of Large Numbers states that the empirical mean (arithmetic mean) of a sufficiently large finite sample from a fixed distribution is close to the theoretical mean (mathematical expectation) of that distribution. Depending on the type of convergence, a distinction is made between the weak law of large numbers, when convergence in probability occurs, and the strong law of large numbers, when convergence occurs almost everywhere.



There will always be a number of trials in which, with any given probability in advance, the frequency of occurrence of some event will differ as little as desired from its probability. The general meaning of the law of large numbers is that the combined action of a large number of random factors leads to a result that is almost independent of chance.

The weak law of large numbers.

Then Sn P M(X).

Strengthened law of large numbers.

Then Sn→M(X) is almost certain.

Mathematical expectation discrete random variable is the sum of the products of all its possible values ​​and their probabilities

Comment. From the definition it follows that the mathematical expectation of a discrete random variable is a non-random (constant) quantity.

The mathematical expectation of a continuous random variable can be calculated using the formula

M(X) =
.

The mathematical expectation is approximately equal to(the more accurate, the greater the number of tests) arithmetic mean of observed values ​​of a random variable.

Properties of mathematical expectation.

Property 1. The mathematical expectation of a constant value is equal to the constant itself:

Property 2. The constant factor can be taken out of the mathematical expectation sign:

Property 3. The mathematical expectation of the product of two independent random variables is equal to the product of their mathematical expectations:

M(XY) =M(X) *M(Y).

Property 4. The mathematical expectation of the sum of two random variables is equal to the sum of the mathematical expectations of the terms:

M(X+Y) =M(X) +M(Y).

12.1. Dispersion of a random variable and its properties.

In practice, it is often necessary to find out the scattering of a random variable around its mean value. For example, in artillery it is important to know how closely the shells will fall near the target that is to be hit.

At first glance, it may seem that the easiest way to estimate dispersion is to calculate all possible deviations of a random variable and then find their average value. However, this path will not yield anything, since the average value of the deviation, i.e. M, for any random variable is zero.

Therefore, most often they take a different path - they use variance to calculate it.

Variance(scattering) of a random variable is the mathematical expectation of the squared deviation of a random variable from its mathematical expectation:

D(X) = M2.

To calculate the variance, it is often convenient to use the following theorem.

Theorem. The variance is equal to the difference between the mathematical expectation of the square of the random variable X and the square of its mathematical expectation.

D(X) = M(X 2) – 2.

Properties of dispersion.

Property 1. Variance of constant valueCequal to zero:

Property 2. The constant factor can be raised to the sign of the dispersion by squaring it:

D(CX) =C 2 D(X).

Property 3. The variance of the sum of two independent random variables is equal to the sum of the variances of these variables:

D(X+Y) =D(X) +D(Y).

Property 4. The variance of the difference between two independent random variables is equal to the sum of their variances:

D(X–Y) =D(X) +D(Y).

13.1. Normalized random variables.

has a variance equal to 1 and a mathematical expectation equal to 0.

Normalized random variable V is the ratio of a given random variable X to its standard deviation σ

Standard deviation is the square root of the variance

The mathematical expectation and variance of the normalized random variable V are expressed through the characteristics of X as follows:

where v is the coefficient of variation of the original random variable X.

For the distribution function F V (x) and the distribution density f V (x) we have:

F V (x) =F(σx), f V (x) =σf(σx),

Where F(x)– distribution function of the original random variable X, A f(x)– its probability density.