Mathematical methods of forecasting. Successes of modern natural science

Economic and mathematical methods. When using economic and mathematical methods, the structure of models is established and verified experimentally, under conditions that allow objective observation and measurement.

Determining the system of factors and the cause-and-effect structure of the phenomenon under study is the initial stage of mathematical modeling.

Statistical methods occupy a special place in forecasting. The methods of mathematical and applied statistics are used in planning any work on forecasting, in processing data obtained both by intuitive methods, and by using economic and mathematical methods proper. In particular, they are used to determine the number of expert groups, interviewed citizens, the frequency of data collection, and evaluate the parameters of theoretical economic and mathematical models.

Each of these methods has advantages and disadvantages. All forecasting methods complement each other and can be used together.

Scenario method- an effective tool for organizing forecasting, combining qualitative and quantitative approaches.

A scenario is a model of the future, which describes the possible course of events, indicating the probabilities of their implementation. The scenario identifies the main factors to be taken into account and indicates how these factors may affect the anticipated events. As a rule, several alternative scenarios are compiled. A scenario is thus a characterization of the future in an exploratory forecast, not a definition of one possible or desired state of the future. Usually the most probable variant of the scenario is considered as the base one, on the basis of which decisions are made. Other versions of the scenario, considered as alternative ones, are planned in the event that reality begins to approach their content to a greater extent, and not to the basic version of the scenario. Scenarios are usually descriptions of events and estimates of indicators and characteristics over time. The scenario preparation method was first used to identify possible outcomes of military operations. Later, scenario forecasting began to be used in economic policy, and then in strategic corporate planning. Now it is the most well-known integration mechanism for forecasting economic processes in the market. Scripts are an effective means of overcoming traditional thinking. A scenario is an analysis of a rapidly changing present and future, its preparation forces one to deal with details and processes that may be missed when using particular forecasting methods in isolation. Therefore, the scenario differs from a simple forecast. It is a tool that is used to determine the types of forecasts that should be developed to describe the future with sufficient completeness, taking into account all the main factors.


The use of scenario forecasting in market conditions provides:

better understanding of the situation, its evolution;

assessment of potential threats;

identifying opportunities;

identification of possible and expedient directions of activity;

increasing the level of adaptation to changes in the external environment.

Scenario forecasting is an effective means of preparing planned decisions both at an enterprise and in states.

Planning is closely related to forecasting, these processes are divided to a certain extent conditionally, therefore, the same methods or closely related methods can be used in planning and forecasting.

Plan approval decisions. Plans are the result of management decisions that are made on the basis of possible planning alternatives. Management decisions are made according to certain criteria. Using these criteria, alternatives are evaluated in terms of achieving one or more goals. Criteria reflect the goals set by decision makers.

A decision based on a single criterion is considered simple, and a decision based on several criteria is considered complex. The criteria, in which quantitative or ordinal rating scales are formulated, make it possible to use mathematical methods of operations research to prepare solutions.

Plan approval decisions tend to be not only complex due to multiple criteria, but downright difficult due to uncertainty, limited information and high responsibility. Therefore, the final decisions on the approval of plans are made by heuristic, intuitive selection from a limited number of pre-prepared alternatives.

Planning methods are thus methods of preparing planning alternatives, or at least one plan option, for approval by a decision maker or body.

Methods for preparing one or more variants of plans are distinguished by the methods used for drawing up these plans, methods and terms for the possible implementation of plans, planning objects.

Like forecasting, planning can be based on heuristic and mathematical methods. Among the mathematical methods of operations research, methods of optimal planning occupy a special place.

Methods of optimal planning. In solving the problems of preparing optimal, that is, the best according to certain criteria, plans, methods of mathematical programming can be used.

The tasks of mathematical programming are to find the maximum or minimum of a certain function in the presence of restrictions on the variables - the elements of the solution. A large number of typical problems of mathematical programming are known, for the solution of which effective methods, algorithms and programs for computers have been developed, for example:

Tasks on the composition of the mixture, which consist in determining a diet that has a minimum cost and consists of different products with different nutrient content, according to the condition of ensuring that their content in the diet is not lower than a certain level;

Tasks on the optimal production plan, which consist in determining the best plan for the production of goods in terms of sales volume or profit with limited resources or production capacities;

Transport tasks, the essence of which is the choice of a transportation plan that provides a minimum of transportation costs when fulfilling given volumes of deliveries to consumers at different points, with different possible routes, from different points where stocks or production capacities are limited.

Game theory methods can be used to plan for uncertain weather conditions, the expected timing of natural disasters. These are "games" with a passive "player" who acts regardless of your plans.

Methods have also been developed for solving problems of game theory with active "players" who act in response to the actions of the opposite side. In addition, methods have been developed for solving problems in which the actions of the parties are characterized by certain strategies - sets of action rules. These decisions can be useful when drawing up plans in the face of possible opposition from competitors, diversity in the actions of partners.

Solutions to game theory problems may depend on the level of risk one is willing to accept, or be based simply on obtaining the maximum guaranteed benefit. Solving certain types of simple game theory problems is reduced to solving linear programming problems.

More detailed and correct materials have been published on .

In March 2011, the note "Five Ways to Improve Prediction Accuracy" was published. The author Aleksey Skripchan very efficiently, simply and in sufficient detail considered in it the forecasting that must be carried out as part of marketing and planning. His epithet sounds interesting in the subsection "The Benefits of Better Forecasting":

Forecasting becomes the rudder that helps a company stay on course, change direction, or navigate unfamiliar waters with confidence…

I would like to add a few words to what has already been said. Mainly, it should be noted that in the mentioned article we are talking about expert forecasting. Need to distinguish two types of forecasting: expert and formalized.

Expert forecasting

Expert forecasting implies the formation of future values ​​by an expert, i.e. a person with deep knowledge in a particular area. In this case, the expert often uses the mathematical apparatus, however, in this type of forecasting, the mathematical apparatus is only an auxiliary computational tool. The basis is the knowledge and intuition of an expert, and therefore sometimes these methods are called intuitive.

Expert forecasting is used when the forecasting object is either too simple or, on the contrary, so complex that it is impossible to analytically take into account the influence of external factors.. Expert forecasting methods do not involve the development of forecasting models and reflect the individual judgments of specialists (experts) regarding the prospects for the development of the process. These methods include the following methods.

  • Method of expert assessments
  • Method of historical analogies
  • Foresight by pattern
  • fuzzy logic
  • Scenario modeling "what if"

Formalized forecasting is forecasting based on mathematical model, which, capturing the patterns of the process, at its output has the future values ​​of the process under study. quite a lot, for example, according to a number of reviews, there are currently over 100 classes of forecasting models. The number of general classes of models that are repeated in one variation or another in others is, of course, much smaller and can easily be reduced to a dozen.

  • Regression Models(regression model)
  • Autoregressive Models( ,AR)
  • Neural network models(artificial neural network, ANN)
  • Exponential smoothing models( ,ES)
  • Models based on Markov chains(Markov chain)
  • Classification-Regression Trees(classification and regression trees , CART)
  • Support vector machine(support vector machine , SVM)
  • genetic algorithm(genetic algorithm, GA)
  • Transfer Function Model(transfer function , TF)
  • Formalized fuzzy logic(fuzzy logic, FL)
  • Fundamental Models

The author of an article on forecasting in marketing quite rightly noted that “ like any tool, mathematics can be dangerous in the hands of an amateur. To check your own calculations, you can involve someone with strong statistical skills to analyze your information.». Mathematical forecasting models require developed competencies not only in mathematics, but also in programming, possession of complex statistical packages to create not only an accurate and fast model.

Improving Prediction Accuracy

Of course, both considered types of forecasting often work together, for example, based on a complex algorithm, the future values ​​of the time series are calculated, and then the expert checks these figures for adequacy. At this stage, the expert can make manual adjustments, which, given his high qualification, can positively affect the quality of the forecast.

In total, if you need to improve the accuracy of expert forecasting in marketing tasks, then you need to directly follow the recommendations given in the article. If you are faced with the task of improving the accuracy of forecasting through complex, fast, software-implemented mathematical models, then you should look away, that is, a forecast made on the basis of a set of independent forecasts. Soon I will talk about consensus forecast more details in this blog.

1

In the article, on specific examples, various mathematical methods of forecasting over time are considered, including simple extrapolation, methods based on growth rates, and mathematical modeling. It is shown that the choice of method depends on the forecast base - information for the previous time period.

forecasting

biostatistics

1. Afanasiev V.N., Yuzbashev M.M. Time Series Analysis and Forecasting: A Textbook. - M.: Finance and statistics, 2001. - 228 p.

2. Petri A., Sabin K. Visual statistics in medicine. - M.: GEOTAR-MED, 2003. - 144 p.

3. Sadovnikova N.A., Shmoylova R.A. Time Series Analysis and Forecasting: Textbook. – M.: Ed. Center EAOI, 2001. - 67 p.

Usually, forecasting is understood as the process of predicting the future based on some data from the past, i.e. the development of the phenomenon of interest in time is studied. Then the predicted value is considered as a function of time y=f(t) . However, other types of prognosis are also considered in medicine: a diagnosis is predicted, the diagnostic value of a new test, a change in one factor under the influence of another, etc.

The purpose of the article was to present various forecasting methods and approaches to their correct use in medicine.

Materials and methods of research

The following forecasting methods are considered in the article: simple extrapolation methods, moving average method, exponential smoothing method, average absolute growth method, average growth rate method, forecasting methods based on mathematical models.

Research results and discussion

As already noted, the forecast is based on some information from the past (forecast base). Before choosing a forecasting method, it is useful to at least qualitatively assess the dynamics of the studied quantity in previous moments of time. The presented graphs (Fig. 1) show that it can be different.

Rice. 1. Examples of the dynamics of the studied quantity

In the first case (plot A), relative stability is observed with slight fluctuations around the average value. In the second case (graph B), the dynamics is linearly increasing, in the third case (graph C), the dependence on time is non-linear, exponential. The fourth case (chart D) is an example of complex fluctuations that have several components.

The most common short-term forecasting method (1-3 time periods) is extrapolation, which consists in extending previous patterns into the future. The use of extrapolation in forecasting is based on the following assumptions:

The development of the phenomenon under study as a whole is described by a smooth curve;

The general trend in the development of the phenomenon in the past and present will not undergo major changes in the future.

The first method of the simple extrapolation methods is the series average method. In this method, the predicted level of the quantity under study is taken equal to the average value of the levels of the series of this quantity in the past. This method is used if the average level does not tend to change, or this change is insignificant (there is no clear trend, Fig. 1, graph A)

where yprog is the predicted level of the studied value; yi - value of the i-th level; n - forecast base.

In a sense, the segment of the time series covered by the observation can be likened to a sample, which means that the resulting forecast will be selective, for which a confidence interval can be specified

where is the standard deviation of the time series; tα -Student's test for a given level of significance and the number of degrees of freedom (n-1).

Example. In table. 1 shows the data of the time series y(t). Calculate the predicted value of y at the time t =13 using the average series method.

Table 1

Time series data y(t)

(80+98+94+103)/4

(80+98+94+103+84)/5

(80+98+94+103+84+115)/6

(80+98+94+103+84+115+98)/7

(80+98+94+103+84+115+98+113)/8

(80+98+94+103+84+115+98+113+114)/9

(80+98+94+103+84+115+98+113+114+87)/10

(80+98+94+103+84+115+98+113+114+87+107)/11

(80+98+94+103+84+115+98+113+114+87+107+85)/12

The original and smoothed series are shown in Fig. 2, calculation y - in table. 2.

Rice. 2. Initial and smoothed series

table 2

Confidence interval for forecast at time t =13

The moving average method is a short-term forecasting method based on the procedure for smoothing the levels of the studied value (filtering). Predominantly, linear anti-aliasing filters with an interval m are used, i.e.

.

Confidence interval

where is the standard deviation of the time series; tα - Student's test for a given level of significance and the number of degrees of freedom (n-1).

Example. In table. 3 shows the data of the time series y(t). Calculate the predicted value y at time t =13 using the moving average method with a smoothing interval m=3.

The original and smoothed series are shown in Fig. 3, calculation y - in table. 4.

Table 3

Time series data y(t)

Rice. 3. Initial and smoothed series

Table 4

Predictive value y

The exponential smoothing method is a method in which the values ​​of previous levels, taken with a certain weight, are used in the process of leveling each level. As you move away from a certain level, the weight of this observation decreases. The smoothed value of the level at time t is determined by the formula

where St is the current smoothed value; yt - current value of the original series; St - 1 - previous smoothed value; α - smoothing parameter.

S0 is taken equal to the arithmetic mean of the first few values ​​of the series.

To calculate α, the following formula is proposed

There is no consensus on the choice of α, this problem of optimizing the model has not yet been solved. Some literature recommends choosing 0.1 ≤ α ≤ 0.3.

The forecast is calculated as follows

.

Confidence interval

Table 5

Time series data y(t)

0.3×80+(1-0.3)×90.7

0.3×98+(1-0.3)×87.5

0.3×94+(1-0.3)×90.6

0.3⋅103+(1-0.3)×91.6

0.3×84+(1-0.3)×95

0.3⋅115+(1-0.3)×91.7

0.3×98+(1-0.3)×98.7

0.3⋅113+(1-0.3)×98.5

0,3⋅114+(1-0,3) ⋅102,8

0.3×87+(1-0.3) ⋅106.2

0,3⋅107+(1-0,3) ⋅100,4

0.3×85+(1-0.3) ⋅102.4

97.2+0.3× (85-97.2)

The original and smoothed series are shown in Fig. 4, calculation y - in table. 6.

Rice. 4. Initial and smoothed series

Table 6

Forecast value y at time t =11

The next forecasting method is the method of average absolute growth. The predicted level of the studied quantity changes in accordance with the average absolute growth of this quantity in the past. This method is applied if the general trend in dynamics is linear (for the case shown in Fig. 1, graph B)

where ; y0 - the base level of extrapolation is selected as the average of the last few values ​​of the original series; - average absolute increase in the levels of the series; l is the number of forecasting intervals.

The average value of the last values ​​of the series, maximum three, is taken as the base level.

Table 7

Time series data y(t)

Forecast = y0+Δl

(60+75+70)/3=68,3

(75+70+103)/3=82,7

(70+103+100)/3=91

(103+100+115)/3=106

(100+115+125)/3=113,3

(115+125+113)/3=117,7

(125+113+138)/3=125,3

(113+138+136)/3=129

(138+136+145)/3=139,7

(136+145+150)/3=143,7

143,7+8,2⋅1=151,9

143,7+8,2⋅2=160,1

143,7+8,2⋅3=168,3

The original and smoothed series are shown in Fig. 5.

Rice. 5. Initial and smoothed series

Average growth rate method

The predicted level of the quantity under study changes in accordance with the average growth rate of this quantity in the past. This method is used if the overall trend in dynamics is characterized by an exponential or exponential curve (Fig. 1B)

where is the average growth rate in the past; l is the number of prediction intervals.

The predictive estimate will depend on the direction in which the base level y0 deviates from the main trend (trend), so it is recommended to calculate y0 as an average of the last few values ​​of the series.

Table 8

Time series data y(t)

62,5⋅1,081 = 67,7

(70/60)1/2 =1,08

65⋅1,081 = 70,2

(65+70+68)/3=67,7

(68/60)1/3 =1,04

67,7⋅1,041 =70,5

(70+68+82)/3=73,3

(82/60)1/4 =1,08

73,3⋅1,081 =79,3

(68+82+80)/3=76,7

(80/60)1/5 =1,06

76,7⋅1,061 =81,2

(82+80+95)/3=85,7

(95/60)1/6 =1,08

85,7⋅1,081 =92,5

(80+95+113)/3=96

(113/60)1/7 =1,09

96⋅1,091 =105,1

(95+113+135)/3=114,3

(135/60)1/8 =1,11

114,3⋅1,111 =126,5

(113+135+140)/3=129,3

(140/60)1/9 =1,10

129,3⋅1,11 =142,1

(135+140+168)/3=147,7

(168/60)1/10 =1,11

147,7⋅1,111 =163,7

(140+168205)/3=171

(205/60)1/11 =1,12

171⋅1,121 =191,2

171⋅1,122 =213,8

171⋅1,123 =239,1

The original and smoothed series are shown in Fig. 6.

Rice. 6. Initial and smoothed series

To date, the most common forecasting method is to find an analytical expression (equation) of the trend. The trend of the extrapolated phenomenon is the main trend of the time series, to some extent free from random influences.

The development of the forecast consists in determining the type of extrapolating function y=f(t), which expresses the dependence of the studied value on time based on the initial observed data. The first step is to choose the optimal type of function that gives the best description of the trend. The most commonly used dependencies are:

Linear ;

Parabolic ;

Exponential function ;

The problems of finding the coefficients of a linear function and the forecast based on it are considered in the statistics section "regression analysis". If the shape of the curve describing the trend is non-linear, then the task of estimating the function y=f(t) becomes more complicated, and in this case it is necessary to involve biostatisticians in the analysis and use computer programs for statistical data processing.

In most real cases, the time series is a complex curve that can be represented as the sum or product of the trend, seasonal, cyclical, and random components.

The trend is a smooth change in the process over time and is due to the action of long-term factors. The seasonal effect is associated with the presence of factors that act with a predetermined periodicity (for example, seasons, lunar cycles). The cyclical component describes long periods of relative rise and fall and consists of cycles of variable duration and amplitude (for example, some epidemics have a long cyclical nature). The random component of the series reflects the impact of numerous random factors and can have a varied structure.

Conclusion

The methods of simple extrapolation, the method of moving averages, the method of exponential smoothing are the simplest, and at the same time the most approximate - this can be seen from the wide confidence intervals in the examples given. A large forecast error is observed in the case of strong level fluctuations. It should be noted that it is illegal to use these methods if there is a clear upward (or downward) trend in the initial time series. Nevertheless, for short-term forecasts, their use is justified.

The analysis of all components of the time series and forecasting based on them is not a trivial task, it is considered in the statistics section "time series analysis" and requires special training.

Bibliographic link

Koichubekov B.K., Sorokina M.A., Mkhitaryan K.E. MATHEMATICAL METHODS OF PREDICTION IN MEDICINE // Successes of modern natural science. - 2014. - No. 4. - P. 29-36;
URL: http://natural-sciences.ru/ru/article/view?id=33316 (date of access: 03/30/2019). We bring to your attention the magazines published by the publishing house "Academy of Natural History" April 23, 2013 at 11:08

Classification of forecasting methods and models

  • Mathematics
  • tutorial

I have been doing time series forecasting for over 5 years. Last year I defended my dissertation on the topic " Time Series Forecasting Model from Maximum Similarity Sample”, however, after the defense, there were quite a few questions left. Here is one of them - general classification of forecasting methods and models.


Usually, in the works of both domestic and English-speaking authors, they do not ask themselves the question of the classification of forecasting methods and models, but simply list them. But it seems to me that today this area has grown and expanded so much that, even if the most general, classification is necessary. Below is my own version of the general classification.

What is the difference between a forecasting method and a model?

Prediction Method represents a sequence of actions that need to be performed to obtain a forecasting model. By analogy with cooking, a method is a sequence of actions according to which a dish is prepared - that is, a forecast is made.


Prediction Model is a functional representation that adequately describes the process under study and is the basis for obtaining its future values. In the same culinary analogy, the model has a list of ingredients and their ratio, which is necessary for our dish - a forecast.


The combination of method and model form a complete recipe!



It is now customary to use English abbreviations for the names of both models and methods. For example, there is the famous autoregression integrated moving average extended (ARIMAX) forecasting model. This model and its corresponding method are usually called ARIMAX, and sometimes the Box-Jenkins model (method) after the authors.

First we classify the methods

If you look closely, it quickly becomes clear that the concept of " forecasting method"much broader concept" predictive model". In this regard, at the first stage of classification, methods are usually divided into two groups: intuitive and formalized.



If we recall our culinary analogy, then even there we can divide all recipes into formalized ones, that is, written down by the number of ingredients and the method of preparation, and intuitive, that is, not recorded anywhere and obtained from the experience of the culinary specialist. When do we not use a prescription? When the dish is very simple: fry potatoes or boil dumplings, you don’t need a recipe. When else do we not use the recipe? When we want to invent something new!


Intuitive forecasting methods deal with the judgments and assessments of experts. To date, they are often used in marketing, economics, politics, since the system, the behavior of which must be predicted, is either very complex and cannot be described mathematically, or very simple and does not need such a description. Details on such methods can be found in .


Formalized Methods- forecasting methods described in the literature, as a result of which forecasting models are built, that is, they determine such a mathematical dependence that allows you to calculate the future value of the process, that is, make a forecast.


On this, the general classification of forecasting methods, in my opinion, can be completed.

Next, we make a general classification of models

Here it is necessary to proceed to the classification of forecasting models. At the first stage, the models should be divided into two groups: domain models and time series models.




Domain Models- such mathematical forecasting models, for the construction of which the laws of the subject area are used. For example, a model used to make a weather forecast contains the equations of fluid dynamics and thermodynamics. The forecast of population development is made on a model built on a differential equation. The prediction of the blood sugar level of a person with diabetes is made on the basis of a system of differential equations. In short, such models use dependencies that are specific to a particular subject area. Such models are characterized by an individual approach to development.


Time series models- mathematical forecasting models that seek to find the dependence of the future value on the past within the process itself and calculate the forecast on this dependence. These models are universal for various subject areas, that is, their general form does not change depending on the nature of the time series. We can use neural networks to predict air temperature, and then apply a similar model on neural networks to predict stock indices. These are generalized models, like boiling water, into which if you throw a product, it will boil, regardless of its nature.

Classifying time series models

It seems to me that it is not possible to make a general classification of domain models: how many areas, so many models! However, time series models lend themselves easily to simple division. Time series models can be divided into two groups: statistical and structural.




AT statistical models the dependence of the future value on the past is given in the form of some equation. These include:

  1. regression models (linear regression, non-linear regression);
  2. autoregressive models (ARIMAX, GARCH, ARDLM);
  3. exponential smoothing model;
  4. model based on the maximum similarity sample;
  5. etc.

AT structural models the dependence of the future value on the past is given in the form of a certain structure and rules for moving along it. These include:

  1. neural network models;
  2. models based on Markov chains;
  3. models based on classification-regression trees;
  4. etc.

For both groups, I have indicated the main, that is, the most common and detailed forecasting models. However, today there are already a huge number of time series forecasting models, and for making forecasts, for example, SVM (support vector machine) models, GA (genetic algorithm) models, and many others have begun to be used.

General classification

Thus we got the following classification of models and forecasting methods.




  1. Tikhonov E.E. Forecasting in market conditions. Nevinnomyssk, 2006. 221 p.
  2. Armstrong J.S. Forecasting for Marketing // Quantitative Methods in Marketing. London: International Thompson Business Press, 1999, pp. 92–119.
  3. Jingfei Yang M. Sc. Power System Short-term Load Forecasting: Thesis for Ph.d degree. Germany, Darmstadt, Elektrotechnik und Informationstechnik der Technischen Universitat, 2006. 139 p.
UPD. 11/15/2016.
Gentlemen, it has reached insanity! Recently, I was sent an article for the VAK edition with a link to this entry for review. I draw your attention to the fact that neither in diplomas, nor in articles, and even more so in dissertations can't link to the blog! If you want a link use this one: Chuchueva I.A. MODEL OF PREDICTION OF TIME SERIES ON THE SELECTION OF THE MAXIMUM SIMILARITY, dissertation… cand. those. Sciences / Moscow State Technical University. N.E. Bauman. Moscow, 2012.

Appendix 1. METHODS OF STATISTICAL ANALYSIS AND FORECASTING IN BUSINESS

4. Mathematical forecasting tools

Mathematical methods and models used in problems of stochastic analysis and forecasting in business can be related to various branches of mathematics: regression analysis, time series analysis, formation and evaluation of expert opinions, simulation modeling, systems of simultaneous equations, discriminant analysis, logit and probit models, the apparatus of logical decision functions, analysis of variance or covariance, analysis of rank correlations and contingency tables, etc. However, all of them are united by the fact that they represent different approaches to solving the central problem of multivariate statistical analysis and econometrics - problems of statistical study of dependencies, which is just basic problem of statistical analysis and forecasting in business (its general formulation was given in paragraph 2).

In paragraph 1, it was already noted that among p+k+l+m The components of the analyzed multidimensional feature can be both quantitative and ordinal and nominal variables. The approaches mentioned above to solving the central problem of multivariate statistical analysis were formed taking into account the nature of the variables under study. The corresponding specialization of these approaches is reflected in Table. 4. It also contains references to literary sources, in which one can find a fairly complete description of these approaches.

Table 4

The nature of the resulting indicators

The nature of the explanatory variables

The name of the service sections of multivariate statistical analysis

Literary sources

quantitative

quantitative

Regression analysis and systems of simultaneous equations

quantitative

The only quantitative variable interpreted as "time"

Time series analysis

quantitative

Nonquantitative (ordinal or nominal variables)

Analysis of variance

quantitative

Analysis of covariance, typological regression models

Nonquantitative (ordinal variables)

Nonquantitative (ordinal and nominal variables)

Analysis of rank correlations and contingency tables

Non-quantitative (nominal variables)

quantitative

Discriminant analysis, logit and probit models, cluster analysis, taxonomy, splitting of mixtures of distributions

Mixed (quantitative and non-quantitative variables)

Mixed (quantitative and non-quantitative variables)

Apparatus of logical decision functions, Data Mining

Nevertheless, the practice of statistical analysis and forecasting in business shows that in the entire spectrum of their mathematical tools, the undisputed leadership (in terms of prevalence and relevance) belongs to three sections:
- regression analysis;
-
time series analysis;
-
the mechanism of formation and statistical analysis of expert assessments.

Let's briefly look at each of these sections.

Regression analysis

As before, we will describe the functioning of the real object under study (firm, company, production process or product distribution, etc.) by a set of variables and (their meaningful meaning is described in paragraph 2). Let us introduce a number of definitions and concepts used in regression analysis.

Resulting (dependent, endogenous) variables. The variable that characterizes the result or efficiency of the analyzed system is called the resulting (dependent, endogenous). Its values ​​are formed during and within the functioning of this system under the influence of a number of other variables and factors, some of which can be registered and, to a certain extent, managed and planned (this part is commonly called explanatory variables, see below). In regression analysis, the resulting variable acts as a function, the values ​​of which are determined (though with some random error) by the values ​​of the above-mentioned explanatory variables that act as arguments. Therefore, by its nature, the resulting variable is always stochastic (random). In the general case, the behavior of several resulting variables is usually analyzed .

Explanatory (predictor, exogenous) variables . Variables (or signs) that can be registered, describing the conditions for the functioning of the real economic system under study and to a large extent determining the process of forming the values ​​of the resulting variables, are called explanatory. As a rule, some of them lend themselves to at least partial regulation and management. The values ​​of a number of explanatory variables can be set as if "outside" the analyzed system. In this case, they are called exogenous. In regression analysis, they play the role of arguments of the function, which is considered as the analyzed resulting indicator. By their nature, explanatory variables can be either random or non-random.

Regression Residuals- these are latent (i.e., hidden, not amenable to direct measurement) random components, reflecting the impact, respectively, on not taken into account in the composition of factors, as well as random errors in the measurement of the analyzed resulting variables. Generally speaking, they can also depend on , i.e., in the general case .

The general scheme of the interaction of variables in regression analysis is shown in the figure.




Picture . General scheme of interaction of variables in regression analysis.

regression function on. The function is called regression function by (or just - regression on) if it describes the change in the conditional mean value of the resulting variable (assuming that the values ​​of the explanatory variables are fixed at levels ) depending on the change in the values ​​of the explanatory variables. Accordingly, mathematically, this definition can be written as

where the symbol means the operation of theoretical averaging of values ​​(i.e. is the mathematical expectation of the random variable , and , or simply is the conditional mathematical expectation of the random variable , calculated under the condition that the values ​​of the explanatory variables are fixed at the level ).

If we analyze simultaneously the resulting variables , then we should consider respectively the regression functions or, which is the same, one vector-valued function

. (11)

Then the regression model can be written in the form

, (12)

moreover, it follows from the definition that always]

(12’)

(identical the equal sign in (12') means that it is valid for any values ​​; the column vector of zeros on the right side has dimension ).

regression problem in its most general form can be formulated as follows:

according to the results of measurements

of the variables under study on the objects (systems, processes) of the analyzed population, construct such a (vector-valued) function (11) that would allow the best (in a certain sense) way to restore the values ​​of the resulting (predicted) variables by given values ​​of explanatory (exogenous) variables .

Remark 1. The most common are linear regression models, i.e. models in which the regression functions have a linear form:

Remark 2. There are at least two options for interpreting the “behavioral”, “status”, and “external” variables introduced in Section 2, respectively, and within the framework of the described regression model (12)–(12 '). In the first variant all three types variables and refer to explanatory variables and build a regression on . In another variant, the variables and are interpreted as observation conditions and then separately for each fixed combination of these conditions, a regression model of the form (12) is built (within the framework of a linear model (12 ''), this will mean that the regression coefficients themselves depend on and , i.e., they are defined as functions of and ).

Time series analysis

Any statistical analysis and forecast is based on the initial statistical data. Their main types were presented in paragraph 1. At the same time, if the process of data registration occurs in time , and the time itself is fixed along with the values ​​of the analyzed characteristics , then one speaks of a statistical analysis of the so-called panel data. If we fix the number of the variable and the number of the statistically examined object , then the sequence of values ​​located in chronological order

called one-dimensional time series. If, however, we simultaneously consider one-dimensional time series of the form (13), i.e., investigate the patterns in interconnected behavior of time series (13) for , characterizing the dynamics of variables, measured on some one(-m) object, then they talk about statistical analysis multivariate time series. In essence, all tasks related to the analysis of economic dynamics and forecasting involve the use of time series of certain indicators as their statistical base.

As a rule, in the tasks of business forecasting, only discrete (by observation time) one-dimensional time series for equally spaced observation moments, i.e. where is a given time period (minute, hour, day, week, month, quarter, year, etc.). In these cases, it will be more convenient for us to represent the time series under study in the form

where is the value of the analyzed indicator, registered in the th time step .

Speaking about the use of the apparatus of time series analysis in the problem of forecasting, we mean briefly- and medium term forecast, because the construction long-term forecast implies the mandatory use of methods of organization and statistical analysis special expert assessments.

Genesis of observations forming the time series. We are talking about the structure and classification of the main factors, under the influence of which the values ​​of the elements of the time series are formed. It is advisable to distinguish the following 4 types of such factors.

(BUT) long-term, forming a general (in the long term) trend in the change of the analyzed trait. Usually this trend is described using one or another non-random function f tr (t), usually monotonous. This function is called trend function or simply trend.

(B) Seasonal, which form fluctuations of the analyzed trait periodically repeating at a certain time of the year. Let us agree to denote the result of the action of seasonal factors with the help of a non-random function . Since this function should be periodical(with periods that are multiples of seasons, i.e., quarters), harmonics (trigonometric functions) participate in its analytical expression, the frequency of which, as a rule, is determined by the content of the problem.

(AT) Cyclic (opportunistic) that form changes in the analyzed trait, due to the action of long-term cycles of an economic, demographic or astrophysical nature (Kondratiev waves, demographic "holes", cycles of solar activity, etc.). The result of the action of cyclic factors will be denoted by a non-random function .

(G) Random(irregular), not amenable to accounting and registration. Their impact on the formation of the values ​​of the time series just determines stochastic nature elements, and hence the need for interpretation as observations made on random variables, respectively. We will denote the result of the impact of random factors with the help of random variables ("residuals", "errors"). Of course, it is not at all necessary that factors simultaneously participate in the process of forming the values ​​of any time series. all four types. In some cases, the values ​​of the time series can be formed under the influence of factors (A), (B) and (D), in others - under the influence of factors (A), (C) and (D) and, finally, exclusively under the influence of factors alone. random factors (D). However, in all cases the indispensable participation of random (evolutionary) factors (D). In addition, it is generally accepted (as a hypothesis) additive structural scheme the influence of factors (A), (B), (C) and (D) on the formation of values ​​, which means the legitimacy of representing the values ​​of the members of the time series in the form of decomposition:

Conclusions about whether factors of this type are involved or not in the formation of values ​​can be based both on the analysis of the content essence of the task (i.e., be a priori expert in nature), and on a special statistical analysis of the studied time series.

Within the framework of the introduced concepts and notation time series statistical analysis problem in general can be formulated as follows:

based on the results of measurements of the variable under study for the time ticks of the base period, construct the best (in a certain sense) estimates for the terms of expansion (14).

The solution of this problem is used to construct a predictive value for time ticks ahead using formula (14) with and when substituting the obtained estimates of the components of the right-hand side of the decomposition into it.

Formation mechanisms and statistical analysis of expert assessments

Usually, the following main types of organization of the work of an expert group () are distinguished:

· collegial: “method of commissions” (in the form of an open discussion on the problem under discussion); "court method" (in the form of confrontation between "defence" and "charge" for each of the options for the discussed solution to the problem); "brainstorming", etc.;

· partially collegiate: scenario analysis of the “what-if” type, the “Delphi” method - a multi-round discussion of the problem with secret voting of experts or filling out special anonymous questionnaires at the end of each round and the work of an independent analytical group in between rounds, etc .;

· individually-autonomous: each of the members of the expert group forms and expresses his opinion (regardless of the positions of other participants) in the form of ranking the discussed solutions (or objects), their paired comparisons or assigning each of them to one of the previously described gradations (see forms for presenting initial statistical data in the form of frequency tables or contingency tables in between the opinions of the -th and -th experts is measured by the value , where is the Spearman rank correlation coefficient (see, Ch. 11]). we can then solve the problem of "clustering" of experts, interpreting each of the clusters found in this way as a group of like-minded experts.

(ii) Analysis of the mutual agreement of opinions of the group of experts. Having the opinions of a whole group of experts, the statistician seeks to assess the degree of consistency of all these expert assessments, including statistically testing the hypothesis of the complete absence of any consistency (and then, obviously, one should either clarify the formulation of the problem proposed by the experts, or change the composition expert group). This problem is also solved by means of multivariate statistical analysis. The choice of a specific method depends on the form of the initial statistical data. For example, if the opinions of experts are represented by rankings, then as a measure of their consistency, one can consider coefficient of objects), i.e. with initial statistical data of the form is defined as a solution to an optimization problem of the form j-th expert is farther away from the unified group opinion, the lower the level of his relative competence is estimated. Note that if, as a result of studying the structure of the totality of expert opinions, the statistician comes to the conclusion that several subgroups of experts with homogeneity of opinions within each subgroup and with a significant difference in opinions in any pair of such subgroups, then the task of a single group opinion and an assessment of the relative competence of an expert is solved separately for each of the identified subgroups.


Random factors, in turn, can be of a twofold nature: sudden(“disorder”), leading to abrupt structural changes in the mechanism of formation of values x(t)(which is expressed, for example, in radical spasmodic changes in the basic structural characteristics of functions f tr(t), j(t) and y(t) analyzed time series at a random time), and evolutionary residual, causing relatively small random deviations of the values x(t) from those that should have been under the influence of factors (A), (B), and (C). However, in this section, time series formation schemes will be considered, including the action only evolutionary residual random factors.

Previous
Loading...Loading...