Team solutions

Antelope SAP

The current paper examines the factors that influence the increase of the
sales volume of a retailer. The aim of the study is to create an accurate model with high explanatory
power which accounts for the promotional and competitor effects on the quantity sold as well
as to identify the main volume uplift drivers. That information could be useful when designing
marketing strategies in order to gain a competitive advantage over the other market players.


SAP case


The current paper aims at providing information for optimizing the decision making process of a retailer. The case study analyses the sales volume uplift drivers, the own promotional effectiveness as well as the effects of the pricing strategies applied by competitors on the quantity sold by the company.  The main goals are estimating the impact of price reduction and promotions on the volume of sales by also accounting for the prices of the other players on the market. With those insights, the organization would be able to understand its customers better. Moreover, it would be able to apply an optimal pricing strategy which best responds to the needs of the clients and is profitable for the enterprise as well, given the evaluation of distinct possible scenarios. The analysis performed as well as the structure of this paper follow the stages of the CRISP-DM methodology. The remaining parts of the study are organized as follows. We begin with reviewing relevant to the case scientific papers, proceed with the business understanding, data understanding and modelling and we conclude with the results and evaluation.

Literature Review

Firstly, we begin with exploring why businesses run sales promotions and how they are perceived by consumers.[1] According to Blattberg and Neslin, sales promotion is “an action-focused marketing event whose purpose is to have a direct impact on the behavior of the firm’s customer.”[2] Blattberg refers to it as a temporary event and a “call-to-action.” The objectives of a retailer when running a sales promotion range from generating visitors to the store or selling excess merchandise (which was caused by overstocking) to increasing category profitability or stimulating people to try new products.

Psychological marketing research is helpful when it comes to understanding why promotions are made. The key concepts related to the topic are the ones of the smart shopper, transactional utility and reference price. Permanent price reductions and sales promotions result in different customer responses. Furthermore, consumers react more strongly to promotions than to price decreases. When buying promotional goods, they perceive themselves as “efficient, effective and smart shoppers” as pointed out by Schindler (as cited by Blattberg, 2010).

Additionally, transactional utility, as introduced by Thaler (as cited by Blattberg, 2010), refers to the gain (or loss) of utility when the customer pays less (or more) than the reference price (the price they believe the good is worth or the price they would expect to pay). If the reference price exceeds the amount the consumer actually pays, there is positive transactional utility. When promotions generate that effect, the response of buyers is higher than to an equivalent reduction of regular price and this means that the company would be better off when running sales promotions.

One of the first studies of the reference price conducted by Winer (as cited by Blattberg, 2010) defines it as a function of past brand or category prices. Research suggests that customers use internal or memory-based reference prices as well as external or stimulus-based prices. According to Howard and Kerin (as cited by Blattberg, 2010), reference prices are context specific so internal reference prices differ when a product is advertised in a retailer’s ad or not. The findings of Blattberg reveal that the greater the frequency of promotions, the more the reference price decreases and this in turn may lead to a reduction of incremental sales resulting from the promotion. On the contrary, Srinivasan (as cited by Blattberg, 2010) suggests that retailer revenue elasticities are higher for brands that have frequent and shallow promotion, for impulse products and low-degree brand proliferation categories. Winer (as cited by Blattberg, 2010) found out that consumers are willing to pay less for products that are heavily promoted. With the increase of the variance of the deal discount, the reference price also goes up, even though the average deal discount remains constant. Thus, as Blattberg concludes, brands should have multiple discount levels and the levels of discount should be varying so that consumers are less able to predict their future savings.

The main drivers of volume uplifts due to promotions that are outlined by the literature and summarised by Blattberg are:

  • customers switching purchases from other brands or brand switching
  • purchasing more for inventory or stockpiling
  • current consumers accelerating their purchase (purchase acceleration)
  • new consumers entering the market (primary demand expansion)

Stockpiling and purchase acceleration are said to be mostly unprofitable for the retailer. They could be profitable if they evoke increased future demand and consumption. Category expansion is seen as profitable by Blattberg. Brand switching is beneficial to the retailer if higher margin items are purchased.

As the study of Blattberg suggests, since the 1980’s there has been an increase in promotional activity due to ability to measure the impact that promotions have on the volume of sales. The basic model that has been used when modeling point-of-sale (POS) data is

lnSt = α+β1Pt + β2Dt + β3FAt + β4DISPt + et,

 where St = unit sales at time t, Pt is the regular price at time t, Dt is the promotional discount at time t, FAt is feature advertising (0,1) at time t, and DISPt (0,1) is presence of a display at time t and et is the error term at time t. The underlying assumption when price and promotional discount are separated is that promotions have a differential effect from a price reduction. The total impact of the promotions might be estimated through adding lags of promotional variables. A drawback of the model is that there is multucollinearity due to the presence of feature advertising, presence of display and price discounts at the same time. This means that the estimates are less precise. Another issue is how to account for the promotional discount. For example, when using a semi-log model, as pointed out by Blattberg, the effect of the promotion accelerates with the size of the discount. In this way, smaller discounts would have a limited effect on sales in the model. Modelers also face the decision whether to represent the discount as an absolute value or as a percentage. Moreover, it is worth noting that distinct consumer segments are thought to respond in a different way to promotions and that should be represented in the models as well.

Consumer expectations about prices and discounts have a key role to modelling the changes in volume as well. Customers are likely to postpone their purchase if it is known that a firm will reduce its price. Sun (as cited by Blattberg, 2010) suggests that consumers anticipate promotions accurately.

For businesses it is important to know the reasons why customers respond strongly to sales promotions in order to be able to model and predict the volume of sales with sufficient precision in different scenarios.

We are next going to explore data mining techniques for marketing as they are highly relevant for the case study below. [3] Even though large amounts of customer data are generated, the consumer characteristics and purchase patterns remain largely hidden, according to Shaw. The data mining tools for decision support may be successfully applied in marketing to manage customer relationships given the large amounts of data that are available to businesses. Truly understanding clients by means of data mining and exploring hidden patterns together with efficient knowledge management turn into successful marketing strategies. The authors of the paper outline that there is plenty of literature on data mining techniques as well as many studies with a focus on customer relationship. However, there is no overall framework that links the two. They argue that customer relationship management is only possible by integrating the knowledge discovery process with the creation of marketing strategies. Shaw et al. [4] aim at proposing a systematic methodology (that uses data mining and knowledge management) which could manage the marketing knowledge, support marketing decisions and become the basis for enhancing customer relationship management. In this way individual needs may be addressed to provide a better customer service.

The paper first discusses different data mining tasks and knowledge management as an iterative process, afterwards it continues with useful marketing and customer knowledge obtained through data mining as well as knowledge-based marketing. At the end, possible issues when managing the marketing knowledge are presented.

Pattern extraction is an important part of data mining and is related to finding relationships between subsets of data. It is done through performing different data mining tasks, which could be broadly summarized to five groups according to the knowledge that is extracted by them: dependency analysis, class identification, concept description, deviation detection and data visualization. Dependency knowledge could be used in order to perform market basket analysis and thus to develop promotional strategies. Class identification might be useful in defining customer groups. It could be done through mathematical taxonomy algorithms which, according to Shaw et al., maximize the similarity within the classes and minimize similarity between classes. The other option, when it comes to class identification, is concept clustering which refers to clustering according to similarity between attributes and conceptual cohesiveness in the way it is defined by the domain knowledge. The groups of clients which are created in the process of concept description are based on domain knowledge and the database and there is no forced definition of the groups, as the authors clarify. This technique facilitates summarization (grouping clients according to their characteristics and creating customer profiles), discrimination (differentiation of one class from another) and comparison of marketing and customer knowledge (comparing the class with other records). Deviation detection helps when identifying anomalies and differences from the normal or usual situation. Data visualization proves to be a valuable tool when it comes to observing the existing complex patterns in detail.

Knowledge management, as described in the paper of Shaw, is an iterative process through which data mining techniques are extended to a knowledge management framework. The outcome of the distinct data mining techniques is evaluated and the iterative learning process continues until the model is acceptable. The authors point out that a systematic way to retain, refine and use the model is vital for effectively making decisions. Knowledge might be obtained through data mining, from third parties or it could be refined or refreshed knowledge. When it comes to shared marketing knowledge, its sources are retailers, consumer knowledge based on market research and market knowledge from third-party providers. It should then be integrated and distributed to the decision support applications (those that Shaw et al. outline are category analysis, shelf space management, product development, product promotion and customer relationship management). In that way the customer relationship marketing is facilitated and improved.

Integrating the customer knowledge obtained through data mining with marketing decisions is called knowledge-based marketing. Data mining for knowledge-based marketing could be applied in the areas of customer profiling, deviation analysis and trend analysis. The customer profile is based on information such as frequency, size or recency of the purchases, which could then be used for prospecting, computing customer lifetime values, identifying customer groups, measuring the success of marketing programs or the deviations from normal behavior. Trends analysis involves the evaluation of product performance or forecasting future sales. Marketers are typically interested in knowing how particular marketing programs affect future sales. Data mining could be a powerful tool to discover relationships or specific patterns. Challenges to the marketing knowledge management might be web mining of client data; making the process better structured and, additionally, managing the customer knowledge which is distributed across the supply chain partners. To conclude, data mining could be successfully integrated into a marketing knowledge management framework. That process facilitates knowing customers better and eventually providing them with better service.

The study of Shaw is well-structured and manages to capture all the relevant information clearly and systematically. In the paper the best practices from data mining and customer relationship management are united. The paper does an excellent job in creating an overall framework which could be used to support better informed marketing decisions. One of the main strengths of the study is that it outlines the relationship between data mining and marketing and suggests that the integration of those could lead to improving marketing strategies. Another strength is that it proposes a solution to the changing focus of marketing (from mass marketing to one-to-one relationship marketing), namely through extracting knowledge from existing data and implementing it to support decision-making. Among the weaknesses of the work of Shaw is that there are no metrics to quantify the results of implementing such a framework into businesses. It would be more convincing to present key performance indicators of companies which have tried to improve their customer relationship management in similar ways (before and after the implementation of such a framework. Nevertheless, the paper provides useful insights which could be of help to any company which aims at creating informed marketing strategies and providing better customer service.

Business Understanding

The case study focuses on analyzing the impact of promotions and reducing prices on the volume of sales given the 3 years of weekly data provided by SAP. Additionally, the effect of the actions of the competitors in terms of pricing and promotions on the quantity sold is to be observed as well. This paper aims at identifying the drivers of volume uplift and examining the cannibalization and competitors’ effects as well as the promotional effectiveness. The volume uplift drivers might be the promotion itself, the seasonality of the product, the percentage of the discount, the prices or the quantity that was bought in previous periods, the prices of other competitors on the market, how competitive the market is (how many the competing players are). At the end of the study, the effect of promotions on the volume of sales is evaluated as well as the impact of the main competitors. Consequently, to accomplish those objectives, we would have to complete several tasks. In order to understand how the changes in price affect consumer behavior, we need to calculate a base price which shows an initial level of the price without promotions. We should also calculate the percentage of the discount that the customers see when shopping and test if it has a significant impact on the amounts they buy. If we want to trace the effects of the price or the quantity bought from previous periods, we would have to investigate the lags of the actual prices or the volume. The average market price for a given week or the average prices of the competitors might also be of importance. Another factor we should also examine is the percentage changes in the prices of the retailer and its competitors.

Apart from baseline price, we also estimate the baseline volume, which gives us the amount of product sold given that there is no promotion. By comparing the baseline volume and the actual volume during promotions, we can measure the immediate effect on sales from the promotions.

Data Understanding

SAP provided 3 years of weekly data, which contain information on the sales volume of a specific product, the price of the product as well as the prices of 7 competitors for the same product and whether there is a promotion in a specific week and what its type is.

The prices of the retailer range from 0.758 to 1.222 currency units for the observed period. Only competitor 1 supplies the target product during all of the weeks of interest and his prices are 30% lower than ours on average. The other competitors offer the product for less than half of the time (ranging from 17% to 47% of the time, which means from 6 months during the period to roughly a year and a half).  The prices of all of the competitors are always lower than the own prices, except for Competitor 3 whose price is always higher (by at least approximately 70%).

During the observed period Competitors 4 and 5 appear 3 times on the market for the approximately same periods (around 22 weeks, which is a little more than 5 months). Competitor 2 is present for the same periods, except in the first year. In the last 22 weeks there are 8 market players. This has to be taken into account when the data is split into train and test samples, since it can lead to incorrect model specification. Additionally, this means that the market is becoming more competitive and possibly the number of factors that influence our volume of sales are increasing.

There is no starting date provided in the original data and we assume that this product is seasonal only by looking at the occurrences of the different competitors. This hypothesis needs to be further verified.

As shown in Figure 1, it is evident that when there is a reduction in the price of the retailer, the sales volume increases rapidly. Moreover, it can be noticed (Figure 2) that during promotions the large increase in volume in the first week is followed by a reduction almost to the levels of sales volume without promotion. Thus, it is important to account for this pattern in the data when modelling the relationships.

Figure 1: Volume of sales plotted along with the prices of all competitors on the market for the whole period

The frequency of promotions is another important feature. 58% of the observed time there are promotions. Figure 2 shows sales volume by weeks and types of promotion. The most frequent promotion is C – 37% of the time, and the least frequent is E – 10% of the time. Promotions A and B have approximately the same frequency – around 15% of the time and promotion D occurs 21% of the time. It can also be traced that promotions continue between 3 and 5 weeks, but there is no specific order of occurrence. In the period between weeks 118 and 146 can be seen that almost all of the time promotions are present. Since this is the period when market players are 8 (the highest number), it can be inferred that these frequent promotions aim at pulling customers to this particular product. The more the competitors on the market, the greater the difficulty in retaining the clients. Furthermore, from Figure 1 is evident that the price of the retailer is the highest among competitors’ prices, except for Competitor 3. This means that customers might prefer buying the products of the competitors with lower prices, which would lead to lowering the sales volume of the retailer’s product.

Figure 2: Sales volume by the types of promotion

If we observe the level of correlation between the variables at hand, we would notice that the correlation between the actual price and the volume of sales is high (as expected) and consequently the actual price might turn into a highly significant predictor of the total quantity sold. The relationship is negative indicating that the lower the price, the greater the volume sold, as shown in Figure 1.

From the correlation matrix can be seen that our actual price is positively correlated with the price of competitor 1 (0.64). This means that they might be following our actions (we cannot be sure since correlation is not causation) or if not, at least their pricing strategy resembles ours very much. The rest of the competitors have negative correlation with our price (very close to 0), which means that either they do the opposite of what we do or, which is more likely, their pricing decisions do not take into account ours. All those conclusions are valid for the correlation between competitor 1 and the rest of the market players.

The correlations between competitor 2 and the rest (3, 4, 5, 6, 7) are all positive and relatively high – only with competitor 3 it is smaller – 0.17. This means that competitor 2 and the rest have very similar pricing strategies.

Between competitor 3 and competitor 4 and 5 there is a low negative correlation, but with the rest it is higher and positive. Therefore, between competitor 3, 6 and 7 there are similar decisions regarding prices.

Competitors 4 and 5 have almost 0.96 correlation between prices, which means that they follow the same strategy. With the rest of the market players the correlations of those competitors are also positive, but much lower – almost 0.3.

Between Competitors 6 and 7 there is almost 0.93 correlation, which means that there is a match of their strategies when it comes to pricing.

To sum up, the most significant correlation occurs between Competitors 4 and 5 and between competitors 6 and 7 – above 0.9. Between the retailer and Competitor 1 it is also very high – 0.65, and between Competitors 2 and 4 it is 0.7.

The volume of sales is strongly correlated with our price and the price of Competitor 1. The lowest correlations of the volume of sales, which are very close to 0, are with Competitors 4 and 5, followed by Competitor 2. The rest are also low – around 0.25. We can draw the conclusion that mainly the weekly pricing decisions of competitor 1 influence the quantity that we sell. Therefore, we are interested in the decisions of our main competitor, namely competitor 1, the most, since the others do not seem to be significant.

When it comes to our promotional prices, the correlation between only the promotional prices and the volume of sales is almost 0.5. There is a negative correlation between our promo prices and those of Competitor 1 – 0.37, and between our promo prices and those of Competitor 3 – 0.2. This might mean that these two competitors also make promotions when we do and thus they affect our volume of sales negatively.

Data Preparation

Firstly, we begin with examining the sales volume as a time series. We decompose it in order to see whether there are significant seasonal or trend patterns. Figure 3 presents the sales volume in the observed period and Figure 4 presents the obtained results from the decomposition:


Figure 3: Sales volume in the observed period

The frequency is 52 since there are approximately 52 weeks in a year. From Figure 3 can be seen that in the end of the period the frequency of promotions is very high, as it was mentioned in the previous section (see Figure 2). A downward trend can also be spotted until the beginning of the second year when the maximum sales volume begins to increase slightly.

Figure 4: Sales volume decomposition into seasonal and trend components

The observed results show that there is some trend in the series, but regarding seasonality it is not capturing it well. It seems that the seasonal component in this case cannot be well separated from the promotional effect. Thus, if subtracted from the original series, some of the promotional effects will be lost. Therefore, we believe it is preferable to continue working with the original series.

The next step in the current analysis is to derive baseline sales volume and baseline price from the original sales volume and actual price. These new variables are obtained by using the original values for the periods with no promotion and then replacing the missing values by using exponential weighted moving average[5]. The results are presented on the figures below:

Figure 5: Original sales volume (in black) and the derived baseline volume (in red)

As can be seen from Figure 5, the baseline volume tries to represent the case when there were no promotional activities. This is important for the estimation of the promotional effectiveness. The derived baseline volume is not as smooth as it has to be, thus this may impose problems on the accurate estimation of the promotional effect.

Figure 6: Original price values (in black) and derived baseline price (in red)

The baseline price has the highest values, since it represents the case when there are no promotional activities. Similarly to the baseline sales volume, the difference between the baseline price and the actual price measures the effect from promotions.

These series, baseline price and baseline sales volume, are not as smooth as they have to be in order to achieve good estimation of the promotional effect. However, for the purposes of the current analysis no further corrections are made.

Further examination of the series is conducted regarding stationarity and significance of past values. The Dickey-Fuller test for stationarity results in the rejection of the null hypothesis of stationarity. Figure 7 shows the autocorrelation function of the sales volume series:

Figure 7: Autocorrelation function of sales volume

From the figure above can be seen that there are significant lags at weeks 3, 5, 8, 11, 16, 24, 27 and 35. This may be due to the frequency of promotions or the seasonal participation of some of the competitors.

Figure 8: Partial autocorrelation function of sales volume

The PACF shows that there are significant lags up to lag 31 – 5, 6, 8, 16 and 31. Again, these significant lags may be caused by the promotional activities and the appearance of different competitors in different times of the year.

To further account for all possible effects the following variables were calculated:

  • percentage difference between the actual price and the baseline price – captures the “real” change in the price that resulted from a promotion – during non-promotional periods its value is 0; during promotional periods it gives the promotional discount as a fraction of the baseline price
  • ratio between the actual price and the baseline price
  • change in the actual price on week-to-week basis
  • average market pricederived using the actual price plus the prices of the firms participating in the market for the respective week
  • ratio between the actual price and the average market priceshows the relative place of the particular market player
  • ratios between the actual price and the prices of all other competitorsshows the position of the company against all its competitors
  • percentage difference in the price of the competitors on a week-to-week basisshows the effect of a price reduction or a promotion run by a competitor
  • a vector in which are included only the values of the actual price during promotion weeks and zeros during non-promotional weeks – captures the effect from promotions
  • a vector including only change of actual price during promotional weeks – again for capturing the effect from promotions
  • percentage change in the sales volume
  • percentage change in the sales volume with lag 1 – taking the lag of the percentage change in volume accounts for the fact that customers tend to pile up and stock goods during promotion periods. This changes their shopping behavior and it is plausible to expect that they will buy less of the product in the periods following the promotion
  • sales volume with lags 3, 5, 6, 8, 11, 16, 24, 27, 31 and 35

Modeling and Results

This section discusses the details regarding model specification.

The analysis of the effect of promotions on the volume of sales begins with creating a simple linear model with the initially available variables – the actual price of the product and the prices of all the competitors. Coefficients were estimated using the ordinary least squares (OLS) method. By fitting only the available in the beginning variables, the model was not sufficiently accurate. Its coefficient of determination (R2) was about 0.55 on the training set.

To increase the accuracy of the model are taken into account all new variables (see previous section). Feature selection is conducted by using algorithms like best subset, forward subset and backward subset selection. Different models are compared and some of the models do not include features suggested by some of the abovementioned algorithms.

The train and test split is as follows: train set – 110 observations, test set – 36 observations. There are several specifics of the dataset, which are taken into account. First, this is a time series dataset, which means that train and test split should not be random. However, at the end of the period, between weeks 119 and 146, there are 8 market players and before that some of them have never occurred. This might impose problems, since the train data would not include all competitors, but only some of them, thus every trained model might achieve low performance on the test set. Therefore, it is more preferable to use random split so that the two sets to be similar as much as possible.

Regarding the functional forms, several options were considered – linear, log-linear and log-log. Additionally, taking into account the dependencies between the response variable and the explanatory variables, were also considered spline regression models. However, the best functional form in this case turned out to be log-linear.

If the actual price of the product is the only explanatory variable included in the model, then the R2 is 0.5179[6] with significant intercept and coefficient of the actual price. This coefficient is negative (-2.9831), which is expected, since when the price rises, the sales volume decreases by the amount of the estimated coefficient value. The same is true for the relative change in actual price. The baseline price turns out to be insignificant in the explanation of the variation in the sales volume. The inclusion of the price of Competitor 3 turns out to be significant along with the relative change in the actual price. This model gives R2 equal to 0.5894. The same is true for the prices of competitors 1, 3, 6 and 7. However, the inclusion of the remaining competitors’ prices is insignificant to the explanation of the variance in the response variable.  If only the actual price during promotional weeks is taken into account, then 0.2361 of the variance is explained. Together with the relative change in actual price 0.62 of the variance is explained. The coefficient for the actual price in promotional weeks only is smaller than 1 (0.26), but it is positive, which is opposite to what is expected. However, the effect is very small, thus it could be neglected. The inclusion of the price of Competitor 3 in the previous model leads to increase in R2 – 0.6596. The coefficient for this variable is positive, which means that the presence of this competitor in the market leads to increase in our sales volume. Although the value of the coefficient is less than 1 and close to 0 (0.14) this effect is expected, since the price of this competitor is always higher than the actual price of the product, which means that customers prefer the product with the lower price. The further inclusion of the lag values in the model does not lead to a significant change in R2.

Taking into account the described above analytical steps, the following variables were included in the final log-linear model[7]:

  • the percentage discount in the actual price,
  • the prices of competitors 3,6 and 7,
  • the two vectors containing prices in promotional or non-promotional weeks only,
  • the ratio between the actual price and the prices of competitors 6 and 7
  • the lag of the percentage change in the volume of sales.

All of the variables were statistically significant at the 95% level. The achieved an adjusted R2 was 0.7474 on the training set. After the model was trained two formal tests were implemented to check for heteroscedasticity or non-normal distribution of the residuals of the model. Breusch-Pagan (BP) test was used for the detection of heteroscedasticity. The BP test statistic was 3.4023 and the p-value stood at 0.9462, therefore it can be concluded that no heteroscedasticity is present in the residuals of the model. Next, the Shapiro-Wilk test was used to determine whether the residuals were normally distributed. By looking at the histogram of the residuals it is possible to conclude that they were indeed normally distributed. The Shapiro-Wilk test statistics was 0.98902 and the p-value stood at 0.5161. Therefore, it is not possible to reject the null hypothesis of normally distributed residuals. The partial autocorrelation function of the residuals showed no significant lag values.

The model was then fit on the validation set. The results were an R2 of 0.677 and RMSE (root mean squared error) of 0.23 and MAE (mean absolute error) of 0.163.

The coefficients with the largest values are those for the prices of competitors 6 and 7, respectively the ratio between the actual price and each of these prices. The coefficient for the price of Competitor 6 is positive, which means that its presence in the market leads to a large increase in the sales volume. The opposite is true for Competitor 7, since the coefficient is negative. The net result from the inclusion of these two market players is positive, i.e. increase in sales volume. These large coefficients might be caused by two facts – first, these two competitors enter the market in the last 27 weeks when all market players are already there, making it even more competitive; second – since in these weeks the market players are 8, which is the maximum for the observed period, the promotions become highly frequent and this leads to higher sales volume on average for this period.

The inclusion of competitor 3 in the market leads to the same effect as already discussed – very small positive change in sales volume.

The percentage discount in the actual price could take both positive and negative values depending on whether there is promotion or no in a specific week. The coefficient is negative (-1.63), which is expected. If there is a discount in the actual price, then the value of the variable will be negative, thus making the effect positive, i.e. increase in sales volume.

The coefficient for the two variables with either only prices in promotional weeks or only prices in non-promotional weeks are negative and below 1 (-0.58 and -0.82 respectively). The fact that the coefficient for the promotional weeks is lower than the coefficient of the non-promotional weeks means that promotions have a positive effect on sales volume.

Finally, the last variable included in the model is the lag of the percentage change in the volume of sales. The coefficient is very close to 0 and is negative (-0.06), which implies that if the change in volume is positive then for the current week lower sales volume will be expected and vice versa.


The final model that was chosen has a relatively high explanatory power and it is an improvement of the regression with the initial variables only. It accounts for the effects of promotions and the volume uplift drivers (through the percentage discount in the actual price and the two vectors containing prices in promotional or non-promotional weeks only) as well as for the effects that competitors have on total amount of sales (through the the prices of competitors 3,6 and 7, the ratio between the actual price and the prices of competitors 6 and 7). Moreover, the quantity that was bought in previous periods might impact the current volume of sales and that is considered through including the lag of the percentage change in the volume of sales. In addition, the model that was composed attempts at following the business logic behind the problem so that each of the resulting coefficients is meaningful to the retailer. A major drawback is that there is slight multicollinearity and that might lead to less precise results. Another weakness is that in the residuals appears a cyclical component (see Appendix) and efforts might be made in order to remove its effects. Exploring the seasonality patterns of the data in greater depth and better accounting for the existing trend might lead to improvements of the explanatory power of the model and the significance of the obtained results. Furthermore, better smoothing of the baseline price and baseline volume would prove of importance when explaining the promotional effects on the volume of sales.


The current paper presents the results of the analysis performed on the SAP case study data. The finally chosen model has relatively strong explanatory power. However, a number of improvements could be made in order to increase the precision and the significance of the results. A possibility for further research would be to apply a threshold autoregressive model in order to account better for the entrance and presence of some of the competitors. With refinements, the model could prove to be useful to the retailer for the decision making processes and the creation of marketing strategies.





COMP6_PERC + COMP7_PERC + LAG_PERC_VOL, data = training)



Min       1Q   Median       3Q      Max

-0.71961 -0.14021 -0.00557  0.13744  0.44300



Estimate Std. Error t value Pr(>|t|)

(Intercept)               11.95330    0.19132  62.478  < 2e-16 ***

DICOUNT_PRICE_PERCENTAGE  -1.62887    0.20021  -8.136 1.16e-12 ***

COMPETITOR3_PRICE          0.10959    0.04026   2.722 0.007646 **

COMPETITOR6_PRICE         59.25102   15.80614   3.749 0.000298 ***

COMPETITOR7_PRICE        -48.29877   12.55855  -3.846 0.000211 ***

PRICE_NOPROMO             -0.82354    0.18495  -4.453 2.21e-05 ***

PRICE_PROMO               -0.58865    0.19556  -3.010 0.003306 **

COMP6_PERC               -54.21621   14.54790  -3.727 0.000321 ***

COMP7_PERC                44.41125   11.55255   3.844 0.000213 ***

LAG_PERC_VOL              -0.06395    0.03088  -2.071 0.040909 *

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Residual standard error: 0.2203 on 100 degrees of freedom

Multiple R-squared:  0.7683,   Adjusted R-squared:  0.7474

F-statistic: 36.84 on 9 and 100 DF,  p-value: < 2.2e-16


postResample(pr, ppt)     RMSE  Rsquared       MAE

0.2304883 0.6771820 0.1630477




Austin, C., & Stayerberg, E. (2015). The number of subjects per variable requeired in linear regression analyses. Jounral of Clinical Epidemiology 68, 627-636.

Blattberg, R., & Briesch, R. (2010). Sales promotions. Oxford Handbook of Pricing Management.

Hanley, J. (2016). Simple and multiple linear regression: sample size consideratioms. Journal of Clinical Epidemiology 79, 112-119.

Shaw, M., Subramaniam, C., Tan, G., & Welge, M. (2001). Knowledge management and data mining for marketing. Decision Support Systems 31, 127-137.


[1] Blattberg,R., Briesch, R. (2010) Sales Promotions, Oxford Handbook of Pricing Management

[2] Blattberg,R., Briesch, R. (2010) Sales Promotions, Oxford Handbook of Pricing Management

[3] Shaw, M., Subramaniam, C., Tan, G., Welge, M. (2001) Knowledge management and data mining for marketing. Decision Support Systems 31 (2001) 127–137

[4] Shaw, M., Subramaniam, C., Tan, G., Welge, M. (2001) Knowledge management and data mining for marketing. Decision Support Systems 31 (2001) 127–137

[5] The moving average window used is 3 observations – 3 left and 3 right

[6] These R2 values are taken only on the train set

[7] Detailed results regarding the model specification is included in the Appendix

Share this

2 thoughts on “Antelope SAP

  1. 0

    The most difficult part of this challenge is to understand the data, create new features and rerun the predictive models till you achieve a good accuracy.
    As you may mentioned if you run a predictive model with the initial dataset you will get an extremely low modelling accuracy.

    I will vote based on the below criteria:

    1. business understanding
    2. feature engineering
    3. modelling accuracy
    4. insights & final results

    Your approach was on the right path but you didn’t managed to create the new features that will help you identify the volume uplift drivers and increase the accuracy of the model.
    You understand the data and the business challenge but you didn’t manage to increase the accuracy of the 1st model by using feature engineering.
    You could increase the accuracy of the model by implementing a base price algorithm and then by taking the % of difference between the actual and the base price you could extract the weekly promotional price reduction and use it as input parameters for your regression model.

  2. 1

    Hi team,
    You have underestimated the data understanding, EDA and feature engineering. It is an important part of data science. Having a visual representations would be nice. Also tables and some numbers are welcome in the paper.
    – zenpanik

Leave a Reply