Navigating Price Elastic Modelling (Part-3): Building the Price Elasticity Model
Welcome to the thrilling conclusion of our Price Elasticity Modelling series! In this final installment, we embark on the most critical phase of our project — building the Price Elasticity Model. Armed with a meticulously cleaned dataset and insightful exploratory analysis, we are now ready to unlock the secrets of price elasticity for various electronic products. Join us as we dive into the world of statistical analysis and unleash the power of data-driven pricing strategies.
Recap of the Journey:
In our previous blogs, we laid the foundation for this adventure. In an introductory blog we explored the concept of price elasticity, a vital economic metric that measures the responsiveness of product demand to changes in price. In Part 1, we embark on the crucial first step — Data Cleaning. As any seasoned data scientist will tell you, a robust analysis begins with clean, reliable data. Part 2 saw us conducting extensive Exploratory Data Analysis (EDA), unearthing valuable insights about product prices, sales trends, and the influence of various factors on demand. If you missed the earlier parts of our journey, catch up here: [Link to Part 1] [Link to Part 2]
Building the Price Elasticity Model:
To construct the Price Elasticity Model, we follow these key steps:
Importing Essential Libraries:
We begin by importing essential Python libraries, such as Matplotlib for data visualization, Seaborn for enhanced visualizations, and Statsmodels for statistical analysis.
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
import warnings
warnings.filterwarnings("ignore")
sns.set_style("darkgrid")
Loading the Cleaned Dataset:
We load the cleaned dataset we prepared in Part 1 of our journey. This dataset contains valuable information about electronic products, including prices, availability, conditions, and more.
df = pd.read_csv('Cleaned_dataset.csv')
Preparing the Data:
To simplify the analysis, we narrow our focus to a specific merchant and category, Walmart.com and TVs. We then convert week-level price change data for all products and calculate the mean impression count for each week.
#filtered only Walmart store and Tv category just to keep it simple ( your choice)
walmart_df = df[df['merchant'] == 'Walmart.com']
walmart_df = walmart_df[walmart_df.Category_name == 'tv, television, led']
# Converting week level data Price change for all the different products
Week_price_df = walmart_df.groupby(['name', 'Week']).agg({'Discount_price':'mean','Impression':'sum'}).reset_index()
x_pivot = Week_price_df.pivot(index='Week', columns='name' ,values='Discount_price')
Imputing Missing Values:
We handle missing values in the data by imputing them using forward fill. This assumption is based on the idea that if a week’s price data is missing, the price does not change in the following weeks.
x_values = pd.DataFrame(x_pivot.to_records()) # This step is performed to ensure compatibility with the fillna() method since it requires a 2D array-like input.
x_values.fillna(method='ffill', inplace=True)
Calculating Price Elasticity:
The heart of our Price Elasticity Model lies in calculating the price elasticity for each product’s demand. We use Ordinary least-squares (OLS) models to achieve this. This technique fits a model of a relationship between one or more explanatory variables and a continuous or at least interval outcome variable that minimizes the sum of square errors.
final_df = pd.DataFrame()
for col in x_values.columns[1:]:
#to store the results of the analysis for each column.
results_values = {
"name": [],
"price_elasticity": [],
"price_mean": [],
"quantity_mean": [],
"intercept": [],
"t_score":[],
"slope": [],
"coefficient_pvalue" : [],
"rsquared": [],
}
#to hold the data for the current column being processed.
temp_df1 = pd.DataFrame()
temp_df1['x'] = x_values[col]
temp_df1['y'] = y_values[col]
#Drops any rows with missing values in temp_df1
temp_df1.dropna(inplace=True)
x_value = temp_df1['x']
y_value = temp_df1['y']
X = sm.add_constant(x_value)
#Constructs an Ordinary Least Squares (OLS) regression model using sm.OLS() from statsmodels, with y_value as the dependent variable and x_value as the independent variable.
model = sm.OLS(y_value, X)
result = model.fit()
#choose only those whose P-value is less than 5% errornous (5% significance level)
if result.f_pvalue < 0.05:
rsquared = result.rsquared
coefficient_pvalue = result.f_pvalue
try:
intercept,slope = result.params
except:
slope = result.params
mean_price = np.mean(x_value)
mean_quantity = np.mean(y_value)
try:
tintercept, t_score = result.tvalues
except:
pass
#Price elasticity Formula
price_elasticity = (slope)*(mean_price/mean_quantity)
#Append results into dictionary for dataframe
results_values["name"].append(col)
results_values["price_elasticity"].append(price_elasticity)
results_values["price_mean"].append(mean_price)
results_values["quantity_mean"].append(mean_quantity)
results_values["intercept"].append(intercept)
results_values['t_score'].append(t_score)
results_values["slope"].append(slope)
results_values["coefficient_pvalue"].append(coefficient_pvalue)
results_values["rsquared"].append(rsquared)
final_df = pd.concat([final_df,pd.DataFrame.from_dict(results_values)],axis=0,ignore_index=True)
#After processing all the columns in x_values, the resulting DataFrame final_df contains the summarized analysis results for each column, including the price elasticity, mean price, mean quantity, intercept, t-score, slope, coefficient p-value, and R-squared value.
Analyzing the Results:
The resulting DataFrame contains the product name, price elasticity, mean price, and mean quantity for each product. We sort the products based on their price elasticity to identify the most and least price-sensitive products.
final_df = final_df.sort_values(by=['price_elasticity'])
Visualizing Price Elasticity:
We use Seaborn to create a captivating bar plot showcasing the price elasticity scores for different products. This visualization helps us identify the products with the most significant impact on demand due to price changes.
# temp_df
plt.figure(figsize=(12,10))
ax= sns.barplot(x=temp_df['price_elasticity'],y=temp_df['name'])
ax.bar_label(ax.containers[0])
In the above we notice that Samsung-65 Class LED TV with a negative price elasticity of -17.68, it is described as follows: “A 10% price decrease in Samsung-65 Class LED, it increases sales demand by 170.6% or a 10% price increase, it decreases sales demand by 170.6%”
Whereas in other case, Sony XBR-X850E-Series 75-Class TV with a negative price elasticity of 7.19 & interpreted as — “A 10% price decrease in XBR-X850E-Series 75-Class TV, it decreases sales demand by 71.2% or a 10% price increase, it increases sales demand by 71.2%”
Conclusion
Throughout this enthralling journey, we learned the significance of price elasticity in pricing strategies and explored its application in the real world of electronic products. Armed with the Price Elasticity Model, businesses can now make informed decisions about pricing and maximize their profits. Thank you for joining me on this exhilarating adventure.