Homework 2: Replication of Berry, Levinsohn & Pakes (1995)

Hasan K. Tosun, December 2017

In this homework, we are going to replicate "Automobile Prices in Market Equilibrium" paper by Berry, Levinsohn and Pakes (1995, Econometrica) [BLP hereafter].

We use the data provided by Prof. Amil Petrin.

We use R (in Part 1) and Python (in Part 2) for this homework.

Let's get started.

Part 1: Data Preparation and Preliminary Estimation

Step 0: Import the necessary R packages.

In [1]:
library(MASS)
library(tidyverse)
library(bbmle)
library(pracma)
library(AER)

Step 1: Load the dataset into R, and name the columns.

In [2]:
data <- read_tsv("cardata.txt", col_types = cols());
names(data) <- c("modelid", "year", "hpwt","size","air","mpd","price","share","firmid")

Step 2: Take a look at the data.

In [3]:
glimpse(data)
Observations: 2,217
Variables: 9
$ modelid <chr> "AD100L", "ADSUPE", "AMAMBS", "AMGREM", "AMHORN", "AMJAVL",...
$ year    <int> 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71,...
$ hpwt    <dbl> 0.4879084, 0.4657662, 0.4524887, 0.5289969, 0.4943244, 0.46...
$ size    <dbl> 1.2627, 1.1136, 1.6458, 1.1502, 1.2780, 1.4592, 1.6068, 1.7...
$ air     <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
$ mpd     <dbl> 1.982720, 2.201909, 1.504286, 1.888146, 1.935989, 1.716800,...
$ price   <dbl> 8.876543, 7.641975, 8.928395, 4.935803, 5.516049, 7.108642,...
$ share   <dbl> 0.0002810, 0.0000376, 0.0004424, 0.0010513, 0.0006701, 0.00...
$ firmid  <int> 7, 7, 15, 15, 15, 15, 15, 19, 19, 19, 19, 19, 19, 19, 19, 1...
In [4]:
summary(data)
   modelid               year            hpwt             size      
 Length:2217        Min.   :71.00   Min.   :0.1705   Min.   :0.756  
 Class :character   1st Qu.:77.00   1st Qu.:0.3366   1st Qu.:1.131  
 Mode  :character   Median :82.00   Median :0.3750   Median :1.270  
                    Mean   :81.54   Mean   :0.3944   Mean   :1.310  
                    3rd Qu.:87.00   3rd Qu.:0.4275   3rd Qu.:1.453  
                    Max.   :90.00   Max.   :0.9476   Max.   :1.888  
      air              mpd             price            share          
 Min.   :0.0000   Min.   :0.8461   Min.   : 3.393   Min.   :0.0000007  
 1st Qu.:0.0000   1st Qu.:1.5570   1st Qu.: 6.714   1st Qu.:0.0001863  
 Median :0.0000   Median :2.0104   Median : 8.729   Median :0.0005698  
 Mean   :0.2418   Mean   :2.0849   Mean   :11.761   Mean   :0.0009732  
 3rd Qu.:0.0000   3rd Qu.:2.4828   3rd Qu.:13.074   3rd Qu.:0.0012855  
 Max.   :1.0000   Max.   :6.4368   Max.   :68.597   Max.   :0.0094728  
     firmid     
 Min.   : 1.00  
 1st Qu.: 8.00  
 Median :16.00  
 Mean   :13.74  
 3rd Qu.:19.00  
 Max.   :26.00  

So, the data covers the period between 1971 and 1990. There are different number of car models provided. Call it $J_t$. Define $J \equiv \sum_{t=71}^{90} J_t$.

Model Description

Suppose the utility to an individual is: $$ u_{ij} = \alpha (y_{i}-p_j) + x_j\beta + \epsilon_{ij} $$ with $j=1,\ldots,J_t$ being the cars in year $t$, $i\in [0,1]$ the individuals, $y_{i}$ income, $x_j$ observed car characteristics, $p_j$ price. Suppose also that $\epsilon_{ij}$ has type-1 extreme value distribution. Note that the market shares in the data accounts for outside good, i.e. share sums are equal to the fraction of people who have bought a car. Denote the outside good as good 0, i.e. $j=0$ for each year.

Suppose we have the aggregate data (market shares) rather than the micro-level data (who buys what).

Let's first derive the likelihood function for the aggregate data.

Maximum Likelihood Estimation

So, if individual $i$ buys good $j$, it means that good $j$ provides him more utility than any other good $k$, including the outside good. Therefore, we can say that the probability of individual $i$ buying $j$ at year $t$ is equal to

$$ \begin{align} Pr(i \text{ buys good } j \text{ in year } t) &= Pr(u_{ijt}> u_{ikt}, \forall k= {0,1,\ldots,J_t}, k\neq j) \\ &= Pr(\alpha (y_{it}-p_{jt}) + x_{kt}\beta + \epsilon_{ijt}> \alpha (y_{it}-p_{kt}) + x_{kt}\beta + \epsilon_{ikt} , \forall k= {0,1,\ldots,J_t}, k\neq j) \\ &= Pr(-\alpha p_{jt} + x_{jt}\beta + \epsilon_{ijt} > - \alpha p_{kt} + x_{kt}\beta + \epsilon_{ikt}, \forall k= {0,1,\ldots,J_t}, k\neq j)\\ &= \frac{\exp(-\alpha p_{jt} + x_{jt}\beta)}{1 + \sum_{k=1}^{J_t} \exp(-\alpha p_{kt} + x_{kt}\beta)} \end{align} $$

This is true for every $j$, i.e. for every good in each year. Assuming that the individuals are distributed with cdf $T$, we can write the probability of good $j$ being sold in year $t$ as

$$ \begin{align} Pr(\text{good } j \text{ being sold in year } t) &= \int_i Pr( i \text{ buys good } j \text{ in year } t) dT(i) \\ & = \int_i \frac{\exp(-\alpha p_{jt} + x_{jt}\beta)}{1 + \sum_{k=1}^{J_t} \exp(-\alpha p_{kt} + x_{kt}\beta)} dT(i) \\ & = \frac{\exp(-\alpha p_{jt} + x_{jt}\beta)}{1 + \sum_{k=1}^{J_t} \exp(-\alpha p_{kt} + x_{kt}\beta)} \int_i dT(i) \\ & = \frac{\exp(-\alpha p_{jt} + x_{jt}\beta)}{1 + \sum_{k=1}^{J_t} \exp(-\alpha p_{kt} + x_{kt}\beta)} \end{align} $$

Note that the data counterpart of $s_{jt}$ is the market share of good $j$ in year $t$. That is, we can say that if individuals are in the unit interval, $s_{jt}$ many individuals buy good $j$ in year $t$. Therefore, we can write the likelihood function as $$ \begin{align} L & = \prod_{t=71}^{90} \prod_{j=0}^{J_t} (Pr(\text{good } j \text{ being sold in year } t))^{s_{jt}} \\ & = \prod_{t=71}^{90} \prod_{j=0}^{J_t} \left(\frac{exp(-\alpha p_{jt} + x_{jt}\beta)}{1 + \sum_{k=1}^{J_t} \exp(-\alpha p_{kt} + x_{kt}\beta)}\right)^{s_{jt}} \end{align} $$

So, the log-likelihood function is $$ \begin{align} logL \equiv log(L) & = \sum_{t=71}^{90} \sum_{j=0}^{J_t} s_{jt} \left[\log\left(\frac{exp(-\alpha p_{jt} + x_{jt}\beta)}{1 + \sum_{k=1}^{J_t} \exp(-\alpha p_{kt} + x_{kt}\beta)}\right)\right] \\ & = \sum_{t=71}^{90} \sum_{j=0}^{J_t}\left[ s_{jt} (-\alpha p_{jt} + x_{jt}\beta) - s_{jt}\log\left(1 + \sum_{k=1}^{J_t} \exp(-\alpha p_{kt} + x_{kt}\beta)\right)\right] \end{align} $$

For notational simplicity, define $\delta_{jt} \equiv -\alpha p_{jt} + x_{jt}\beta$. Also, normalize $\delta_{0t} = 0$. Therefore,

$$ \begin{align} logL & = \sum_{t=71}^{90} \sum_{j=0}^{J_t} \left[s_{jt} \delta_{jt} - s_{jt}\log\left(1 + \sum_{k=1}^{J_t} \exp(\delta_{kt})\right)\right]\\ & = \sum_{t=71}^{90} s_{0t}\delta_{0t} - \sum_{t=71}^{90} s_{0t}\log\left(1 + \sum_{k=1}^{J_t} \exp(\delta_{kt})\right) + \sum_{t=71}^{90} \sum_{j=1}^{J_t} \left[s_{jt} \delta_{jt} - s_{jt}\log\left(1 + \sum_{k=1}^{J_t} \exp(\delta_{kt})\right)\right]\\ & = - \sum_{t=71}^{90} s_{0t}\log\left(1 + \sum_{k=1}^{J_t} \exp(\delta_{kt})\right) + \sum_{t=71}^{90} \sum_{j=1}^{J_t} \left[s_{jt} \delta_{jt} - s_{jt}\log\left(1 + \sum_{k=1}^{J_t} \exp(\delta_{kt})\right)\right]\\ & = -\sum_{t=71}^{90} \log\left(1 + \sum_{k=1}^{J_t} \exp(\delta_{kt})\right) \left(s_{0t} + \sum_{j=1}^{J_t} s_{jt}\right) + \sum_{t=71}^{90} \sum_{j=1}^{J_t} s_{jt} \delta_{jt} \\ & = -\sum_{t=71}^{90} \log\left(1 + \sum_{k=1}^{J_t} \exp(\delta_{kt})\right) + \sum_{t=71}^{90} \sum_{j=1}^{J_t} s_{jt} \delta_{jt} \\ \end{align} $$

We have the data for market share $s_{jt}$, prices $p_{jt}$, and observable car characteristics $x_{jt}$. Therefore, we can minimize the function $logL$ with respect to parameters $\alpha$ and $\beta$ to get ML estimates for them.

Now, we are ready estimate our model by using the loglikelihood function given above. But first, let's describe what observable characteristics, $x_{jt}$ are.

In our dataset, four series of car characteristics that can influence consumer decision are provided. They are: 0. Constant (1), 1. Horsepower/Weight, 2. Air Conditioner is standard or not, 3. Miles per dollar, 4. Car size.

Now, let's translate the mathematical expression for log-likelihood into a computer code:

In [5]:
minusLL <- function(beta0, beta1, beta2, beta3, beta4, alpha){
    data <- data %>%  mutate(delta = beta0 + beta1 * hpwt + beta2 * air + beta3 * mpd + beta4 * size - alpha * price)
    term1 <- data %>%  mutate(temp1 = exp(delta)) %>%
                       group_by(year) %>%
                       summarize(temp2 = sum(temp1)) %>% 
                       mutate(temp3 = log(1+temp2)) %>%
                       summarize(t = -sum(temp3))
    
    term2 <- data %>% mutate(temp1 = share * delta) %>%
                      summarize(t = sum(temp1))
  return(-(term1$t+term2$t))
}

Now, it's time to find the maximum likelihood. So, we minimize the function $minusLL(\beta_0, \beta_1,\beta_2,\beta_3,\beta_4,\alpha)$.

In [6]:
mlest <- mle2(minusLL, start = list(beta0 = 0, beta1 = 0, beta2 = 0, beta3 = 0, beta4 = 0, alpha = 0))
summary(mlest)
Maximum likelihood estimation

Call:
mle2(minuslogl = minusLL, start = list(beta0 = 0, beta1 = 0, 
    beta2 = 0, beta3 = 0, beta4 = 0, alpha = 0))

Coefficients:
      Estimate Std. Error z value  Pr(z)
beta0 -8.02514    7.79831 -1.0291 0.3034
beta1 -0.19475    9.89757 -0.0197 0.9843
beta2  0.19176    3.07492  0.0624 0.9503
beta3  0.12150    1.35427  0.0897 0.9285
beta4  1.73457    3.77460  0.4595 0.6458
alpha  0.13824    0.31409  0.4401 0.6598

-2 log L: 33.19545 

Own-Price Elasticities with MLE

We can calculate the own-price elasticities by $$\begin{equation} \epsilon = \alpha p_{jt} (1-s_{jt}) \end{equation} $$

In [7]:
ownprice_MLE <-  as.numeric(coef(mlest)["alpha"])*data$price*(1-data$share)
In [8]:
HDACCO90_own_elas_mle <-  as.numeric(data[data$year == 90 & data$modelid == "HDACCO",]["price"] * coef(mlest)["alpha"] * (1-data[data$year == 90 & data$modelid == "HDACCO",]["share"]))
FDESCO90_own_elas_mle <-  as.numeric(data[data$year == 90 & data$modelid == "FDESCO",]["price"] * coef(mlest)["alpha"] * (1-data[data$year == 90 & data$modelid == "FDESCO",]["share"]))
TYCORO73_own_elas_mle <-  as.numeric(data[data$year == 73 & data$modelid == "TYCORO",]["price"] * coef(mlest)["alpha"] * (1-data[data$year == 73 & data$modelid == "TYCORO",]["share"]))
HDCIVI74_own_elas_mle <-  as.numeric(data[data$year == 74 & data$modelid == "HDCIVI",]["price"] * coef(mlest)["alpha"] * (1-data[data$year == 74 & data$modelid == "HDCIVI",]["share"]))
cat("The own-price elasticity of 1990 Honda Accord is ", HDACCO90_own_elas_mle, ".\n",sep ="")
cat("The own-price elasticity of 1990 Ford Escort is ", FDESCO90_own_elas_mle,".\n",sep ="")
cat("The own-price elasticity of 1973 Toyota Corolla is ", TYCORO73_own_elas_mle,".\n",sep ="")
cat("The own-price elasticity of 1974 Honda Civic is ", HDCIVI74_own_elas_mle,".\n", sep ="")
The own-price elasticity of 1990 Honda Accord is 1.27888.
The own-price elasticity of 1990 Ford Escort is 0.7805036.
The own-price elasticity of 1973 Toyota Corolla is 0.8333769.
The own-price elasticity of 1974 Honda Civic is 0.6657317.

OLS

In [9]:
data <- data %>% group_by(year) %>% mutate(s0 = 1-sum(share))
data <- data %>% mutate(logdif = log(share)-log(s0))

lmodel <- lm(logdif ~ hpwt + air + mpd + size + price, data = data)
summary(lmodel)
Call:
lm(formula = logdif ~ hpwt + air + mpd + size + price, data = data)

Residuals:
    Min      1Q  Median      3Q     Max 
-6.3550 -0.6534  0.1371  0.7134  2.4603 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -10.071557   0.252911 -39.823  < 2e-16 ***
hpwt         -0.124185   0.277269  -0.448    0.654    
air          -0.034350   0.072815  -0.472    0.637    
mpd           0.264999   0.043123   6.145 9.45e-10 ***
size          2.342060   0.125196  18.707  < 2e-16 ***
price        -0.088638   0.004026 -22.015  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.083 on 2211 degrees of freedom
Multiple R-squared:  0.3871,	Adjusted R-squared:  0.3857 
F-statistic: 279.2 on 5 and 2211 DF,  p-value: < 2.2e-16

Own-Price Elasticities with OLS

In [10]:
HDACCO90_own_elas_ols <-  as.numeric(data[data$year == 90 & data$modelid == "HDACCO",]["price"] * coefficients(lmodel)["price"] * (1-data[data$year == 90 & data$modelid == "HDACCO",]["share"]))
FDESCO90_own_elas_ols <-  as.numeric(data[data$year == 90 & data$modelid == "FDESCO",]["price"] * coefficients(lmodel)["price"] * (1-data[data$year == 90 & data$modelid == "FDESCO",]["share"]))
TYCORO73_own_elas_ols <-  as.numeric(data[data$year == 73 & data$modelid == "TYCORO",]["price"] * coefficients(lmodel)["price"] * (1-data[data$year == 73 & data$modelid == "TYCORO",]["share"]))
HDCIVI74_own_elas_ols <-  as.numeric(data[data$year == 74 & data$modelid == "HDCIVI",]["price"] * coefficients(lmodel)["price"] * (1-data[data$year == 74 & data$modelid == "HDCIVI",]["share"]))

cat("The own-price elasticity of 1990 Honda Accord is ", HDACCO90_own_elas_ols, ".\n",sep ="")
cat("The own-price elasticity of 1990 Ford Escort is ", FDESCO90_own_elas_ols,".\n",sep ="")
cat("The own-price elasticity of 1973 Toyota Corolla is ", TYCORO73_own_elas_ols,".\n",sep ="")
cat("The own-price elasticity of 1974 Honda Civic is ", HDCIVI74_own_elas_ols,".\n", sep ="")
The own-price elasticity of 1990 Honda Accord is -0.8200015.
The own-price elasticity of 1990 Ford Escort is -0.5004491.
The own-price elasticity of 1973 Toyota Corolla is -0.5343508.
The own-price elasticity of 1974 Honda Civic is -0.4268588.

2SLS

We know that price is an endogenous variable in demand estimation. So, we cannot use it directly in our estimations. So, we need good instruments for price. In BLP, instruments for $p_{jt}$ are defined as $$ x_{jtk}, \sum_{r\in \mathcal{J_{ft}},r\neq j } x_{rtk},\sum_{r\notin \mathcal{J_{tf}},r\neq j } x_{rtk}$$ where $x_{jtk}$ is the $k$th characteristic of good $j$ at time $t$, and $\mathcal{J}_{ft}$ is the set of goods produced by the firm who produces $j$, at time $t$. So, the instruments are,

  1. the observable characteristics,
  2. the sum of the characteristics of the goods produced by the same firm (except the good itself), and
  3. the sum of the characteristics of the goods produced by other firms.
In [11]:
data <- data %>% group_by(year,firmid) %>% 
  mutate(const = 1) %>% 
  mutate(own_const = sum(const) - const,
         own_hpwt = sum(hpwt) - hpwt,
         own_air = sum(air) - air, 
         own_mpd = sum(mpd) - mpd, 
         own_size = sum(size) - size)
data <- data %>% group_by(year) %>%
  mutate(all_const = sum(const) - const - own_const,
         all_hpwt = sum(hpwt) - hpwt - own_hpwt, 
         all_air = sum(air) - air - own_air, 
         all_mpd = sum(mpd) - mpd - own_mpd, 
         all_size = sum(size) - size - own_size)
In [12]:
tslsmodel <- ivreg(logdif  ~ const + hpwt + air + mpd + size + price | const + hpwt + air + mpd + size + own_const + own_hpwt + own_air + own_mpd + own_size + all_const + all_hpwt + all_air + all_mpd + all_size, data=data)

summary(tslsmodel)
Call:
ivreg(formula = logdif ~ const + hpwt + air + mpd + size + price | 
    const + hpwt + air + mpd + size + own_const + own_hpwt + 
        own_air + own_mpd + own_size + all_const + all_hpwt + 
        all_air + all_mpd + all_size, data = data)

Residuals:
     Min       1Q   Median       3Q      Max 
-6.54921 -0.64342  0.09796  0.71322  4.22830 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -9.91531    0.26269 -37.745  < 2e-16 ***
hpwt         1.22597    0.40364   3.037 0.002415 ** 
air          0.48628    0.13311   3.653 0.000265 ***
mpd          0.17155    0.04862   3.528 0.000427 ***
size         2.29157    0.12945  17.703  < 2e-16 ***
price       -0.13571    0.01077 -12.599  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.116 on 2211 degrees of freedom
Multiple R-Squared: 0.3492,	Adjusted R-squared: 0.3477 
Wald test: 203.5 on 5 and 2211 DF,  p-value: < 2.2e-16 
In [13]:
HDACCO90_own_elas_tsls <-  as.numeric(data[data$year == 90 & data$modelid == "HDACCO",]["price"] * coefficients(tslsmodel)["price"] * (1-data[data$year == 90 & data$modelid == "HDACCO",]["share"])) 
FDESCO90_own_elas_tsls <-  as.numeric(data[data$year == 90 & data$modelid == "FDESCO",]["price"] * coefficients(tslsmodel)["price"] * (1-data[data$year == 90 & data$modelid == "FDESCO",]["share"]))
TYCORO73_own_elas_tsls <-  as.numeric(data[data$year == 73 & data$modelid == "TYCORO",]["price"] * coefficients(tslsmodel)["price"] * (1-data[data$year == 73 & data$modelid == "TYCORO",]["share"]))
HDCIVI74_own_elas_tsls <-  as.numeric(data[data$year == 74 & data$modelid == "HDCIVI",]["price"] * coefficients(tslsmodel)["price"] * (1-data[data$year == 74 & data$modelid == "HDCIVI",]["share"]))


cat("The own-price elasticity of 1990 Honda Accord is ", HDACCO90_own_elas_tsls, ".\n",sep ="")
cat("The own-price elasticity of 1990 Ford Escort is ", FDESCO90_own_elas_tsls,".\n",sep ="")
cat("The own-price elasticity of 1973 Toyota Corolla is ", TYCORO73_own_elas_tsls,".\n",sep ="")
cat("The own-price elasticity of 1974 Honda Civic is ", HDCIVI74_own_elas_tsls,".\n", sep ="")
The own-price elasticity of 1990 Honda Accord is -1.255451.
The own-price elasticity of 1990 Ford Escort is -0.7662054.
The own-price elasticity of 1973 Toyota Corolla is -0.8181101.
The own-price elasticity of 1974 Honda Civic is -0.653536.

Now, it's time to switch gears. We export the data and the instruments to a csv file, and will continue in Python.

In [14]:
write_csv(data,"data.csv",col_names = TRUE)

Part 2: Random Coefficient Model

We assumed that everyone has the same taste for each characteristic. However, it's not the case. Additionally, there may be some features of a good that are observable to the consumer, but not to the econometrician. To allow for these, we introduce a new model for the utility (following BLP (1999)): $$ \begin{equation} u_{ijt} = -\alpha \frac{p_{jt}}{y_{it}} + x_{jt}\beta + \sum_{k=1}^K \sigma_k v_{ik} x_{jkt} + \xi_{jt} + \epsilon_{ijt} \end{equation} $$ where $K$ is the number of (observable) characteristics, $\xi$ is the observable-to-consumer-but-not-to-econometrician component, $v_{ik}$ is a random variable that is distributed standard normal, and $y_{it}$ is the income of individual $i$ at time $t$.

The first term in the utility function was $\log(y_{it}-p_{jt})$ in BLP (1995). But that term is not defined if the price is higher than the income (and in our case, we have some observations where this is true). To eliminate this case, we used the utility function in BLP (1999).

Now, define $$\delta_{jt} \equiv x_{jt}\beta + \xi_{jt} $$ and $$\mu_{ijt} \equiv -\alpha\frac{p_{jt}}{y_{it}} + \sum_{k=1}^K \sigma_k v_{ik} x_{jkt} $$ Now, we have the following model: $$ \begin{equation} u_{ijt} = \delta_{jt} + \mu_{ijt} + \epsilon_{ijt} \end{equation} $$

So, again, if individual $i$ buys good $j$, it means that good $j$ provides him more utility than any other good $k$, including the outside good. Therefore, we can say that the probability of individual $i$ buying $j$ at year $t$ is equal to

$$ \begin{align} Pr(i \text{ buys good } j \text{ in year } t) &= Pr(u_{ijt}> u_{ikt}, \forall k= {0,1,\ldots,J_t}, k\neq j) \\ &= Pr(\delta_{jt} + \mu_{ijt} + \epsilon_{ijt} > \delta_{kt} + \mu_{ikt} + \epsilon_{ikt}, \forall k= {0,1,\ldots,J_t}, k\neq j)\\ &= \frac{\exp(\delta_{jt} + \mu_{ijt})}{1 + \sum_{k=1}^{J_t} \exp(\delta_{kt} + \mu_{ikt})} \end{align} $$

Assuming that the individuals are distributed with cdf $T$, we can write the probability of good $j$ being sold in year $t$ as $$ \begin{align} Pr(\text{good } j \text{ being sold in year } t) &= \int_i Pr( i \text{ buys good } j \text{ in year } t) dT(i) \\ & = \int_i \frac{\exp(\delta_{jt} + \mu_{ijt})}{1 + \sum_{k=1}^{J_t} \exp(\delta_{kt} + \mu_{ikt})} dT(i) \\ & \approx \frac{1}{ns} \sum_{i=1}^{ns} \frac{\exp(\delta_{jt} + \mu_{ijt})}{1 + \sum_{k=1}^{J_t} \exp(\delta_{kt} + \mu_{ikt})} \\ \end{align} $$

Here is how we code this function.

In [9]:
def s_comp(expdelta0,expmu):
    nom = expdelta0 @ np.ones((1,ns)) * expmu
    denom = np.dot(A,np.dot(A.T,nom))
    frac = nom/(1+denom)
    s_computed = (1/ns)*(frac.sum(1))
    return(s_computed)

To calculate the probabilities, we need $\delta$'s and $\mu$'s. However, we don't know what $\delta$ is. Why? We don't have the data for $\xi_j$. So, how do we calculate $\delta_{jt}$? We approximate it numerically.

Let's write the following recursion

$$ \begin{equation} \delta^{n+1} = \delta^{n} + \log(s^{DATA}) - \log(s(\theta)) \; \forall n=0,1,2,\ldots \end{equation} $$

where $\theta \equiv (\beta,\alpha,\sigma)$, and $\sigma \equiv (\sigma_0,\ldots,\sigma_k)$. Define $\theta_B \equiv (\alpha,\sigma)$ to be the set of parameters outside of $\delta$ for later use.

Define $s(\theta) \equiv Pr(\text{good } j \text{ being sold in year } t)$. So, $\theta$ is the set of parameters that determines $Pr(\text{good } j \text{ being sold in year } t)$.

BLP shows that the above equation is a "contraction", i.e. for all $\epsilon>0$, there exists an integer $J_\epsilon$ such that for all $m>J_\epsilon$, $||(\delta^{m+1}-\delta^{m}||<\epsilon$. Therefore, we can use the function above to find the "best" $\delta$ that explains the market shares in the data.

In [10]:
def contraction(sigma,alpha):
    expdelta0 = deltam
    expdelta0.shape = (nrow,1)
    draws = v * np.kron(np.ones((ns,1)),sigma)
    expmu = (np.exp(np.dot(X,draws.T) - py * alpha))
    dif = 1
    it = 0
    while(dif > tol and it < 1000):
        rat = s_data/s_comp(expdelta0,expmu)
        rat.shape = (nrow,1)
        expdelta1 = expdelta0 * rat
        dif = (abs(expdelta1/expdelta0-1)).max()
        expdelta0 = expdelta1
        it += 1
    return(np.log(expdelta0))

For computational speed, I transformed the recursion above by taking exponentials of both sides. What next?

GMM estimation

The moment condition we use for the estimation is $$ \begin{equation} E[\xi\mid Z] = 0 \end{equation} $$ where $Z$ is the instrument matrix. Before going into the GMM objective function, let's first calculate $\xi$'s.

We noted earlier that $ \delta_{jt} \equiv x_{jt}\beta + \xi_{jt}$. Now that we "estimated" $\delta_{jt}$, we can back out $\xi_{jt}$. But first, we need $\beta$.

We can get $\beta$'s by using 2SLS. How? Use the above equation to regress the $\delta$ we got from the contraction on the product characteristics $X$ and $\xi$, by using $Z$ as the instrument for $\xi$.

$$ \begin{equation} \hat{\beta} = (X^{'} Z \Phi^{-1} Z^{'} X)^{-1} X^{'} Z \Phi^{-1} Z^{'} \delta(\theta_B) \end{equation} $$

where $\Phi \equiv Z^{'}Z$ being the weighting matrix.

Then, use $\hat{\beta}$ to back out $\xi$: $$ \xi_{jt} = \delta_{jt}(\theta) - x_{jt}\beta $$

Next, define $$g \equiv \frac{1}{J} Z^{'}\xi$$

Finally, define the GMM objective function as

$$ Q(\theta_B) \equiv g^{'}Wg $$

Any symmetric and positive-definite $W$ gives us the consistent estimates for the parameters. So we use the identity matrix, i.e. $W = I$

In [11]:
def obj(params):
    sigma = params[0:5]
    alpha = params[5]
    delta = contraction(sigma,alpha)
    beta = np.dot(prem,delta)
    deltam = delta
    xi = delta - np.dot(X,beta)
    g = (1/nrow)* (np.dot(Z.T,xi))
    ob = np.dot(g.T,g)
    ob = ob[0,0]
    return(ob)

Finally, we find the parameters $\sigma$ and $\alpha$ by solving

$$ \begin{equation} \min_{\theta_B = (\sigma,\alpha)} Q(\theta_B) \end{equation} $$

Implementation

Step 0: Load necessary packages.

In [12]:
import pandas as pd
import numpy as np
from scipy import stats
from numba import jit
import time
from numpy.linalg import inv
from scipy.optimize import minimize

Step 1: Import the data (prepared earlier with the R code above).

In [13]:
data = pd.read_csv("data.csv")

Step 2: Preparation.

In [14]:
ns = 50000
nrow = data.shape[0]

X = data[["const", "hpwt", "air", "mpd", "size"]]
X = np.asarray(X)
sizeX = X.shape[1]
Z = data[["const", "hpwt", "air", "mpd", "size", "own_const", "own_hpwt", "own_air", "own_mpd", "own_size", "all_const", "all_hpwt", "all_air", "all_mpd", "all_size"]]
Z = np.asarray(Z)
phi = np.dot(Z.T,Z)
prem = inv(X.T @ Z @ inv(phi) @ Z.T @ X) @ X.T @ Z @ inv(phi) @ Z.T


A = pd.get_dummies(data["year"])
A = np.asarray(A, dtype = np.float32)
AAT =  np.dot(A,A.T)
nyears = A.shape[1]


np.random.seed(6128)
v = np.random.randn(ns,sizeX)



ym = np.array([2.01156, 2.06526, 2.07843, 2.05775, 2.02915, 2.05346, 2.06745, 2.09805, 2.10404, 2.07208, 2.06019, 2.06561, 2.07672, 2.10437, 2.12608, 2.16426, 2.18071, 2.18856, 2.21250, 2.18377])
sigma_y = 1.72
ym.shape = (nyears,1)
np.random.seed(6128)
ar = np.random.randn(nyears,ns)
y = A @ np.exp(ym + sigma_y*ar)




global deltam 
deltam = np.exp(data["logdif"])
deltam = np.asarray(deltam)


s_data = np.asarray(data["share"])

price = np.asarray(data["price"])
price.shape = (nrow,1)


py = np.dot(price,np.ones((1,ns)))/y

tol = 1e-6

Step 3: Estimation via minimization of GMM objective function with respect to the parameters $\sigma$ and $\alpha$.

In [15]:
init = np.array([3.612, 4.628, 1.818, 1.050, 2.056, 43.501])
t0 = time.time()
soln = minimize(obj,init,method="L-BFGS-B", tol = 1e-2, options={'disp': True})
elapsed_time = time.time()-t0
In [16]:
print("Time elapsed:",elapsed_time)
print("GMM objective:", soln.fun)
print("sigma: ", abs(soln.x[0:sizeX]),"\n", "alpha: ", soln.x[sizeX],sep="")
print("beta:", (prem @ contraction(soln.x[0:sizeX],soln.x[sizeX])).T[0])
Time elapsed: 8464.339102983475
GMM objective: 54.972915367
sigma: [ 3.65318032  4.60393565  1.78346108  0.07123388  2.03939633]
alpha: 43.5160476233
beta: [-6.33286408  1.95140956  0.55924833  0.19367693  3.54098235]