In this exercise, my aim is to do some empirical work using the data described in “A study of cartel stability: the Joint Executive Committee” by Robert H. Porter (1983, The Bell Journal of Economics) and “Theories of cartel stability and the Joint Executive Committee” by Glenn Ellison (1994, RAND). In doing so, I will use the instructions and the dataset provided by Glenn Ellison and Sara Ellison. The computer codes are written in R.

Part 1: Some basics

Step 0: First off, start with including the necessary libraries we will use later for this exercise.

library(tidyverse)
library(AER)
library(stats)
library(stargazer)


Step 1: I import the dataset into R, and name the columns.

data <- read.table("https://ocw.mit.edu/courses/economics/14-271-industrial-organization-i-fall-2013/assignments/Porter.prn", header = TRUE)

Let’s first see how the data looks.

stargazer::stargazer(data, type = 'html', digits = 2, style = "qje")
Statistic N Mean St. Dev. Min Pctl(25) Pctl(75) Max
week 328 164.50 94.83 1 82.8 246.2 328
quantity 328 25,384.40 11,632.77 4,810 16,604 32,389 76,407
price 328 0.25 0.07 0.12 0.20 0.30 0.40
lakes 328 0.57 0.50 0 0 1 1
collusion 328 0.62 0.49 0 0 1 1
dm1 328 0.42 0.49 0 0 1 1
dm2 328 0.05 0.21 0 0 0 1
dm3 328 0.43 0.50 0 0 1 1
dm4 328 0.02 0.12 0 0 0 1
seas1 328 0.09 0.28 0 0 0 1
seas2 328 0.09 0.28 0 0 0 1
seas3 328 0.09 0.28 0 0 0 1
seas4 328 0.09 0.28 0 0 0 1
seas5 328 0.07 0.26 0 0 0 1
seas6 328 0.07 0.26 0 0 0 1
seas7 328 0.07 0.26 0 0 0 1
seas8 328 0.07 0.26 0 0 0 1
seas9 328 0.07 0.26 0 0 0 1
seas10 328 0.07 0.26 0 0 0 1
seas11 328 0.07 0.26 0 0 0 1
seas12 328 0.07 0.26 0 0 0 1
seas13 328 0.07 0.26 0 0 0 1

To start with, let’s do a simple OLS regression of log(quantity) on a constant, log(price), lakes, and all the seasonal dummies (except one, since we have already included a constant. I’ll exclude one).

seas_dums <- paste("seas",1:12, sep="")
seas_dummies <- paste(seas_dums, collapse= "+")
fmla_ols <- as.formula(paste("log(quantity) ~ log(price) + lakes + ", seas_dummies ))

olsmodel <- lm(fmla_ols, data = data)

stargazer(olsmodel, type = "html", style = "aer")
log(quantity)
log(price) -0.639***
(0.082)
lakes -0.448***
(0.120)
seas1 -0.133
(0.111)
seas2 0.067
(0.111)
seas3 0.111
(0.111)
seas4 0.155
(0.111)
seas5 0.110
(0.130)
seas6 0.047
(0.160)
seas7 0.123
(0.160)
seas8 -0.235
(0.160)
seas9 0.004
(0.160)
seas10 0.169
(0.161)
seas11 0.215
(0.160)
seas12 0.220
(0.159)
Constant 9.309***
(0.140)
Observations 328
R2 0.313
Adjusted R2 0.282
Residual Std. Error 0.397 (df = 313)
F Statistic 10.169*** (df = 14; 313)
Notes: ***Significant at the 1 percent level.
**Significant at the 5 percent level.
*Significant at the 10 percent level.

In this (naive) OLS regression, the coefficient for log(price), which we can call “price elasticity”, is -0.639. Note that it is less than 1 in absolute terms. It implies negative marginal revenues. Why? Check this. For per-period profit maximization, we know that the marginal cost should be equal to the marginal revenue. Since the marginal cost is positive, and the marginal revenue is negative, they cannot be equal. Therefore, the estimate we get for the elasticity is not reasonable, since it contradicts with per-period profit maximization.

So, let’s try something different.

Step 2: Actually we knew that the above regression wouldn’t give us the correct estimates because of the “endogeneity bias”. To overcome this problem, we use Instrumental Variable estimation.

Following the instructions in the homework, I use collusion as an instrument for price.

fmla_iv1 <- as.formula(paste("log(quantity) ~ log(price) + lakes + ", seas_dummies, "| collusion + lakes + ",seas_dummies))

ivmodel1 <- ivreg(fmla_iv1,data=data)

stargazer(olsmodel, ivmodel1, type = "html", style = "qje", column.labels = c("OLS", "IV"), title = "Regression Results", model.names = FALSE)
Regression Results
log(quantity)
OLS IV
(1) (2)
log(price) -0.639*** -0.867***
(0.082) (0.132)
lakes -0.448*** -0.423***
(0.120) (0.122)
seas1 -0.133 -0.131
(0.111) (0.112)
seas2 0.067 0.091
(0.111) (0.113)
seas3 0.111 0.136
(0.111) (0.113)
seas4 0.155 0.153
(0.111) (0.112)
seas5 0.110 0.074
(0.130) (0.132)
seas6 0.047 -0.006
(0.160) (0.163)
seas7 0.123 0.060
(0.160) (0.164)
seas8 -0.235 -0.294*
(0.160) (0.164)
seas9 0.004 -0.058
(0.160) (0.164)
seas10 0.169 0.086
(0.161) (0.168)
seas11 0.215 0.152
(0.160) (0.165)
seas12 0.220 0.179
(0.159) (0.162)
Constant 9.309*** 8.996***
(0.140) (0.199)
N 328 328
R2 0.313 0.296
Adjusted R2 0.282 0.264
Residual Std. Error (df = 313) 0.397 0.402
F Statistic 10.169*** (df = 14; 313)
Notes: ***Significant at the 1 percent level.
**Significant at the 5 percent level.
*Significant at the 10 percent level.

Now the results look better. Why? Look at the coefficient for log(price), which is the price elasticity (of demand): it is -0.867. So the new estimate for elasticity is much closer to 1 (in absolute terms). The estimates are closer to those reported in Ellison (1994), Table 2 (the estimation of demand with no serial correlation).

We can interpret the coefficient on the lakes variable and the seasonal dummies as follows. If the lakes are open to navigation, quantity demanded for railroad transport decreases by 42.294 percent compared to the case where the lakes are frozen. In the first season, the quantity demanded is 13.097 percent lower than season 13, ceteris paribus. In season 2, the quantity the quantity demanded is 9.095percent higher than season 13. And so on. However, all of the seasonal dummies turn out to be insignificant.

The R-squared of this regression is 0.2958746.

Alternatively, let’s use dm1, dm2,dm3 and dm4 as instruments for price in addition to collusion.

dm_dums <- paste("dm",1:4, sep="")
dm_dummies <- paste(dm_dums,collapse = "+")
fmla_iv2 <- as.formula(paste("log(quantity) ~ log(price) + lakes + ", seas_dummies, "|", dm_dummies , "+ collusion + lakes + ", seas_dummies))

ivmodel2 <- ivreg(fmla_iv2,data=data)
stargazer(olsmodel, ivmodel1, ivmodel2, type = "html", style = "qje", column.labels = c("OLS", "IV", "IV with more instruments"), title = "Regression Results", model.names = FALSE)
Regression Results
log(quantity)
OLS IV IV with more instruments
(1) (2) (3)
log(price) -0.639*** -0.867*** -0.735***
(0.082) (0.132) (0.120)
lakes -0.448*** -0.423*** -0.437***
(0.120) (0.122) (0.120)
seas1 -0.133 -0.131 -0.132
(0.111) (0.112) (0.111)
seas2 0.067 0.091 0.077
(0.111) (0.113) (0.112)
seas3 0.111 0.136 0.122
(0.111) (0.113) (0.112)
seas4 0.155 0.153 0.154
(0.111) (0.112) (0.111)
seas5 0.110 0.074 0.094
(0.130) (0.132) (0.131)
seas6 0.047 -0.006 0.025
(0.160) (0.163) (0.161)
seas7 0.123 0.060 0.096
(0.160) (0.164) (0.162)
seas8 -0.235 -0.294* -0.260
(0.160) (0.164) (0.162)
seas9 0.004 -0.058 -0.023
(0.160) (0.164) (0.162)
seas10 0.169 0.086 0.134
(0.161) (0.168) (0.165)
seas11 0.215 0.152 0.188
(0.160) (0.165) (0.162)
seas12 0.220 0.179 0.202
(0.159) (0.162) (0.160)
Constant 9.309*** 8.996*** 9.177***
(0.140) (0.199) (0.184)
N 328 328 328
R2 0.313 0.296 0.310
Adjusted R2 0.282 0.264 0.279
Residual Std. Error (df = 313) 0.397 0.402 0.398
F Statistic 10.169*** (df = 14; 313)
Notes: ***Significant at the 1 percent level.
**Significant at the 5 percent level.
*Significant at the 10 percent level.

The results for the new estimation are above. It turns out that adding extra instruments doesn’t improve the estimates; the estimate for elasticity was closer to -1 in the previous regression.

Step 3: Now, let’s estimate the supply equation instead.

fmla_iv3 <- as.formula(paste("log(price) ~ log(quantity) + collusion + ", dm_dummies ,"+",seas_dummies,"|", " lakes + collusion + ", dm_dummies, "+", seas_dummies))
ivmodel3 <- ivreg(fmla_iv3, data=data)

stargazer(ivmodel3, type = "html", style = "qje", title = "Supply Equation Regression Results", model.names = FALSE)
Supply Equation Regression Results
log(price)
log(quantity) 0.253
(0.173)
collusion 0.368***
(0.054)
dm1 -0.202***
(0.055)
dm2 -0.173**
(0.081)
dm3 -0.319***
(0.065)
dm4 -0.208
(0.172)
seas1 0.030
(0.072)
seas2 0.092
(0.069)
seas3 0.130*
(0.071)
seas4 -0.001
(0.082)
seas5 -0.043
(0.076)
seas6 -0.043
(0.090)
seas7 -0.084
(0.082)
seas8 0.072
(0.127)
seas9 0.041
(0.099)
seas10 -0.111
(0.077)
seas11 -0.097
(0.074)
seas12 0.017
(0.078)
Constant -3.975**
(1.778)
N 328
R2 0.316
Adjusted R2 0.276
Residual Std. Error 0.246 (df = 309)
Notes: ***Significant at the 1 percent level.
**Significant at the 5 percent level.
*Significant at the 10 percent level.

The coefficient on collusion tells us that if there is a collusion between the firms, the price is 36.8 percent higher compared to the no-collusion case. The coefficient on log(quantity) implies decreasing returns to scale. Why? It implies that the marginal cost curve is upward-sloping. It happens only if the production function has decreasing returns to scale [proof omitted]. Therefore, the cost of the firm is a convex function of its output.

Part 2: Model derivation and interpretation

Suppose we have a linear demand specification:

\[\begin{equation} Q_t = \alpha_0 + \alpha_1 P_t + \alpha_2 Lakes_t + u_t \end{equation}\]

It implies that the slope of the demand curve is \(1/\alpha_1\). Therefore, the slope of the marginal revenue curve is \(2/\alpha_1\). Assume that the demand curve is of the form \(P = K + \frac{1}{\alpha_1} Q\). So the marginal revenue curve is of the form \(P = K + \frac{2}{\alpha_1} Q\). We know that \(MR=MC\) at the optimal quantity, so define \(Q^*\) as \(MR(Q^*) = c\). Since \(MR(Q^*) = c\), we get \(K+\frac{2}{\alpha_1}Q^* = c\). Therefore, \(K = c - \frac{2}{\alpha_1}Q^*\). The price that the monopolist charges is \(D(Q^*)\). Plugging \(K\) into the demand equation, we get \(P^* = c - \frac{2}{\alpha_1}Q^* + \frac{1}{\alpha_1}Q^* = c - \frac{1}{\alpha_1}Q^*\). Given this result, I would choose a linear functional form for the supply curve, such as \(P = a_0 + a_1 Q\), where \(a_1 > 0\).

All of the calculations above assumed that the firm is a monopolist. Therefore, we should control for that in constructing a supply equation. Other structural factors also play a role in determining the price. Therefore, we can construct the supply equation as

\[\begin{equation} P_t = \beta_0 + \beta_1 Q_t + \beta_2 S_t + \beta_3 I_t + v_t \end{equation}\]

If the industry is composed of competitive firms with total cost \(c(Q_t) = c_0 Q_t + c_1 Q_t^2\), then \(MC = P\) implies the pricing rule \(P_t = c_0 + 2c_1 Q_t\).

It is important to note that if firms have this type of total cost structure, Porter’s approach to identify industry behavior is no longer useful, since both types of behavior imply \(P = a_0 + a_1 Q\) type of pricing rule. Therefore, regime type cannot be identified by using the method described by Porter.

Part 3: Causes of Price Wars

Now, let’s go back to the data and try to find the causes of price wars by using a probit regression.

First, we create a new variable that indicates the start of a price war, conditional on the previous period being collusive. Then, we use quantity, lakes, dm1,dm2,dm3 and dm4 as the explatanory variables, and estimate the likelihood of a start of a price war by using a probit regression.

war_start <- c()
for(i in 2:nrow(data)){
    if(data$collusion[i-1]==1){
        war_start[i] <- abs(data$collusion[i]-1)
    }
    else
        war_start[i] <- NA
}

data$quantity <- data$quantity/1000
fmla_probit <-  as.formula(paste("war_start ~ quantity + lakes + ", dm_dummies))
probitmodel <- glm(fmla_probit, family = binomial(link = "probit"), data = data)

stargazer(probitmodel, type = "html", style = "qje", covariate.labels = c("quantity/1000"))
war_start
quantity/1000 0.001
(0.019)
lakes -0.246
(0.337)
dm1 3.934
(305.054)
dm2 4.400
(305.055)
dm3 4.578
(305.054)
dm4 -0.068
(841.486)
Constant -5.700
(305.055)
N 202
Log Likelihood -38.419
Akaike Inf. Crit. 90.839
Notes: ***Significant at the 1 percent level.
**Significant at the 5 percent level.
*Significant at the 10 percent level.

The coefficient on quantity is positive, implying that the probability of a price war increases as quantity increases. In other words, price wars are more likely to occur in booms. Additionally, the coefficient on lakes implies that price wars are less likely if the lakes are open to navigation. However, none of these coefficients are significantly different than zero.