In this exercise, my aim is to do some empirical work using the data described in “A study of cartel stability: the Joint Executive Committee” by Robert H. Porter (1983, The Bell Journal of Economics) and “Theories of cartel stability and the Joint Executive Committee” by Glenn Ellison (1994, RAND). In doing so, I will use the instructions and the dataset provided by Glenn Ellison and Sara Ellison. The computer codes are written in R.
Step 0: First off, start with including the necessary libraries we will use later for this exercise.
library(tidyverse)
library(AER)
library(stats)
library(stargazer)
Step 1: I import the dataset into R, and name the columns.
data <- read.table("https://ocw.mit.edu/courses/economics/14-271-industrial-organization-i-fall-2013/assignments/Porter.prn", header = TRUE)
Let’s first see how the data looks.
stargazer::stargazer(data, type = 'html', digits = 2, style = "qje")
Statistic | N | Mean | St. Dev. | Min | Pctl(25) | Pctl(75) | Max |
week | 328 | 164.50 | 94.83 | 1 | 82.8 | 246.2 | 328 |
quantity | 328 | 25,384.40 | 11,632.77 | 4,810 | 16,604 | 32,389 | 76,407 |
price | 328 | 0.25 | 0.07 | 0.12 | 0.20 | 0.30 | 0.40 |
lakes | 328 | 0.57 | 0.50 | 0 | 0 | 1 | 1 |
collusion | 328 | 0.62 | 0.49 | 0 | 0 | 1 | 1 |
dm1 | 328 | 0.42 | 0.49 | 0 | 0 | 1 | 1 |
dm2 | 328 | 0.05 | 0.21 | 0 | 0 | 0 | 1 |
dm3 | 328 | 0.43 | 0.50 | 0 | 0 | 1 | 1 |
dm4 | 328 | 0.02 | 0.12 | 0 | 0 | 0 | 1 |
seas1 | 328 | 0.09 | 0.28 | 0 | 0 | 0 | 1 |
seas2 | 328 | 0.09 | 0.28 | 0 | 0 | 0 | 1 |
seas3 | 328 | 0.09 | 0.28 | 0 | 0 | 0 | 1 |
seas4 | 328 | 0.09 | 0.28 | 0 | 0 | 0 | 1 |
seas5 | 328 | 0.07 | 0.26 | 0 | 0 | 0 | 1 |
seas6 | 328 | 0.07 | 0.26 | 0 | 0 | 0 | 1 |
seas7 | 328 | 0.07 | 0.26 | 0 | 0 | 0 | 1 |
seas8 | 328 | 0.07 | 0.26 | 0 | 0 | 0 | 1 |
seas9 | 328 | 0.07 | 0.26 | 0 | 0 | 0 | 1 |
seas10 | 328 | 0.07 | 0.26 | 0 | 0 | 0 | 1 |
seas11 | 328 | 0.07 | 0.26 | 0 | 0 | 0 | 1 |
seas12 | 328 | 0.07 | 0.26 | 0 | 0 | 0 | 1 |
seas13 | 328 | 0.07 | 0.26 | 0 | 0 | 0 | 1 |
To start with, let’s do a simple OLS regression of log(quantity)
on a constant, log(price)
, lakes
, and all the seasonal dummies (except one, since we have already included a constant. I’ll exclude one).
seas_dums <- paste("seas",1:12, sep="")
seas_dummies <- paste(seas_dums, collapse= "+")
fmla_ols <- as.formula(paste("log(quantity) ~ log(price) + lakes + ", seas_dummies ))
olsmodel <- lm(fmla_ols, data = data)
stargazer(olsmodel, type = "html", style = "aer")
log(quantity) | |
log(price) | -0.639^{***} |
(0.082) | |
lakes | -0.448^{***} |
(0.120) | |
seas1 | -0.133 |
(0.111) | |
seas2 | 0.067 |
(0.111) | |
seas3 | 0.111 |
(0.111) | |
seas4 | 0.155 |
(0.111) | |
seas5 | 0.110 |
(0.130) | |
seas6 | 0.047 |
(0.160) | |
seas7 | 0.123 |
(0.160) | |
seas8 | -0.235 |
(0.160) | |
seas9 | 0.004 |
(0.160) | |
seas10 | 0.169 |
(0.161) | |
seas11 | 0.215 |
(0.160) | |
seas12 | 0.220 |
(0.159) | |
Constant | 9.309^{***} |
(0.140) | |
Observations | 328 |
R^{2} | 0.313 |
Adjusted R^{2} | 0.282 |
Residual Std. Error | 0.397 (df = 313) |
F Statistic | 10.169^{***} (df = 14; 313) |
Notes: | ^{***}Significant at the 1 percent level. |
^{**}Significant at the 5 percent level. | |
^{*}Significant at the 10 percent level. |
In this (naive) OLS regression, the coefficient for log(price)
, which we can call “price elasticity”, is -0.639. Note that it is less than 1 in absolute terms. It implies negative marginal revenues. Why? Check this. For per-period profit maximization, we know that the marginal cost should be equal to the marginal revenue. Since the marginal cost is positive, and the marginal revenue is negative, they cannot be equal. Therefore, the estimate we get for the elasticity is not reasonable, since it contradicts with per-period profit maximization.
So, let’s try something different.
Step 2: Actually we knew that the above regression wouldn’t give us the correct estimates because of the “endogeneity bias”. To overcome this problem, we use Instrumental Variable estimation.
Following the instructions in the homework, I use collusion
as an instrument for price
.
fmla_iv1 <- as.formula(paste("log(quantity) ~ log(price) + lakes + ", seas_dummies, "| collusion + lakes + ",seas_dummies))
ivmodel1 <- ivreg(fmla_iv1,data=data)
stargazer(olsmodel, ivmodel1, type = "html", style = "qje", column.labels = c("OLS", "IV"), title = "Regression Results", model.names = FALSE)
log(quantity) | ||
OLS | IV | |
(1) | (2) | |
log(price) | -0.639^{***} | -0.867^{***} |
(0.082) | (0.132) | |
lakes | -0.448^{***} | -0.423^{***} |
(0.120) | (0.122) | |
seas1 | -0.133 | -0.131 |
(0.111) | (0.112) | |
seas2 | 0.067 | 0.091 |
(0.111) | (0.113) | |
seas3 | 0.111 | 0.136 |
(0.111) | (0.113) | |
seas4 | 0.155 | 0.153 |
(0.111) | (0.112) | |
seas5 | 0.110 | 0.074 |
(0.130) | (0.132) | |
seas6 | 0.047 | -0.006 |
(0.160) | (0.163) | |
seas7 | 0.123 | 0.060 |
(0.160) | (0.164) | |
seas8 | -0.235 | -0.294^{*} |
(0.160) | (0.164) | |
seas9 | 0.004 | -0.058 |
(0.160) | (0.164) | |
seas10 | 0.169 | 0.086 |
(0.161) | (0.168) | |
seas11 | 0.215 | 0.152 |
(0.160) | (0.165) | |
seas12 | 0.220 | 0.179 |
(0.159) | (0.162) | |
Constant | 9.309^{***} | 8.996^{***} |
(0.140) | (0.199) | |
N | 328 | 328 |
R^{2} | 0.313 | 0.296 |
Adjusted R^{2} | 0.282 | 0.264 |
Residual Std. Error (df = 313) | 0.397 | 0.402 |
F Statistic | 10.169^{***} (df = 14; 313) | |
Notes: | ^{***}Significant at the 1 percent level. | |
^{**}Significant at the 5 percent level. | ||
^{*}Significant at the 10 percent level. |
Now the results look better. Why? Look at the coefficient for log(price)
, which is the price elasticity (of demand): it is -0.867. So the new estimate for elasticity is much closer to 1 (in absolute terms). The estimates are closer to those reported in Ellison (1994), Table 2 (the estimation of demand with no serial correlation).
We can interpret the coefficient on the lakes
variable and the seasonal dummies as follows. If the lakes are open to navigation, quantity demanded for railroad transport decreases by 42.294 percent compared to the case where the lakes are frozen. In the first season, the quantity demanded is 13.097 percent lower than season 13, ceteris paribus. In season 2, the quantity the quantity demanded is 9.095percent higher than season 13. And so on. However, all of the seasonal dummies turn out to be insignificant.
The R-squared of this regression is 0.2958746.
Alternatively, let’s use dm1, dm2,dm3
and dm4
as instruments for price
in addition to collusion
.
dm_dums <- paste("dm",1:4, sep="")
dm_dummies <- paste(dm_dums,collapse = "+")
fmla_iv2 <- as.formula(paste("log(quantity) ~ log(price) + lakes + ", seas_dummies, "|", dm_dummies , "+ collusion + lakes + ", seas_dummies))
ivmodel2 <- ivreg(fmla_iv2,data=data)
stargazer(olsmodel, ivmodel1, ivmodel2, type = "html", style = "qje", column.labels = c("OLS", "IV", "IV with more instruments"), title = "Regression Results", model.names = FALSE)
log(quantity) | |||
OLS | IV | IV with more instruments | |
(1) | (2) | (3) | |
log(price) | -0.639^{***} | -0.867^{***} | -0.735^{***} |
(0.082) | (0.132) | (0.120) | |
lakes | -0.448^{***} | -0.423^{***} | -0.437^{***} |
(0.120) | (0.122) | (0.120) | |
seas1 | -0.133 | -0.131 | -0.132 |
(0.111) | (0.112) | (0.111) | |
seas2 | 0.067 | 0.091 | 0.077 |
(0.111) | (0.113) | (0.112) | |
seas3 | 0.111 | 0.136 | 0.122 |
(0.111) | (0.113) | (0.112) | |
seas4 | 0.155 | 0.153 | 0.154 |
(0.111) | (0.112) | (0.111) | |
seas5 | 0.110 | 0.074 | 0.094 |
(0.130) | (0.132) | (0.131) | |
seas6 | 0.047 | -0.006 | 0.025 |
(0.160) | (0.163) | (0.161) | |
seas7 | 0.123 | 0.060 | 0.096 |
(0.160) | (0.164) | (0.162) | |
seas8 | -0.235 | -0.294^{*} | -0.260 |
(0.160) | (0.164) | (0.162) | |
seas9 | 0.004 | -0.058 | -0.023 |
(0.160) | (0.164) | (0.162) | |
seas10 | 0.169 | 0.086 | 0.134 |
(0.161) | (0.168) | (0.165) | |
seas11 | 0.215 | 0.152 | 0.188 |
(0.160) | (0.165) | (0.162) | |
seas12 | 0.220 | 0.179 | 0.202 |
(0.159) | (0.162) | (0.160) | |
Constant | 9.309^{***} | 8.996^{***} | 9.177^{***} |
(0.140) | (0.199) | (0.184) | |
N | 328 | 328 | 328 |
R^{2} | 0.313 | 0.296 | 0.310 |
Adjusted R^{2} | 0.282 | 0.264 | 0.279 |
Residual Std. Error (df = 313) | 0.397 | 0.402 | 0.398 |
F Statistic | 10.169^{***} (df = 14; 313) | ||
Notes: | ^{***}Significant at the 1 percent level. | ||
^{**}Significant at the 5 percent level. | |||
^{*}Significant at the 10 percent level. |
The results for the new estimation are above. It turns out that adding extra instruments doesn’t improve the estimates; the estimate for elasticity was closer to -1 in the previous regression.
Step 3: Now, let’s estimate the supply equation instead.
fmla_iv3 <- as.formula(paste("log(price) ~ log(quantity) + collusion + ", dm_dummies ,"+",seas_dummies,"|", " lakes + collusion + ", dm_dummies, "+", seas_dummies))
ivmodel3 <- ivreg(fmla_iv3, data=data)
stargazer(ivmodel3, type = "html", style = "qje", title = "Supply Equation Regression Results", model.names = FALSE)
log(price) | |
log(quantity) | 0.253 |
(0.173) | |
collusion | 0.368^{***} |
(0.054) | |
dm1 | -0.202^{***} |
(0.055) | |
dm2 | -0.173^{**} |
(0.081) | |
dm3 | -0.319^{***} |
(0.065) | |
dm4 | -0.208 |
(0.172) | |
seas1 | 0.030 |
(0.072) | |
seas2 | 0.092 |
(0.069) | |
seas3 | 0.130^{*} |
(0.071) | |
seas4 | -0.001 |
(0.082) | |
seas5 | -0.043 |
(0.076) | |
seas6 | -0.043 |
(0.090) | |
seas7 | -0.084 |
(0.082) | |
seas8 | 0.072 |
(0.127) | |
seas9 | 0.041 |
(0.099) | |
seas10 | -0.111 |
(0.077) | |
seas11 | -0.097 |
(0.074) | |
seas12 | 0.017 |
(0.078) | |
Constant | -3.975^{**} |
(1.778) | |
N | 328 |
R^{2} | 0.316 |
Adjusted R^{2} | 0.276 |
Residual Std. Error | 0.246 (df = 309) |
Notes: | ^{***}Significant at the 1 percent level. |
^{**}Significant at the 5 percent level. | |
^{*}Significant at the 10 percent level. |
The coefficient on collusion
tells us that if there is a collusion between the firms, the price is 36.8 percent higher compared to the no-collusion case. The coefficient on log(quantity)
implies decreasing returns to scale. Why? It implies that the marginal cost curve is upward-sloping. It happens only if the production function has decreasing returns to scale [proof omitted]. Therefore, the cost of the firm is a convex function of its output.
Suppose we have a linear demand specification:
\[\begin{equation} Q_t = \alpha_0 + \alpha_1 P_t + \alpha_2 Lakes_t + u_t \end{equation}\]It implies that the slope of the demand curve is \(1/\alpha_1\). Therefore, the slope of the marginal revenue curve is \(2/\alpha_1\). Assume that the demand curve is of the form \(P = K + \frac{1}{\alpha_1} Q\). So the marginal revenue curve is of the form \(P = K + \frac{2}{\alpha_1} Q\). We know that \(MR=MC\) at the optimal quantity, so define \(Q^*\) as \(MR(Q^*) = c\). Since \(MR(Q^*) = c\), we get \(K+\frac{2}{\alpha_1}Q^* = c\). Therefore, \(K = c - \frac{2}{\alpha_1}Q^*\). The price that the monopolist charges is \(D(Q^*)\). Plugging \(K\) into the demand equation, we get \(P^* = c - \frac{2}{\alpha_1}Q^* + \frac{1}{\alpha_1}Q^* = c - \frac{1}{\alpha_1}Q^*\). Given this result, I would choose a linear functional form for the supply curve, such as \(P = a_0 + a_1 Q\), where \(a_1 > 0\).
All of the calculations above assumed that the firm is a monopolist. Therefore, we should control for that in constructing a supply equation. Other structural factors also play a role in determining the price. Therefore, we can construct the supply equation as
\[\begin{equation} P_t = \beta_0 + \beta_1 Q_t + \beta_2 S_t + \beta_3 I_t + v_t \end{equation}\]If the industry is composed of competitive firms with total cost \(c(Q_t) = c_0 Q_t + c_1 Q_t^2\), then \(MC = P\) implies the pricing rule \(P_t = c_0 + 2c_1 Q_t\).
It is important to note that if firms have this type of total cost structure, Porter’s approach to identify industry behavior is no longer useful, since both types of behavior imply \(P = a_0 + a_1 Q\) type of pricing rule. Therefore, regime type cannot be identified by using the method described by Porter.
Now, let’s go back to the data and try to find the causes of price wars by using a probit regression.
First, we create a new variable that indicates the start of a price war, conditional on the previous period being collusive. Then, we use quantity
, lakes
, dm1,dm2,dm3
and dm4
as the explatanory variables, and estimate the likelihood of a start of a price war by using a probit regression.
war_start <- c()
for(i in 2:nrow(data)){
if(data$collusion[i-1]==1){
war_start[i] <- abs(data$collusion[i]-1)
}
else
war_start[i] <- NA
}
data$quantity <- data$quantity/1000
fmla_probit <- as.formula(paste("war_start ~ quantity + lakes + ", dm_dummies))
probitmodel <- glm(fmla_probit, family = binomial(link = "probit"), data = data)
stargazer(probitmodel, type = "html", style = "qje", covariate.labels = c("quantity/1000"))
war_start | |
quantity/1000 | 0.001 |
(0.019) | |
lakes | -0.246 |
(0.337) | |
dm1 | 3.934 |
(305.054) | |
dm2 | 4.400 |
(305.055) | |
dm3 | 4.578 |
(305.054) | |
dm4 | -0.068 |
(841.486) | |
Constant | -5.700 |
(305.055) | |
N | 202 |
Log Likelihood | -38.419 |
Akaike Inf. Crit. | 90.839 |
Notes: | ^{***}Significant at the 1 percent level. |
^{**}Significant at the 5 percent level. | |
^{*}Significant at the 10 percent level. |
The coefficient on quantity
is positive, implying that the probability of a price war increases as quantity increases. In other words, price wars are more likely to occur in booms. Additionally, the coefficient on lakes
implies that price wars are less likely if the lakes are open to navigation. However, none of these coefficients are significantly different than zero.