Utility Models

]

# Week 7: .fancy[Utility Models]

### <svg aria-hidden="true" role="img" viewBox="0 0 512 512" style="height:1em;width:1em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:white;overflow:visible;position:relative;"><path d="M243.4 2.587C251.4-.8625 260.6-.8625 268.6 2.587L492.6 98.59C506.6 104.6 514.4 119.6 511.3 134.4C508.3 149.3 495.2 159.1 479.1 160V168C479.1 181.3 469.3 192 455.1 192H55.1C42.74 192 31.1 181.3 31.1 168V160C16.81 159.1 3.708 149.3 .6528 134.4C-2.402 119.6 5.429 104.6 19.39 98.59L243.4 2.587zM256 128C273.7 128 288 113.7 288 96C288 78.33 273.7 64 256 64C238.3 64 224 78.33 224 96C224 113.7 238.3 128 256 128zM127.1 416H167.1V224H231.1V416H280V224H344V416H384V224H448V420.3C448.6 420.6 449.2 420.1 449.8 421.4L497.8 453.4C509.5 461.2 514.7 475.8 510.6 489.3C506.5 502.8 494.1 512 480 512H31.1C17.9 512 5.458 502.8 1.372 489.3C-2.715 475.8 2.515 461.2 14.25 453.4L62.25 421.4C62.82 420.1 63.41 420.6 63.1 420.3V224H127.1V416z"/></svg> EMSE 6035: Marketing Analytics for Design Decisions
### <svg aria-hidden="true" role="img" viewBox="0 0 448 512" style="height:1em;width:0.88em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:white;overflow:visible;position:relative;"><path d="M224 256c70.7 0 128-57.31 128-128s-57.3-128-128-128C153.3 0 96 57.31 96 128S153.3 256 224 256zM274.7 304H173.3C77.61 304 0 381.6 0 477.3c0 19.14 15.52 34.67 34.66 34.67h378.7C432.5 512 448 496.5 448 477.3C448 381.6 370.4 304 274.7 304z"/></svg> John Paul Helveston
### <svg aria-hidden="true" role="img" viewBox="0 0 448 512" style="height:1em;width:0.88em;vertical-align:-0.125em;margin-left:auto;margin-right:auto;font-size:inherit;fill:white;overflow:visible;position:relative;"><path d="M96 32C96 14.33 110.3 0 128 0C145.7 0 160 14.33 160 32V64H288V32C288 14.33 302.3 0 320 0C337.7 0 352 14.33 352 32V64H400C426.5 64 448 85.49 448 112V160H0V112C0 85.49 21.49 64 48 64H96V32zM448 464C448 490.5 426.5 512 400 512H48C21.49 512 0 490.5 0 464V192H448V464z"/></svg> October 12, 2022

]

---

# Week 7: .fancy[Utility Models]

### 1. Utility models
### 2. Exploring choice data
### 3. Linear & discrete parameters

### BREAK

### 4. Outside good
### 5. Team project utility models

---

# Week 7: .fancy[Utility Models]

### 1. .orange[Utility models]
### 2. Exploring choice data
### 3. Linear & discrete parameters

### BREAK

### 4. Outside good
### 5. Team project utility models

---

# Random utility model

<br>

## The utility for alternative `$j$` is
# `$$\tilde{u}_j = v_j + \tilde{\varepsilon}_j$$`

## `$v_j$` = Things we observe (non-random variables)
## `$\tilde{\varepsilon}_j$` = Things we _don't_ observe (random variable)

---

# `$$\tilde{u}_j = v_j + \tilde{\varepsilon}_j$$`

---

# Practice Question 1

a) A random variable, `$\tilde{x}$`, has the PDF, `$f_{\tilde{x}}(x)$`. Write the equation to compute its total probability (hint: think area under the curve!). What is the answer to the equation?

b) A random variable, `$\tilde{x}$`, has a uniform distribution between the values 0 and 1. Draw the probability density function (PDF) and Cumulative Density Function (CDF) of `$\tilde{x}$`.

c) The value of a random variable, `$\tilde{x}$`, is determined by rolling one fair, 6-sided dice. Draw the PDF and CDF of `$\tilde{x}$`.

---

## **Logit model**: Assume that `$\tilde{\varepsilon}_j$` ~ [Gumbel Distribution](https://en.wikipedia.org/wiki/Gumbel_distribution)

## `$$\tilde{u}_j = v_j + \tilde{\varepsilon}_j$$`

]

## Probability of choosing alternative `$j$`:

# `$$P_j = \frac{e^{v_j}}{\sum_k{e^{v_k}}}$$`

]

---

# Practice Question 2

a) A consumer is making a choice between two bars of chocolate:

- Milk chocolate `$(m)$`
- Dark chocolate `$(d)$`

Assume that we know the observed utility of each bar to be `$v_m = 3$` and `$v_d = 4$`. Using a logit model, compute the probabilities of choosing each bar: `$P_m$` and `$P_d$`.

b) A third bar of chocolate is now added to the choice set. It is the exact same as the milk chocolate bar, but it has a slightly different wrapper (which has no effect on the consumer's utility). Now,  `$v_{m1} = v_{m2} = 3$`, and `$v_d = 4$`. Based on the probabilities from question a), what would we expect the probabilities of choosing each bar to be? What probabilities does the logit model produce?

---

### **"Observed utility" `$(v_j)$` is a weighted sum of attribute values**

<br>

## `$$v_j = \beta_1 x_{j}^{\mathrm{A}} + \beta_2 x_j^{\mathrm{B}} +  \dots$$`

## Each `$x_j$` is an observable attribute (_price_, etc.)

<br>

## We know `$x_{j}^{\mathrm{A}}, x_{j}^{\mathrm{B}}, \dots$`,<br>**we want to _estimate_** `$\beta_1, \beta_2, \dots$`

---

#.center[Notation Convention]

## Continuous: `$x_j$`

## `$$u_j = \beta_1 x_{j}^{\mathrm{price}} + \dots$$`

```
#>   price
#> 1     1
#> 2     2
#> 3     3
```

]

## Discrete: `$\delta_j$`

## `$$u_j = \beta_1 \delta_{j}^{\mathrm{ford}} + \beta_2 \delta_{j}^{\mathrm{gm}} \dots$$`

```
#>   brand brand_BMW brand_Ford brand_GM
#> 1  Ford         0          1        0
#> 2    GM         0          0        1
#> 3   BMW         1          0        0
```

]

---

# Practice Question 3

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> Attribute </th>
   <th style="text-align:left;"> Bar 1 </th>
   <th style="text-align:left;"> Bar 2 </th>
   <th style="text-align:left;"> Bar 3 </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Price </td>
   <td style="text-align:left;"> $1.20 </td>
   <td style="text-align:left;"> $1.50 </td>
   <td style="text-align:left;"> $3.00 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> % Cacao </td>
   <td style="text-align:left;"> 10% </td>
   <td style="text-align:left;"> 60% </td>
   <td style="text-align:left;"> 80% </td>
  </tr>
</tbody>
</table>

a) Write out a model for the _observed_ utility of each chocolate bar in the above set.

b) If the coefficient for the _price_ attribute was -0.1 and the coefficient for % _Cacao_ attribute was 0.1, what is the difference in the observed utility between bars 3 and 1?

c) With the addition of the _brand_ attribute, repeat part a.

]

<table class="table table-hover table-condensed" style="width: auto !important; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:left;"> Attribute </th>
   <th style="text-align:left;"> Bar 1 </th>
   <th style="text-align:left;"> Bar 2 </th>
   <th style="text-align:left;"> Bar 3 </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Price </td>
   <td style="text-align:left;"> $1.20 </td>
   <td style="text-align:left;"> $1.50 </td>
   <td style="text-align:left;"> $3.00 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> % Cacao </td>
   <td style="text-align:left;"> 10% </td>
   <td style="text-align:left;"> 60% </td>
   <td style="text-align:left;"> 80% </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Brand </td>
   <td style="text-align:left;"> Hershey </td>
   <td style="text-align:left;"> Lindt </td>
   <td style="text-align:left;"> Ghirardelli </td>
  </tr>
</tbody>
</table>

]

---

## Your Turn

Let's say our utility function is:

.font80[$$v_j = \beta_1 x_j^{\mathrm{price}} + \beta_2 x_j^{\mathrm{cacao}} + \beta_3 \delta_j^{\mathrm{hershey}} + \beta_4 \delta_j^{\mathrm{lindt}}$$]

And we estimate the following coefficients:

Parameter | Coefficient 
----------|-----------
`$\beta_1$` | -0.1
`$\beta_2$` | 0.1
`$\beta_3$` | -2.0
`$\beta_4$` | -0.1

]

a) What are the expected probabilities of choosing each of these bars using a logit model?

b) What price would Bar 2 have to be to get a 50% market share?

]

---

# Week 7: .fancy[Utility Models]

### 1. Utility models
### 2. .orange[Exploring choice data]
### 3. Linear & discrete parameters

### BREAK

### 4. Outside good
### 5. Team project utility models

---

## Download the [logitr-cars](https://github.com/emse-madd-gwu/logitr-cars) repo from GitHub

---

# .center[Exploring choice data]

<br>

## 1. Open `logitr-cars.Rproj`

## 2. Open `code/2.1-explore-data.R`

]

---

# Week 7: .fancy[Utility Models]

### 1. Utility models
### 2. Exploring choice data
### 3. .orange[Linear & discrete parameters]

### BREAK

### 4. Outside good
### 5. Team project utility models

---

# .center[Dummy-coded variables]

Data frame with one variable: _price_

```r
data <- data.frame(price = c(10, 20, 30))

data
```

```
#>   price
#> 1    10
#> 2    20
#> 3    30
```

]

Add dummy columns for each price "level"

```r
library(fastDummies)

dummy_cols(data, "price")
```

```
#>   price price_10 price_20 price_30
#> 1    10        1        0        0
#> 2    20        0        1        0
#> 3    30        0        0        1
```

]

---

### Model _price_ as _continuous_

`$v_j = \beta_1 x^\mathrm{price}$`

]

```r
model <- logitr(
    data   = data,
    choice = "choice",
    obsID  = "obsID",
    pars   = "price"
)
```

<br>

Coef. | Interpretation
------|------------------
β1 | how utility changes with increasing _price_

]

### .center[Model _price_ as _discrete_]

`$v_j = \beta_1 \delta^\mathrm{price = 20} +  \beta_2 \delta^\mathrm{price = 30}$`

]

```r
model <- logitr(
    data   = data,
    choice = "choice",
    obsID  = "obsID",
    pars   = c("price_20", "price_30")
)
```

Coef. | Interpretation
------|------------------
β1 | utility for _price=20_ relative to _price=10_
β2 | utility for _price=30_ relative to _price=10_

]

---

# .center[Estimating utility models]

<br>

## 1. Open `logitr-cars.Rproj`

## 2. Open `code/3.1-model-mnl.R`

]

---

# `mnl_dummy`

All dummy-code variables

```r
pars = c(
  "price_20", "price_25",
  "fuelEconomy_25", "fuelEconomy_30",
  "accelTime_7", "accelTime_8",
  "powertrain_Electric")
```

Reference Levels:

- Price: 15
- Fuel Economy: 20
- Accel. Time: 6
- Powertrain: "Gasoline"

]

# `mnl_linear`

All continuous (linear), except for `powertrain_Electric`

```r
pars = c(
  'price', 'fuelEconomy', 'accelTime', 
  'powertrain_Electric')
```

Reference Levels:

- Powertrain: "Gasoline"

]

---

## Your Turn

1) Run the code chunk to read in the `data.csv` file in the "data" folder, which contains choice observations from chocolate bars with the following attributes:

Attribute | Description 
----------|----------------------
`price` | Price in $
`percent_cacao` | % Cacao (how "dark" the chocolate is)
`crispy_rice` | 0 or 1 for if the bar contains crispy rice
`brand` | "Hershey", "Lindt", or "Ghirardelli"

]

2) Write code to estimate the following utility model<br>(HINT: you may need to make some dummy-coded variables!):

`$$u_j = \beta_1 x_j^{\mathrm{price}} + \beta_2 x_j^{\mathrm{\%cacao}} + \beta_3 \delta_j^{\mathrm{crispy}} + \beta_4 \delta_j^{\mathrm{hershey}} + \beta_5 \delta_j^{\mathrm{lindt}} + \varepsilon_j$$`

3) Write code to plot the change in utility for the _price_ attribute.

---

# Week 7: .fancy[Utility Models]

### 1. Utility models
### 2. Exploring choice data
### 3. Linear & discrete parameters

### BREAK

### 4. .orange[Outside good]
### 5. Team project utility models

---

## .center[Estimating utility models with an _Outside Good_]

<br>

## 1. Open `logitr-cars.Rproj`

## 2. Open `code/4.1-model-nochoice.R`

]

---

# Week 7: .fancy[Utility Models]

### 1. Utility models
### 2. Exploring choice data
### 3. Linear & discrete parameters

### BREAK

### 4. Outside good
### 5. .orange[Team project utility models]

---

# .center[Simulating choice data]

```r
data <- cbc_choices(
    design = design,
    obsID = "obsID"
)
```

]

`$v_j = -0.7 x_j^{\mathrm{price}} + 0.1 x_j^{\mathrm{fuelEconomy}} - 0.2 x_j^{\mathrm{accelTime}} -4 \delta_j^{\mathrm{electric}}$`

]]

```r
data <- cbc_choices(
    design = design,
    obsID = "obsID",
    priors = list(
        price       = -0.7,
        fuelEconomy = 0.1,
        accelTime   = -0.2,
        powertrain_Electric = -4.0
    )
)
```

]

---

# .center[Estimate a choice model]

`$$v_j = \beta_1 x_j^{\mathrm{price}} + \beta_2 x_j^{\mathrm{fuelEconomy}} + \beta_3 x_j^{\mathrm{accelTime}} + \beta_4 \delta_j^{\mathrm{electric}}$$`

```r
model <- logitr(
  data    = data,
  outcome = "choice",
  obsID   = "obsID",
  pars    = c(
      "price", "fuelEconomy", "accelTime", "powertrain_Electric"
  )
)
```

---

## Your Turn

### As a team:

1. Go back to your code from last week where you created your choice questions.

2. Write out a utility model for your project.

3. Write code to simulate data according to your utility model - pick some fake parameter values.

4. Write code to estimate a model using your simulated data.