Attribute | Range | Explanation |
Upfront Incentive: | $100, $300, $500 | It means the amount of money you will be instantly paid if you join the smart charging program. The unit is in USD. |
Free Level 2 Charger: | No or Yes | A "Yes" means you will get a free level 2 charger. A level 2 charger enables faster charging than a standard home charger. |
Electricity Price Discount: | 10%, 25%, 50% | If you choose to join this smart charging program, this will be your electricity price discount. |
Override Window: | 0.5 ~ 4 hrs | This is the longest time window per day that you can force override to regular charging, during which time your EV is not controlled by the smart charging program. |
Guaranteed Range if charged for 8 hrs: | 25%, 50%, 75% | It means the guranteed range if your car is charged for 8 hrs during smart charging. It shows percentage here, but will be converted to miles. |
Final Analysis - Smart Charging for PEVs
Quantifying the Benefits and Constraints of Plug-In Electric Vehicle Smart Charging Adoption
Abstract
The product being studied in the project proposal is “Smart Charging Solutions for Plug-in Electric Vehicles (PEVs).” The initiative aims to investigate the reception and potential adoption of these technologies among users who own or might consider owning PEVs. The focus is on understanding consumer attitudes, preferences, and potential barriers regarding the adoption of smart charging solutions for electric vehicles. The project intends to delve into aspects such as awareness, perceived benefits, and potential hindrances associated with smart charging technologies for PEVs, exploring attributes like compensation, electricity pricing, charging speed, and more.
The investigation of consumer inclinations and considerations within the project was centered on a strategic analysis of distinct product attributes and their corresponding design decisions. These key attributes included Upfront Incentive, Free Level 2 Charger, Electricity Price Discount, Override Window, and the commitment to a Guaranteed Range if smart charged for 8 hours. Notably, the design decisions for these attributes were structured with deliberate variations: the Upfront Incentive ranged across $100, $300, and $500, the provision of a Free Level 2 Charger was delineated as ‘Yes’ or ‘No’, and the Electricity Price Discount was presented at three distinct rates: 10, 25, and 50. Additionally, the Guaranteed Range if smart charged for 8 hours offered choices of 25, 50, and 75 units. These carefully curated design decisions were aimed at assessing consumer responsiveness to diverse incentive structures, charging infrastructure availability, pricing models, and the potential impact of guaranteed range, thereby offering comprehensive insights into the dynamics of smart charging solution adoption among consumers.
Introduction
The global drive for sustainable energy has propelled Plug-in Electric Vehicles (PEVs) to the forefront in the fight against climate change. Smart charging technologies, integrating advanced grid systems and digital connectivity, offer a transformative opportunity to enhance PEVs’ role in a sustainable future by optimizing charging processes, improving grid stability, and reducing costs.
This project proposal aims to gauge user acceptance of smart charging solutions for PEVs. It delves into factors influencing consumer choices, uncovering awareness levels, benefits, and barriers associated with smart charging among PEV owners. Using surveys, interviews, and data analysis, the project aims to reveal user attitudes, preferences, and concerns about smart charging.
The comprehensive approach considers various attributes like compensation, electricity price, charging speed, window, and percentage, exploring user attitudes toward smart charging. It also evaluates the impact of compensation on adoption rates, offering valuable insights for stakeholders in the electric vehicle industry.
In an era focused on environmental awareness and sustainable transportation, this project accelerates the shift toward cleaner and more efficient mobility solutions. It aligns with global energy sustainability goals, emphasizing the need to reduce carbon emissions and create a smarter, more connected, and environmentally responsible landscape for electric vehicle charging.
Survey Design
The survey is divided into three sections. Section 1 covers PEV Ownership and Charging Habits Questions, Section 2 addresses Smart Charging Problems, and Section 3 gathers Demographic Information.
Before starting the actual survey, participants will see a consent page and a 30-second educational video to introduce them to Smart Charging. The video explains smart charging and how it allows the utility to control the PEV charging process, including the time window and charging amount. In order to ensure engagement of all users, this survey does not fail any test takers if they are not EV owners. However, it turned out the majority of test takers are EV owners. This survey provides a “not interested” option if the test takers don’t want to participate.
Section 1 begins by asking participants about their interest in PEVs and whether they own, lease, plan to have one in the future, or do not have one. It also inquires about their preferred PEV model, preferred mileage, and charging habits at home and work. The respondents who have PEVs or tend to have PEVs are within our eligibility.
Section 2, focusing on Smart Charging Problems, consists of eight questions per respondent. This section addresses five main issues faced by smart charging: Upfront Incentive, Free Level 2 charger, Electricity Price discount, Override Window, and Guaranteed Range if charged for 8 hours. Upfront Incentive refers to the instant payment when joining the smart charging program, ranging from $100 to $500. Free Level 2 charger is a faster charging option, with respondents choosing between “Yes” or “No.” Electricity Price discount offers a discount ranging from 10% to 50% if they join the smart charging program. Override Window refers to the longest daily period where users can override regular charging, with values ranging from 0.5 to 4 hours. Guaranteed Range if charged for 8 hours specifies the range achieved if the car is charged for 8 hours during the smart charging period. The value shown ranges from 25% to 75% of the maximum range in miles, relying on users’ input. Each of the attributes with their respective levels and brief explanation is summarized in the table below:
This part of our survey involves a small twist: the respondent’s choice in Section 1 question determines the percentage attribute. In the guaranteed range section, we’ll calculate mileage based on the user’s input from Section 1. The guaranteed range in percentage ranges from 25% to 75% for the user’s input. Each question has three alternatives, with the third option being “I am not interested.” Since these attributes are randomly generated, some options may seem illogical or overly obvious. To ensure respondent understanding, we’ve included a sentence asking them to choose between two options if those were the only choices available. Additionally, a third option, “I am not interested,” is provided for those who do not show interest in the available options. Similarly, to ensure clarity, Section 2 starts with a description of the attributes, their values, and a sample question to guide respondents. Below is an example of our conjoint question.
Code
<- tibble(
alts `Options:` = c("Option 1", "Option 2"),
`Upfront Incentive:` = scales::dollar(c(100, 500)),
`Free Level 2 Charger:` = c("No", "Yes"),
`Electricity Price Discount:` = c("25%", "50%"),
`Override Window:` = c("0.5 hrs", "4 hrs"),
`Guaranteed Range if charged for 8 hrs:` = c("100 miles", "300 miles"))
<- as.data.frame(t(alts)) %>%
alts_t rownames_to_column(var = "Attribute")
colnames(alts_t) <- NULL
kable(alts_t, escape = FALSE, align = c('r', 'c', 'c')) %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE,
position = "center") %>%
column_spec(column = 1, width = '20em') %>%
column_spec(column = 2, width = '10em') %>%
column_spec(column = 3, width = '10em') %>%
row_spec(row = 1, bold = TRUE)
Options: | Option 1 | Option 2 |
Upfront Incentive: | $100 | $500 |
Free Level 2 Charger: | No | Yes |
Electricity Price Discount: | 25% | 50% |
Override Window: | 0.5 hrs | 4 hrs |
Guaranteed Range if charged for 8 hrs: | 100 miles | 300 miles |
Section 3 collects demographic information, travel habits, opinions regarding climate change and commuting preferences. This includes birth year, gender identity, race, educational background, household income, interest in the environment and climate change, political inclination, ZIP code, commute method, number of cars in their household, presence of EV charger, and the presence of backup cars, if any, along with their quantity.
There have not been any major changes between the pilot and final surveys. The only modification is an increase in the number of questions per respondent, from 5 in the pilot survey to 8 in the final version.
Data Analysis
This section describes the analysis that the team carried out using our pilot data. It contains 3 subsections: sample description, data cleaning, and modeling. The analysis breakdown is described in the following 3 subsections, and the data dictionary is show in the appendix (see 2. Data Dictionary).
1. Sample Description
The team generated a randomized full-factorial design of cbc choice question set using cbc_design()
and fed into the survey. The design is generated using the following codes:
Code
# Generate the profiles of attributes
<- cbc_profiles(
profiles upfront = c(100, 300, 500), # USD
lv_2_charger = c('No', 'Yes'),
discount = c(10, 25, 50), # percentage
window = c(0.5, 1, 2, 4), # hrs
guaranteed = c(25, 50, 75) # percentage
)
# Randomized full-factorial design
<- cbc_design(
design profiles = profiles,
n_resp = 500, # Number of respondents
n_alts = 2, # Number of alternatives per question
n_q = 8, # Number of questions per respondent
no_choice = TRUE
)
# Save design
write_csv(design, here('data', 'choice_questions.csv'))
The choice_questions.csv
is saved to the data
folder. However, in the survey analysis, the choice questions refered to is from an online source with the same format. See appendix 1. Conjoint Survey Documentation for the documentation of the conjoint survey questions.
2. Data Cleaning
The team had 122 complete sets of survey results, each with 8 conjoint questions answered. The data cleaning process is written in the code chunk below:
Code
# 1. Read survey data
<- read_csv(here("data", "pevFinal1.csv"))
p1 <- read_csv(here("data", "pevFinal2.csv"))
p2 <- read_csv(here("data", "pevFinal3.csv"))
p3
# 2. Format and combine the 3 dataframes
# 2.1 Format p1
<- p1 %>%
p1 mutate(
created = ymd_hms(created, tz = "EST"),
ended = ymd_hms(ended, tz = "EST"),
time_sec_p1 = as.numeric(ended - created, units = "secs")) %>%
# Select important columns
select(session, time_sec_p1, starts_with("mc"))
# 2.2 Format p2
<- p2 %>%
p2 mutate(
created = ymd_hms(created),
ended = ymd_hms(ended),
time_sec_p2 = as.numeric(ended - created, units = "secs")) %>%
# Select important columns
select(session, time_sec_p2, respondentID, cbc_all_same, starts_with("mc"))
# 2.3 Format p3
<- p3 %>%
p3 mutate(
created = ymd_hms(created),
ended = ymd_hms(ended),
time_sec_p3 = as.numeric(ended - created, units = "secs")) %>%
# Select important columns
select(session, time_sec_p3, completion_code, starts_with("mc"))
# 2.4 Combine all parts together using the session variable
<- p1 %>%
choice_data_wider left_join(p2, by = "session") %>%
left_join(p3, by = "session") %>%
# No longer need session variable
select(-session)
# 3. Drop unnecessary parts
# 3.1 Drop anyone who didn't complete all choice questions
<- choice_data_wider %>%
choice_data_wider filter(!is.na(mc_cbc_1)) %>%
filter(!is.na(mc_cbc_2)) %>%
filter(!is.na(mc_cbc_3)) %>%
filter(!is.na(mc_cbc_4)) %>%
filter(!is.na(mc_cbc_5)) %>%
filter(!is.na(mc_cbc_6)) %>%
filter(!is.na(mc_cbc_7)) %>%
filter(!is.na(mc_cbc_8))
# 3.2 Drop respondents who went too fast
<- choice_data_wider %>%
choice_data_wider mutate(
# First replace NA values with 0 seconds
time_sec_p1 = ifelse(is.na(time_sec_p1), 0, time_sec_p1),
time_sec_p2 = ifelse(is.na(time_sec_p2), 0, time_sec_p2),
time_sec_p3 = ifelse(is.na(time_sec_p3), 0, time_sec_p3),
# Now compute the total time
time_min_total = (time_sec_p1 + time_sec_p2 + time_sec_p3) / 60
)
# 3.3 Drop anyone who finished in under the 10th percentile of completion times
<- quantile(choice_data_wider$time_min_total, 0.1)
time_10 <- choice_data_wider %>%
choice_data_wider filter(time_min_total >= time_10)
# 3.4 Drop respondents that got the attention check question wrong
# choice_data_wider <- choice_data_wider %>%
# filter(mc_cbc_practice == 'option_2')
# 4. Generate choice data csv file
# 4.1 Convert the data to long format
<- choice_data_wider %>%
choice_data pivot_longer(
cols = mc_cbc_1:mc_cbc_8,
names_to = "qID",
values_to = "choice") %>%
# Convert the qID variable to a number
mutate(qID = parse_number(qID))
# 4.2 Read in choice questions and join it to choice_data
<- read_csv("https://github.com/pingfan0727/Public-Resources/raw/main/datasets/cbc_questions_final.csv")
survey <- choice_data %>%
choice_data rename(respID = respondentID) %>%
left_join(survey, by = c("respID", "qID"), relationship = "many-to-many")
# 4.3 Convert choice column to 1 or 0 based on if the alternative was chosen
<- choice_data %>%
choice_data mutate(
choice = case_when(
str_detect(choice, "^option_") ~
as.numeric(str_replace(choice, "option_", "")),
== "not_interested" ~ 3,
choice TRUE ~ as.numeric(choice)
),choice = ifelse(choice == altID, 1, 0)
%>%
) select(-mc_cbc_practice, -cbc_all_same)
# 4.4 Create new values for respID & obsID
<- nrow(choice_data_wider)
nRespondents <- max(survey$altID)
nAlts <- max(survey$qID)
nQuestions $respID <- rep(seq(nRespondents), each = nAlts*nQuestions)
choice_data$obsID <- rep(seq(nRespondents*nQuestions), each = nAlts)
choice_data
# 4.5 Reorder columns - it's nice to have the "ID" variables first
<- choice_data %>%
choice_data select(ends_with("ID"), "choice", "upfront",
"lv_2_charger_Yes", "lv_2_charger_No",
"discount", "window", "guaranteed",
"no_choice", everything())
# 4.6 Clean up names for choice_data
<- clean_names(choice_data)
choice_data
# 4.7 Save the choice_data_wider for future use
save(
choice_data_wider,file = here("data", "choice_data_wider.RData")
)
# 4.8 Save cleaned data for modeling
write_csv(choice_data, here("data", "choice_data.csv"))
The cleaned choice_data.csv
file is saved to the data
folder.
The team followed Prof. Helveston’s instructions on dropping criteria, including those were:
- Incomplete
- Too fast
- Under 10th percentile of completion time
We intentionally kept those who answered the sample question wrong (which is obviously a dominant question), since by later data analysis we can see they answered the conjoint questions properly. It can be assumed that they were just not familiar with the conjoint question in the beginning.
3. Summary Statistics
This is instructed to be in the Sample Description subsection, but in fact can only be processed after Data Cleaning. The summary statistics will be shown in 3 parts:
- Vehicle Info Summary
- Charging Preference Summary
- Demographic Info Summary
3.1 Vehicle Info Summary
The vehicle info summary generates and illustrates the vehicle ownership (or leasing) of the users. Below is the code chunk:
Code
# 1.1 Load and pivot the data
load(here("data", "choice_data_wider.RData"))
<- choice_data_wider %>%
choice_data_vehicle pivot_longer(cols = c(mc_own_ev, mc_max_range, mc_commute,
mc_car_number, mc_secondary),names_to = 'category',
values_to = 'value')
# 1.2 Reorder and relabel the values based on category
<- choice_data_vehicle %>%
choice_data_vehicle mutate(
category = case_when(
== "mc_own_ev" ~ "Own EV",
category == "mc_max_range" ~ "Max Range",
category == "mc_car_number" ~ "Car Number",
category == "mc_secondary" ~ "Secondary Car",
category == "mc_commute" ~ "Commute",
category TRUE ~ category
),value = case_when(
== "Own EV" ~
category factor(value, levels = c("yes_own", "yes_lease", "no_but_next",
"no_but_interested", "no"),
labels = c("Own", "Lease", "Plan", "Interested", "No")),
== "Max Range" ~
category factor(value, levels = c(
"below_100", "range_100_to_149", "range_150_to_199",
"range_200_to_249", "range_250_to_299", "over_300"
),labels = c("100", "150", "200", "250", "300", ">300")),
== "Car Number" ~
category factor(value, levels = c("zero", "one", "two", "three", "four"),
labels = c("0", "1", "2", "3", "4")),
== "Secondary Car" ~
category factor(value, levels = c("yes_ev", "yes_gas", "yes_both", "no"),
labels = c("EV", "Gas", "Both", "Nil")),
== "Commute" ~
category factor(value, levels = c("on_foot", "by_bus", "by_train_subway",
"self_driving", "others"),
labels = c("On Foot", "By Bus", "By Train",
"Self-driving", "Others")),
TRUE ~ as.factor(value)
)
)
# 1.3 Plot the vehicle info summary
<- choice_data_vehicle %>%
vehicle_plot ggplot(aes(x = value)) +
geom_bar(width = 0.6) +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
facet_wrap(~fct_relevel(category, "Own EV", "Max Range",
"Car Number", "Secondary Car", "Commute"),
scales = "free_x", nrow = 3) +
labs(x = NULL,
y = "Count",
title = "Vehicle Info Summary Plots",
subtitle = "Summary Statistics Part 1") +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '1_2_vehicle_plot.png'),
plot = vehicle_plot,
width = 7,
height = 7 * 0.8
)
3.2 Charging Preference Summary
The charging preference summary generates and illustrates the workplace and home charging preferences of the users. Below is the code chunk:
Code
# 2.1 Pivot the data
<- choice_data_wider %>%
choice_data_charging pivot_longer(cols = c(mc_at_home, mc_at_work),
names_to = 'category',
values_to = 'value') %>%
separate_rows(value, sep = ",\\s*")
# 2.2 Reorder and relabel the values based on category
<- choice_data_charging %>%
choice_data_charging mutate(
category = case_when(
== "mc_at_home" ~ "At Home",
category == "mc_at_work" ~ "At Work",
category TRUE ~ category
),value = case_when(
== "At Home" ~
category factor(value, levels = c(
"home_morning", "home_afternoon", "home_evening",
"home_night", "no_home"),
labels = c("Morning", "Afternoon", "Evening",
"Night", "No")),
== "At Work" ~
category factor(value, levels = c(
"work_morning", "work_noon", "work_afternoon",
"work_evening", "work_night", "no_work"),
labels = c("Morning", "Noon", "Afternoon",
"Evening", "Night", "No")),
TRUE ~ as.factor(value)
)
)
# 2.3 Plot the charging preference summary
<- choice_data_charging %>%
charging_plot ggplot(aes(x = value)) +
geom_bar(width = 0.6) +
scale_x_discrete(limits = c("Morning", "Noon", "Afternoon",
"Evening", "Night", "No")) +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
facet_wrap(~fct_relevel(category, "At Home", "At Work"),
scales = "free_x", nrow = 2) +
labs(x = NULL,
y = "Count",
title = "Charging Preference Summary Plots",
subtitle = "Summary Statistics Part 2") +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '1_2_charging_plot.png'),
plot = charging_plot,
width = 6,
height = 6 * 0.8
)
3.3 Demographic Info Summary
The demographic info summary is broken into 3 parts:
- Part 1 - Gender, Race, Climate Caring, Party
- Part 2 - Education
- Part 3 - Income
Below is the code chunk for demographic part 1:
Code
# 3.1 First Part - gender, race, climate caring, and party
# 3.1.1 Pivot the data
<- choice_data_wider %>%
choice_data_demo mutate(across(c(mc_gender, mc_race, mc_climate, mc_party), as.character)) %>%
pivot_longer(cols = c(mc_gender, mc_race, mc_climate, mc_party),
names_to = 'category',
values_to = 'value') %>%
separate_rows(value, sep = ",\\s*")
# 3.1.2 Reorder and relabel the values based on category
<- choice_data_demo %>%
choice_data_demo mutate(
category = case_when(
== "mc_gender" ~ "Gender",
category == "mc_race" ~ "Race",
category == "mc_climate" ~ "Climate Caring",
category == "mc_party" ~ "Party",
category TRUE ~ category
),value = case_when(
== "Gender" ~
category factor(value, levels = c(
"male", "female", "transMale", "transFemale",
"genderNonconform", "others", "prefer_not_say"),
labels = c("Male", "Female", "Trans Male", "Trans Female",
"Genderqueer", "Others", "Not Say")),
== "Race" ~
category factor(value, levels = c(
"asian", "black", "white", "hispanic", "native",
"pacific", "others", "prefer_not_say"),
labels = c("Asian", "Black", "White", "Hispanic", "Native",
"Pacific", "Other", "Not Say")),
== "Party" ~
category factor(value, levels = c(
"democratic", "republican", "independent", "prefer_not_say"),
labels = c("Democratic", "Republican", "Independent", "Not say")),
TRUE ~ as.factor(value)
)
)
# 3.1.3 Plot the first part of demographic info summary
<- choice_data_demo %>%
demo_plot_1 ggplot(aes(x = value)) +
geom_bar(width = 0.6) +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
facet_wrap(~fct_relevel(category, "Gender", "Race", "Climate Caring", "Party"),
scales = "free_x", nrow = 2) +
labs(x = NULL,
y = "Count",
title = "Demographic Summary Plots Part 1",
subtitle = "Summary Statistics Part 3-1") +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '1_2_demo_plot_1.png'),
plot = demo_plot_1,
width = 7,
height = 7 * 0.8
)
Below is the code chunk for demographic part 2:
Code
# 3.2 Second Part - education
# 3.2.1 Pivot the data
<- choice_data_wider %>%
choice_data_demo_2 mutate(across(c(mc_education, mc_income), as.character)) %>%
pivot_longer(cols = c(mc_education, mc_income),
names_to = 'category',
values_to = 'value') %>%
separate_rows(value, sep = ",\\s*")
# 3.2.2 Reorder and relabel the values based on category
<- choice_data_demo_2 %>%
choice_data_demo_2 mutate(
category = case_when(
== "mc_education" ~ "Education",
category == "mc_income" ~ "Income",
category TRUE ~ category
),value = case_when(
== "Education" ~
category factor(value, levels = c(
"no_hs", "hs", "college_some", "vocational", "degree_associate",
"degree_bs", "degree_grad", "others", "prefer_not_say"),
labels = c("< High School", "High School", "Some College",
"Vocational", "AA/AS", "BA/BS", "Grad",
"Others", "Not Say")),
== "Income" ~
category factor(value, levels = c(
"under60", "inc_60to80", "inc_80to100", "inc_100to150",
"inc_150to200", "inc_200to250", "inc_250to300",
"inc_300to350", "inc_350to400", "inc_over400",
"prefer_not_say"),
labels = c("< $60K", "$60K to $80K", "$80K to $100K",
"$100K to $150K", "$150K to $200K",
"$200K to $250K", "$250K to $300K",
"$300K to $350K", "$350K to $400K",
"> $400K", "Not Say")),
TRUE ~ as.factor(value)
)
)
# 3.2.3 Plot the education distribution
<- choice_data_demo_2 %>%
demo_plot_2 filter(category == "Education") %>%
ggplot(aes(x = value)) +
geom_bar(width = 0.6) +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
labs(x = NULL,
y = "Count",
title = "Demographic Summary Plots Part 2 - Education",
subtitle = "Summary Statistics Part 3-2") +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '1_2_demo_plot_2.png'),
plot = demo_plot_2,
width = 6,
height = 6 / 1.618
)
Below is the code chunk for demographic part 3:
Code
# 3.3 Third Part - income
# 3.3.1 Plot the income distribution
<- choice_data_demo_2 %>%
demo_plot_3 filter(category == "Income") %>%
ggplot(aes(x = value)) +
geom_bar(width = 0.6) +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
labs(x = NULL,
y = "Count",
title = "Demographic Summary Plots Part 3 - Income",
subtitle = "Summary Statistics Part 3-3") +
coord_flip() +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '1_2_demo_plot_3.png'),
plot = demo_plot_3,
width = 6,
height = 6 / 1.618
)
3.4 Summary Statistics Data Viz
The summary plots are shown below in different tabs:
Interpretation of the summary statistics:
The vehicle info bar chart shows the majority of the test takers are EV owners. The max range of their EVs are mostly over 250 miles. They usually have at least 2 vehicles, with the other vehicle mostly being gas powered, which means EVs are probably used as a backup car. Most of the users drive to commute, which is reasonable.
The charging preference bar chart shows most people charges during evening and night time at home, but it’s harder to find charging places at work time.
The demographic plots show the basic information of the users. Male users is almost twice as females. Most of the users are white and supports the democratic party, and they do care about the environment very much, which encourages them to become EV users. The users mostly have bachelor’s degree, with moderate annual income of around $100k.
4. Modeling
The team attempted 5 models:
mnl_pref
- simple logit preference space model,mxl_pref
- mixed logit preference space model,mnl_pref_groups
- simple logit preference space model by groups,mnl_wtp
- simple logit WTP space model, andmxl_wtp
- mixed logit WTP space model.
4.1 Simple logit preference space
The mnl_pref
model is the starting point of utility modeling. Below is the code chunk:
Code
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
<- read_csv(here("data", "choice_data.csv")) %>%
data_mnl_pref mutate(upfront = -1 * upfront / 100,
discount = discount / 10,
guaranteed = guaranteed / 10)
<- logitr(
mnl_pref data = data_mnl_pref,
outcome = "choice",
obsID = "obs_id",
pars = c("upfront", "lv_2_charger_yes", "discount",
"window", "guaranteed", "no_choice")
)
# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_pref)
# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mnl_pref$gradient
# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mnl_pref$hessian)$values
# 2.3 Save the model
save(
mnl_pref,file = here("models", "mnl_pref.RData")
)
# 3. Model Equation
# 3.1 Estimated values and std err
<- data.frame(
table_mnl_pref Coefficient = c("β<sub>1</sub>", "β<sub>2</sub>",
"β<sub>3</sub>", "β<sub>4</sub>",
"β<sub>5</sub>", "β<sub>6</sub>"),
Meaning = c("Upfront", "Lv 2 Charger", "Discount",
"Window", "Guaranteed", "No Choice"),
Estimate = coef(mnl_pref),
StdError = se(mnl_pref),
Level = c("1, 3, 5", "-", "1, 2.5, 5",
"0.5, 1, 2, 4", "25, 50, 75", "-"),
Unit = c("100 USD", "-", "10%", "1 hr", "10%", "-")
)$Estimate <- round(table_mnl_pref$Estimate, 4)
table_mnl_pref$StdError <- round(table_mnl_pref$StdError, 4)
table_mnl_pref
<- table_mnl_pref %>%
mnl_pref_kable kable(format = 'html', escape = FALSE,
col.names = c("Coefficient", "Meaning", "Estimate", "Std Error",
"Level", "Unit"),
align = c('c', 'l', 'c', 'c', 'c', 'c'),
caption = "Summary of Model Coefficients") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 3.2 Model figure
# 3.2.1 Construct the data frame
<- data.frame(
figure_mnl_pref Attribute = factor(
c("Upfront", "Charger", "Discount", "Window", "Guaranteed", "No Choice"),
levels = c("No Choice", "Guaranteed", "Window",
"Discount", "Charger", "Upfront")
),Estimate = coef(mnl_pref),
StdError = se(mnl_pref)
)
$LowerBound =
figure_mnl_pref$Estimate - 1.96 * figure_mnl_pref$StdError
figure_mnl_pref$UpperBound =
figure_mnl_pref$Estimate + 1.96 * figure_mnl_pref$StdError
figure_mnl_pref
# 3.2.2 Create the plot
<- figure_mnl_pref %>%
mnl_pref_plot ggplot(aes(x = Attribute, y = Estimate)) +
geom_point() +
geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
labs(title = "Coefficient Estimates with Uncertainty Bounds",
subtitle = "Simple logit preference space",
x = "Attribute",
y = "Estimate") +
coord_flip() +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '2_1_mnl_pref_plot.png'),
plot = mnl_pref_plot,
width = 6,
height = 6 / 1.618
)
Below is the generated model coefficient table and plot for mnl_pref
:
Coefficient | Meaning | Estimate | Std Error | Level | Unit | |
---|---|---|---|---|---|---|
upfront | β1 | Upfront | -0.0151 | 0.0305 | 1, 3, 5 | 100 USD |
lv_2_charger_yes | β2 | Lv 2 Charger | 0.5884 | 0.0989 | - | - |
discount | β3 | Discount | 0.1657 | 0.0298 | 1, 2.5, 5 | 10% |
window | β4 | Window | 0.0835 | 0.0375 | 0.5, 1, 2, 4 | 1 hr |
guaranteed | β5 | Guaranteed | 0.2242 | 0.0253 | 25, 50, 75 | 10% |
no_choice | β6 | No Choice | 0.8345 | 0.2361 | - | - |
Utility equation of the mnl_pref
model:
\[ \begin{align} u_j &= \beta_1 x_j^{upfront} + \beta_2 \delta_j^{charger} + \beta_3 x_j^{discount} + \beta_4 x_j^{window} + \beta_5 x_j^{guaranteed} + \beta_6 x_j^{no} + \epsilon_j \\ u_j &= -0.02 x_j^{upfront} + 0.59 \delta_j^{charger} + 0.17 x_j^{discount} + 0.08 x_j^{window} + 0.22 x_j^{guaranteed} + 0.83 x_j^{no} + \epsilon_j \end{align} \]
Interpretation of the mnl_pref
model:
This is the single logit preference space model. It contains point estimates of each coefficient, along with the stand error values.
We start with the point estimates: the coefficients The team intentionally set the upfront
attribute as a negative number, so that in the model it shows a negative coefficient. However, this does mean that the upfront incentive is still “the more the better”. The other coefficients are all positive, meaning the users will tend to agree with either a larger value, or agree with a “YES” option.
The interesting finding is the “No Choice”: Users tend to simply NOT participate in the smart charging program. It is a reasonable result, since the users are sacrificing their rights, which is the reason why we need monetary incentives to them.
The standard errors reveal how the users are confident with each utility. For example, the attributes of upfront, discount, window, and guaranteed have small SE values, indicating the users’ high confidence when making decisions. The users are not quite sure if they want a free charger, or to participate in this program.
An important notice based on this very first model: the simple logit preference space model (and the one with groups) is the ONLY model that does not require multi-start, due to the fact of being a convex model with a single global maximum likelihood. All the multiple logit or WTP space models are non-convex and require multi-starts. We’ll explain more in their own sections.
4.2 Mixed logit preference space
This part is the 2nd model: mixed logit preference space, mxl_pref
. Below is the code chunk:
Code
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
<- read_csv(here("data", "choice_data.csv")) %>%
data_mxl_pref mutate(upfront = -1 * upfront / 100,
discount = discount / 10,
guaranteed = guaranteed / 10)
set.seed(456)
<- logitr(
mxl_pref data = data_mxl_pref,
outcome = "choice",
obsID = "obs_id",
pars = c("upfront", "lv_2_charger_yes", "discount",
"window", "guaranteed", "no_choice"),
randPars = c(upfront = "n",
lv_2_charger_yes = "n",
discount = "n",
window = "n",
guaranteed = "n",
no_choice = "n"),
numMultiStarts = 30,
drawType = 'sobol',
numDraws = 200
)
# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mxl_pref)
# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mxl_pref$gradient
# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mxl_pref$hessian)$values
# 2.3 Save the model
save(
mxl_pref,file = here("models", "mxl_pref.RData")
)
# 3. Model Equation
# 3.1 Estimated values and std err
<- data.frame(
table_mxl_pref Coefficient = c("β<sub>1</sub>", "β<sub>2</sub>",
"β<sub>3</sub>", "β<sub>4</sub>",
"β<sub>5</sub>", "β<sub>6</sub>",
"SD β<sub>1</sub>", "SD β<sub>2</sub>",
"SD β<sub>3</sub>", "SD β<sub>4</sub>",
"SD β<sub>5</sub>", "SD β<sub>6</sub>"),
Meaning = c("Upfront", "Lv 2 Charger", "Discount",
"Window", "Guaranteed", "No Choice",
"SD Upfront", "SD Lv 2 Charger", "SD Discount",
"SD Window", "SD Guaranteed", "SD No Choice"),
Estimate = coef(mxl_pref),
StdError = se(mxl_pref),
Level = c("1, 3, 5", "-", "1, 2.5, 5",
"0.5, 1, 2, 4", "25, 50, 75", "-",
"-", "-", "-", "-", "-", "-"),
Unit = c("100 USD", "-", "10%", "hour", "10%", "-",
"-", "-", "-", "-", "-", "-")
)$Estimate <- round(table_mxl_pref$Estimate, 4)
table_mxl_pref$StdError <- round(table_mxl_pref$StdError, 4)
table_mxl_pref<- table_mxl_pref %>%
mxl_pref_kable kable(format = 'html', escape = FALSE,
col.names = c("Coefficient", "Meaning", "Estimate", "Std Error",
"Level", "Unit"),
align = c('c', 'l', 'c', 'c', 'l', 'l'),
caption = "Summary of Model Coefficients") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 3.2 Model figure
# 3.2.1 Construct the data frame
<- data.frame(
figure_mxl_pref Attribute = factor(
c("Upfront", "Charger", "Discount", "Window", "Guaranteed", "No Choice"),
levels = c("No Choice", "Guaranteed", "Window",
"Discount", "Charger", "Upfront")
),Estimate = head(coef(mxl_pref), 6),
StdError = head(se(mxl_pref), 6)
)
$LowerBound =
figure_mxl_pref$Estimate - 1.96 * figure_mxl_pref$StdError
figure_mxl_pref$UpperBound =
figure_mxl_pref$Estimate + 1.96 * figure_mxl_pref$StdError
figure_mxl_pref
# 3.2.2 Create the plot
<- figure_mxl_pref %>%
mxl_pref_plot ggplot(aes(x = Attribute, y = Estimate)) +
geom_point() +
geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
labs(title = "Coef Estimates with Uncertainty Bounds",
subtitle = "Mixed logit preference space",
x = "Attribute",
y = "Estimate") +
coord_flip() +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '2_2_mxl_pref_plot.png'),
plot = mxl_pref_plot,
width = 6,
height = 6 / 1.618
)
Below is the generated model coefficient table and plot for mxl_pref
:
Coefficient | Meaning | Estimate | Std Error | Level | Unit | |
---|---|---|---|---|---|---|
upfront | β1 | Upfront | -0.0151 | 0.0305 | 1, 3, 5 | 100 USD |
lv_2_charger_yes | β2 | Lv 2 Charger | 0.5882 | 0.1001 | - | - |
discount | β3 | Discount | 0.1655 | 0.0298 | 1, 2.5, 5 | 10% |
window | β4 | Window | 0.0835 | 0.0375 | 0.5, 1, 2, 4 | hour |
guaranteed | β5 | Guaranteed | 0.2241 | 0.0253 | 25, 50, 75 | 10% |
no_choice | β6 | No Choice | 0.8340 | 0.2359 | - | - |
sd_upfront | SD β1 | SD Upfront | 0.0002 | 0.2045 | - | - |
sd_lv_2_charger_yes | SD β2 | SD Lv 2 Charger | -0.0035 | 2.4494 | - | - |
sd_discount | SD β3 | SD Discount | 0.0030 | NaN | - | - |
sd_window | SD β4 | SD Window | 0.0007 | NaN | - | - |
sd_guaranteed | SD β5 | SD Guaranteed | -0.0006 | NaN | - | - |
sd_no_choice | SD β6 | SD No Choice | 0.0050 | NaN | - | - |
Interpretation of the mxl_pref
model:
This is the mixed logit preference space model. The team applied multi-start and saved the result using the set.seed()
function. This model has similar coefficients with the simple logit model, and provided the standard deviation values of each coefficient.
The standard deviation values indicate that the coefficient of each utility follows normal distribution and has a span of its estimated value. The span (aka SD) is not high compared with the estimated values.
4.3 Simple logit preference space with groups
This part is the 3rd model: simple logit preference space with groups, mnl_pref_groups
. Below is the code chunk:
Code
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
<- read_csv(here("data", "choice_data.csv")) %>%
data_mnl_pref_groups mutate(upfront = -1 * upfront / 100,
discount = discount / 10,
guaranteed = guaranteed / 10) %>%
mutate(group_A = ifelse(mc_income == "under60", 1, 0),
group_B = ifelse(mc_income != "under60", 1, 0)) %>%
mutate(upfront_B = upfront * group_B,
lv_2_charger_yes_B = lv_2_charger_yes * group_B,
discount_B = discount * group_B,
window_B = window * group_B,
guaranteed_B = guaranteed * group_B,
no_choice_B = no_choice * group_B)
<- logitr(
mnl_pref_groups data = data_mnl_pref_groups,
outcome = "choice",
obsID = "obs_id",
pars = c("upfront", "lv_2_charger_yes", "discount",
"window", "guaranteed", "no_choice",
"upfront_B", "lv_2_charger_yes_B", "discount_B",
"window_B", "guaranteed_B", "no_choice_B")
)
# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_pref_groups)
# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mnl_pref_groups$gradient
# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mnl_pref_groups$hessian)$values
# 2.3 Save the model
save(
mnl_pref_groups,file = here("models", "mnl_pref_groups.RData")
)
# 3. Model Equation
# 3.1 Estimated values and std err
<- data.frame(
table_mnl_pref_groups Coefficient = c("β<sub>1</sub>", "β<sub>2</sub>",
"β<sub>3</sub>", "β<sub>4</sub>",
"β<sub>5</sub>", "β<sub>6</sub>",
"β<sub>7</sub>", "β<sub>8</sub>",
"β<sub>9</sub>", "β<sub>10</sub>",
"β<sub>11</sub>", "β<sub>12</sub>"),
Meaning = c("Upfront", "Lv 2 Charger", "Discount",
"Window", "Guaranteed", "No Choice",
"Upfront B", "Lv 2 Charger B", "Discount B",
"Window B", "Guaranteed B", "No Choice B"),
Estimate = coef(mnl_pref_groups),
StdError = se(mnl_pref_groups),
Level = c("1, 3, 5", "-", "1, 2.5, 5",
"0.5, 1, 2, 4", "25, 50, 75", "-",
"1, 3, 5", "-", "1, 2.5, 5",
"0.5, 1, 2, 4", "25, 50, 75", "-"),
Unit = c("100 USD", "-", "10%", "hour", "10%", "-",
"100 USD", "-", "10%", "hour", "10%", "-")
)$Estimate <- round(table_mnl_pref_groups$Estimate, 4)
table_mnl_pref_groups$StdError <- round(table_mnl_pref_groups$StdError, 4)
table_mnl_pref_groups<- table_mnl_pref_groups %>%
mnl_pref_groups_kable kable(format = 'html', escape = FALSE,
col.names = c("Coefficient", "Meaning", "Estimate", "Std Error",
"Level", "Unit"),
align = c('c', 'l', 'c', 'c', 'l', 'l'),
caption = "Summary of Model Coefficients") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 3.2 Model figure
# 3.2.1 Construct the data frame
<- data.frame(
figure_mnl_pref_groups Attribute = factor(
c("Upfront", "Charger", "Discount", "Window", "Guaranteed", "No Choice",
"Upfront B", "Charger B", "Discount B",
"Window B", "Guaranteed B", "No Choice B"),
levels = c("No Choice B", "Guaranteed B", "Window B",
"Discount B", "Charger B", "Upfront B",
"No Choice", "Guaranteed", "Window",
"Discount", "Charger", "Upfront")
),Estimate = coef(mnl_pref_groups),
StdError = se(mnl_pref_groups)
)
$LowerBound =
figure_mnl_pref_groups$Estimate - 1.96 * figure_mnl_pref_groups$StdError
figure_mnl_pref_groups$UpperBound =
figure_mnl_pref_groups$Estimate + 1.96 * figure_mnl_pref_groups$StdError
figure_mnl_pref_groups
# 3.2.2 Create the plot
<- figure_mnl_pref_groups %>%
mnl_pref_groups_plot ggplot(aes(x = Attribute, y = Estimate)) +
geom_point() +
geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
labs(title = "Coefficient Estimates with Uncertainty Bounds",
subtitle = "Simple logit preference space with groups",
x = "Attribute",
y = "Estimate") +
coord_flip() +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '2_3_mnl_pref_groups_plot.png'),
plot = mnl_pref_groups_plot,
width = 6,
height = 6 / 1.618
)
# 4. Hypothesis Test
# 4.1 Perform the simulation
<- coef(mnl_pref_groups)
beta <- mnl_pref_groups$hessian
hessian = -1*(solve(hessian))
covariance = as.data.frame(MASS::mvrnorm(10^5, beta, covariance))
draws
# 4.2 Define the coefficients for groups A and B
<- draws$upfront
b1_A <- draws$lv_2_charger_yes
b2_A <- draws$discount
b3_A <- draws$window
b4_A <- draws$guaranteed
b5_A <- draws$no_choice
b6_A
<- draws$upfront + draws$upfront_B
b1_B <- draws$lv_2_charger_yes + draws$lv_2_charger_yes_B
b2_B <- draws$discount + draws$discount_B
b3_B <- draws$window + draws$window_B
b4_B <- draws$guaranteed + draws$guaranteed_B
b5_B <- draws$no_choice + draws$no_choice_B
b6_B
# 4.3 Construct the data frame
= list(b1_A, b2_A, b3_A, b4_A, b5_A, b6_A,
coefficients
b1_B, b2_B, b3_B, b4_B, b5_B, b6_B)
<- data.frame(
draw_sim_df Coefficient = c("b1_A", "b2_A", "b3_A", "b4_A", "b5_A", "b6_A",
"b1_B", "b2_B", "b3_B", "b4_B", "b5_B", "b6_B"),
Mean = sapply(coefficients, mean),
LowerQuantile = sapply(coefficients, function(x) quantile(x, probs = 0.025)),
UpperQuantile = sapply(coefficients, function(x) quantile(x, probs = 0.975))
)
# 4.4 Sketch the plot
<- draw_sim_df %>%
hypothesis_test_plot ggplot(aes(x = Coefficient, y = Mean)) +
geom_point() +
geom_errorbar(aes(ymin = LowerQuantile, ymax = UpperQuantile), width = 0.2) +
labs(title = "Hypothesis Test on Groups A and B",
x = "Coefficient",
y = "Values") +
coord_flip() +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '2_3_hypothesis_test_plot.png'),
plot = hypothesis_test_plot,
width = 6,
height = 6 / 1.618
)
Below is the generated model coefficient table and plot for mnl_pref_groups
:
Coefficient | Meaning | Estimate | Std Error | Level | Unit | |
---|---|---|---|---|---|---|
upfront | β1 | Upfront | 0.0795 | 0.0812 | 1, 3, 5 | 100 USD |
lv_2_charger_yes | β2 | Lv 2 Charger | 0.0577 | 0.2587 | - | - |
discount | β3 | Discount | 0.1084 | 0.0759 | 1, 2.5, 5 | 10% |
window | β4 | Window | 0.1789 | 0.0957 | 0.5, 1, 2, 4 | hour |
guaranteed | β5 | Guaranteed | 0.2391 | 0.0645 | 25, 50, 75 | 10% |
no_choice | β6 | No Choice | 0.7396 | 0.5704 | - | - |
upfront_B | β7 | Upfront B | -0.1155 | 0.0877 | 1, 3, 5 | 100 USD |
lv_2_charger_yes_B | β8 | Lv 2 Charger B | 0.6234 | 0.2803 | - | - |
discount_B | β9 | Discount B | 0.0678 | 0.0826 | 1, 2.5, 5 | 10% |
window_B | β10 | Window B | -0.1069 | 0.1041 | 0.5, 1, 2, 4 | hour |
guaranteed_B | β11 | Guaranteed B | -0.0128 | 0.0702 | 25, 50, 75 | 10% |
no_choice_B | β12 | No Choice B | 0.1405 | 0.6275 | - | - |
Interpretation of the mnl_pref_groups
model:
This is the simple logit preference space model with groups. This model splits the users into based on their annual income: below or above $60k. The results of the hypothesis test are NOT significant, since every single coefficient has obvious overlapping.
The team continued with other grouping techniques, including gender, party, and education. None of them revealed significant differences. It is a proof that this survey result might be generated by professional test takers with pre-set personas, that their results look convincing by individual questions, but are NOT consistent in the whole picture.
4.4 Simple logit WTP space
This part is the 4th model: simple logit WTP space, mnl_wtp
. Below is the code chunk:
Code
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
<- read_csv(here("data", "choice_data.csv")) %>%
data_mnl_pref mutate(upfront = -1 * upfront / 100,
discount = discount / 10,
guaranteed = guaranteed / 10)
set.seed(234)
<- logitr(
mnl_wtp data = data_mnl_pref,
outcome = "choice",
obsID = "obs_id",
pars = c("lv_2_charger_yes", "discount",
"window", "guaranteed", "no_choice"),
scalePar = "upfront",
numMultiStarts = 30
)
# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_wtp)
# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mnl_wtp$gradient
# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mnl_wtp$hessian)$values
# 2.3 Compare preference vs WTP spaces
load(here("models", "mnl_pref.RData"))
<- wtpCompare(mnl_pref, mnl_wtp, scalePar = 'upfront')
mnl_wtp_compare <- mnl_wtp_compare %>% kable(
mnl_wtp_compare_kable format = 'html', escape = FALSE,
col.names = c("Coefficient", "mnl_pref Space",
"mnl_wtp Space", "Difference"),
caption = "Model Coefficients Compare") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 2.4 Save the model
save(
mnl_wtp,file = here("models", "mnl_wtp.RData")
)
Below is the comparison between mnl_pref
and mnl_wtp
:
Coefficient | mnl_pref Space | mnl_wtp Space | Difference |
---|---|---|---|
scalePar | 0.0150763 | 0.01484121 | -0.00023509 |
lv_2_charger_yes | 39.0272499 | 39.66775455 | 0.64050465 |
discount | 10.9895955 | 11.17393094 | 0.18433547 |
window | 5.5370587 | 5.62073081 | 0.08367215 |
guaranteed | 14.8688168 | 15.10740414 | 0.23858730 |
no_choice | 55.3548416 | 56.16701223 | 0.81217065 |
logLik | -880.1725938 | -880.17269689 | -0.00010313 |
Interpretation of the mnl_wtp
model:
This is the simple logit WTP space model. This WTP model directly generates the willingness-to-pay coefficients. The wtpCompare()
function can be used to compare mnl_pref
and mnl_wtp
. The result shows a very low difference, indicating the success of our algorithm, that a local maximum likelihood close to global maximum is captured.
4.5 Mixed logit WTP space
This part is the last model: mixed logit WTP space, mxl_wtp
. Below is the code chunk:
Code
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
<- read_csv(here("data", "choice_data.csv")) %>%
data_mxl_pref mutate(upfront = -1 * upfront / 100,
discount = discount / 10,
guaranteed = guaranteed / 10)
set.seed(89)
<- logitr(
mxl_wtp data = data_mxl_pref,
outcome = "choice",
obsID = "obs_id",
pars = c("lv_2_charger_yes", "discount",
"window", "guaranteed", "no_choice"),
scalePar = "upfront",
randPars = c(lv_2_charger_yes = "n",
discount = "n",
window = "n",
guaranteed = "n",
no_choice = "n"),
numMultiStarts = 30
)
# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mxl_wtp)
# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mxl_wtp$gradient
# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mxl_wtp$hessian)$values
# 2.3 Save the coefficients
<- head(coef(mxl_wtp), 6)
coef_mxl_wtp <- as.data.frame(coef_mxl_wtp)
coef_mxl_wtp_df
<- cbind(coef_mxl_wtp_df, head(mnl_wtp_compare, 6)) %>%
mxl_wtp_compare select(pref, coef_mxl_wtp, difference) %>%
mutate(difference = coef_mxl_wtp - as.numeric(pref))
<- mxl_wtp_compare %>% kable(
mxl_wtp_compare_kable format = 'html', escape = FALSE,
col.names = c("Coefficient", "mnl_pref Space",
"mxl_wtp Space", "Difference"),
caption = "Model Coefficients Compare") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 2.4 Save the model
save(
mxl_wtp,file = here("models", "mxl_wtp.RData")
)
Here we reveal the coefficients in this model. The team has made a great many numbers of attempts and captured this result. It’s not perfect but close enough:
Coefficient | mnl_pref Space | mxl_wtp Space | Difference |
---|---|---|---|
scalePar | 0.0150763 | 0.0143826 | -0.0006937 |
lv_2_charger_yes | 39.0272499 | 51.8474599 | 12.8202100 |
discount | 10.9895955 | 13.8418559 | 2.8522604 |
window | 5.5370587 | 4.9340397 | -0.6030190 |
guaranteed | 14.8688168 | 17.8647569 | 2.9959401 |
no_choice | 55.3548416 | 56.3146482 | 0.9598066 |
Interpretation of the mxl_wtp
model:
This is the mixed logit WTP space model. to reveal the complete image of the model, run summary(mxl_wtp)
. The team made attempts to find a good local maximum and found a (not perfect) result.
Results
1. WTP Analysis
Different from 4.4 Simple logit WTP space, which directly generates the WTP model, this part calculates WTP from the simple preference space model generated in 4.1 Simple logit preference space. Below is the code chunk:
Code
# 1. Load the mnl_pref model
load(here("models", "mnl_pref.RData"))
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_pref)
# 2. Compute WTP estimates
<- coef(mnl_pref)
coefs_mnl_pref <- coefs_mnl_pref / (-1*coefs_mnl_pref['upfront'])
wtp_mnl_pref
# 3. Simulate WTP with uncertainty
<- vcov(mnl_pref)
cov_mnl_pref set.seed(3)
<- as.data.frame(
coef_draws_mnl_pref ::mvrnorm(10^6, coefs_mnl_pref, cov_mnl_pref)
MASS
)<- -1*(coef_draws_mnl_pref[,2:6] / coef_draws_mnl_pref[,1])
wtp_draws_mnl_pref <- ci(wtp_draws_mnl_pref)
wtp_ci_mnl_pref
# 4. Create kables for WTP with uncertainty
<- wtp_mnl_pref %>% kable(
wtp_mnl_pref_kable format = 'html', escape = FALSE,
col.names = c("Attribute", "Coefficient"),
caption = "Computed WTP") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
<- wtp_ci_mnl_pref %>% kable(
wtp_ci_mnl_pref_kable format = 'html', escape = FALSE,
col.names = c("Attribute", "Mean", "Lower", "Upper"),
caption = "Simulated WTP with CI") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 5. Save WTP with uncertainty
save(
wtp_mnl_pref,
wtp_ci_mnl_pref,file = here("data", "wtp_from_pref.RData")
)
The computed and simulated WTP values are shown below:
Attribute | Coefficient |
---|---|
upfront | -1.000000 |
lv_2_charger_yes | 39.027250 |
discount | 10.989596 |
window | 5.537059 |
guaranteed | 14.868817 |
no_choice | 55.354842 |
Attribute | Mean | Lower | Upper |
---|---|---|---|
lv_2_charger_yes | 35.184767 | -267.42093 | 281.31683 |
discount | 11.800483 | -74.84438 | 78.77620 |
window | 5.304263 | -38.33406 | 40.24955 |
guaranteed | 14.546457 | -101.89208 | 107.08355 |
no_choice | 44.555414 | -353.89815 | 378.77140 |
Interpretation of WTP analysis:
The WTP values in this section are from the preference space model, either calculated or simulated. It is the same as the WTP values in the WTP model.
2. Market Simulation
The market simulation is based on the simple logit preference space model, mnl_pref
. Two market simulations were performed:
- Single market simulation with only one scenario considered. In our case, we did a “Monetary” case vs a “Flexibility” case, in which the users are provided with either the best monetary incentives (highest upfront, free charger, and best electricity price discount) or the best flexibility (highest override time interval and largest amount of guaranteed battery percentage).
- Multiple market simulation with 9 different scenarios. In our case, we considered: polarized, incentive only, charger only, discount only, window only, guaranteed only, monetary vs flexibility (same as single simulation), monetary, and flexibility.
Below is the code chunk:
Code
# 1. Load the mnl_pref model
load(here("models", "mnl_pref.RData"))
# 2. Market Simulation
# 2.1 Single market simulation
<- data.frame(
scenarios_sing alt_id = c(1, 2, 3),
obs_id = c(1, 1, 1),
upfront = c(-5, -1, 0),
lv_2_charger_yes = c(1, 0, 0),
discount = c(5, 1, 0),
window = c(1, 4, 0),
guaranteed = c(2.5, 7.5, 0),
no_choice = c(0, 0, 1)
)
<- predict(
sim_mnl_pref_sing
mnl_pref,newdata = scenarios_sing,
obsID = 'obs_id',
level = 0.95,
interval = 'confidence',
returnData = TRUE
)
# 2.2 Multiple simulations
<- read_csv(here("data", "scenarios.csv"))
scenarios_mult
<- predict(
sim_mnl_pref_mult
mnl_pref,newdata = scenarios_mult,
obsID = 'obs_id',
level = 0.95,
interval = 'confidence',
returnData = TRUE
)
# 2.3 Save simulations
save(
sim_mnl_pref_sing,
sim_mnl_pref_mult,file = here("sims", "sim_mnl_pref.RData")
)
# 3. Market Simulation Plots
# 3.1 Single market simulation plot
<- sim_mnl_pref_sing %>%
sim_sing_plot mutate(label = factor(c("Monetary", "Flexibility", "Not Enroll"),
levels = c("Monetary",
"Flexibility",
"Not Enroll"))) %>%
ggplot(aes(
x = label, y = predicted_prob,
ymin = predicted_prob_lower, ymax = predicted_prob_upper)) +
geom_col(fill = "grey", width = 0.6) +
geom_errorbar(width = 0.3) +
scale_y_continuous(limits = c(0, 1), labels = scales::percent) +
labs(x = 'Options',
y = 'Probability',
title = 'Single Market Simulation Plot',
subtitle = 'Monetary vs Flexibility') +
theme_bw(base_family = 'Ubuntu')
ggsave(
filename = here('figs_manu', '3_2_sim_sing_plot.png'),
plot = sim_sing_plot,
width = 6,
height = 6 / 1.618
)
# 3.2 Multiple market simulation plot
<- setNames(c(
scenarios_mult_labels "Polarized", "Incentive Only", "Charger Only", "Discount Only",
"Window Only", "Guaranteed Only", "Monetary vs Flexibility",
"Monetary", "Flexibility"), 1:9)
<- sim_mnl_pref_mult %>%
sim_mult_plot ggplot(aes(
x = as.factor(alt_id), y = predicted_prob,
ymin = predicted_prob_lower, ymax = predicted_prob_upper)) +
geom_col(fill = "grey", width = 0.6) +
geom_errorbar(width = 0.3) +
facet_wrap(~obs_id, labeller = labeller(obs_id = scenarios_mult_labels)) +
scale_x_discrete(labels = c("1" = "1", "2" = "2", "3" = "No")) +
scale_y_continuous(limits = c(0, 1), labels = scales::percent) +
labs(x = 'Options',
y = 'Probability',
title = 'Multiple Market Simulation Plot') +
theme_bw(base_family = 'Ubuntu')
ggsave(
filename = here('figs_manu', '3_2_sim_mult_plot.png'),
plot = sim_mult_plot,
width = 7,
height = 7 * 0.8
)
The datasets are all saved to the sims
folder, and the plots are shown below:
Note: The single market simulation is case 7 of multiple market simulation.
Cases of the multiple market simulation plots:
No. | Option 1 | Option 2 | No Option |
---|---|---|---|
1 | Best levels of all attributes | Lowest level for all | No participation |
2 | Best upfront incentive | Lowest level for all | No participation |
3 | Free charger | Lowest level for all | No participation |
4 | Best electricity price discount | Lowest level for all | No participation |
5 | Best override window | Lowest level for all | No participation |
6 | Best guaranteed battery | Lowest level for all | No participation |
7 | Best upfront, discount, and a free charger | Best window and guaranteed battery | No participation |
8 | Best upfront, discount, and a free charger | Lowest level for all | No participation |
9 | Best window and guaranteed battery | Lowest level for all | No participation |
3. Market Sensitivity
The market sensitivity is also generated with respect to the simple logit preference model, mnl_pref
. This part consists of the following:
- Sensitivity of market share to changes in “upfront”, where there are 2 options. In the baseline, option 1 is set as a no option, and option 2 is provided with moderate values. We evaluate the sensitivity by changing the upfront value in option 2.
- Sensitivity of market share to changes in multiple attributes, where we study the sensitivity of all attributes.
- Plots of market share and cost sensitivity to the change of the “upfront” attribute. We assume a market size of 1000 users.
- Plot of market share sensitivity to the change of multiple attributes. This is called the “Tornado Plot”, and is generated using the
ggplot
-like function in Prof John Helveston’s package,jph::ggtornado()
.
The code scripts are broken into 2 parts: sensitivity calculation and plot generation.
Below is the sensitivity calculation code chunk:
Code
# 1. Load the mnl_pref model
load(here("models", "mnl_pref.RData"))
# 2. Sensitivity Analysis
# 2.1 Sensitivity of market share to changes in "upfront"
# Set a baseline scenario
# Option 1 is a no option
# Option 2 is provided with moderate values
# We will change the upfront values for Option 2
<- data.frame(
baseline alt_id = c(1, 2),
obs_id = c(1, 1),
upfront = c(0, -2),
lv_2_charger_yes = c(0, 1),
discount = c(0, 2.5),
window = c(0, 2),
guaranteed = c(0, 5),
no_choice = c(1, 0)
)
<- baseline %>% kable(
baseline_kable format = 'html', escape = FALSE,
caption = "Single Attribute Baseline") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# Set the sequence of upfront levels from 1000 to 100 USD
<- seq(-10, -1, by = 1)
upfront_levels
# Count the numbers of upfront scenarios
<- length(upfront_levels)
upfront_numbers
# Create scenarios with different upfront levels in option 2
<- do.call(
scenario_upfront replicate(upfront_numbers, baseline, simplify = FALSE)
rbind,
)$obs_id <- rep(seq(upfront_numbers), each = 2)
scenario_upfront$upfront[which(scenario_upfront$alt_id == 2)] <- upfront_levels
scenario_upfront
# Simulate the upfront sensitivity
<- predict(
sens_upfront
mnl_pref,newdata = scenario_upfront,
obsID = 'obs_id',
level = 0.95,
interval = 'confidence',
returnData = TRUE
%>%
) # Keep only option 2
filter(alt_id == 2) %>%
# Keep only upfront and predictions
select(upfront, starts_with("predicted_"))
<- sens_upfront %>% kable(
sens_upfront_kable format = 'html', escape = FALSE,
caption = "Upfront Sensitivity Table") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 2.2 Sensitivity of market share to changes in multiple attributes
# We continue to work on Option 2
# This time, we modify all continuous attributes
# The attributes are: upfront, discount, window, and guaranteed
# Construct the attribute_levels, like we constructed upfront_levels
# But attribute_levels is a df, while upfront_levels is a sequence
<- tribble(
attribute_levels ~obs_id, ~alt_id, ~attribute, ~case, ~value,
2, 2, 'upfront', 'high', -5,
3, 2, 'upfront', 'low', -1,
4, 2, 'discount', 'high', 5,
5, 2, 'discount', 'low', 1,
6, 2, 'window', 'high', 4,
7, 2, 'window', 'low', 0.5,
8, 2, 'guaranteed', 'high', 7.5,
9, 2, 'guaranteed', 'low', 2.5
)
<- attribute_levels %>%
attribute_levels_kable kable(format = 'html', escape = FALSE,
col.names = c("Obs ID", "Alt ID", "Attribute",
"Case", "Value"),
caption = "Attribute Levels") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# Define the numbers of different scenarios
<- 9
scenario_numbers
# Create scenarios with different upfront levels in option 2
<- do.call(
scenario_table replicate(scenario_numbers, baseline, simplify = FALSE)
rbind,
)$obs_id <- rep(seq(scenario_numbers), each = 2)
scenario_table<- scenario_table %>%
scenario_table left_join(attribute_levels, by = c("alt_id", "obs_id")) %>%
mutate(
attribute = ifelse(is.na(attribute), "other", attribute),
case = ifelse(is.na(case), "base", case),
upfront = ifelse(attribute == 'upfront', value, upfront),
discount = ifelse(attribute == 'discount', value, discount),
window = ifelse(attribute == 'window', value, window),
guaranteed = ifelse(attribute == 'guaranteed', value, guaranteed)
)
<- scenario_table %>%
scenario_kable kable(format = 'html', escape = FALSE,
col.names = c("Alt ID", "Obs ID", "Upfront", "Charger",
"Discount", "Window", "Guaranteed",
"No Choice", "Attribute", "Case", "Value"),
caption = "Scenario Table") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# Simulate the multiple change sensitivity
<- predict(
sens_multiple
mnl_pref,newdata = scenario_table,
obsID = 'obs_id',
level = 0.95,
interval = 'confidence',
returnData = TRUE
%>%
) # Keep only option 2
filter(alt_id == 2) %>%
# Keep only upfront and predictions
select(upfront, attribute, case, value, predicted_prob)
# Save simulations
save(
sens_upfront,
sens_multiple,file = here("sims", "sens_mnl_pref.RData")
)
Below is the code chunk for generating the plots:
Code
# 3. Sensitivity Plots
# 3.1 Plot of market share change to "upfront"
<- sens_upfront %>%
sens_upfront_plot ggplot(aes(x = upfront * -100,
y = predicted_prob,
ymin = predicted_prob_lower,
ymax = predicted_prob_upper)) +
geom_ribbon(alpha = 0.2) +
geom_line(linetype = "dashed") +
geom_line(
data = sens_upfront %>% filter(upfront <= -1, upfront >= -5),
linetype = "solid") +
expand_limits(x = c(-10, -1), y = c(0, 1)) +
labs(x = 'Upfront (USD)',
y = 'Market Share',
title = 'Sensitivity of market share change to upfront') +
theme_bw(base_family = "Ubuntu")
# Save plot
ggsave(
filename = here('figs_manu', '3_3_sens_upfront_plot.png'),
plot = sens_upfront_plot,
width = 6,
height = 6 / 1.618
)
# 3.2 Plot of policy cost change to "upfront"
# Assume a market size of 1000 users
<- 1000
market_size <- sens_upfront %>%
sens_cost mutate(
cost_mean = upfront*market_size*predicted_prob*-100 / 1000,
cost_low = upfront*market_size*predicted_prob_lower*-100 / 1000,
cost_high = upfront*market_size*predicted_prob_upper*-100 / 1000)
<- sens_cost %>%
sens_cost_plot ggplot(aes(x = upfront * -100,
y = cost_mean,
ymin = cost_low,
ymax = cost_high)) +
geom_ribbon(alpha = 0.2) +
geom_line(linetype = "dashed") +
geom_line(
data = sens_cost %>% filter(upfront <= -1, upfront >= -5),
linetype = "solid") +
expand_limits(x = c(-10, -1), y = c(0, 1)) +
labs(x = 'Upfront (USD)',
y = 'Cost ($1000)',
title = 'Sensitivity of policy cost change to upfront') +
theme_bw(base_family = "Ubuntu")
# Save plot
ggsave(
filename = here('figs_manu', '3_3_sens_cost_plot.png'),
plot = sens_cost_plot,
width = 6,
height = 6 / 1.618
)
# 3.2 Tornado Plot of market share change to multiple attributes
<- data.frame(
labels attribute = c('upfront', 'discount', 'window', 'guaranteed'),
label = c('Upfront Incentive (USD)',
'Electricity Discount (%)',
'Override Window (hrs)',
'Guaranteed Battery (%)')
)
<- sens_multiple %>%
tornado_data select(-upfront) %>%
filter(case != 'base') %>%
left_join(labels, by = 'attribute') %>%
mutate(value = ifelse(attribute == "upfront", -100*value, value),
value = ifelse(attribute == "discount" | attribute == "guaranteed",
10*value, value))
<- jph::ggtornado(
tornado_plot data = tornado_data,
baseline = sens_multiple$predicted_prob[1],
var = 'label',
level = 'case',
value = 'value',
result = 'predicted_prob'
+
) scale_fill_manual(values = c("#67a9cf", "#ef8a62")) +
labs(x = 'Market Share',
y = 'Attribute',
title = 'Plot of market share change to multiple attributes') +
theme_bw(base_family = 'Ubuntu') +
theme(legend.position = 'none')
# Save plot
ggsave(
filename = here('figs_manu', '3_3_tornado_plot.png'),
plot = tornado_plot,
width = 6,
height = 6 / 1.618
)
The sensitivity results of single attribute (upfront) change are illustrated below:
alt_id | obs_id | upfront | lv_2_charger_yes | discount | window | guaranteed | no_choice |
---|---|---|---|---|---|---|---|
1 | 1 | 0 | 0 | 0.0 | 0 | 0 | 1 |
2 | 1 | -2 | 1 | 2.5 | 2 | 5 | 0 |
upfront | predicted_prob | predicted_prob_lower | predicted_prob_upper | |
---|---|---|---|---|
2 | -10 | 0.8329388 | 0.7577337 | 0.8886528 |
4 | -9 | 0.8308303 | 0.7643835 | 0.8820927 |
6 | -8 | 0.8287008 | 0.7702253 | 0.8749223 |
8 | -7 | 0.8265500 | 0.7754630 | 0.8679891 |
10 | -6 | 0.8243779 | 0.7800687 | 0.8616038 |
12 | -5 | 0.8221845 | 0.7833330 | 0.8557803 |
14 | -4 | 0.8199697 | 0.7844276 | 0.8511014 |
16 | -3 | 0.8177334 | 0.7837099 | 0.8478217 |
18 | -2 | 0.8154755 | 0.7792340 | 0.8465033 |
20 | -1 | 0.8131961 | 0.7728263 | 0.8474078 |
The sensitivity results of multiple attributes change are illustrated below:
Obs ID | Alt ID | Attribute | Case | Value |
---|---|---|---|---|
2 | 2 | upfront | high | -5.0 |
3 | 2 | upfront | low | -1.0 |
4 | 2 | discount | high | 5.0 |
5 | 2 | discount | low | 1.0 |
6 | 2 | window | high | 4.0 |
7 | 2 | window | low | 0.5 |
8 | 2 | guaranteed | high | 7.5 |
9 | 2 | guaranteed | low | 2.5 |
Alt ID | Obs ID | Upfront | Charger | Discount | Window | Guaranteed | No Choice | Attribute | Case | Value |
---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | 0 | 0 | 0.0 | 0.0 | 0.0 | 1 | other | base | NA |
2 | 1 | -2 | 1 | 2.5 | 2.0 | 5.0 | 0 | other | base | NA |
1 | 2 | 0 | 0 | 0.0 | 0.0 | 0.0 | 1 | other | base | NA |
2 | 2 | -5 | 1 | 2.5 | 2.0 | 5.0 | 0 | upfront | high | -5.0 |
1 | 3 | 0 | 0 | 0.0 | 0.0 | 0.0 | 1 | other | base | NA |
2 | 3 | -1 | 1 | 2.5 | 2.0 | 5.0 | 0 | upfront | low | -1.0 |
1 | 4 | 0 | 0 | 0.0 | 0.0 | 0.0 | 1 | other | base | NA |
2 | 4 | -2 | 1 | 5.0 | 2.0 | 5.0 | 0 | discount | high | 5.0 |
1 | 5 | 0 | 0 | 0.0 | 0.0 | 0.0 | 1 | other | base | NA |
2 | 5 | -2 | 1 | 1.0 | 2.0 | 5.0 | 0 | discount | low | 1.0 |
1 | 6 | 0 | 0 | 0.0 | 0.0 | 0.0 | 1 | other | base | NA |
2 | 6 | -2 | 1 | 2.5 | 4.0 | 5.0 | 0 | window | high | 4.0 |
1 | 7 | 0 | 0 | 0.0 | 0.0 | 0.0 | 1 | other | base | NA |
2 | 7 | -2 | 1 | 2.5 | 0.5 | 5.0 | 0 | window | low | 0.5 |
1 | 8 | 0 | 0 | 0.0 | 0.0 | 0.0 | 1 | other | base | NA |
2 | 8 | -2 | 1 | 2.5 | 2.0 | 7.5 | 0 | guaranteed | high | 7.5 |
1 | 9 | 0 | 0 | 0.0 | 0.0 | 0.0 | 1 | other | base | NA |
2 | 9 | -2 | 1 | 2.5 | 2.0 | 2.5 | 0 | guaranteed | low | 2.5 |
Interpretation of market sensitivity:
From single attribute change, we see that users are not sensitive to upfront incentive. This encourages us to generate the study of multiple attribute change.
From the multiple attribute change (aka the Tornado plot), we see that users are most sensitive to guaranteed battery, followed by electricity discount, override window, and finally the upfront incentive.
This implies that the utility providers do not need to sacrifice too much on upfront incentive.
4. Limitation of Analysis
The analysis can be limited in several aspects. Below is a breakdown:
- Results: The results of the survey might be biased or unstable, either due to the limited amount of sampling, or due to the possibility of professional test takers mocking the results with their pre-constructed personas.
- Profiles: The cbc profiles of the conjoint questions obey full factorial randomized design, which is a ready-to-use design but lacks the filtering of unreasonable questions. If the profiles could be better, we can prepare better survey question sheets.
- Grouping: The grouping technique was arbitrarily chosen. It could be improved by better reasoning processes.
- Categorizing: The team categorizes upfront incentive, free charger, and discounted electricity price as “Monetary”, and the rest as “Flexibility”. There might be other categorizing techniques to yield better results.
- Incentive Range: The market share plot usually has range from 0 to 100%, indicating a full range of market share in different prices, but the upfront incentive sensitivity is incredibally low in our model, making it impossible to cover the full range.
Final Recommendations and Conclusions
The fluctuation in market share shows a remarkable sensitivity to initial incentives, especially when investment levels are low, i.e., a $100 or $200 incentive. Even a minor increase in upfront incentives like these can result in a significant surge in market share. However, as the upfront incentive escalates further, the correlation weakens, suggesting that higher incentives, such as $500 or more, have negligible impact on market share.
Further analysis indicates that market share is most attuned to Guaranteed Battery percentage, followed by Discount Electricity percentage, Override Window hours, and Upfront Incentive. A marginal alteration in Guaranteed Battery percentage significantly influences market share, whereas a similar adjustment in Upfront Incentive has a comparatively minor impact. Yet, exceptions exist. For instance, an excessive increase in Override Window hours can paradoxically result in a decline in market share, likely due to customers’ inclination toward products with shorter Override Windows.
In conclusion, the relationship between upfront incentives and market share demonstrates diminishing returns as incentives increase. Additionally, the pivotal factors affecting market share sensitivity vary, with Guaranteed Battery percentage being the most influential. Factoring in these dynamics can help optimize strategies to maximize market share effectiveness and anticipate consumer preferences for sustained growth.
Limitations
This section delves into the constraints encountered in sample representation, hypothetical preferences, and short-term evaluations, shedding light on the need for comprehensive approaches to gather more accurate and forward-thinking data. Simultaneously, it explores the significance of longitudinal data collection, behavioral insights, and regional variations in understanding the diverse landscape of smart charging adoption among PEV users. It highlights the critical unknowns—technological advancements, regulatory changes, and user experience feedback—that could significantly impact findings. This critical examination of existing limitations, identification of future data collection avenues, and emphasis on pivotal yet uncertain factors shapes the future landscape of smart charging programs for electric vehicles.
1. Limitations of the current survey
Sample Representation:
The sample gathered in the survey may not accurately reflect the wide range of PEV owners or potential buyers because of potential biases in how participants were recruited or limited outreach efforts. This might result in an overrepresentation of certain demographics, geographic areas, or particular EV models within the sample.
Hypothetical Preferences:
Survey participants could shape their preferences for smart charging based on hypothetical scenarios outlined in the survey, potentially deviating from factors that truly influence real-life decisions. The actual adoption of these practices might be swayed by other unaddressed factors that play a role in real-world decision-making.
Short-Term Evaluation:
The survey’s range might concentrate on current preferences for smart charging features without considering how consumer perspectives, technological progressions, or changes in regulations might evolve over time, potentially influencing long-term adoption significantly.
2. Valuable Information for Future Collection
Longitudinal Data:
Gathering data gradually allows for a deeper understanding of how preferences and behaviors among PEV owners change over time, offering valuable insights into the evolving trends in smart charging adoption.
Behavioral Context:
By collecting qualitative data or conducting follow-up interviews, we can uncover the motivations, barriers, and decision-making processes that drive individuals’ choices regarding smart charging programs. This approach allows for a deeper understanding of the underlying factors shaping these decisions.
Regional Variations:
Expanding the sample size to include greater geographical diversity could illuminate regional disparities in smart charging preferences. This broader scope considers variations in utility programs, infrastructure accessibility, and environmental policies across different areas, offering insights into how these factors influence preferences.
3. Important Unknowns Affecting Findings
Technological Advancements:
Future technological innovations, like advancements in battery technology or shifts in charging infrastructure, have the potential to substantially transform smart charging preferences and behaviors. These changes could reshape how individuals perceive and engage with smart charging options.
Regulatory Changes:
Changes in governmental regulations or incentives linked to electric vehicles and grid management could markedly impact the appeal and viability of smart charging programs. These alterations have the potential to significantly shift the landscape of smart charging initiatives, affecting their attractiveness and feasibility for consumers.
User Experience:
Gaining insights into the practical experiences and satisfaction levels of individuals currently enrolled in smart charging programs is essential. Understanding their challenges, benefits, and feedback can serve as valuable input, significantly guiding future decisions in this domain.
Attribution
This final analysis report is done by all members of the PEV team. Below is the breakdown:
- Vedanth worked on Abstract and Introduction.
- Sampada worked on Survey Design.
- Pingfan worked on Data Analysis and Results.
- Bharath worked on Final Recommendations and Conclusions and Limitations.
In the end, Pingfan filled out Attribution and Appendix, and conducted a final check.
Appendix
1. Conjoint Survey Documentation
1.1 Section 2 - Smart Charging Programs
1.2 Start of Section 2
Now that we have known basics about your EV ownership and your charging habits, let’s move on to Smart Charging program selections!
A Smart Charging program consists of 5 attributes: Upfront Incentive, Free Level 2 Charger, Electricity Price Discount, Override Window, and Guaranteed Range if charged for 8 hrs.
Here is a brief explanation of each attribute:
Attribute | Range | Explanation |
Upfront Incentive: | $100, $300, $500 | It means the amount of money you will be instantly paid if you join the smart charging program. The unit is in USD. |
Free Level 2 Charger: | No or Yes | A "Yes" means you will get a free level 2 charger. A level 2 charger enables faster charging than a standard home charger. |
Electricity Price Discount: | 10%, 25%, 50% | If you choose to join this smart charging program, this will be your electricity price discount. |
Override Window: | 0.5 ~ 4 hrs | This is the longest time window per day that you can force override to regular charging, during which time your EV is not controlled by the smart charging program. |
Guaranteed Range if charged for 8 hrs: | 25%, 50%, 75% | It means the guranteed range if your car is charged for 8 hrs during smart charging. It shows percentage here, but will be converted to miles. |
The above table can be accessed via this link. It’s available on top of each question so that you can refer at any time.
The next page provides a sample question to enhance your understanding.
1.3 Section 2 - Sample Question
You are provided with 2 options of smart charging programs, and an “I am not interested” option. You need to treat them as if they were your only options and choose the one that you most prefer. You can choose the “I am not interested” option if you are not interested in either.
You can access the attribute table using this link.
Options: | Option 1 | Option 2 |
Upfront Incentive: | $100 | $500 |
Free Level 2 Charger: | No | Yes |
Electricity Price Discount: | 25% | 50% |
Override Window: | 0.5 hrs | 4 hrs |
Guaranteed Range if charged for 8 hrs: | 100 miles | 300 miles |
- Option 1
- Option 2
- I am not interested
1.4 Section 2 - Ready to Start the Questions
The sample question is quite straight-forward. Since Option 2 has every attribute better than Option 1, I bet you have chosen it.
However, this simplicity is not always the case. There will be trade-offs between the 2 options, and you need to make your own decision. For example, one option might have a large up-front but a low discount, while the other one is the inverse. Then, you need to make decisions based on your preference.
There are 8 choice questions in total. For each question, you have 2 valid options to choose from. Again, you make your choice as if they were your ONLY options. If you are really not interested in either of them, you can choose “I am not interested”.
Click on “Next Page” to proceed.
1.5 Question 1 of 8
(1 of 8) If your utility offers you these 2 smart charging programs, which one do you prefer?
You can access the attribute table using this link.
Options: | Option 1 | Option 2 |
Upfront Incentive: | $100 | $300 |
Free Level 2 Charger: | No | No |
Electricity Price Discount: | 10% | 10% |
Override Window: | 4 hrs | 2 hrs |
Guaranteed Range if charged for 8 hrs: | 75 % | 50 % |
- Option 1
- Option 2
- I am not interested
The other 7 questions are in similar format and are thus omitted.
2. Data Dictionary
2.1 Dictionary of R codes:
No. | Name | Type | Description |
---|---|---|---|
1 | 0_make_choice_questions | R | Construct the conjoint questions. |
2 | 1_1_clean_data | R | Clean and export survey data. |
3 | 1_2_summary_statistics | R | Generate summary statistics of test takers. |
4 | 2_1_mnl_pref_model | R | The simple logit preference space model. |
5 | 2_2_mxl_pref_model | R | The mixed logit preference space model. |
6 | 2_3_mnl_pref_groups_model | R | The simple logit preference space model by groups. |
7 | 2_4_mnl_wtp_model | R | The simple logit WTP space model. |
8 | 2_5_mxl_wtp_model | R | The mixed logit WTP space model. |
9 | 3_1_wtp_from_pref | R | Calculation of WTP using mnl_pref model. |
10 | 3_2_market_sim | R | Market simulations |
11 | 3_3_sensitivity_analysis | R | Market sensitivity analysis |
2.2 Dictionary of Files:
No. | Name | Type | Description |
---|---|---|---|
1 | choice_data | CSV | Cleaned results from the final survey. |
2 | choice_questions | CSV | Local version of Conjoint questions. |
3 | pevFinal1 | CSV | Part 1 of final survey result from formr. |
4 | pevFinal2 | CSV | Part 2 of final survey result from formr. |
5 | pevFinal3 | CSV | Part 3 of final survey result from formr. |
6 | scenarios | CSV | Scenarios of multiple attributes for sensitivity analysis. |
7 | choice_data_wider | RData | Wider version of survey results |
8 | wtp_from_pref | RData | WTP with uncertainty calculated from mnl_pref. |
9 | mnl_pref_groups | RData | The simple logit preference space model. |
10 | mnl_pref | RData | The mixed logit preference space model. |
11 | mnl_wtp | RData | The simple logit preference space model by groups. |
12 | mxl_pref | RData | The simple logit WTP space model. |
13 | mxl_wtp | RData | The mixed logit WTP space model. |
14 | sens_mnl_pref | RData | Market sensitivity of mnl_pref model. |
15 | sim_mnl_pref | RData | Market simulation of mnl_pref model. |
2.3 Dictionary of Data Frames:
No. | Name | Type | Description |
---|---|---|---|
1 | choice_data | A tibble: 2,928 × 40 | Conjoint questions for the pilot survey. |
2 | choice_data_charging | A tibble: 336 × 38 | Longer version of choice data about charging. |
3 | choice_data_demo | A tibble: 499 × 36 | Longer version of choice data about main demography. |
4 | choice_data_demo_2 | A tibble: 244 × 38 | Longer version of choice data about education and income. |
5 | choice_data_vehicle | A tibble: 610 × 35 | Longer version of choice data about vehicle. |
6 | choice_data_wider | A tibble: 122 × 38 | Original wider version of choice data. |
7 | coef_mxl_wtp_df | A data frame: 6 × 1 | Coefficients of the mxl_wtp model. |
8 | data_mnl_pref | A tibble: 2,928 × 40 | Choice data for the mnl_pref model. |
9 | data_mnl_pref_groups | A tibble: 2,928 × 48 | Choice data for the mnl_pref_groups model. |
10 | data_mxl_pref | A tibble: 2,928 × 40 | Choice data for the mxl_pref model. |
11 | figure_mnl_pref | A data frame: 6 × 5 | Data frame for mnl_pref plot. |
12 | figure_mnl_pref_groups | A data frame: 12 × 5 | Data frame for mnl_pref_groups plot. |
13 | figure_mxl_pref | A data frame: 6 × 5 | Data frame for mxl_pref plot. |
14 | mnl_wtp_compare | A data frame: 7 × 3 | Compare data between mnl_pref and mnl_wtp. |
15 | mxl_wtp_compare | A data frame: 6 × 3 | Compare data between mnl_pref and mxl_wtp. |
16 | profiles | A data frame: 216 × 6 | CBC profiles of the conjoint survey. |
17 | scenario_table | A data frame: 18 × 11 | Scenarios of multiple attributes for market sensitivity analysis. |
18 | scenario_upfront | A data frame: 20 × 8 | Scenarios of upfront attribute for market sensitivity analysis. |
19 | scenarios_mult | A tibble: 27 × 8 | Multiple scenarios for maket simulation. |
20 | scenarios_sing | A data frame: 3 × 8 | Single scenarios for maket simulation. |
21 | sens_cost | A data frame: 10 × 7 | Sensitivity of cost. |
22 | sens_multiple | A data frame: 9 × 5 | Sensitivity of multiple attributes. |
23 | sens_upfront | A data frame: 10 × 4 | Sensitivity of upfront attribute. |
24 | sim_cases | A tibble: 9 × 4 | Cases of multiple market simulation. |
25 | sim_mnl_pref_mult | A data frame: 27 × 11 | Multiple simulation using mnl_pref model. |
26 | sim_mnl_pref_sing | A data frame: 3 × 11 | Single simulation using mnl_pref model. |
27 | survey | A tibble: 12,000 × 12 | Survey data. |
28 | table_mnl_pref | A data frame: 6 × 6 | Details of the mnl_pref model. |
29 | table_mnl_pref_groups | A data frame: 12 × 6 | Details of the mnl_pref_groups model. |
30 | table_mxl_pref | A data frame: 12 × 6 | Details of the mxl_pref model. |
31 | tornado_data | A data frame: 8 × 5 | Data of tornado plot. |
32 | wtp_ci_mnl_pref | A data frame: 5 × 3 | WTP with CI calculated from mnl_pref model. |
3. All Code
Code
::opts_chunk$set(
knitrcomment = "#>",
fig.align = "center",
fig.show = "hold",
fig.width = 6.5,
fig.asp = 0.7,
fig.retina = 3,
fig.path = "figs_auto/",
out.width = "90%"
)
# Load libraries here
library(tidyverse)
library(here)
library(cbcTools)
library(logitr)
library(janitor)
library(lubridate)
library(fastDummies)
library(kableExtra)
library(tibble)
library(dplyr)
library(kableExtra)
<- tibble(
brief `Attribute` = c("Range", "Explanation"),
`Upfront Incentive:` = c("$100, $300, $500", "It means the amount of money you will be instantly paid if you join the smart charging program. The unit is in USD."),
`Free Level 2 Charger:` = c("No or Yes", "A \"Yes\" means you will get a free level 2 charger. A level 2 charger enables faster charging than a standard home charger."),
`Electricity Price Discount:` = c("10%, 25%, 50%", "If you choose to join this smart charging program, this will be your electricity price discount."),
`Override Window:` = c("0.5 ~ 4 hrs", "This is the longest time window per day that you can force override to regular charging, during which time your EV is not controlled by the smart charging program."),
`Guaranteed Range if charged for 8 hrs:` = c("25%, 50%, 75%", "It means the guranteed range if your car is charged for 8 hrs during smart charging. It shows percentage here, but will be converted to miles."))
row.names(brief) <- NULL
kable(t(brief), escape = FALSE) %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE,
position = "center") %>%
row_spec(row = 1, bold = TRUE)
<- tibble(
alts `Options:` = c("Option 1", "Option 2"),
`Upfront Incentive:` = scales::dollar(c(100, 500)),
`Free Level 2 Charger:` = c("No", "Yes"),
`Electricity Price Discount:` = c("25%", "50%"),
`Override Window:` = c("0.5 hrs", "4 hrs"),
`Guaranteed Range if charged for 8 hrs:` = c("100 miles", "300 miles"))
<- as.data.frame(t(alts)) %>%
alts_t rownames_to_column(var = "Attribute")
colnames(alts_t) <- NULL
kable(alts_t, escape = FALSE, align = c('r', 'c', 'c')) %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE,
position = "center") %>%
column_spec(column = 1, width = '20em') %>%
column_spec(column = 2, width = '10em') %>%
column_spec(column = 3, width = '10em') %>%
row_spec(row = 1, bold = TRUE)
# Generate the profiles of attributes
<- cbc_profiles(
profiles upfront = c(100, 300, 500), # USD
lv_2_charger = c('No', 'Yes'),
discount = c(10, 25, 50), # percentage
window = c(0.5, 1, 2, 4), # hrs
guaranteed = c(25, 50, 75) # percentage
)
# Randomized full-factorial design
<- cbc_design(
design profiles = profiles,
n_resp = 500, # Number of respondents
n_alts = 2, # Number of alternatives per question
n_q = 8, # Number of questions per respondent
no_choice = TRUE
)
# Save design
write_csv(design, here('data', 'choice_questions.csv'))
# 1. Read survey data
<- read_csv(here("data", "pevFinal1.csv"))
p1 <- read_csv(here("data", "pevFinal2.csv"))
p2 <- read_csv(here("data", "pevFinal3.csv"))
p3
# 2. Format and combine the 3 dataframes
# 2.1 Format p1
<- p1 %>%
p1 mutate(
created = ymd_hms(created, tz = "EST"),
ended = ymd_hms(ended, tz = "EST"),
time_sec_p1 = as.numeric(ended - created, units = "secs")) %>%
# Select important columns
select(session, time_sec_p1, starts_with("mc"))
# 2.2 Format p2
<- p2 %>%
p2 mutate(
created = ymd_hms(created),
ended = ymd_hms(ended),
time_sec_p2 = as.numeric(ended - created, units = "secs")) %>%
# Select important columns
select(session, time_sec_p2, respondentID, cbc_all_same, starts_with("mc"))
# 2.3 Format p3
<- p3 %>%
p3 mutate(
created = ymd_hms(created),
ended = ymd_hms(ended),
time_sec_p3 = as.numeric(ended - created, units = "secs")) %>%
# Select important columns
select(session, time_sec_p3, completion_code, starts_with("mc"))
# 2.4 Combine all parts together using the session variable
<- p1 %>%
choice_data_wider left_join(p2, by = "session") %>%
left_join(p3, by = "session") %>%
# No longer need session variable
select(-session)
# 3. Drop unnecessary parts
# 3.1 Drop anyone who didn't complete all choice questions
<- choice_data_wider %>%
choice_data_wider filter(!is.na(mc_cbc_1)) %>%
filter(!is.na(mc_cbc_2)) %>%
filter(!is.na(mc_cbc_3)) %>%
filter(!is.na(mc_cbc_4)) %>%
filter(!is.na(mc_cbc_5)) %>%
filter(!is.na(mc_cbc_6)) %>%
filter(!is.na(mc_cbc_7)) %>%
filter(!is.na(mc_cbc_8))
# 3.2 Drop respondents who went too fast
<- choice_data_wider %>%
choice_data_wider mutate(
# First replace NA values with 0 seconds
time_sec_p1 = ifelse(is.na(time_sec_p1), 0, time_sec_p1),
time_sec_p2 = ifelse(is.na(time_sec_p2), 0, time_sec_p2),
time_sec_p3 = ifelse(is.na(time_sec_p3), 0, time_sec_p3),
# Now compute the total time
time_min_total = (time_sec_p1 + time_sec_p2 + time_sec_p3) / 60
)
# 3.3 Drop anyone who finished in under the 10th percentile of completion times
<- quantile(choice_data_wider$time_min_total, 0.1)
time_10 <- choice_data_wider %>%
choice_data_wider filter(time_min_total >= time_10)
# 3.4 Drop respondents that got the attention check question wrong
# choice_data_wider <- choice_data_wider %>%
# filter(mc_cbc_practice == 'option_2')
# 4. Generate choice data csv file
# 4.1 Convert the data to long format
<- choice_data_wider %>%
choice_data pivot_longer(
cols = mc_cbc_1:mc_cbc_8,
names_to = "qID",
values_to = "choice") %>%
# Convert the qID variable to a number
mutate(qID = parse_number(qID))
# 4.2 Read in choice questions and join it to choice_data
<- read_csv("https://github.com/pingfan0727/Public-Resources/raw/main/datasets/cbc_questions_final.csv")
survey <- choice_data %>%
choice_data rename(respID = respondentID) %>%
left_join(survey, by = c("respID", "qID"), relationship = "many-to-many")
# 4.3 Convert choice column to 1 or 0 based on if the alternative was chosen
<- choice_data %>%
choice_data mutate(
choice = case_when(
str_detect(choice, "^option_") ~
as.numeric(str_replace(choice, "option_", "")),
== "not_interested" ~ 3,
choice TRUE ~ as.numeric(choice)
),choice = ifelse(choice == altID, 1, 0)
%>%
) select(-mc_cbc_practice, -cbc_all_same)
# 4.4 Create new values for respID & obsID
<- nrow(choice_data_wider)
nRespondents <- max(survey$altID)
nAlts <- max(survey$qID)
nQuestions $respID <- rep(seq(nRespondents), each = nAlts*nQuestions)
choice_data$obsID <- rep(seq(nRespondents*nQuestions), each = nAlts)
choice_data
# 4.5 Reorder columns - it's nice to have the "ID" variables first
<- choice_data %>%
choice_data select(ends_with("ID"), "choice", "upfront",
"lv_2_charger_Yes", "lv_2_charger_No",
"discount", "window", "guaranteed",
"no_choice", everything())
# 4.6 Clean up names for choice_data
<- clean_names(choice_data)
choice_data
# 4.7 Save the choice_data_wider for future use
save(
choice_data_wider,file = here("data", "choice_data_wider.RData")
)
# 4.8 Save cleaned data for modeling
write_csv(choice_data, here("data", "choice_data.csv"))
# 1.1 Load and pivot the data
load(here("data", "choice_data_wider.RData"))
<- choice_data_wider %>%
choice_data_vehicle pivot_longer(cols = c(mc_own_ev, mc_max_range, mc_commute,
mc_car_number, mc_secondary),names_to = 'category',
values_to = 'value')
# 1.2 Reorder and relabel the values based on category
<- choice_data_vehicle %>%
choice_data_vehicle mutate(
category = case_when(
== "mc_own_ev" ~ "Own EV",
category == "mc_max_range" ~ "Max Range",
category == "mc_car_number" ~ "Car Number",
category == "mc_secondary" ~ "Secondary Car",
category == "mc_commute" ~ "Commute",
category TRUE ~ category
),value = case_when(
== "Own EV" ~
category factor(value, levels = c("yes_own", "yes_lease", "no_but_next",
"no_but_interested", "no"),
labels = c("Own", "Lease", "Plan", "Interested", "No")),
== "Max Range" ~
category factor(value, levels = c(
"below_100", "range_100_to_149", "range_150_to_199",
"range_200_to_249", "range_250_to_299", "over_300"
),labels = c("100", "150", "200", "250", "300", ">300")),
== "Car Number" ~
category factor(value, levels = c("zero", "one", "two", "three", "four"),
labels = c("0", "1", "2", "3", "4")),
== "Secondary Car" ~
category factor(value, levels = c("yes_ev", "yes_gas", "yes_both", "no"),
labels = c("EV", "Gas", "Both", "Nil")),
== "Commute" ~
category factor(value, levels = c("on_foot", "by_bus", "by_train_subway",
"self_driving", "others"),
labels = c("On Foot", "By Bus", "By Train",
"Self-driving", "Others")),
TRUE ~ as.factor(value)
)
)
# 1.3 Plot the vehicle info summary
<- choice_data_vehicle %>%
vehicle_plot ggplot(aes(x = value)) +
geom_bar(width = 0.6) +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
facet_wrap(~fct_relevel(category, "Own EV", "Max Range",
"Car Number", "Secondary Car", "Commute"),
scales = "free_x", nrow = 3) +
labs(x = NULL,
y = "Count",
title = "Vehicle Info Summary Plots",
subtitle = "Summary Statistics Part 1") +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '1_2_vehicle_plot.png'),
plot = vehicle_plot,
width = 7,
height = 7 * 0.8
)# 2.1 Pivot the data
<- choice_data_wider %>%
choice_data_charging pivot_longer(cols = c(mc_at_home, mc_at_work),
names_to = 'category',
values_to = 'value') %>%
separate_rows(value, sep = ",\\s*")
# 2.2 Reorder and relabel the values based on category
<- choice_data_charging %>%
choice_data_charging mutate(
category = case_when(
== "mc_at_home" ~ "At Home",
category == "mc_at_work" ~ "At Work",
category TRUE ~ category
),value = case_when(
== "At Home" ~
category factor(value, levels = c(
"home_morning", "home_afternoon", "home_evening",
"home_night", "no_home"),
labels = c("Morning", "Afternoon", "Evening",
"Night", "No")),
== "At Work" ~
category factor(value, levels = c(
"work_morning", "work_noon", "work_afternoon",
"work_evening", "work_night", "no_work"),
labels = c("Morning", "Noon", "Afternoon",
"Evening", "Night", "No")),
TRUE ~ as.factor(value)
)
)
# 2.3 Plot the charging preference summary
<- choice_data_charging %>%
charging_plot ggplot(aes(x = value)) +
geom_bar(width = 0.6) +
scale_x_discrete(limits = c("Morning", "Noon", "Afternoon",
"Evening", "Night", "No")) +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
facet_wrap(~fct_relevel(category, "At Home", "At Work"),
scales = "free_x", nrow = 2) +
labs(x = NULL,
y = "Count",
title = "Charging Preference Summary Plots",
subtitle = "Summary Statistics Part 2") +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '1_2_charging_plot.png'),
plot = charging_plot,
width = 6,
height = 6 * 0.8
)# 3.1 First Part - gender, race, climate caring, and party
# 3.1.1 Pivot the data
<- choice_data_wider %>%
choice_data_demo mutate(across(c(mc_gender, mc_race, mc_climate, mc_party), as.character)) %>%
pivot_longer(cols = c(mc_gender, mc_race, mc_climate, mc_party),
names_to = 'category',
values_to = 'value') %>%
separate_rows(value, sep = ",\\s*")
# 3.1.2 Reorder and relabel the values based on category
<- choice_data_demo %>%
choice_data_demo mutate(
category = case_when(
== "mc_gender" ~ "Gender",
category == "mc_race" ~ "Race",
category == "mc_climate" ~ "Climate Caring",
category == "mc_party" ~ "Party",
category TRUE ~ category
),value = case_when(
== "Gender" ~
category factor(value, levels = c(
"male", "female", "transMale", "transFemale",
"genderNonconform", "others", "prefer_not_say"),
labels = c("Male", "Female", "Trans Male", "Trans Female",
"Genderqueer", "Others", "Not Say")),
== "Race" ~
category factor(value, levels = c(
"asian", "black", "white", "hispanic", "native",
"pacific", "others", "prefer_not_say"),
labels = c("Asian", "Black", "White", "Hispanic", "Native",
"Pacific", "Other", "Not Say")),
== "Party" ~
category factor(value, levels = c(
"democratic", "republican", "independent", "prefer_not_say"),
labels = c("Democratic", "Republican", "Independent", "Not say")),
TRUE ~ as.factor(value)
)
)
# 3.1.3 Plot the first part of demographic info summary
<- choice_data_demo %>%
demo_plot_1 ggplot(aes(x = value)) +
geom_bar(width = 0.6) +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
facet_wrap(~fct_relevel(category, "Gender", "Race", "Climate Caring", "Party"),
scales = "free_x", nrow = 2) +
labs(x = NULL,
y = "Count",
title = "Demographic Summary Plots Part 1",
subtitle = "Summary Statistics Part 3-1") +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '1_2_demo_plot_1.png'),
plot = demo_plot_1,
width = 7,
height = 7 * 0.8
)# 3.2 Second Part - education
# 3.2.1 Pivot the data
<- choice_data_wider %>%
choice_data_demo_2 mutate(across(c(mc_education, mc_income), as.character)) %>%
pivot_longer(cols = c(mc_education, mc_income),
names_to = 'category',
values_to = 'value') %>%
separate_rows(value, sep = ",\\s*")
# 3.2.2 Reorder and relabel the values based on category
<- choice_data_demo_2 %>%
choice_data_demo_2 mutate(
category = case_when(
== "mc_education" ~ "Education",
category == "mc_income" ~ "Income",
category TRUE ~ category
),value = case_when(
== "Education" ~
category factor(value, levels = c(
"no_hs", "hs", "college_some", "vocational", "degree_associate",
"degree_bs", "degree_grad", "others", "prefer_not_say"),
labels = c("< High School", "High School", "Some College",
"Vocational", "AA/AS", "BA/BS", "Grad",
"Others", "Not Say")),
== "Income" ~
category factor(value, levels = c(
"under60", "inc_60to80", "inc_80to100", "inc_100to150",
"inc_150to200", "inc_200to250", "inc_250to300",
"inc_300to350", "inc_350to400", "inc_over400",
"prefer_not_say"),
labels = c("< $60K", "$60K to $80K", "$80K to $100K",
"$100K to $150K", "$150K to $200K",
"$200K to $250K", "$250K to $300K",
"$300K to $350K", "$350K to $400K",
"> $400K", "Not Say")),
TRUE ~ as.factor(value)
)
)
# 3.2.3 Plot the education distribution
<- choice_data_demo_2 %>%
demo_plot_2 filter(category == "Education") %>%
ggplot(aes(x = value)) +
geom_bar(width = 0.6) +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
labs(x = NULL,
y = "Count",
title = "Demographic Summary Plots Part 2 - Education",
subtitle = "Summary Statistics Part 3-2") +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '1_2_demo_plot_2.png'),
plot = demo_plot_2,
width = 6,
height = 6 / 1.618
)# 3.3 Third Part - income
# 3.3.1 Plot the income distribution
<- choice_data_demo_2 %>%
demo_plot_3 filter(category == "Income") %>%
ggplot(aes(x = value)) +
geom_bar(width = 0.6) +
scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
labs(x = NULL,
y = "Count",
title = "Demographic Summary Plots Part 3 - Income",
subtitle = "Summary Statistics Part 3-3") +
coord_flip() +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '1_2_demo_plot_3.png'),
plot = demo_plot_3,
width = 6,
height = 6 / 1.618
)
vehicle_plot
charging_plot
demo_plot_1
demo_plot_2
demo_plot_3# 1. Model Summary
# 1.1 Use the logitr package to generate the model
<- read_csv(here("data", "choice_data.csv")) %>%
data_mnl_pref mutate(upfront = -1 * upfront / 100,
discount = discount / 10,
guaranteed = guaranteed / 10)
<- logitr(
mnl_pref data = data_mnl_pref,
outcome = "choice",
obsID = "obs_id",
pars = c("upfront", "lv_2_charger_yes", "discount",
"window", "guaranteed", "no_choice")
)
# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_pref)
# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mnl_pref$gradient
# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mnl_pref$hessian)$values
# 2.3 Save the model
save(
mnl_pref,file = here("models", "mnl_pref.RData")
)
# 3. Model Equation
# 3.1 Estimated values and std err
<- data.frame(
table_mnl_pref Coefficient = c("β<sub>1</sub>", "β<sub>2</sub>",
"β<sub>3</sub>", "β<sub>4</sub>",
"β<sub>5</sub>", "β<sub>6</sub>"),
Meaning = c("Upfront", "Lv 2 Charger", "Discount",
"Window", "Guaranteed", "No Choice"),
Estimate = coef(mnl_pref),
StdError = se(mnl_pref),
Level = c("1, 3, 5", "-", "1, 2.5, 5",
"0.5, 1, 2, 4", "25, 50, 75", "-"),
Unit = c("100 USD", "-", "10%", "1 hr", "10%", "-")
)$Estimate <- round(table_mnl_pref$Estimate, 4)
table_mnl_pref$StdError <- round(table_mnl_pref$StdError, 4)
table_mnl_pref
<- table_mnl_pref %>%
mnl_pref_kable kable(format = 'html', escape = FALSE,
col.names = c("Coefficient", "Meaning", "Estimate", "Std Error",
"Level", "Unit"),
align = c('c', 'l', 'c', 'c', 'c', 'c'),
caption = "Summary of Model Coefficients") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 3.2 Model figure
# 3.2.1 Construct the data frame
<- data.frame(
figure_mnl_pref Attribute = factor(
c("Upfront", "Charger", "Discount", "Window", "Guaranteed", "No Choice"),
levels = c("No Choice", "Guaranteed", "Window",
"Discount", "Charger", "Upfront")
),Estimate = coef(mnl_pref),
StdError = se(mnl_pref)
)
$LowerBound =
figure_mnl_pref$Estimate - 1.96 * figure_mnl_pref$StdError
figure_mnl_pref$UpperBound =
figure_mnl_pref$Estimate + 1.96 * figure_mnl_pref$StdError
figure_mnl_pref
# 3.2.2 Create the plot
<- figure_mnl_pref %>%
mnl_pref_plot ggplot(aes(x = Attribute, y = Estimate)) +
geom_point() +
geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
labs(title = "Coefficient Estimates with Uncertainty Bounds",
subtitle = "Simple logit preference space",
x = "Attribute",
y = "Estimate") +
coord_flip() +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '2_1_mnl_pref_plot.png'),
plot = mnl_pref_plot,
width = 6,
height = 6 / 1.618
)
mnl_pref_kable
mnl_pref_plot# 1. Model Summary
# 1.1 Use the logitr package to generate the model
<- read_csv(here("data", "choice_data.csv")) %>%
data_mxl_pref mutate(upfront = -1 * upfront / 100,
discount = discount / 10,
guaranteed = guaranteed / 10)
set.seed(456)
<- logitr(
mxl_pref data = data_mxl_pref,
outcome = "choice",
obsID = "obs_id",
pars = c("upfront", "lv_2_charger_yes", "discount",
"window", "guaranteed", "no_choice"),
randPars = c(upfront = "n",
lv_2_charger_yes = "n",
discount = "n",
window = "n",
guaranteed = "n",
no_choice = "n"),
numMultiStarts = 30,
drawType = 'sobol',
numDraws = 200
)
# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mxl_pref)
# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mxl_pref$gradient
# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mxl_pref$hessian)$values
# 2.3 Save the model
save(
mxl_pref,file = here("models", "mxl_pref.RData")
)
# 3. Model Equation
# 3.1 Estimated values and std err
<- data.frame(
table_mxl_pref Coefficient = c("β<sub>1</sub>", "β<sub>2</sub>",
"β<sub>3</sub>", "β<sub>4</sub>",
"β<sub>5</sub>", "β<sub>6</sub>",
"SD β<sub>1</sub>", "SD β<sub>2</sub>",
"SD β<sub>3</sub>", "SD β<sub>4</sub>",
"SD β<sub>5</sub>", "SD β<sub>6</sub>"),
Meaning = c("Upfront", "Lv 2 Charger", "Discount",
"Window", "Guaranteed", "No Choice",
"SD Upfront", "SD Lv 2 Charger", "SD Discount",
"SD Window", "SD Guaranteed", "SD No Choice"),
Estimate = coef(mxl_pref),
StdError = se(mxl_pref),
Level = c("1, 3, 5", "-", "1, 2.5, 5",
"0.5, 1, 2, 4", "25, 50, 75", "-",
"-", "-", "-", "-", "-", "-"),
Unit = c("100 USD", "-", "10%", "hour", "10%", "-",
"-", "-", "-", "-", "-", "-")
)$Estimate <- round(table_mxl_pref$Estimate, 4)
table_mxl_pref$StdError <- round(table_mxl_pref$StdError, 4)
table_mxl_pref<- table_mxl_pref %>%
mxl_pref_kable kable(format = 'html', escape = FALSE,
col.names = c("Coefficient", "Meaning", "Estimate", "Std Error",
"Level", "Unit"),
align = c('c', 'l', 'c', 'c', 'l', 'l'),
caption = "Summary of Model Coefficients") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 3.2 Model figure
# 3.2.1 Construct the data frame
<- data.frame(
figure_mxl_pref Attribute = factor(
c("Upfront", "Charger", "Discount", "Window", "Guaranteed", "No Choice"),
levels = c("No Choice", "Guaranteed", "Window",
"Discount", "Charger", "Upfront")
),Estimate = head(coef(mxl_pref), 6),
StdError = head(se(mxl_pref), 6)
)
$LowerBound =
figure_mxl_pref$Estimate - 1.96 * figure_mxl_pref$StdError
figure_mxl_pref$UpperBound =
figure_mxl_pref$Estimate + 1.96 * figure_mxl_pref$StdError
figure_mxl_pref
# 3.2.2 Create the plot
<- figure_mxl_pref %>%
mxl_pref_plot ggplot(aes(x = Attribute, y = Estimate)) +
geom_point() +
geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
labs(title = "Coef Estimates with Uncertainty Bounds",
subtitle = "Mixed logit preference space",
x = "Attribute",
y = "Estimate") +
coord_flip() +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '2_2_mxl_pref_plot.png'),
plot = mxl_pref_plot,
width = 6,
height = 6 / 1.618
)
mxl_pref_kable
mxl_pref_plot# 1. Model Summary
# 1.1 Use the logitr package to generate the model
<- read_csv(here("data", "choice_data.csv")) %>%
data_mnl_pref_groups mutate(upfront = -1 * upfront / 100,
discount = discount / 10,
guaranteed = guaranteed / 10) %>%
mutate(group_A = ifelse(mc_income == "under60", 1, 0),
group_B = ifelse(mc_income != "under60", 1, 0)) %>%
mutate(upfront_B = upfront * group_B,
lv_2_charger_yes_B = lv_2_charger_yes * group_B,
discount_B = discount * group_B,
window_B = window * group_B,
guaranteed_B = guaranteed * group_B,
no_choice_B = no_choice * group_B)
<- logitr(
mnl_pref_groups data = data_mnl_pref_groups,
outcome = "choice",
obsID = "obs_id",
pars = c("upfront", "lv_2_charger_yes", "discount",
"window", "guaranteed", "no_choice",
"upfront_B", "lv_2_charger_yes_B", "discount_B",
"window_B", "guaranteed_B", "no_choice_B")
)
# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_pref_groups)
# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mnl_pref_groups$gradient
# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mnl_pref_groups$hessian)$values
# 2.3 Save the model
save(
mnl_pref_groups,file = here("models", "mnl_pref_groups.RData")
)
# 3. Model Equation
# 3.1 Estimated values and std err
<- data.frame(
table_mnl_pref_groups Coefficient = c("β<sub>1</sub>", "β<sub>2</sub>",
"β<sub>3</sub>", "β<sub>4</sub>",
"β<sub>5</sub>", "β<sub>6</sub>",
"β<sub>7</sub>", "β<sub>8</sub>",
"β<sub>9</sub>", "β<sub>10</sub>",
"β<sub>11</sub>", "β<sub>12</sub>"),
Meaning = c("Upfront", "Lv 2 Charger", "Discount",
"Window", "Guaranteed", "No Choice",
"Upfront B", "Lv 2 Charger B", "Discount B",
"Window B", "Guaranteed B", "No Choice B"),
Estimate = coef(mnl_pref_groups),
StdError = se(mnl_pref_groups),
Level = c("1, 3, 5", "-", "1, 2.5, 5",
"0.5, 1, 2, 4", "25, 50, 75", "-",
"1, 3, 5", "-", "1, 2.5, 5",
"0.5, 1, 2, 4", "25, 50, 75", "-"),
Unit = c("100 USD", "-", "10%", "hour", "10%", "-",
"100 USD", "-", "10%", "hour", "10%", "-")
)$Estimate <- round(table_mnl_pref_groups$Estimate, 4)
table_mnl_pref_groups$StdError <- round(table_mnl_pref_groups$StdError, 4)
table_mnl_pref_groups<- table_mnl_pref_groups %>%
mnl_pref_groups_kable kable(format = 'html', escape = FALSE,
col.names = c("Coefficient", "Meaning", "Estimate", "Std Error",
"Level", "Unit"),
align = c('c', 'l', 'c', 'c', 'l', 'l'),
caption = "Summary of Model Coefficients") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 3.2 Model figure
# 3.2.1 Construct the data frame
<- data.frame(
figure_mnl_pref_groups Attribute = factor(
c("Upfront", "Charger", "Discount", "Window", "Guaranteed", "No Choice",
"Upfront B", "Charger B", "Discount B",
"Window B", "Guaranteed B", "No Choice B"),
levels = c("No Choice B", "Guaranteed B", "Window B",
"Discount B", "Charger B", "Upfront B",
"No Choice", "Guaranteed", "Window",
"Discount", "Charger", "Upfront")
),Estimate = coef(mnl_pref_groups),
StdError = se(mnl_pref_groups)
)
$LowerBound =
figure_mnl_pref_groups$Estimate - 1.96 * figure_mnl_pref_groups$StdError
figure_mnl_pref_groups$UpperBound =
figure_mnl_pref_groups$Estimate + 1.96 * figure_mnl_pref_groups$StdError
figure_mnl_pref_groups
# 3.2.2 Create the plot
<- figure_mnl_pref_groups %>%
mnl_pref_groups_plot ggplot(aes(x = Attribute, y = Estimate)) +
geom_point() +
geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
labs(title = "Coefficient Estimates with Uncertainty Bounds",
subtitle = "Simple logit preference space with groups",
x = "Attribute",
y = "Estimate") +
coord_flip() +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '2_3_mnl_pref_groups_plot.png'),
plot = mnl_pref_groups_plot,
width = 6,
height = 6 / 1.618
)
# 4. Hypothesis Test
# 4.1 Perform the simulation
<- coef(mnl_pref_groups)
beta <- mnl_pref_groups$hessian
hessian = -1*(solve(hessian))
covariance = as.data.frame(MASS::mvrnorm(10^5, beta, covariance))
draws
# 4.2 Define the coefficients for groups A and B
<- draws$upfront
b1_A <- draws$lv_2_charger_yes
b2_A <- draws$discount
b3_A <- draws$window
b4_A <- draws$guaranteed
b5_A <- draws$no_choice
b6_A
<- draws$upfront + draws$upfront_B
b1_B <- draws$lv_2_charger_yes + draws$lv_2_charger_yes_B
b2_B <- draws$discount + draws$discount_B
b3_B <- draws$window + draws$window_B
b4_B <- draws$guaranteed + draws$guaranteed_B
b5_B <- draws$no_choice + draws$no_choice_B
b6_B
# 4.3 Construct the data frame
= list(b1_A, b2_A, b3_A, b4_A, b5_A, b6_A,
coefficients
b1_B, b2_B, b3_B, b4_B, b5_B, b6_B)
<- data.frame(
draw_sim_df Coefficient = c("b1_A", "b2_A", "b3_A", "b4_A", "b5_A", "b6_A",
"b1_B", "b2_B", "b3_B", "b4_B", "b5_B", "b6_B"),
Mean = sapply(coefficients, mean),
LowerQuantile = sapply(coefficients, function(x) quantile(x, probs = 0.025)),
UpperQuantile = sapply(coefficients, function(x) quantile(x, probs = 0.975))
)
# 4.4 Sketch the plot
<- draw_sim_df %>%
hypothesis_test_plot ggplot(aes(x = Coefficient, y = Mean)) +
geom_point() +
geom_errorbar(aes(ymin = LowerQuantile, ymax = UpperQuantile), width = 0.2) +
labs(title = "Hypothesis Test on Groups A and B",
x = "Coefficient",
y = "Values") +
coord_flip() +
theme_bw(base_family = "Ubuntu")
ggsave(
filename = here('figs_manu', '2_3_hypothesis_test_plot.png'),
plot = hypothesis_test_plot,
width = 6,
height = 6 / 1.618
)
mnl_pref_groups_kable
mnl_pref_groups_plot
hypothesis_test_plot# 1. Model Summary
# 1.1 Use the logitr package to generate the model
<- read_csv(here("data", "choice_data.csv")) %>%
data_mnl_pref mutate(upfront = -1 * upfront / 100,
discount = discount / 10,
guaranteed = guaranteed / 10)
set.seed(234)
<- logitr(
mnl_wtp data = data_mnl_pref,
outcome = "choice",
obsID = "obs_id",
pars = c("lv_2_charger_yes", "discount",
"window", "guaranteed", "no_choice"),
scalePar = "upfront",
numMultiStarts = 30
)
# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_wtp)
# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mnl_wtp$gradient
# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mnl_wtp$hessian)$values
# 2.3 Compare preference vs WTP spaces
load(here("models", "mnl_pref.RData"))
<- wtpCompare(mnl_pref, mnl_wtp, scalePar = 'upfront')
mnl_wtp_compare <- mnl_wtp_compare %>% kable(
mnl_wtp_compare_kable format = 'html', escape = FALSE,
col.names = c("Coefficient", "mnl_pref Space",
"mnl_wtp Space", "Difference"),
caption = "Model Coefficients Compare") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 2.4 Save the model
save(
mnl_wtp,file = here("models", "mnl_wtp.RData")
)
mnl_wtp_compare_kable# 1. Model Summary
# 1.1 Use the logitr package to generate the model
<- read_csv(here("data", "choice_data.csv")) %>%
data_mxl_pref mutate(upfront = -1 * upfront / 100,
discount = discount / 10,
guaranteed = guaranteed / 10)
set.seed(89)
<- logitr(
mxl_wtp data = data_mxl_pref,
outcome = "choice",
obsID = "obs_id",
pars = c("lv_2_charger_yes", "discount",
"window", "guaranteed", "no_choice"),
scalePar = "upfront",
randPars = c(lv_2_charger_yes = "n",
discount = "n",
window = "n",
guaranteed = "n",
no_choice = "n"),
numMultiStarts = 30
)
# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mxl_wtp)
# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mxl_wtp$gradient
# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mxl_wtp$hessian)$values
# 2.3 Save the coefficients
<- head(coef(mxl_wtp), 6)
coef_mxl_wtp <- as.data.frame(coef_mxl_wtp)
coef_mxl_wtp_df
<- cbind(coef_mxl_wtp_df, head(mnl_wtp_compare, 6)) %>%
mxl_wtp_compare select(pref, coef_mxl_wtp, difference) %>%
mutate(difference = coef_mxl_wtp - as.numeric(pref))
<- mxl_wtp_compare %>% kable(
mxl_wtp_compare_kable format = 'html', escape = FALSE,
col.names = c("Coefficient", "mnl_pref Space",
"mxl_wtp Space", "Difference"),
caption = "Model Coefficients Compare") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 2.4 Save the model
save(
mxl_wtp,file = here("models", "mxl_wtp.RData")
)
mxl_wtp_compare_kable# 1. Load the mnl_pref model
load(here("models", "mnl_pref.RData"))
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_pref)
# 2. Compute WTP estimates
<- coef(mnl_pref)
coefs_mnl_pref <- coefs_mnl_pref / (-1*coefs_mnl_pref['upfront'])
wtp_mnl_pref
# 3. Simulate WTP with uncertainty
<- vcov(mnl_pref)
cov_mnl_pref set.seed(3)
<- as.data.frame(
coef_draws_mnl_pref ::mvrnorm(10^6, coefs_mnl_pref, cov_mnl_pref)
MASS
)<- -1*(coef_draws_mnl_pref[,2:6] / coef_draws_mnl_pref[,1])
wtp_draws_mnl_pref <- ci(wtp_draws_mnl_pref)
wtp_ci_mnl_pref
# 4. Create kables for WTP with uncertainty
<- wtp_mnl_pref %>% kable(
wtp_mnl_pref_kable format = 'html', escape = FALSE,
col.names = c("Attribute", "Coefficient"),
caption = "Computed WTP") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
<- wtp_ci_mnl_pref %>% kable(
wtp_ci_mnl_pref_kable format = 'html', escape = FALSE,
col.names = c("Attribute", "Mean", "Lower", "Upper"),
caption = "Simulated WTP with CI") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 5. Save WTP with uncertainty
save(
wtp_mnl_pref,
wtp_ci_mnl_pref,file = here("data", "wtp_from_pref.RData")
)
wtp_mnl_pref_kable
wtp_ci_mnl_pref_kable# 1. Load the mnl_pref model
load(here("models", "mnl_pref.RData"))
# 2. Market Simulation
# 2.1 Single market simulation
<- data.frame(
scenarios_sing alt_id = c(1, 2, 3),
obs_id = c(1, 1, 1),
upfront = c(-5, -1, 0),
lv_2_charger_yes = c(1, 0, 0),
discount = c(5, 1, 0),
window = c(1, 4, 0),
guaranteed = c(2.5, 7.5, 0),
no_choice = c(0, 0, 1)
)
<- predict(
sim_mnl_pref_sing
mnl_pref,newdata = scenarios_sing,
obsID = 'obs_id',
level = 0.95,
interval = 'confidence',
returnData = TRUE
)
# 2.2 Multiple simulations
<- read_csv(here("data", "scenarios.csv"))
scenarios_mult
<- predict(
sim_mnl_pref_mult
mnl_pref,newdata = scenarios_mult,
obsID = 'obs_id',
level = 0.95,
interval = 'confidence',
returnData = TRUE
)
# 2.3 Save simulations
save(
sim_mnl_pref_sing,
sim_mnl_pref_mult,file = here("sims", "sim_mnl_pref.RData")
)
# 3. Market Simulation Plots
# 3.1 Single market simulation plot
<- sim_mnl_pref_sing %>%
sim_sing_plot mutate(label = factor(c("Monetary", "Flexibility", "Not Enroll"),
levels = c("Monetary",
"Flexibility",
"Not Enroll"))) %>%
ggplot(aes(
x = label, y = predicted_prob,
ymin = predicted_prob_lower, ymax = predicted_prob_upper)) +
geom_col(fill = "grey", width = 0.6) +
geom_errorbar(width = 0.3) +
scale_y_continuous(limits = c(0, 1), labels = scales::percent) +
labs(x = 'Options',
y = 'Probability',
title = 'Single Market Simulation Plot',
subtitle = 'Monetary vs Flexibility') +
theme_bw(base_family = 'Ubuntu')
ggsave(
filename = here('figs_manu', '3_2_sim_sing_plot.png'),
plot = sim_sing_plot,
width = 6,
height = 6 / 1.618
)
# 3.2 Multiple market simulation plot
<- setNames(c(
scenarios_mult_labels "Polarized", "Incentive Only", "Charger Only", "Discount Only",
"Window Only", "Guaranteed Only", "Monetary vs Flexibility",
"Monetary", "Flexibility"), 1:9)
<- sim_mnl_pref_mult %>%
sim_mult_plot ggplot(aes(
x = as.factor(alt_id), y = predicted_prob,
ymin = predicted_prob_lower, ymax = predicted_prob_upper)) +
geom_col(fill = "grey", width = 0.6) +
geom_errorbar(width = 0.3) +
facet_wrap(~obs_id, labeller = labeller(obs_id = scenarios_mult_labels)) +
scale_x_discrete(labels = c("1" = "1", "2" = "2", "3" = "No")) +
scale_y_continuous(limits = c(0, 1), labels = scales::percent) +
labs(x = 'Options',
y = 'Probability',
title = 'Multiple Market Simulation Plot') +
theme_bw(base_family = 'Ubuntu')
ggsave(
filename = here('figs_manu', '3_2_sim_mult_plot.png'),
plot = sim_mult_plot,
width = 7,
height = 7 * 0.8
)
sim_sing_plot
sim_mult_plot<- tibble(
sim_cases `No.` = c(seq(1, 9)),
`Option 1` = c(
"Best levels of all attributes",
"Best upfront incentive",
"Free charger",
"Best electricity price discount",
"Best override window",
"Best guaranteed battery",
"Best upfront, discount, and a free charger",
"Best upfront, discount, and a free charger",
"Best window and guaranteed battery"),
`Option 2` = c(
"Lowest level for all",
"Lowest level for all",
"Lowest level for all",
"Lowest level for all",
"Lowest level for all",
"Lowest level for all",
"Best window and guaranteed battery",
"Lowest level for all",
"Lowest level for all"
),`No Option` = c(
"No participation", "No participation", "No participation",
"No participation", "No participation", "No participation",
"No participation", "No participation", "No participation"
)
)
<- sim_cases %>%
sim_cases_kable kable(format = 'html', escape = FALSE,
align = c('c', 'l', 'l', 'l'),
caption = "Multiple Market Simulation Cases") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
sim_cases_kable# 1. Load the mnl_pref model
load(here("models", "mnl_pref.RData"))
# 2. Sensitivity Analysis
# 2.1 Sensitivity of market share to changes in "upfront"
# Set a baseline scenario
# Option 1 is a no option
# Option 2 is provided with moderate values
# We will change the upfront values for Option 2
<- data.frame(
baseline alt_id = c(1, 2),
obs_id = c(1, 1),
upfront = c(0, -2),
lv_2_charger_yes = c(0, 1),
discount = c(0, 2.5),
window = c(0, 2),
guaranteed = c(0, 5),
no_choice = c(1, 0)
)
<- baseline %>% kable(
baseline_kable format = 'html', escape = FALSE,
caption = "Single Attribute Baseline") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# Set the sequence of upfront levels from 1000 to 100 USD
<- seq(-10, -1, by = 1)
upfront_levels
# Count the numbers of upfront scenarios
<- length(upfront_levels)
upfront_numbers
# Create scenarios with different upfront levels in option 2
<- do.call(
scenario_upfront replicate(upfront_numbers, baseline, simplify = FALSE)
rbind,
)$obs_id <- rep(seq(upfront_numbers), each = 2)
scenario_upfront$upfront[which(scenario_upfront$alt_id == 2)] <- upfront_levels
scenario_upfront
# Simulate the upfront sensitivity
<- predict(
sens_upfront
mnl_pref,newdata = scenario_upfront,
obsID = 'obs_id',
level = 0.95,
interval = 'confidence',
returnData = TRUE
%>%
) # Keep only option 2
filter(alt_id == 2) %>%
# Keep only upfront and predictions
select(upfront, starts_with("predicted_"))
<- sens_upfront %>% kable(
sens_upfront_kable format = 'html', escape = FALSE,
caption = "Upfront Sensitivity Table") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# 2.2 Sensitivity of market share to changes in multiple attributes
# We continue to work on Option 2
# This time, we modify all continuous attributes
# The attributes are: upfront, discount, window, and guaranteed
# Construct the attribute_levels, like we constructed upfront_levels
# But attribute_levels is a df, while upfront_levels is a sequence
<- tribble(
attribute_levels ~obs_id, ~alt_id, ~attribute, ~case, ~value,
2, 2, 'upfront', 'high', -5,
3, 2, 'upfront', 'low', -1,
4, 2, 'discount', 'high', 5,
5, 2, 'discount', 'low', 1,
6, 2, 'window', 'high', 4,
7, 2, 'window', 'low', 0.5,
8, 2, 'guaranteed', 'high', 7.5,
9, 2, 'guaranteed', 'low', 2.5
)
<- attribute_levels %>%
attribute_levels_kable kable(format = 'html', escape = FALSE,
col.names = c("Obs ID", "Alt ID", "Attribute",
"Case", "Value"),
caption = "Attribute Levels") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# Define the numbers of different scenarios
<- 9
scenario_numbers
# Create scenarios with different upfront levels in option 2
<- do.call(
scenario_table replicate(scenario_numbers, baseline, simplify = FALSE)
rbind,
)$obs_id <- rep(seq(scenario_numbers), each = 2)
scenario_table<- scenario_table %>%
scenario_table left_join(attribute_levels, by = c("alt_id", "obs_id")) %>%
mutate(
attribute = ifelse(is.na(attribute), "other", attribute),
case = ifelse(is.na(case), "base", case),
upfront = ifelse(attribute == 'upfront', value, upfront),
discount = ifelse(attribute == 'discount', value, discount),
window = ifelse(attribute == 'window', value, window),
guaranteed = ifelse(attribute == 'guaranteed', value, guaranteed)
)
<- scenario_table %>%
scenario_kable kable(format = 'html', escape = FALSE,
col.names = c("Alt ID", "Obs ID", "Upfront", "Charger",
"Discount", "Window", "Guaranteed",
"No Choice", "Attribute", "Case", "Value"),
caption = "Scenario Table") %>%
kable_styling(bootstrap_options = c("striped", "hover"))
# Simulate the multiple change sensitivity
<- predict(
sens_multiple
mnl_pref,newdata = scenario_table,
obsID = 'obs_id',
level = 0.95,
interval = 'confidence',
returnData = TRUE
%>%
) # Keep only option 2
filter(alt_id == 2) %>%
# Keep only upfront and predictions
select(upfront, attribute, case, value, predicted_prob)
# Save simulations
save(
sens_upfront,
sens_multiple,file = here("sims", "sens_mnl_pref.RData")
)# 3. Sensitivity Plots
# 3.1 Plot of market share change to "upfront"
<- sens_upfront %>%
sens_upfront_plot ggplot(aes(x = upfront * -100,
y = predicted_prob,
ymin = predicted_prob_lower,
ymax = predicted_prob_upper)) +
geom_ribbon(alpha = 0.2) +
geom_line(linetype = "dashed") +
geom_line(
data = sens_upfront %>% filter(upfront <= -1, upfront >= -5),
linetype = "solid") +
expand_limits(x = c(-10, -1), y = c(0, 1)) +
labs(x = 'Upfront (USD)',
y = 'Market Share',
title = 'Sensitivity of market share change to upfront') +
theme_bw(base_family = "Ubuntu")
# Save plot
ggsave(
filename = here('figs_manu', '3_3_sens_upfront_plot.png'),
plot = sens_upfront_plot,
width = 6,
height = 6 / 1.618
)
# 3.2 Plot of policy cost change to "upfront"
# Assume a market size of 1000 users
<- 1000
market_size <- sens_upfront %>%
sens_cost mutate(
cost_mean = upfront*market_size*predicted_prob*-100 / 1000,
cost_low = upfront*market_size*predicted_prob_lower*-100 / 1000,
cost_high = upfront*market_size*predicted_prob_upper*-100 / 1000)
<- sens_cost %>%
sens_cost_plot ggplot(aes(x = upfront * -100,
y = cost_mean,
ymin = cost_low,
ymax = cost_high)) +
geom_ribbon(alpha = 0.2) +
geom_line(linetype = "dashed") +
geom_line(
data = sens_cost %>% filter(upfront <= -1, upfront >= -5),
linetype = "solid") +
expand_limits(x = c(-10, -1), y = c(0, 1)) +
labs(x = 'Upfront (USD)',
y = 'Cost ($1000)',
title = 'Sensitivity of policy cost change to upfront') +
theme_bw(base_family = "Ubuntu")
# Save plot
ggsave(
filename = here('figs_manu', '3_3_sens_cost_plot.png'),
plot = sens_cost_plot,
width = 6,
height = 6 / 1.618
)
# 3.2 Tornado Plot of market share change to multiple attributes
<- data.frame(
labels attribute = c('upfront', 'discount', 'window', 'guaranteed'),
label = c('Upfront Incentive (USD)',
'Electricity Discount (%)',
'Override Window (hrs)',
'Guaranteed Battery (%)')
)
<- sens_multiple %>%
tornado_data select(-upfront) %>%
filter(case != 'base') %>%
left_join(labels, by = 'attribute') %>%
mutate(value = ifelse(attribute == "upfront", -100*value, value),
value = ifelse(attribute == "discount" | attribute == "guaranteed",
10*value, value))
<- jph::ggtornado(
tornado_plot data = tornado_data,
baseline = sens_multiple$predicted_prob[1],
var = 'label',
level = 'case',
value = 'value',
result = 'predicted_prob'
+
) scale_fill_manual(values = c("#67a9cf", "#ef8a62")) +
labs(x = 'Market Share',
y = 'Attribute',
title = 'Plot of market share change to multiple attributes') +
theme_bw(base_family = 'Ubuntu') +
theme(legend.position = 'none')
# Save plot
ggsave(
filename = here('figs_manu', '3_3_tornado_plot.png'),
plot = tornado_plot,
width = 6,
height = 6 / 1.618
)
baseline_kable
sens_upfront_kable
sens_upfront_plot
sens_cost_plot
attribute_levels_kable
scenario_kable
tornado_plot# Read in the choice questions
library(tidyverse)
<- read_csv("https://raw.githubusercontent.com/pingfan0727/Public-Resources/main/datasets/cbc_questions_final.csv")
survey
# Define the respondent ID
<- sample(survey$respID, 1)
respondentID
# Create the subset of rows for that respondent ID
<- survey %>%
df filter(respID == respondentID)
# Convert df to json
<- jsonlite::toJSON(df) # Here the df_json is NOT serialized since we are using Kables instead of buttons.
df_json library(dplyr)
library(kableExtra)
<- tibble(
brief `Attribute` = c("Range", "Explanation"),
`Upfront Incentive:` = c("$100, $300, $500", "It means the amount of money you will be instantly paid if you join the smart charging program. The unit is in USD."),
`Free Level 2 Charger:` = c("No or Yes", "A \"Yes\" means you will get a free level 2 charger. A level 2 charger enables faster charging than a standard home charger."),
`Electricity Price Discount:` = c("10%, 25%, 50%", "If you choose to join this smart charging program, this will be your electricity price discount."),
`Override Window:` = c("0.5 ~ 4 hrs", "This is the longest time window per day that you can force override to regular charging, during which time your EV is not controlled by the smart charging program."),
`Guaranteed Range if charged for 8 hrs:` = c("25%, 50%, 75%", "It means the guranteed range if your car is charged for 8 hrs during smart charging. It shows percentage here, but will be converted to miles."))
row.names(brief) <- NULL
kable(t(brief), escape = FALSE) %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE,
position = "center") %>%
row_spec(row = 1, bold = TRUE)
library(dplyr)
library(kableExtra)
library(tibble)
<- tibble(
alts `Options:` = c("Option 1", "Option 2"),
`Upfront Incentive:` = scales::dollar(c(100, 500)),
`Free Level 2 Charger:` = c("No", "Yes"),
`Electricity Price Discount:` = c("25%", "50%"),
`Override Window:` = c("0.5 hrs", "4 hrs"),
`Guaranteed Range if charged for 8 hrs:` = c("100 miles", "300 miles"))
<- as.data.frame(t(alts)) %>%
alts_t rownames_to_column(var = "Attribute")
colnames(alts_t) <- NULL
kable(alts_t, escape = FALSE, align = c('r', 'c', 'c')) %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE,
position = "center") %>%
column_spec(column = 1, width = '20em') %>%
column_spec(column = 2, width = '10em') %>%
column_spec(column = 3, width = '10em') %>%
row_spec(row = 1, bold = TRUE)
library(dplyr)
library(kableExtra)
library(tibble)
<- jsonlite::fromJSON(df_json) %>%
alts filter(qID == 1 & altID != 3) %>%
mutate(
altID = paste("Option", altID),
upfront = scales::dollar(upfront),
discount = paste0(discount,"%"),
window = paste(window, "hrs"),
guaranteed = paste(guaranteed, "%"),
lv_2_charger_Yes = case_when(lv_2_charger_Yes == 1 ~ "Yes",
== 0 ~ "No")) %>%
lv_2_charger_Yes select(
`Options:` = altID,
`Upfront Incentive:` = upfront,
`Free Level 2 Charger:` = lv_2_charger_Yes,
`Electricity Price Discount:` = discount,
`Override Window:` = window,
`Guaranteed Range if charged for 8 hrs:` = guaranteed)
<- as.data.frame(t(alts)) %>%
alts_t rownames_to_column(var = "Attribute")
colnames(alts_t) <- NULL
kable(alts_t, escape = FALSE, align = c('r', 'c', 'c')) %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE,
position = "center") %>%
column_spec(column = 1, width = '20em') %>%
column_spec(column = 2, width = '10em') %>%
column_spec(column = 3, width = '10em') %>%
row_spec(row = 1, bold = TRUE)
<- data.frame(
dict_code name = c('0_make_choice_questions',
'1_1_clean_data',
'1_2_summary_statistics',
'2_1_mnl_pref_model',
'2_2_mxl_pref_model',
'2_3_mnl_pref_groups_model',
'2_4_mnl_wtp_model',
'2_5_mxl_wtp_model',
'3_1_wtp_from_pref',
'3_2_market_sim',
'3_3_sensitivity_analysis'),
type = 'R',
description = c('Construct the conjoint questions.',
'Clean and export survey data.',
'Generate summary statistics of test takers.',
'The simple logit preference space model.',
'The mixed logit preference space model.',
'The simple logit preference space model by groups.',
'The simple logit WTP space model.',
'The mixed logit WTP space model.',
'Calculation of WTP using mnl_pref model.',
'Market simulations',
'Market sensitivity analysis')
%>%
) mutate(count = row_number()) %>%
select(count, everything())
%>%
dict_code kable(format = 'html',
escape = FALSE,
align = c('c', 'l', 'c', 'l'),
col.names = c('No.', 'Name', 'Type', 'Description'),
caption = 'Dictionary of R Codes') %>%
kable_styling(bootstrap_options = c('striped', 'hover'))
<- data.frame(
dict_file name = c('choice_data',
'choice_questions',
'pevFinal1',
'pevFinal2',
'pevFinal3',
'scenarios',
'choice_data_wider',
'wtp_from_pref',
'mnl_pref_groups',
'mnl_pref',
'mnl_wtp',
'mxl_pref',
'mxl_wtp',
'sens_mnl_pref',
'sim_mnl_pref'),
type = c('CSV', 'CSV', 'CSV', 'CSV', 'CSV', 'CSV',
'RData', 'RData', 'RData', 'RData', 'RData',
'RData', 'RData', 'RData', 'RData'),
description = c(
'Cleaned results from the final survey.',
'Local version of Conjoint questions.',
'Part 1 of final survey result from formr.',
'Part 2 of final survey result from formr.',
'Part 3 of final survey result from formr.',
'Scenarios of multiple attributes for sensitivity analysis.',
'Wider version of survey results',
'WTP with uncertainty calculated from mnl_pref.',
'The simple logit preference space model.',
'The mixed logit preference space model.',
'The simple logit preference space model by groups.',
'The simple logit WTP space model.',
'The mixed logit WTP space model.',
'Market sensitivity of mnl_pref model.',
'Market simulation of mnl_pref model.'
)%>%
) mutate(count = row_number()) %>%
select(count, everything())
%>%
dict_file kable(format = 'html',
escape = FALSE,
align = c('c', 'l', 'c', 'l'),
col.names = c('No.', 'Name', 'Type', 'Description'),
caption = 'Dictionary of Files') %>%
kable_styling(bootstrap_options = c('striped', 'hover'))
<- data.frame(
dict_df name = c('choice_data',
'choice_data_charging',
'choice_data_demo',
'choice_data_demo_2',
'choice_data_vehicle',
'choice_data_wider',
'coef_mxl_wtp_df',
'data_mnl_pref',
'data_mnl_pref_groups',
'data_mxl_pref',
'figure_mnl_pref',
'figure_mnl_pref_groups',
'figure_mxl_pref',
'mnl_wtp_compare',
'mxl_wtp_compare',
'profiles',
'scenario_table',
'scenario_upfront',
'scenarios_mult',
'scenarios_sing',
'sens_cost',
'sens_multiple',
'sens_upfront',
'sim_cases',
'sim_mnl_pref_mult',
'sim_mnl_pref_sing',
'survey',
'table_mnl_pref',
'table_mnl_pref_groups',
'table_mxl_pref',
'tornado_data',
'wtp_ci_mnl_pref'),
type = c('A tibble: 2,928 × 40',
'A tibble: 336 × 38',
'A tibble: 499 × 36',
'A tibble: 244 × 38',
'A tibble: 610 × 35',
'A tibble: 122 × 38',
'A data frame: 6 × 1',
'A tibble: 2,928 × 40',
'A tibble: 2,928 × 48',
'A tibble: 2,928 × 40',
'A data frame: 6 × 5',
'A data frame: 12 × 5',
'A data frame: 6 × 5',
'A data frame: 7 × 3',
'A data frame: 6 × 3',
'A data frame: 216 × 6',
'A data frame: 18 × 11',
'A data frame: 20 × 8',
'A tibble: 27 × 8',
'A data frame: 3 × 8',
'A data frame: 10 × 7',
'A data frame: 9 × 5',
'A data frame: 10 × 4',
'A tibble: 9 × 4',
'A data frame: 27 × 11',
'A data frame: 3 × 11',
'A tibble: 12,000 × 12',
'A data frame: 6 × 6',
'A data frame: 12 × 6',
'A data frame: 12 × 6',
'A data frame: 8 × 5',
'A data frame: 5 × 3'),
description = c(
'Conjoint questions for the pilot survey.',
'Longer version of choice data about charging.',
'Longer version of choice data about main demography.',
'Longer version of choice data about education and income.',
'Longer version of choice data about vehicle.',
'Original wider version of choice data.',
'Coefficients of the mxl_wtp model.',
'Choice data for the mnl_pref model.',
'Choice data for the mnl_pref_groups model.',
'Choice data for the mxl_pref model.',
'Data frame for mnl_pref plot.',
'Data frame for mnl_pref_groups plot.',
'Data frame for mxl_pref plot.',
'Compare data between mnl_pref and mnl_wtp.',
'Compare data between mnl_pref and mxl_wtp.',
'CBC profiles of the conjoint survey.',
'Scenarios of multiple attributes for market sensitivity analysis.',
'Scenarios of upfront attribute for market sensitivity analysis.',
'Multiple scenarios for maket simulation.',
'Single scenarios for maket simulation.',
'Sensitivity of cost.',
'Sensitivity of multiple attributes.',
'Sensitivity of upfront attribute.',
'Cases of multiple market simulation.',
'Multiple simulation using mnl_pref model.',
'Single simulation using mnl_pref model.',
'Survey data.',
'Details of the mnl_pref model.',
'Details of the mnl_pref_groups model.',
'Details of the mxl_pref model.',
'Data of tornado plot.',
'WTP with CI calculated from mnl_pref model.'
)%>%
) mutate(count = row_number()) %>%
select(count, everything())
%>%
dict_df kable(format = 'html',
escape = FALSE,
align = c('c', 'l', 'l', 'l'),
col.names = c('No.', 'Name', 'Type', 'Description'),
caption = 'Dictionary of Data Frames') %>%
kable_styling(bootstrap_options = c('striped', 'hover'))