Final Analysis - Smart Charging for PEVs

Quantifying the Benefits and Constraints of Plug-In Electric Vehicle Smart Charging Adoption

Authors

Pingfan Hu

Bharath Ravindra

Sampada Dhakal

Vedanth Surendra Hegde

Published

December 10, 2023

Abstract

The product being studied in the project proposal is “Smart Charging Solutions for Plug-in Electric Vehicles (PEVs).” The initiative aims to investigate the reception and potential adoption of these technologies among users who own or might consider owning PEVs. The focus is on understanding consumer attitudes, preferences, and potential barriers regarding the adoption of smart charging solutions for electric vehicles. The project intends to delve into aspects such as awareness, perceived benefits, and potential hindrances associated with smart charging technologies for PEVs, exploring attributes like compensation, electricity pricing, charging speed, and more.

The investigation of consumer inclinations and considerations within the project was centered on a strategic analysis of distinct product attributes and their corresponding design decisions. These key attributes included Upfront Incentive, Free Level 2 Charger, Electricity Price Discount, Override Window, and the commitment to a Guaranteed Range if smart charged for 8 hours. Notably, the design decisions for these attributes were structured with deliberate variations: the Upfront Incentive ranged across $100, $300, and $500, the provision of a Free Level 2 Charger was delineated as ‘Yes’ or ‘No’, and the Electricity Price Discount was presented at three distinct rates: 10, 25, and 50. Additionally, the Guaranteed Range if smart charged for 8 hours offered choices of 25, 50, and 75 units. These carefully curated design decisions were aimed at assessing consumer responsiveness to diverse incentive structures, charging infrastructure availability, pricing models, and the potential impact of guaranteed range, thereby offering comprehensive insights into the dynamics of smart charging solution adoption among consumers.

Introduction

The global drive for sustainable energy has propelled Plug-in Electric Vehicles (PEVs) to the forefront in the fight against climate change. Smart charging technologies, integrating advanced grid systems and digital connectivity, offer a transformative opportunity to enhance PEVs’ role in a sustainable future by optimizing charging processes, improving grid stability, and reducing costs.

This project proposal aims to gauge user acceptance of smart charging solutions for PEVs. It delves into factors influencing consumer choices, uncovering awareness levels, benefits, and barriers associated with smart charging among PEV owners. Using surveys, interviews, and data analysis, the project aims to reveal user attitudes, preferences, and concerns about smart charging.

The comprehensive approach considers various attributes like compensation, electricity price, charging speed, window, and percentage, exploring user attitudes toward smart charging. It also evaluates the impact of compensation on adoption rates, offering valuable insights for stakeholders in the electric vehicle industry.

In an era focused on environmental awareness and sustainable transportation, this project accelerates the shift toward cleaner and more efficient mobility solutions. It aligns with global energy sustainability goals, emphasizing the need to reduce carbon emissions and create a smarter, more connected, and environmentally responsible landscape for electric vehicle charging.

Survey Design

The survey is divided into three sections. Section 1 covers PEV Ownership and Charging Habits Questions, Section 2 addresses Smart Charging Problems, and Section 3 gathers Demographic Information.

Before starting the actual survey, participants will see a consent page and a 30-second educational video to introduce them to Smart Charging. The video explains smart charging and how it allows the utility to control the PEV charging process, including the time window and charging amount. In order to ensure engagement of all users, this survey does not fail any test takers if they are not EV owners. However, it turned out the majority of test takers are EV owners. This survey provides a “not interested” option if the test takers don’t want to participate.

Section 1 begins by asking participants about their interest in PEVs and whether they own, lease, plan to have one in the future, or do not have one. It also inquires about their preferred PEV model, preferred mileage, and charging habits at home and work. The respondents who have PEVs or tend to have PEVs are within our eligibility.

Section 2, focusing on Smart Charging Problems, consists of eight questions per respondent. This section addresses five main issues faced by smart charging: Upfront Incentive, Free Level 2 charger, Electricity Price discount, Override Window, and Guaranteed Range if charged for 8 hours. Upfront Incentive refers to the instant payment when joining the smart charging program, ranging from $100 to $500. Free Level 2 charger is a faster charging option, with respondents choosing between “Yes” or “No.” Electricity Price discount offers a discount ranging from 10% to 50% if they join the smart charging program. Override Window refers to the longest daily period where users can override regular charging, with values ranging from 0.5 to 4 hours. Guaranteed Range if charged for 8 hours specifies the range achieved if the car is charged for 8 hours during the smart charging period. The value shown ranges from 25% to 75% of the maximum range in miles, relying on users’ input. Each of the attributes with their respective levels and brief explanation is summarized in the table below:

Attribute Range Explanation
Upfront Incentive: $100, $300, $500 It means the amount of money you will be instantly paid if you join the smart charging program. The unit is in USD.
Free Level 2 Charger: No or Yes A "Yes" means you will get a free level 2 charger. A level 2 charger enables faster charging than a standard home charger.
Electricity Price Discount: 10%, 25%, 50% If you choose to join this smart charging program, this will be your electricity price discount.
Override Window: 0.5 ~ 4 hrs This is the longest time window per day that you can force override to regular charging, during which time your EV is not controlled by the smart charging program.
Guaranteed Range if charged for 8 hrs: 25%, 50%, 75% It means the guranteed range if your car is charged for 8 hrs during smart charging. It shows percentage here, but will be converted to miles.

This part of our survey involves a small twist: the respondent’s choice in Section 1 question determines the percentage attribute. In the guaranteed range section, we’ll calculate mileage based on the user’s input from Section 1. The guaranteed range in percentage ranges from 25% to 75% for the user’s input. Each question has three alternatives, with the third option being “I am not interested.” Since these attributes are randomly generated, some options may seem illogical or overly obvious. To ensure respondent understanding, we’ve included a sentence asking them to choose between two options if those were the only choices available. Additionally, a third option, “I am not interested,” is provided for those who do not show interest in the available options. Similarly, to ensure clarity, Section 2 starts with a description of the attributes, their values, and a sample question to guide respondents. Below is an example of our conjoint question.

Code
alts <- tibble(
    `Options:` = c("Option 1", "Option 2"),
    `Upfront Incentive:` = scales::dollar(c(100, 500)),
    `Free Level 2 Charger:` = c("No", "Yes"), 
    `Electricity Price Discount:` = c("25%", "50%"), 
    `Override Window:` = c("0.5 hrs", "4 hrs"),
    `Guaranteed Range if charged for 8 hrs:` = c("100 miles", "300 miles"))

alts_t <- as.data.frame(t(alts)) %>% 
  rownames_to_column(var = "Attribute")

colnames(alts_t) <- NULL

kable(alts_t, escape = FALSE, align = c('r', 'c', 'c')) %>% 
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE, 
                position = "center") %>%
  column_spec(column = 1, width = '20em') %>%
  column_spec(column = 2, width = '10em') %>%
  column_spec(column = 3, width = '10em') %>%
  row_spec(row = 1, bold = TRUE)
Options: Option 1 Option 2
Upfront Incentive: $100 $500
Free Level 2 Charger: No Yes
Electricity Price Discount: 25% 50%
Override Window: 0.5 hrs 4 hrs
Guaranteed Range if charged for 8 hrs: 100 miles 300 miles

Section 3 collects demographic information, travel habits, opinions regarding climate change and commuting preferences. This includes birth year, gender identity, race, educational background, household income, interest in the environment and climate change, political inclination, ZIP code, commute method, number of cars in their household, presence of EV charger, and the presence of backup cars, if any, along with their quantity.

There have not been any major changes between the pilot and final surveys. The only modification is an increase in the number of questions per respondent, from 5 in the pilot survey to 8 in the final version.

Data Analysis

This section describes the analysis that the team carried out using our pilot data. It contains 3 subsections: sample description, data cleaning, and modeling. The analysis breakdown is described in the following 3 subsections, and the data dictionary is show in the appendix (see 2. Data Dictionary).

1. Sample Description

The team generated a randomized full-factorial design of cbc choice question set using cbc_design() and fed into the survey. The design is generated using the following codes:

Code
# Generate the profiles of attributes
profiles <- cbc_profiles(
    upfront      = c(100, 300, 500), # USD
    lv_2_charger = c('No', 'Yes'),
    discount     = c(10, 25, 50),    # percentage
    window       = c(0.5, 1, 2, 4),  # hrs
    guaranteed   = c(25, 50, 75)     # percentage
)

# Randomized full-factorial design
design <- cbc_design(
    profiles = profiles,
    n_resp   = 500,  # Number of respondents
    n_alts   = 2,    # Number of alternatives per question
    n_q      = 8,    # Number of questions per respondent
    no_choice = TRUE
)

# Save design
write_csv(design, here('data', 'choice_questions.csv'))

The choice_questions.csv is saved to the data folder. However, in the survey analysis, the choice questions refered to is from an online source with the same format. See appendix 1. Conjoint Survey Documentation for the documentation of the conjoint survey questions.

2. Data Cleaning

The team had 122 complete sets of survey results, each with 8 conjoint questions answered. The data cleaning process is written in the code chunk below:

Code
# 1. Read survey data
p1 <- read_csv(here("data", "pevFinal1.csv"))
p2 <- read_csv(here("data", "pevFinal2.csv"))
p3 <- read_csv(here("data", "pevFinal3.csv"))

# 2. Format and combine the 3 dataframes
# 2.1 Format p1
p1 <- p1 %>% 
    mutate(
        created = ymd_hms(created, tz = "EST"),
        ended =  ymd_hms(ended, tz = "EST"),
        time_sec_p1 = as.numeric(ended - created, units = "secs")) %>%
    # Select important columns
    select(session, time_sec_p1, starts_with("mc"))

# 2.2 Format p2
p2 <- p2 %>% 
    mutate(
        created = ymd_hms(created),
        ended =  ymd_hms(ended),
        time_sec_p2 = as.numeric(ended - created, units = "secs")) %>%
    # Select important columns
    select(session, time_sec_p2, respondentID, cbc_all_same, starts_with("mc"))

# 2.3 Format p3
p3 <- p3 %>% 
    mutate(
        created = ymd_hms(created),
        ended =  ymd_hms(ended),
        time_sec_p3 = as.numeric(ended - created, units = "secs")) %>%
    # Select important columns
    select(session, time_sec_p3, completion_code, starts_with("mc"))

# 2.4 Combine all parts together using the session variable
choice_data_wider <- p1 %>% 
    left_join(p2, by = "session") %>% 
    left_join(p3, by = "session") %>% 
    # No longer need session variable
    select(-session)

# 3. Drop unnecessary parts
# 3.1 Drop anyone who didn't complete all choice questions
choice_data_wider <- choice_data_wider %>% 
    filter(!is.na(mc_cbc_1)) %>% 
    filter(!is.na(mc_cbc_2)) %>% 
    filter(!is.na(mc_cbc_3)) %>% 
    filter(!is.na(mc_cbc_4)) %>% 
    filter(!is.na(mc_cbc_5)) %>% 
    filter(!is.na(mc_cbc_6)) %>% 
    filter(!is.na(mc_cbc_7)) %>% 
    filter(!is.na(mc_cbc_8))

# 3.2 Drop respondents who went too fast
choice_data_wider <- choice_data_wider %>% 
    mutate(
        # First replace NA values with 0 seconds
        time_sec_p1 = ifelse(is.na(time_sec_p1), 0, time_sec_p1),
        time_sec_p2 = ifelse(is.na(time_sec_p2), 0, time_sec_p2),
        time_sec_p3 = ifelse(is.na(time_sec_p3), 0, time_sec_p3),
        # Now compute the total time
        time_min_total = (time_sec_p1 + time_sec_p2 + time_sec_p3) / 60
    )

# 3.3 Drop anyone who finished in under the 10th percentile of completion times
time_10 <- quantile(choice_data_wider$time_min_total, 0.1)
choice_data_wider <- choice_data_wider %>% 
    filter(time_min_total >= time_10)

# 3.4 Drop respondents that got the attention check question wrong
# choice_data_wider <- choice_data_wider %>% 
#    filter(mc_cbc_practice == 'option_2')

# 4. Generate choice data csv file
# 4.1 Convert the data to long format
choice_data <- choice_data_wider %>% 
    pivot_longer(
        cols = mc_cbc_1:mc_cbc_8,
        names_to = "qID",
        values_to = "choice") %>% 
    # Convert the qID variable to a number
    mutate(qID = parse_number(qID))

# 4.2 Read in choice questions and join it to choice_data
survey <- read_csv("https://github.com/pingfan0727/Public-Resources/raw/main/datasets/cbc_questions_final.csv")
choice_data <- choice_data %>% 
    rename(respID = respondentID) %>% 
    left_join(survey, by = c("respID", "qID"), relationship = "many-to-many")

# 4.3 Convert choice column to 1 or 0 based on if the alternative was chosen 
choice_data <- choice_data %>% 
    mutate(
        choice = case_when(
            str_detect(choice, "^option_") ~
                as.numeric(str_replace(choice, "option_", "")),
            choice == "not_interested" ~ 3,
            TRUE ~ as.numeric(choice)
        ),
        choice = ifelse(choice == altID, 1, 0)
    ) %>% 
    select(-mc_cbc_practice, -cbc_all_same)

# 4.4 Create new values for respID & obsID
nRespondents <- nrow(choice_data_wider)
nAlts <- max(survey$altID)
nQuestions <- max(survey$qID)
choice_data$respID <- rep(seq(nRespondents), each = nAlts*nQuestions)
choice_data$obsID <- rep(seq(nRespondents*nQuestions), each = nAlts)

# 4.5 Reorder columns - it's nice to have the "ID" variables first
choice_data <- choice_data %>% 
    select(ends_with("ID"), "choice", "upfront",
           "lv_2_charger_Yes", "lv_2_charger_No",
           "discount", "window", "guaranteed",
           "no_choice", everything())

# 4.6 Clean up names for choice_data
choice_data <- clean_names(choice_data)

# 4.7 Save the choice_data_wider for future use
save(
    choice_data_wider,
    file = here("data", "choice_data_wider.RData")
)

# 4.8 Save cleaned data for modeling
write_csv(choice_data, here("data", "choice_data.csv"))

The cleaned choice_data.csv file is saved to the data folder.

The team followed Prof. Helveston’s instructions on dropping criteria, including those were:

  • Incomplete
  • Too fast
  • Under 10th percentile of completion time

We intentionally kept those who answered the sample question wrong (which is obviously a dominant question), since by later data analysis we can see they answered the conjoint questions properly. It can be assumed that they were just not familiar with the conjoint question in the beginning.

3. Summary Statistics

This is instructed to be in the Sample Description subsection, but in fact can only be processed after Data Cleaning. The summary statistics will be shown in 3 parts:

  1. Vehicle Info Summary
  2. Charging Preference Summary
  3. Demographic Info Summary

3.1 Vehicle Info Summary

The vehicle info summary generates and illustrates the vehicle ownership (or leasing) of the users. Below is the code chunk:

Code
# 1.1 Load and pivot the data
load(here("data", "choice_data_wider.RData"))
choice_data_vehicle <- choice_data_wider %>% 
    pivot_longer(cols = c(mc_own_ev, mc_max_range, mc_commute,
                          mc_car_number, mc_secondary),
                 names_to = 'category',
                 values_to = 'value')

# 1.2 Reorder and relabel the values based on category
choice_data_vehicle <- choice_data_vehicle %>% 
    mutate(
        category = case_when(
            category == "mc_own_ev" ~ "Own EV",
            category == "mc_max_range" ~ "Max Range",
            category == "mc_car_number" ~ "Car Number",
            category == "mc_secondary" ~ "Secondary Car",
            category == "mc_commute" ~ "Commute",
            TRUE ~ category
        ),
        value = case_when(
            category == "Own EV" ~
                factor(value, levels = c("yes_own", "yes_lease", "no_but_next",
                                         "no_but_interested", "no"), 
                       labels = c("Own", "Lease", "Plan", "Interested", "No")),
            category == "Max Range" ~
                factor(value, levels = c(
                    "below_100", "range_100_to_149", "range_150_to_199",
                    "range_200_to_249", "range_250_to_299", "over_300"
                ),
                labels = c("100", "150", "200", "250", "300", ">300")),
            category == "Car Number" ~
                factor(value, levels = c("zero", "one", "two", "three", "four"),
                       labels = c("0", "1", "2", "3", "4")),
            category == "Secondary Car" ~
                factor(value, levels = c("yes_ev", "yes_gas", "yes_both", "no"),
                       labels = c("EV", "Gas", "Both", "Nil")),
            category == "Commute" ~
                factor(value, levels = c("on_foot", "by_bus", "by_train_subway",
                                         "self_driving", "others"),
                       labels = c("On Foot", "By Bus", "By Train",
                                  "Self-driving", "Others")),
            TRUE ~ as.factor(value)
        )
    )

# 1.3 Plot the vehicle info summary
vehicle_plot <- choice_data_vehicle %>%
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
    facet_wrap(~fct_relevel(category, "Own EV", "Max Range",
                            "Car Number", "Secondary Car", "Commute"),
               scales = "free_x", nrow = 3) +
    labs(x = NULL,
         y = "Count",
         title = "Vehicle Info Summary Plots",
         subtitle = "Summary Statistics Part 1") +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '1_2_vehicle_plot.png'), 
    plot = vehicle_plot,
    width = 7,
    height = 7 * 0.8
)

3.2 Charging Preference Summary

The charging preference summary generates and illustrates the workplace and home charging preferences of the users. Below is the code chunk:

Code
# 2.1 Pivot the data
choice_data_charging <- choice_data_wider %>% 
    pivot_longer(cols = c(mc_at_home, mc_at_work),
                 names_to = 'category',
                 values_to = 'value') %>% 
    separate_rows(value, sep = ",\\s*")

# 2.2 Reorder and relabel the values based on category
choice_data_charging <- choice_data_charging %>% 
    mutate(
        category = case_when(
            category == "mc_at_home" ~ "At Home",
            category == "mc_at_work" ~ "At Work",
            TRUE ~ category
        ),
        value = case_when(
            category == "At Home" ~
                factor(value, levels = c(
                    "home_morning", "home_afternoon", "home_evening",
                    "home_night", "no_home"), 
                    labels = c("Morning", "Afternoon", "Evening",
                               "Night", "No")),
            category == "At Work" ~
                factor(value, levels = c(
                    "work_morning", "work_noon", "work_afternoon",
                    "work_evening", "work_night", "no_work"), 
                    labels = c("Morning", "Noon", "Afternoon",
                               "Evening", "Night", "No")),
            TRUE ~ as.factor(value)
        )
    )

# 2.3 Plot the charging preference summary
charging_plot <- choice_data_charging %>%
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_x_discrete(limits = c("Morning", "Noon", "Afternoon",
                                "Evening", "Night", "No")) + 
    scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
    facet_wrap(~fct_relevel(category, "At Home", "At Work"),
               scales = "free_x", nrow = 2) +
    labs(x = NULL,
         y = "Count",
         title = "Charging Preference Summary Plots",
         subtitle = "Summary Statistics Part 2") +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '1_2_charging_plot.png'), 
    plot = charging_plot,
    width = 6,
    height = 6 * 0.8
)

3.3 Demographic Info Summary

The demographic info summary is broken into 3 parts:

  • Part 1 - Gender, Race, Climate Caring, Party
  • Part 2 - Education
  • Part 3 - Income

Below is the code chunk for demographic part 1:

Code
# 3.1 First Part - gender, race, climate caring, and party
# 3.1.1 Pivot the data
choice_data_demo <- choice_data_wider %>% 
    mutate(across(c(mc_gender, mc_race, mc_climate, mc_party), as.character)) %>%
    pivot_longer(cols = c(mc_gender, mc_race, mc_climate, mc_party),
                 names_to = 'category',
                 values_to = 'value') %>% 
    separate_rows(value, sep = ",\\s*")

# 3.1.2 Reorder and relabel the values based on category
choice_data_demo <- choice_data_demo %>% 
    mutate(
        category = case_when(
            category == "mc_gender" ~ "Gender",
            category == "mc_race" ~ "Race",
            category == "mc_climate" ~ "Climate Caring",
            category == "mc_party" ~ "Party",
            TRUE ~ category
        ),
        value = case_when(
            category == "Gender" ~
                factor(value, levels = c(
                    "male", "female", "transMale", "transFemale",
                    "genderNonconform", "others", "prefer_not_say"), 
                    labels = c("Male", "Female", "Trans Male", "Trans Female",
                               "Genderqueer", "Others", "Not Say")),
            category == "Race" ~
                factor(value, levels = c(
                    "asian", "black", "white", "hispanic", "native",
                    "pacific", "others", "prefer_not_say"), 
                    labels = c("Asian", "Black", "White", "Hispanic", "Native",
                               "Pacific", "Other", "Not Say")),
            category == "Party" ~
                factor(value, levels = c(
                    "democratic", "republican", "independent", "prefer_not_say"), 
                    labels = c("Democratic", "Republican", "Independent", "Not say")),
            TRUE ~ as.factor(value)
        )
    )

# 3.1.3 Plot the first part of demographic info summary
demo_plot_1 <- choice_data_demo %>%
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
    facet_wrap(~fct_relevel(category, "Gender", "Race", "Climate Caring", "Party"),
               scales = "free_x", nrow = 2) +
    labs(x = NULL,
         y = "Count",
         title = "Demographic Summary Plots Part 1",
         subtitle = "Summary Statistics Part 3-1") +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '1_2_demo_plot_1.png'), 
    plot = demo_plot_1,
    width = 7,
    height = 7 * 0.8
)

Below is the code chunk for demographic part 2:

Code
# 3.2 Second Part - education
# 3.2.1 Pivot the data
choice_data_demo_2 <- choice_data_wider %>% 
    mutate(across(c(mc_education, mc_income), as.character)) %>%
    pivot_longer(cols = c(mc_education, mc_income),
                 names_to = 'category',
                 values_to = 'value') %>% 
    separate_rows(value, sep = ",\\s*")

# 3.2.2 Reorder and relabel the values based on category
choice_data_demo_2 <- choice_data_demo_2 %>% 
    mutate(
        category = case_when(
            category == "mc_education" ~ "Education",
            category == "mc_income" ~ "Income",
            TRUE ~ category
        ),
        value = case_when(
            category == "Education" ~
                factor(value, levels = c(
                    "no_hs", "hs", "college_some", "vocational", "degree_associate",
                    "degree_bs", "degree_grad", "others", "prefer_not_say"), 
                    labels = c("< High School", "High School", "Some College",
                               "Vocational", "AA/AS", "BA/BS", "Grad",
                               "Others", "Not Say")),
            category == "Income" ~
                factor(value, levels = c(
                    "under60", "inc_60to80", "inc_80to100", "inc_100to150",
                    "inc_150to200", "inc_200to250", "inc_250to300",
                    "inc_300to350", "inc_350to400", "inc_over400",
                    "prefer_not_say"), 
                    labels = c("< $60K", "$60K to $80K", "$80K to $100K",
                               "$100K to $150K", "$150K to $200K",
                               "$200K to $250K", "$250K to $300K",
                               "$300K to $350K", "$350K to $400K",
                               "> $400K", "Not Say")),
            TRUE ~ as.factor(value)
        )
    )

# 3.2.3 Plot the education distribution
demo_plot_2 <- choice_data_demo_2 %>% 
    filter(category == "Education") %>% 
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
    labs(x = NULL,
         y = "Count",
         title = "Demographic Summary Plots Part 2 - Education",
         subtitle = "Summary Statistics Part 3-2") +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '1_2_demo_plot_2.png'), 
    plot = demo_plot_2,
    width = 6,
    height = 6 / 1.618
)

Below is the code chunk for demographic part 3:

Code
# 3.3 Third Part - income
# 3.3.1 Plot the income distribution
demo_plot_3 <- choice_data_demo_2 %>% 
    filter(category == "Income") %>% 
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
    labs(x = NULL,
         y = "Count",
         title = "Demographic Summary Plots Part 3 - Income",
         subtitle = "Summary Statistics Part 3-3") +
    coord_flip() +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '1_2_demo_plot_3.png'), 
    plot = demo_plot_3,
    width = 6,
    height = 6 / 1.618
)

3.4 Summary Statistics Data Viz

The summary plots are shown below in different tabs:

Interpretation of the summary statistics:

The vehicle info bar chart shows the majority of the test takers are EV owners. The max range of their EVs are mostly over 250 miles. They usually have at least 2 vehicles, with the other vehicle mostly being gas powered, which means EVs are probably used as a backup car. Most of the users drive to commute, which is reasonable.

The charging preference bar chart shows most people charges during evening and night time at home, but it’s harder to find charging places at work time.

The demographic plots show the basic information of the users. Male users is almost twice as females. Most of the users are white and supports the democratic party, and they do care about the environment very much, which encourages them to become EV users. The users mostly have bachelor’s degree, with moderate annual income of around $100k.

4. Modeling

The team attempted 5 models:

  1. mnl_pref - simple logit preference space model,
  2. mxl_pref - mixed logit preference space model,
  3. mnl_pref_groups - simple logit preference space model by groups,
  4. mnl_wtp - simple logit WTP space model, and
  5. mxl_wtp - mixed logit WTP space model.

4.1 Simple logit preference space

The mnl_pref model is the starting point of utility modeling. Below is the code chunk:

Code
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
data_mnl_pref <- read_csv(here("data", "choice_data.csv")) %>% 
    mutate(upfront = -1 * upfront / 100,
           discount = discount / 10,
           guaranteed = guaranteed / 10)

mnl_pref <- logitr(
    data    = data_mnl_pref,
    outcome = "choice",
    obsID   = "obs_id",
    pars    = c("upfront", "lv_2_charger_yes", "discount",
                "window", "guaranteed", "no_choice")
    )

# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_pref)

# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mnl_pref$gradient

# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mnl_pref$hessian)$values

# 2.3 Save the model
save(
    mnl_pref,
    file = here("models", "mnl_pref.RData")
)

# 3. Model Equation
# 3.1 Estimated values and std err
table_mnl_pref <- data.frame(
    Coefficient = c("&beta;<sub>1</sub>", "&beta;<sub>2</sub>",
                    "&beta;<sub>3</sub>", "&beta;<sub>4</sub>",
                    "&beta;<sub>5</sub>", "&beta;<sub>6</sub>"),
    Meaning = c("Upfront", "Lv 2 Charger", "Discount",
                "Window", "Guaranteed", "No Choice"),
    Estimate = coef(mnl_pref),
    StdError = se(mnl_pref),
    Level = c("1, 3, 5", "-", "1, 2.5, 5",
               "0.5, 1, 2, 4", "25, 50, 75", "-"),
    Unit = c("100 USD", "-", "10%", "1 hr", "10%", "-")
)
table_mnl_pref$Estimate <- round(table_mnl_pref$Estimate, 4)
table_mnl_pref$StdError <- round(table_mnl_pref$StdError, 4)

mnl_pref_kable <- table_mnl_pref %>% 
    kable(format = 'html', escape = FALSE,
          col.names = c("Coefficient", "Meaning", "Estimate", "Std Error",
                        "Level", "Unit"),
          align = c('c', 'l', 'c', 'c', 'c', 'c'),
          caption = "Summary of Model Coefficients") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 3.2 Model figure
# 3.2.1 Construct the data frame
figure_mnl_pref <- data.frame(
    Attribute = factor(
        c("Upfront", "Charger", "Discount", "Window", "Guaranteed", "No Choice"),
        levels = c("No Choice", "Guaranteed", "Window", 
                   "Discount", "Charger", "Upfront")
    ),
    Estimate = coef(mnl_pref),
    StdError = se(mnl_pref)
)

figure_mnl_pref$LowerBound =
    figure_mnl_pref$Estimate - 1.96 * figure_mnl_pref$StdError
figure_mnl_pref$UpperBound =
    figure_mnl_pref$Estimate + 1.96 * figure_mnl_pref$StdError

# 3.2.2 Create the plot
mnl_pref_plot <- figure_mnl_pref %>% 
    ggplot(aes(x = Attribute, y = Estimate)) +
    geom_point() +
    geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
    labs(title = "Coefficient Estimates with Uncertainty Bounds",
         subtitle = "Simple logit preference space",
         x = "Attribute",
         y = "Estimate") +
    coord_flip() +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '2_1_mnl_pref_plot.png'), 
    plot = mnl_pref_plot,
    width = 6, 
    height = 6 / 1.618
)

Below is the generated model coefficient table and plot for mnl_pref:

Summary of Model Coefficients
Coefficient Meaning Estimate Std Error Level Unit
upfront β1 Upfront -0.0151 0.0305 1, 3, 5 100 USD
lv_2_charger_yes β2 Lv 2 Charger 0.5884 0.0989 - -
discount β3 Discount 0.1657 0.0298 1, 2.5, 5 10%
window β4 Window 0.0835 0.0375 0.5, 1, 2, 4 1 hr
guaranteed β5 Guaranteed 0.2242 0.0253 25, 50, 75 10%
no_choice β6 No Choice 0.8345 0.2361 - -

Utility equation of the mnl_pref model:

\[ \begin{align} u_j &= \beta_1 x_j^{upfront} + \beta_2 \delta_j^{charger} + \beta_3 x_j^{discount} + \beta_4 x_j^{window} + \beta_5 x_j^{guaranteed} + \beta_6 x_j^{no} + \epsilon_j \\ u_j &= -0.02 x_j^{upfront} + 0.59 \delta_j^{charger} + 0.17 x_j^{discount} + 0.08 x_j^{window} + 0.22 x_j^{guaranteed} + 0.83 x_j^{no} + \epsilon_j \end{align} \]

Interpretation of the mnl_pref model:

This is the single logit preference space model. It contains point estimates of each coefficient, along with the stand error values.

We start with the point estimates: the coefficients The team intentionally set the upfront attribute as a negative number, so that in the model it shows a negative coefficient. However, this does mean that the upfront incentive is still “the more the better”. The other coefficients are all positive, meaning the users will tend to agree with either a larger value, or agree with a “YES” option.

The interesting finding is the “No Choice”: Users tend to simply NOT participate in the smart charging program. It is a reasonable result, since the users are sacrificing their rights, which is the reason why we need monetary incentives to them.

The standard errors reveal how the users are confident with each utility. For example, the attributes of upfront, discount, window, and guaranteed have small SE values, indicating the users’ high confidence when making decisions. The users are not quite sure if they want a free charger, or to participate in this program.

An important notice based on this very first model: the simple logit preference space model (and the one with groups) is the ONLY model that does not require multi-start, due to the fact of being a convex model with a single global maximum likelihood. All the multiple logit or WTP space models are non-convex and require multi-starts. We’ll explain more in their own sections.

4.2 Mixed logit preference space

This part is the 2nd model: mixed logit preference space, mxl_pref. Below is the code chunk:

Code
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
data_mxl_pref <- read_csv(here("data", "choice_data.csv")) %>% 
    mutate(upfront = -1 * upfront / 100,
           discount = discount / 10,
           guaranteed = guaranteed / 10)
set.seed(456)
mxl_pref <- logitr(
    data    = data_mxl_pref,
    outcome = "choice",
    obsID   = "obs_id",
    pars    = c("upfront", "lv_2_charger_yes", "discount",
                "window", "guaranteed", "no_choice"),
    randPars = c(upfront          = "n",
                 lv_2_charger_yes = "n",
                 discount         = "n",
                 window           = "n",
                 guaranteed       = "n",
                 no_choice        = "n"),
    numMultiStarts = 30,
    drawType = 'sobol',
    numDraws = 200
    )

# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mxl_pref)

# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mxl_pref$gradient

# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mxl_pref$hessian)$values

# 2.3 Save the model
save(
    mxl_pref,
    file = here("models", "mxl_pref.RData")
)

# 3. Model Equation
# 3.1 Estimated values and std err
table_mxl_pref <- data.frame(
    Coefficient = c("&beta;<sub>1</sub>", "&beta;<sub>2</sub>",
                    "&beta;<sub>3</sub>", "&beta;<sub>4</sub>",
                    "&beta;<sub>5</sub>", "&beta;<sub>6</sub>",
                    "SD &beta;<sub>1</sub>", "SD &beta;<sub>2</sub>",
                    "SD &beta;<sub>3</sub>", "SD &beta;<sub>4</sub>",
                    "SD &beta;<sub>5</sub>", "SD &beta;<sub>6</sub>"),
    Meaning = c("Upfront", "Lv 2 Charger", "Discount",
                "Window", "Guaranteed", "No Choice",
                "SD Upfront", "SD Lv 2 Charger", "SD Discount",
                "SD Window", "SD Guaranteed", "SD No Choice"),
    Estimate = coef(mxl_pref),
    StdError = se(mxl_pref),
    Level = c("1, 3, 5", "-", "1, 2.5, 5",
               "0.5, 1, 2, 4", "25, 50, 75", "-",
              "-", "-", "-", "-", "-", "-"),
    Unit = c("100 USD", "-", "10%", "hour", "10%", "-",
             "-", "-", "-", "-", "-", "-")
)
table_mxl_pref$Estimate <- round(table_mxl_pref$Estimate, 4)
table_mxl_pref$StdError <- round(table_mxl_pref$StdError, 4)
mxl_pref_kable <- table_mxl_pref %>% 
    kable(format = 'html', escape = FALSE,
          col.names = c("Coefficient", "Meaning", "Estimate", "Std Error",
                        "Level", "Unit"),
          align = c('c', 'l', 'c', 'c', 'l', 'l'),
          caption = "Summary of Model Coefficients") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 3.2 Model figure
# 3.2.1 Construct the data frame
figure_mxl_pref <- data.frame(
    Attribute = factor(
        c("Upfront", "Charger", "Discount", "Window", "Guaranteed", "No Choice"),
        levels = c("No Choice", "Guaranteed", "Window",
                   "Discount", "Charger", "Upfront")
    ),
    Estimate = head(coef(mxl_pref), 6),
    StdError = head(se(mxl_pref), 6)
)

figure_mxl_pref$LowerBound =
    figure_mxl_pref$Estimate - 1.96 * figure_mxl_pref$StdError
figure_mxl_pref$UpperBound =
    figure_mxl_pref$Estimate + 1.96 * figure_mxl_pref$StdError

# 3.2.2 Create the plot
mxl_pref_plot <- figure_mxl_pref %>% 
    ggplot(aes(x = Attribute, y = Estimate)) +
    geom_point() +
    geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
    labs(title = "Coef Estimates with Uncertainty Bounds",
         subtitle = "Mixed logit preference space",
         x = "Attribute",
         y = "Estimate") +
    coord_flip() +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '2_2_mxl_pref_plot.png'), 
    plot = mxl_pref_plot,
    width = 6, 
    height = 6 / 1.618
)

Below is the generated model coefficient table and plot for mxl_pref:

Summary of Model Coefficients
Coefficient Meaning Estimate Std Error Level Unit
upfront β1 Upfront -0.0151 0.0305 1, 3, 5 100 USD
lv_2_charger_yes β2 Lv 2 Charger 0.5882 0.1001 - -
discount β3 Discount 0.1655 0.0298 1, 2.5, 5 10%
window β4 Window 0.0835 0.0375 0.5, 1, 2, 4 hour
guaranteed β5 Guaranteed 0.2241 0.0253 25, 50, 75 10%
no_choice β6 No Choice 0.8340 0.2359 - -
sd_upfront SD β1 SD Upfront 0.0002 0.2045 - -
sd_lv_2_charger_yes SD β2 SD Lv 2 Charger -0.0035 2.4494 - -
sd_discount SD β3 SD Discount 0.0030 NaN - -
sd_window SD β4 SD Window 0.0007 NaN - -
sd_guaranteed SD β5 SD Guaranteed -0.0006 NaN - -
sd_no_choice SD β6 SD No Choice 0.0050 NaN - -

Interpretation of the mxl_pref model:

This is the mixed logit preference space model. The team applied multi-start and saved the result using the set.seed() function. This model has similar coefficients with the simple logit model, and provided the standard deviation values of each coefficient.

The standard deviation values indicate that the coefficient of each utility follows normal distribution and has a span of its estimated value. The span (aka SD) is not high compared with the estimated values.

4.3 Simple logit preference space with groups

This part is the 3rd model: simple logit preference space with groups, mnl_pref_groups. Below is the code chunk:

Code
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
data_mnl_pref_groups <- read_csv(here("data", "choice_data.csv")) %>% 
    mutate(upfront = -1 * upfront / 100,
           discount = discount / 10,
           guaranteed = guaranteed / 10) %>% 
    mutate(group_A = ifelse(mc_income == "under60", 1, 0),
           group_B = ifelse(mc_income != "under60", 1, 0)) %>% 
    mutate(upfront_B          = upfront * group_B,
           lv_2_charger_yes_B = lv_2_charger_yes * group_B,
           discount_B         = discount * group_B,
           window_B           = window * group_B,
           guaranteed_B       = guaranteed * group_B,
           no_choice_B        = no_choice * group_B)

mnl_pref_groups <- logitr(
    data    = data_mnl_pref_groups,
    outcome = "choice",
    obsID   = "obs_id",
    pars    = c("upfront", "lv_2_charger_yes", "discount",
                "window", "guaranteed", "no_choice",
                "upfront_B", "lv_2_charger_yes_B", "discount_B",
                "window_B", "guaranteed_B", "no_choice_B")
    )

# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_pref_groups)

# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mnl_pref_groups$gradient

# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mnl_pref_groups$hessian)$values

# 2.3 Save the model
save(
    mnl_pref_groups,
    file = here("models", "mnl_pref_groups.RData")
)

# 3. Model Equation
# 3.1 Estimated values and std err
table_mnl_pref_groups <- data.frame(
    Coefficient = c("&beta;<sub>1</sub>", "&beta;<sub>2</sub>",
                    "&beta;<sub>3</sub>", "&beta;<sub>4</sub>",
                    "&beta;<sub>5</sub>", "&beta;<sub>6</sub>",
                    "&beta;<sub>7</sub>", "&beta;<sub>8</sub>",
                    "&beta;<sub>9</sub>", "&beta;<sub>10</sub>",
                    "&beta;<sub>11</sub>", "&beta;<sub>12</sub>"),
    Meaning = c("Upfront", "Lv 2 Charger", "Discount",
                "Window", "Guaranteed", "No Choice",
                "Upfront B", "Lv 2 Charger B", "Discount B",
                "Window B", "Guaranteed B", "No Choice B"),
    Estimate = coef(mnl_pref_groups),
    StdError = se(mnl_pref_groups),
    Level = c("1, 3, 5", "-", "1, 2.5, 5",
              "0.5, 1, 2, 4", "25, 50, 75", "-",
              "1, 3, 5", "-", "1, 2.5, 5",
              "0.5, 1, 2, 4", "25, 50, 75", "-"),
    Unit = c("100 USD", "-", "10%", "hour", "10%", "-",
             "100 USD", "-", "10%", "hour", "10%", "-")
)
table_mnl_pref_groups$Estimate <- round(table_mnl_pref_groups$Estimate, 4)
table_mnl_pref_groups$StdError <- round(table_mnl_pref_groups$StdError, 4)
mnl_pref_groups_kable <- table_mnl_pref_groups %>% 
    kable(format = 'html', escape = FALSE,
          col.names = c("Coefficient", "Meaning", "Estimate", "Std Error",
                        "Level", "Unit"),
          align = c('c', 'l', 'c', 'c', 'l', 'l'),
          caption = "Summary of Model Coefficients") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 3.2 Model figure
# 3.2.1 Construct the data frame
figure_mnl_pref_groups <- data.frame(
    Attribute = factor(
        c("Upfront", "Charger", "Discount", "Window", "Guaranteed", "No Choice",
          "Upfront B", "Charger B", "Discount B",
          "Window B", "Guaranteed B", "No Choice B"),
        levels = c("No Choice B", "Guaranteed B", "Window B",
                   "Discount B", "Charger B", "Upfront B",
                   "No Choice", "Guaranteed", "Window",
                   "Discount", "Charger", "Upfront")
    ),
    Estimate = coef(mnl_pref_groups),
    StdError = se(mnl_pref_groups)
)

figure_mnl_pref_groups$LowerBound =
    figure_mnl_pref_groups$Estimate - 1.96 * figure_mnl_pref_groups$StdError
figure_mnl_pref_groups$UpperBound =
    figure_mnl_pref_groups$Estimate + 1.96 * figure_mnl_pref_groups$StdError

# 3.2.2 Create the plot
mnl_pref_groups_plot <- figure_mnl_pref_groups %>% 
    ggplot(aes(x = Attribute, y = Estimate)) +
    geom_point() +
    geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
    labs(title = "Coefficient Estimates with Uncertainty Bounds",
         subtitle = "Simple logit preference space with groups",
         x = "Attribute",
         y = "Estimate") +
    coord_flip() +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '2_3_mnl_pref_groups_plot.png'), 
    plot = mnl_pref_groups_plot,
    width = 6, 
    height = 6 / 1.618
)

# 4. Hypothesis Test
# 4.1 Perform the simulation
beta <- coef(mnl_pref_groups)
hessian <- mnl_pref_groups$hessian
covariance = -1*(solve(hessian))
draws = as.data.frame(MASS::mvrnorm(10^5, beta, covariance))

# 4.2 Define the coefficients for groups A and B
b1_A <- draws$upfront
b2_A <- draws$lv_2_charger_yes
b3_A <- draws$discount
b4_A <- draws$window
b5_A <- draws$guaranteed
b6_A <- draws$no_choice

b1_B <- draws$upfront + draws$upfront_B
b2_B <- draws$lv_2_charger_yes + draws$lv_2_charger_yes_B
b3_B <- draws$discount + draws$discount_B
b4_B <- draws$window + draws$window_B
b5_B <- draws$guaranteed + draws$guaranteed_B
b6_B <- draws$no_choice + draws$no_choice_B

# 4.3 Construct the data frame
coefficients = list(b1_A, b2_A, b3_A, b4_A, b5_A, b6_A,
                    b1_B, b2_B, b3_B, b4_B, b5_B, b6_B)

draw_sim_df <- data.frame(
    Coefficient = c("b1_A", "b2_A", "b3_A", "b4_A", "b5_A", "b6_A",
                    "b1_B", "b2_B", "b3_B", "b4_B", "b5_B", "b6_B"),
    Mean = sapply(coefficients, mean),
    LowerQuantile = sapply(coefficients, function(x) quantile(x, probs = 0.025)),
    UpperQuantile = sapply(coefficients, function(x) quantile(x, probs = 0.975))
)

# 4.4 Sketch the plot
hypothesis_test_plot <- draw_sim_df %>% 
    ggplot(aes(x = Coefficient, y = Mean)) +
    geom_point() +
    geom_errorbar(aes(ymin = LowerQuantile, ymax = UpperQuantile), width = 0.2) +
    labs(title = "Hypothesis Test on Groups A and B",
         x = "Coefficient",
         y = "Values") +
    coord_flip() +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '2_3_hypothesis_test_plot.png'), 
    plot = hypothesis_test_plot,
    width = 6, 
    height = 6 / 1.618
)

Below is the generated model coefficient table and plot for mnl_pref_groups:

Summary of Model Coefficients
Coefficient Meaning Estimate Std Error Level Unit
upfront β1 Upfront 0.0795 0.0812 1, 3, 5 100 USD
lv_2_charger_yes β2 Lv 2 Charger 0.0577 0.2587 - -
discount β3 Discount 0.1084 0.0759 1, 2.5, 5 10%
window β4 Window 0.1789 0.0957 0.5, 1, 2, 4 hour
guaranteed β5 Guaranteed 0.2391 0.0645 25, 50, 75 10%
no_choice β6 No Choice 0.7396 0.5704 - -
upfront_B β7 Upfront B -0.1155 0.0877 1, 3, 5 100 USD
lv_2_charger_yes_B β8 Lv 2 Charger B 0.6234 0.2803 - -
discount_B β9 Discount B 0.0678 0.0826 1, 2.5, 5 10%
window_B β10 Window B -0.1069 0.1041 0.5, 1, 2, 4 hour
guaranteed_B β11 Guaranteed B -0.0128 0.0702 25, 50, 75 10%
no_choice_B β12 No Choice B 0.1405 0.6275 - -

Interpretation of the mnl_pref_groups model:

This is the simple logit preference space model with groups. This model splits the users into based on their annual income: below or above $60k. The results of the hypothesis test are NOT significant, since every single coefficient has obvious overlapping.

The team continued with other grouping techniques, including gender, party, and education. None of them revealed significant differences. It is a proof that this survey result might be generated by professional test takers with pre-set personas, that their results look convincing by individual questions, but are NOT consistent in the whole picture.

4.4 Simple logit WTP space

This part is the 4th model: simple logit WTP space, mnl_wtp. Below is the code chunk:

Code
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
data_mnl_pref <- read_csv(here("data", "choice_data.csv")) %>% 
    mutate(upfront = -1 * upfront / 100,
           discount = discount / 10,
           guaranteed = guaranteed / 10)

set.seed(234)
mnl_wtp <- logitr(
    data    = data_mnl_pref,
    outcome = "choice",
    obsID   = "obs_id",
    pars    = c("lv_2_charger_yes", "discount",
                "window", "guaranteed", "no_choice"),
    scalePar = "upfront",
    numMultiStarts = 30
    )

# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_wtp)

# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mnl_wtp$gradient

# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mnl_wtp$hessian)$values

# 2.3 Compare preference vs WTP spaces
load(here("models", "mnl_pref.RData"))
mnl_wtp_compare <- wtpCompare(mnl_pref, mnl_wtp, scalePar = 'upfront')
mnl_wtp_compare_kable <- mnl_wtp_compare %>% kable(
    format = 'html', escape = FALSE,
    col.names = c("Coefficient", "mnl_pref Space",
                  "mnl_wtp Space", "Difference"),
    caption = "Model Coefficients Compare") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 2.4 Save the model
save(
    mnl_wtp,
    file = here("models", "mnl_wtp.RData")
)

Below is the comparison between mnl_pref and mnl_wtp:

Model Coefficients Compare
Coefficient mnl_pref Space mnl_wtp Space Difference
scalePar 0.0150763 0.01484121 -0.00023509
lv_2_charger_yes 39.0272499 39.66775455 0.64050465
discount 10.9895955 11.17393094 0.18433547
window 5.5370587 5.62073081 0.08367215
guaranteed 14.8688168 15.10740414 0.23858730
no_choice 55.3548416 56.16701223 0.81217065
logLik -880.1725938 -880.17269689 -0.00010313

Interpretation of the mnl_wtp model:

This is the simple logit WTP space model. This WTP model directly generates the willingness-to-pay coefficients. The wtpCompare() function can be used to compare mnl_pref and mnl_wtp. The result shows a very low difference, indicating the success of our algorithm, that a local maximum likelihood close to global maximum is captured.

4.5 Mixed logit WTP space

This part is the last model: mixed logit WTP space, mxl_wtp. Below is the code chunk:

Code
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
data_mxl_pref <- read_csv(here("data", "choice_data.csv")) %>% 
    mutate(upfront = -1 * upfront / 100,
           discount = discount / 10,
           guaranteed = guaranteed / 10)

set.seed(89)
mxl_wtp <- logitr(
    data     = data_mxl_pref,
    outcome  = "choice",
    obsID    = "obs_id",
    pars    = c("lv_2_charger_yes", "discount",
                "window", "guaranteed", "no_choice"),
    scalePar = "upfront",
    randPars = c(lv_2_charger_yes = "n",
                 discount         = "n",
                 window           = "n",
                 guaranteed       = "n",
                 no_choice        = "n"),
    numMultiStarts = 30
    )

# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mxl_wtp)

# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mxl_wtp$gradient

# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mxl_wtp$hessian)$values

# 2.3 Save the coefficients
coef_mxl_wtp <- head(coef(mxl_wtp), 6)
coef_mxl_wtp_df <- as.data.frame(coef_mxl_wtp)

mxl_wtp_compare <- cbind(coef_mxl_wtp_df, head(mnl_wtp_compare, 6)) %>% 
    select(pref, coef_mxl_wtp, difference) %>% 
    mutate(difference = coef_mxl_wtp - as.numeric(pref))

mxl_wtp_compare_kable <- mxl_wtp_compare %>% kable(
    format = 'html', escape = FALSE,
    col.names = c("Coefficient", "mnl_pref Space",
                  "mxl_wtp Space", "Difference"),
    caption = "Model Coefficients Compare") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 2.4 Save the model
save(
    mxl_wtp,
    file = here("models", "mxl_wtp.RData")
)

Here we reveal the coefficients in this model. The team has made a great many numbers of attempts and captured this result. It’s not perfect but close enough:

Model Coefficients Compare
Coefficient mnl_pref Space mxl_wtp Space Difference
scalePar 0.0150763 0.0143826 -0.0006937
lv_2_charger_yes 39.0272499 51.8474599 12.8202100
discount 10.9895955 13.8418559 2.8522604
window 5.5370587 4.9340397 -0.6030190
guaranteed 14.8688168 17.8647569 2.9959401
no_choice 55.3548416 56.3146482 0.9598066

Interpretation of the mxl_wtp model:

This is the mixed logit WTP space model. to reveal the complete image of the model, run summary(mxl_wtp). The team made attempts to find a good local maximum and found a (not perfect) result.

Results

1. WTP Analysis

Different from 4.4 Simple logit WTP space, which directly generates the WTP model, this part calculates WTP from the simple preference space model generated in 4.1 Simple logit preference space. Below is the code chunk:

Code
# 1. Load the mnl_pref model
load(here("models", "mnl_pref.RData"))

# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_pref)

# 2. Compute WTP estimates
coefs_mnl_pref <- coef(mnl_pref)
wtp_mnl_pref <- coefs_mnl_pref / (-1*coefs_mnl_pref['upfront'])

# 3. Simulate WTP with uncertainty
cov_mnl_pref <- vcov(mnl_pref)
set.seed(3)
coef_draws_mnl_pref <- as.data.frame(
    MASS::mvrnorm(10^6, coefs_mnl_pref, cov_mnl_pref)
    )
wtp_draws_mnl_pref <- -1*(coef_draws_mnl_pref[,2:6] / coef_draws_mnl_pref[,1])
wtp_ci_mnl_pref <- ci(wtp_draws_mnl_pref)

# 4. Create kables for WTP with uncertainty
wtp_mnl_pref_kable <- wtp_mnl_pref %>% kable(
    format = 'html', escape = FALSE,
    col.names = c("Attribute", "Coefficient"),
    caption = "Computed WTP") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

wtp_ci_mnl_pref_kable <- wtp_ci_mnl_pref %>% kable(
    format = 'html', escape = FALSE,
    col.names = c("Attribute", "Mean", "Lower", "Upper"),
    caption = "Simulated WTP with CI") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 5. Save WTP with uncertainty
save(
    wtp_mnl_pref,
    wtp_ci_mnl_pref,
    file = here("data", "wtp_from_pref.RData")
)

The computed and simulated WTP values are shown below:

Computed WTP
Attribute Coefficient
upfront -1.000000
lv_2_charger_yes 39.027250
discount 10.989596
window 5.537059
guaranteed 14.868817
no_choice 55.354842
Simulated WTP with CI
Attribute Mean Lower Upper
lv_2_charger_yes 35.184767 -267.42093 281.31683
discount 11.800483 -74.84438 78.77620
window 5.304263 -38.33406 40.24955
guaranteed 14.546457 -101.89208 107.08355
no_choice 44.555414 -353.89815 378.77140

Interpretation of WTP analysis:

The WTP values in this section are from the preference space model, either calculated or simulated. It is the same as the WTP values in the WTP model.

2. Market Simulation

The market simulation is based on the simple logit preference space model, mnl_pref. Two market simulations were performed:

  1. Single market simulation with only one scenario considered. In our case, we did a “Monetary” case vs a “Flexibility” case, in which the users are provided with either the best monetary incentives (highest upfront, free charger, and best electricity price discount) or the best flexibility (highest override time interval and largest amount of guaranteed battery percentage).
  2. Multiple market simulation with 9 different scenarios. In our case, we considered: polarized, incentive only, charger only, discount only, window only, guaranteed only, monetary vs flexibility (same as single simulation), monetary, and flexibility.

Below is the code chunk:

Code
# 1. Load the mnl_pref model
load(here("models", "mnl_pref.RData"))

# 2. Market Simulation
# 2.1 Single market simulation
scenarios_sing <- data.frame(
    alt_id           = c(1, 2, 3),
    obs_id           = c(1, 1, 1),
    upfront          = c(-5, -1, 0),
    lv_2_charger_yes = c(1, 0, 0),
    discount         = c(5, 1, 0),
    window           = c(1, 4, 0),
    guaranteed       = c(2.5, 7.5, 0),
    no_choice        = c(0, 0, 1)
)

sim_mnl_pref_sing <- predict(
    mnl_pref,
    newdata    = scenarios_sing,
    obsID      = 'obs_id',
    level      = 0.95,
    interval   = 'confidence',
    returnData = TRUE
)

# 2.2 Multiple simulations
scenarios_mult <- read_csv(here("data", "scenarios.csv"))

sim_mnl_pref_mult <- predict(
    mnl_pref,
    newdata    = scenarios_mult, 
    obsID      = 'obs_id', 
    level      = 0.95,
    interval   = 'confidence',
    returnData = TRUE
)

# 2.3 Save simulations
save(
    sim_mnl_pref_sing,
    sim_mnl_pref_mult,
    file = here("sims", "sim_mnl_pref.RData")
)

# 3. Market Simulation Plots
# 3.1 Single market simulation plot
sim_sing_plot <- sim_mnl_pref_sing %>% 
    mutate(label = factor(c("Monetary", "Flexibility", "Not Enroll"),
                          levels = c("Monetary",
                                     "Flexibility",
                                     "Not Enroll"))) %>% 
    ggplot(aes(
        x = label, y = predicted_prob, 
        ymin = predicted_prob_lower, ymax = predicted_prob_upper)) +
    geom_col(fill = "grey", width = 0.6) +
    geom_errorbar(width = 0.3) +
    scale_y_continuous(limits = c(0, 1), labels = scales::percent) +
    labs(x = 'Options',
         y = 'Probability',
         title = 'Single Market Simulation Plot',
         subtitle = 'Monetary vs Flexibility') +
    theme_bw(base_family = 'Ubuntu')

ggsave(
    filename = here('figs_manu', '3_2_sim_sing_plot.png'),
    plot = sim_sing_plot,
    width = 6,
    height = 6 / 1.618
)

# 3.2 Multiple market simulation plot
scenarios_mult_labels <- setNames(c(
    "Polarized", "Incentive Only", "Charger Only", "Discount Only",
    "Window Only", "Guaranteed Only", "Monetary vs Flexibility",
    "Monetary", "Flexibility"), 1:9)

sim_mult_plot <- sim_mnl_pref_mult %>%
    ggplot(aes(
        x = as.factor(alt_id), y = predicted_prob, 
        ymin = predicted_prob_lower, ymax = predicted_prob_upper)) +
    geom_col(fill = "grey", width = 0.6) +
    geom_errorbar(width = 0.3) +
    facet_wrap(~obs_id, labeller = labeller(obs_id = scenarios_mult_labels)) +
    scale_x_discrete(labels = c("1" = "1", "2" = "2", "3" = "No")) +
    scale_y_continuous(limits = c(0, 1), labels = scales::percent) +
    labs(x = 'Options',
         y = 'Probability',
         title = 'Multiple Market Simulation Plot') +
    theme_bw(base_family = 'Ubuntu')

ggsave(
    filename = here('figs_manu', '3_2_sim_mult_plot.png'),
    plot = sim_mult_plot,
    width = 7,
    height = 7 * 0.8
)

The datasets are all saved to the sims folder, and the plots are shown below:

Note: The single market simulation is case 7 of multiple market simulation.

Cases of the multiple market simulation plots:

Multiple Market Simulation Cases
No. Option 1 Option 2 No Option
1 Best levels of all attributes Lowest level for all No participation
2 Best upfront incentive Lowest level for all No participation
3 Free charger Lowest level for all No participation
4 Best electricity price discount Lowest level for all No participation
5 Best override window Lowest level for all No participation
6 Best guaranteed battery Lowest level for all No participation
7 Best upfront, discount, and a free charger Best window and guaranteed battery No participation
8 Best upfront, discount, and a free charger Lowest level for all No participation
9 Best window and guaranteed battery Lowest level for all No participation

3. Market Sensitivity

The market sensitivity is also generated with respect to the simple logit preference model, mnl_pref. This part consists of the following:

  1. Sensitivity of market share to changes in “upfront”, where there are 2 options. In the baseline, option 1 is set as a no option, and option 2 is provided with moderate values. We evaluate the sensitivity by changing the upfront value in option 2.
  2. Sensitivity of market share to changes in multiple attributes, where we study the sensitivity of all attributes.
  3. Plots of market share and cost sensitivity to the change of the “upfront” attribute. We assume a market size of 1000 users.
  4. Plot of market share sensitivity to the change of multiple attributes. This is called the “Tornado Plot”, and is generated using the ggplot-like function in Prof John Helveston’s package, jph::ggtornado().

The code scripts are broken into 2 parts: sensitivity calculation and plot generation.

Below is the sensitivity calculation code chunk:

Code
# 1. Load the mnl_pref model
load(here("models", "mnl_pref.RData"))

# 2. Sensitivity Analysis
# 2.1 Sensitivity of market share to changes in "upfront"

# Set a baseline scenario
# Option 1 is a no option
# Option 2 is provided with moderate values
# We will change the upfront values for Option 2
baseline <- data.frame(
    alt_id           = c(1, 2), 
    obs_id           = c(1, 1),
    upfront          = c(0, -2),
    lv_2_charger_yes = c(0, 1),
    discount         = c(0, 2.5),
    window           = c(0, 2),
    guaranteed       = c(0, 5),
    no_choice        = c(1, 0)
)

baseline_kable <- baseline %>% kable(
    format = 'html', escape = FALSE,
    caption = "Single Attribute Baseline") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# Set the sequence of upfront levels from 1000 to 100 USD
upfront_levels <- seq(-10, -1, by = 1)

# Count the numbers of upfront scenarios
upfront_numbers <- length(upfront_levels)

# Create scenarios with different upfront levels in option 2
scenario_upfront <- do.call(
    rbind, replicate(upfront_numbers, baseline, simplify = FALSE)
    )
scenario_upfront$obs_id <- rep(seq(upfront_numbers), each = 2)
scenario_upfront$upfront[which(scenario_upfront$alt_id == 2)] <- upfront_levels 

# Simulate the upfront sensitivity
sens_upfront <- predict(
    mnl_pref,
    newdata = scenario_upfront,
    obsID = 'obs_id',
    level = 0.95,
    interval = 'confidence',
    returnData = TRUE
    ) %>% 
    # Keep only option 2
    filter(alt_id == 2) %>% 
    # Keep only upfront and predictions
    select(upfront, starts_with("predicted_"))

sens_upfront_kable <- sens_upfront %>% kable(
    format = 'html', escape = FALSE,
    caption = "Upfront Sensitivity Table") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 2.2 Sensitivity of market share to changes in multiple attributes
# We continue to work on Option 2
# This time, we modify all continuous attributes
# The attributes are: upfront, discount, window, and guaranteed

# Construct the attribute_levels, like we constructed upfront_levels
# But attribute_levels is a df, while upfront_levels is a sequence
attribute_levels <- tribble(
    ~obs_id, ~alt_id, ~attribute,    ~case,  ~value,
    2,       2,      'upfront',     'high',  -5,
    3,       2,      'upfront',     'low',   -1,
    4,       2,      'discount',    'high',  5,
    5,       2,      'discount',    'low',   1,
    6,       2,      'window',      'high',  4,
    7,       2,      'window',      'low',   0.5,
    8,       2,      'guaranteed',  'high',  7.5,
    9,       2,      'guaranteed',  'low',   2.5
)

attribute_levels_kable <- attribute_levels %>% 
    kable(format = 'html', escape = FALSE,
          col.names = c("Obs ID", "Alt ID", "Attribute",
                        "Case", "Value"),
          caption = "Attribute Levels") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# Define the numbers of different scenarios
scenario_numbers <- 9

# Create scenarios with different upfront levels in option 2
scenario_table <- do.call(
    rbind, replicate(scenario_numbers, baseline, simplify = FALSE)
)
scenario_table$obs_id <- rep(seq(scenario_numbers), each = 2)
scenario_table <- scenario_table %>% 
    left_join(attribute_levels, by = c("alt_id", "obs_id")) %>% 
    mutate(
        attribute = ifelse(is.na(attribute), "other", attribute),
        case = ifelse(is.na(case), "base", case),
        upfront = ifelse(attribute == 'upfront', value, upfront),
        discount = ifelse(attribute == 'discount', value, discount),
        window = ifelse(attribute == 'window', value, window),
        guaranteed = ifelse(attribute == 'guaranteed', value, guaranteed)
    )

scenario_kable <- scenario_table %>% 
    kable(format = 'html', escape = FALSE,
          col.names = c("Alt ID", "Obs ID", "Upfront", "Charger",
                        "Discount", "Window", "Guaranteed",
                        "No Choice", "Attribute", "Case", "Value"),
          caption = "Scenario Table") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# Simulate the multiple change sensitivity
sens_multiple <- predict(
    mnl_pref,
    newdata = scenario_table,
    obsID = 'obs_id',
    level = 0.95,
    interval = 'confidence',
    returnData = TRUE
    ) %>% 
    # Keep only option 2
    filter(alt_id == 2) %>% 
    # Keep only upfront and predictions
    select(upfront, attribute, case, value, predicted_prob)

# Save simulations
save(
    sens_upfront,
    sens_multiple,
    file = here("sims", "sens_mnl_pref.RData")
)

Below is the code chunk for generating the plots:

Code
# 3. Sensitivity Plots
# 3.1 Plot of market share change to "upfront"
sens_upfront_plot <- sens_upfront %>% 
    ggplot(aes(x = upfront * -100,
               y = predicted_prob,
               ymin = predicted_prob_lower,
               ymax = predicted_prob_upper)) +
    geom_ribbon(alpha = 0.2) +
    geom_line(linetype = "dashed") +
    geom_line(
        data = sens_upfront %>% filter(upfront <= -1, upfront >= -5), 
        linetype = "solid") +
    expand_limits(x = c(-10, -1), y = c(0, 1)) +
    labs(x = 'Upfront (USD)',
         y = 'Market Share',
         title = 'Sensitivity of market share change to upfront') +
    theme_bw(base_family = "Ubuntu")

# Save plot
ggsave(
    filename = here('figs_manu', '3_3_sens_upfront_plot.png'), 
    plot = sens_upfront_plot,
    width = 6, 
    height = 6 / 1.618
)

# 3.2 Plot of policy cost change to "upfront"
# Assume a market size of 1000 users
market_size <- 1000
sens_cost <- sens_upfront %>%
    mutate(
        cost_mean = upfront*market_size*predicted_prob*-100 / 1000,
        cost_low  = upfront*market_size*predicted_prob_lower*-100 / 1000,
        cost_high = upfront*market_size*predicted_prob_upper*-100 / 1000)
sens_cost_plot <- sens_cost %>% 
    ggplot(aes(x = upfront * -100,
               y = cost_mean,
               ymin = cost_low,
               ymax = cost_high)) +
    geom_ribbon(alpha = 0.2) +
    geom_line(linetype = "dashed") +
    geom_line(
        data = sens_cost %>% filter(upfront <= -1, upfront >= -5), 
        linetype = "solid") +
    expand_limits(x = c(-10, -1), y = c(0, 1)) +
    labs(x = 'Upfront (USD)',
         y = 'Cost ($1000)',
         title = 'Sensitivity of policy cost change to upfront') +
    theme_bw(base_family = "Ubuntu")

# Save plot
ggsave(
    filename = here('figs_manu', '3_3_sens_cost_plot.png'), 
    plot = sens_cost_plot,
    width = 6, 
    height = 6 / 1.618
)

# 3.2 Tornado Plot of market share change to multiple attributes
labels <- data.frame( 
    attribute = c('upfront', 'discount', 'window', 'guaranteed'), 
    label = c('Upfront Incentive (USD)',
              'Electricity Discount (%)',
              'Override Window (hrs)',
              'Guaranteed Battery (%)')
)

tornado_data <- sens_multiple %>% 
    select(-upfront) %>% 
    filter(case != 'base') %>% 
    left_join(labels, by = 'attribute') %>% 
    mutate(value = ifelse(attribute == "upfront", -100*value, value),
           value = ifelse(attribute == "discount" | attribute == "guaranteed",
                          10*value, value))

tornado_plot <- jph::ggtornado(
    data = tornado_data,
    baseline = sens_multiple$predicted_prob[1],
    var = 'label',
    level = 'case',
    value = 'value',
    result = 'predicted_prob'
    ) +
    scale_fill_manual(values = c("#67a9cf", "#ef8a62")) + 
    labs(x = 'Market Share',
         y = 'Attribute',
         title = 'Plot of market share change to multiple attributes') +
    theme_bw(base_family = 'Ubuntu') +
    theme(legend.position = 'none')

# Save plot
ggsave(
    filename = here('figs_manu', '3_3_tornado_plot.png'), 
    plot = tornado_plot,
    width = 6,
    height = 6 / 1.618
)

The sensitivity results of single attribute (upfront) change are illustrated below:

Single Attribute Baseline
alt_id obs_id upfront lv_2_charger_yes discount window guaranteed no_choice
1 1 0 0 0.0 0 0 1
2 1 -2 1 2.5 2 5 0
Upfront Sensitivity Table
upfront predicted_prob predicted_prob_lower predicted_prob_upper
2 -10 0.8329388 0.7577337 0.8886528
4 -9 0.8308303 0.7643835 0.8820927
6 -8 0.8287008 0.7702253 0.8749223
8 -7 0.8265500 0.7754630 0.8679891
10 -6 0.8243779 0.7800687 0.8616038
12 -5 0.8221845 0.7833330 0.8557803
14 -4 0.8199697 0.7844276 0.8511014
16 -3 0.8177334 0.7837099 0.8478217
18 -2 0.8154755 0.7792340 0.8465033
20 -1 0.8131961 0.7728263 0.8474078

The sensitivity results of multiple attributes change are illustrated below:

Attribute Levels
Obs ID Alt ID Attribute Case Value
2 2 upfront high -5.0
3 2 upfront low -1.0
4 2 discount high 5.0
5 2 discount low 1.0
6 2 window high 4.0
7 2 window low 0.5
8 2 guaranteed high 7.5
9 2 guaranteed low 2.5
Scenario Table
Alt ID Obs ID Upfront Charger Discount Window Guaranteed No Choice Attribute Case Value
1 1 0 0 0.0 0.0 0.0 1 other base NA
2 1 -2 1 2.5 2.0 5.0 0 other base NA
1 2 0 0 0.0 0.0 0.0 1 other base NA
2 2 -5 1 2.5 2.0 5.0 0 upfront high -5.0
1 3 0 0 0.0 0.0 0.0 1 other base NA
2 3 -1 1 2.5 2.0 5.0 0 upfront low -1.0
1 4 0 0 0.0 0.0 0.0 1 other base NA
2 4 -2 1 5.0 2.0 5.0 0 discount high 5.0
1 5 0 0 0.0 0.0 0.0 1 other base NA
2 5 -2 1 1.0 2.0 5.0 0 discount low 1.0
1 6 0 0 0.0 0.0 0.0 1 other base NA
2 6 -2 1 2.5 4.0 5.0 0 window high 4.0
1 7 0 0 0.0 0.0 0.0 1 other base NA
2 7 -2 1 2.5 0.5 5.0 0 window low 0.5
1 8 0 0 0.0 0.0 0.0 1 other base NA
2 8 -2 1 2.5 2.0 7.5 0 guaranteed high 7.5
1 9 0 0 0.0 0.0 0.0 1 other base NA
2 9 -2 1 2.5 2.0 2.5 0 guaranteed low 2.5

Interpretation of market sensitivity:

  1. From single attribute change, we see that users are not sensitive to upfront incentive. This encourages us to generate the study of multiple attribute change.

  2. From the multiple attribute change (aka the Tornado plot), we see that users are most sensitive to guaranteed battery, followed by electricity discount, override window, and finally the upfront incentive.

  3. This implies that the utility providers do not need to sacrifice too much on upfront incentive.

4. Limitation of Analysis

The analysis can be limited in several aspects. Below is a breakdown:

  1. Results: The results of the survey might be biased or unstable, either due to the limited amount of sampling, or due to the possibility of professional test takers mocking the results with their pre-constructed personas.
  2. Profiles: The cbc profiles of the conjoint questions obey full factorial randomized design, which is a ready-to-use design but lacks the filtering of unreasonable questions. If the profiles could be better, we can prepare better survey question sheets.
  3. Grouping: The grouping technique was arbitrarily chosen. It could be improved by better reasoning processes.
  4. Categorizing: The team categorizes upfront incentive, free charger, and discounted electricity price as “Monetary”, and the rest as “Flexibility”. There might be other categorizing techniques to yield better results.
  5. Incentive Range: The market share plot usually has range from 0 to 100%, indicating a full range of market share in different prices, but the upfront incentive sensitivity is incredibally low in our model, making it impossible to cover the full range.

Final Recommendations and Conclusions

The fluctuation in market share shows a remarkable sensitivity to initial incentives, especially when investment levels are low, i.e., a $100 or $200 incentive. Even a minor increase in upfront incentives like these can result in a significant surge in market share. However, as the upfront incentive escalates further, the correlation weakens, suggesting that higher incentives, such as $500 or more, have negligible impact on market share.

Further analysis indicates that market share is most attuned to Guaranteed Battery percentage, followed by Discount Electricity percentage, Override Window hours, and Upfront Incentive. A marginal alteration in Guaranteed Battery percentage significantly influences market share, whereas a similar adjustment in Upfront Incentive has a comparatively minor impact. Yet, exceptions exist. For instance, an excessive increase in Override Window hours can paradoxically result in a decline in market share, likely due to customers’ inclination toward products with shorter Override Windows.

In conclusion, the relationship between upfront incentives and market share demonstrates diminishing returns as incentives increase. Additionally, the pivotal factors affecting market share sensitivity vary, with Guaranteed Battery percentage being the most influential. Factoring in these dynamics can help optimize strategies to maximize market share effectiveness and anticipate consumer preferences for sustained growth.

Limitations

This section delves into the constraints encountered in sample representation, hypothetical preferences, and short-term evaluations, shedding light on the need for comprehensive approaches to gather more accurate and forward-thinking data. Simultaneously, it explores the significance of longitudinal data collection, behavioral insights, and regional variations in understanding the diverse landscape of smart charging adoption among PEV users. It highlights the critical unknowns—technological advancements, regulatory changes, and user experience feedback—that could significantly impact findings. This critical examination of existing limitations, identification of future data collection avenues, and emphasis on pivotal yet uncertain factors shapes the future landscape of smart charging programs for electric vehicles.

1. Limitations of the current survey

Sample Representation:

The sample gathered in the survey may not accurately reflect the wide range of PEV owners or potential buyers because of potential biases in how participants were recruited or limited outreach efforts. This might result in an overrepresentation of certain demographics, geographic areas, or particular EV models within the sample.

Hypothetical Preferences:

Survey participants could shape their preferences for smart charging based on hypothetical scenarios outlined in the survey, potentially deviating from factors that truly influence real-life decisions. The actual adoption of these practices might be swayed by other unaddressed factors that play a role in real-world decision-making.

Short-Term Evaluation:

The survey’s range might concentrate on current preferences for smart charging features without considering how consumer perspectives, technological progressions, or changes in regulations might evolve over time, potentially influencing long-term adoption significantly.

2. Valuable Information for Future Collection

Longitudinal Data:

Gathering data gradually allows for a deeper understanding of how preferences and behaviors among PEV owners change over time, offering valuable insights into the evolving trends in smart charging adoption.

Behavioral Context:

By collecting qualitative data or conducting follow-up interviews, we can uncover the motivations, barriers, and decision-making processes that drive individuals’ choices regarding smart charging programs. This approach allows for a deeper understanding of the underlying factors shaping these decisions.

Regional Variations:

Expanding the sample size to include greater geographical diversity could illuminate regional disparities in smart charging preferences. This broader scope considers variations in utility programs, infrastructure accessibility, and environmental policies across different areas, offering insights into how these factors influence preferences.

3. Important Unknowns Affecting Findings

Technological Advancements:

Future technological innovations, like advancements in battery technology or shifts in charging infrastructure, have the potential to substantially transform smart charging preferences and behaviors. These changes could reshape how individuals perceive and engage with smart charging options.

Regulatory Changes:

Changes in governmental regulations or incentives linked to electric vehicles and grid management could markedly impact the appeal and viability of smart charging programs. These alterations have the potential to significantly shift the landscape of smart charging initiatives, affecting their attractiveness and feasibility for consumers.

User Experience:

Gaining insights into the practical experiences and satisfaction levels of individuals currently enrolled in smart charging programs is essential. Understanding their challenges, benefits, and feedback can serve as valuable input, significantly guiding future decisions in this domain.

Attribution

This final analysis report is done by all members of the PEV team. Below is the breakdown:

In the end, Pingfan filled out Attribution and Appendix, and conducted a final check.

Appendix

1. Conjoint Survey Documentation

1.1 Section 2 - Smart Charging Programs

1.2 Start of Section 2

Now that we have known basics about your EV ownership and your charging habits, let’s move on to Smart Charging program selections!

A Smart Charging program consists of 5 attributes: Upfront Incentive, Free Level 2 Charger, Electricity Price Discount, Override Window, and Guaranteed Range if charged for 8 hrs.

Here is a brief explanation of each attribute:

Attribute Range Explanation
Upfront Incentive: $100, $300, $500 It means the amount of money you will be instantly paid if you join the smart charging program. The unit is in USD.
Free Level 2 Charger: No or Yes A "Yes" means you will get a free level 2 charger. A level 2 charger enables faster charging than a standard home charger.
Electricity Price Discount: 10%, 25%, 50% If you choose to join this smart charging program, this will be your electricity price discount.
Override Window: 0.5 ~ 4 hrs This is the longest time window per day that you can force override to regular charging, during which time your EV is not controlled by the smart charging program.
Guaranteed Range if charged for 8 hrs: 25%, 50%, 75% It means the guranteed range if your car is charged for 8 hrs during smart charging. It shows percentage here, but will be converted to miles.

The above table can be accessed via this link. It’s available on top of each question so that you can refer at any time.

The next page provides a sample question to enhance your understanding.

1.3 Section 2 - Sample Question

You are provided with 2 options of smart charging programs, and an “I am not interested” option. You need to treat them as if they were your only options and choose the one that you most prefer. You can choose the “I am not interested” option if you are not interested in either.

You can access the attribute table using this link.

Options: Option 1 Option 2
Upfront Incentive: $100 $500
Free Level 2 Charger: No Yes
Electricity Price Discount: 25% 50%
Override Window: 0.5 hrs 4 hrs
Guaranteed Range if charged for 8 hrs: 100 miles 300 miles
  • Option 1
  • Option 2
  • I am not interested

1.4 Section 2 - Ready to Start the Questions

The sample question is quite straight-forward. Since Option 2 has every attribute better than Option 1, I bet you have chosen it.

However, this simplicity is not always the case. There will be trade-offs between the 2 options, and you need to make your own decision. For example, one option might have a large up-front but a low discount, while the other one is the inverse. Then, you need to make decisions based on your preference.

There are 8 choice questions in total. For each question, you have 2 valid options to choose from. Again, you make your choice as if they were your ONLY options. If you are really not interested in either of them, you can choose “I am not interested”.

Click on “Next Page” to proceed.

1.5 Question 1 of 8

(1 of 8) If your utility offers you these 2 smart charging programs, which one do you prefer?

You can access the attribute table using this link.

Options: Option 1 Option 2
Upfront Incentive: $100 $300
Free Level 2 Charger: No No
Electricity Price Discount: 10% 10%
Override Window: 4 hrs 2 hrs
Guaranteed Range if charged for 8 hrs: 75 % 50 %
  • Option 1
  • Option 2
  • I am not interested

The other 7 questions are in similar format and are thus omitted.

2. Data Dictionary

2.1 Dictionary of R codes:

Dictionary of R Codes
No. Name Type Description
1 0_make_choice_questions R Construct the conjoint questions.
2 1_1_clean_data R Clean and export survey data.
3 1_2_summary_statistics R Generate summary statistics of test takers.
4 2_1_mnl_pref_model R The simple logit preference space model.
5 2_2_mxl_pref_model R The mixed logit preference space model.
6 2_3_mnl_pref_groups_model R The simple logit preference space model by groups.
7 2_4_mnl_wtp_model R The simple logit WTP space model.
8 2_5_mxl_wtp_model R The mixed logit WTP space model.
9 3_1_wtp_from_pref R Calculation of WTP using mnl_pref model.
10 3_2_market_sim R Market simulations
11 3_3_sensitivity_analysis R Market sensitivity analysis

2.2 Dictionary of Files:

Dictionary of Files
No. Name Type Description
1 choice_data CSV Cleaned results from the final survey.
2 choice_questions CSV Local version of Conjoint questions.
3 pevFinal1 CSV Part 1 of final survey result from formr.
4 pevFinal2 CSV Part 2 of final survey result from formr.
5 pevFinal3 CSV Part 3 of final survey result from formr.
6 scenarios CSV Scenarios of multiple attributes for sensitivity analysis.
7 choice_data_wider RData Wider version of survey results
8 wtp_from_pref RData WTP with uncertainty calculated from mnl_pref.
9 mnl_pref_groups RData The simple logit preference space model.
10 mnl_pref RData The mixed logit preference space model.
11 mnl_wtp RData The simple logit preference space model by groups.
12 mxl_pref RData The simple logit WTP space model.
13 mxl_wtp RData The mixed logit WTP space model.
14 sens_mnl_pref RData Market sensitivity of mnl_pref model.
15 sim_mnl_pref RData Market simulation of mnl_pref model.

2.3 Dictionary of Data Frames:

Dictionary of Data Frames
No. Name Type Description
1 choice_data A tibble: 2,928 × 40 Conjoint questions for the pilot survey.
2 choice_data_charging A tibble: 336 × 38 Longer version of choice data about charging.
3 choice_data_demo A tibble: 499 × 36 Longer version of choice data about main demography.
4 choice_data_demo_2 A tibble: 244 × 38 Longer version of choice data about education and income.
5 choice_data_vehicle A tibble: 610 × 35 Longer version of choice data about vehicle.
6 choice_data_wider A tibble: 122 × 38 Original wider version of choice data.
7 coef_mxl_wtp_df A data frame: 6 × 1 Coefficients of the mxl_wtp model.
8 data_mnl_pref A tibble: 2,928 × 40 Choice data for the mnl_pref model.
9 data_mnl_pref_groups A tibble: 2,928 × 48 Choice data for the mnl_pref_groups model.
10 data_mxl_pref A tibble: 2,928 × 40 Choice data for the mxl_pref model.
11 figure_mnl_pref A data frame: 6 × 5 Data frame for mnl_pref plot.
12 figure_mnl_pref_groups A data frame: 12 × 5 Data frame for mnl_pref_groups plot.
13 figure_mxl_pref A data frame: 6 × 5 Data frame for mxl_pref plot.
14 mnl_wtp_compare A data frame: 7 × 3 Compare data between mnl_pref and mnl_wtp.
15 mxl_wtp_compare A data frame: 6 × 3 Compare data between mnl_pref and mxl_wtp.
16 profiles A data frame: 216 × 6 CBC profiles of the conjoint survey.
17 scenario_table A data frame: 18 × 11 Scenarios of multiple attributes for market sensitivity analysis.
18 scenario_upfront A data frame: 20 × 8 Scenarios of upfront attribute for market sensitivity analysis.
19 scenarios_mult A tibble: 27 × 8 Multiple scenarios for maket simulation.
20 scenarios_sing A data frame: 3 × 8 Single scenarios for maket simulation.
21 sens_cost A data frame: 10 × 7 Sensitivity of cost.
22 sens_multiple A data frame: 9 × 5 Sensitivity of multiple attributes.
23 sens_upfront A data frame: 10 × 4 Sensitivity of upfront attribute.
24 sim_cases A tibble: 9 × 4 Cases of multiple market simulation.
25 sim_mnl_pref_mult A data frame: 27 × 11 Multiple simulation using mnl_pref model.
26 sim_mnl_pref_sing A data frame: 3 × 11 Single simulation using mnl_pref model.
27 survey A tibble: 12,000 × 12 Survey data.
28 table_mnl_pref A data frame: 6 × 6 Details of the mnl_pref model.
29 table_mnl_pref_groups A data frame: 12 × 6 Details of the mnl_pref_groups model.
30 table_mxl_pref A data frame: 12 × 6 Details of the mxl_pref model.
31 tornado_data A data frame: 8 × 5 Data of tornado plot.
32 wtp_ci_mnl_pref A data frame: 5 × 3 WTP with CI calculated from mnl_pref model.

3. All Code

Code
knitr::opts_chunk$set(
    comment = "#>",
    fig.align = "center",
    fig.show = "hold",
    fig.width = 6.5,
    fig.asp = 0.7,
    fig.retina = 3,
    fig.path = "figs_auto/",
    out.width = "90%"
)

# Load libraries here
library(tidyverse)
library(here)
library(cbcTools)
library(logitr)
library(janitor)
library(lubridate)
library(fastDummies)
library(kableExtra)
library(tibble)
library(dplyr)
library(kableExtra)
brief <- tibble(
    `Attribute` = c("Range", "Explanation"),
    `Upfront Incentive:` = c("$100, $300, $500", "It means the amount of money you will be instantly paid if you join the smart charging program. The unit is in USD."),
    `Free Level 2 Charger:` = c("No or Yes", "A \"Yes\" means you will get a free level 2 charger. A level 2 charger enables faster charging than a standard home charger."), 
    `Electricity Price Discount:` = c("10%, 25%, 50%", "If you choose to join this smart charging program, this will be your electricity price discount."), 
    `Override Window:` = c("0.5 ~ 4 hrs", "This is the longest time window per day that you can force override to regular charging, during which time your EV is not controlled by the smart charging program."),
    `Guaranteed Range if charged for 8 hrs:` = c("25%, 50%, 75%", "It means the guranteed range if your car is charged for 8 hrs during smart charging. It shows percentage here, but will be converted to miles."))
row.names(brief) <- NULL

kable(t(brief), escape = FALSE) %>% 
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE, 
                position = "center") %>%
  row_spec(row = 1, bold = TRUE)
alts <- tibble(
    `Options:` = c("Option 1", "Option 2"),
    `Upfront Incentive:` = scales::dollar(c(100, 500)),
    `Free Level 2 Charger:` = c("No", "Yes"), 
    `Electricity Price Discount:` = c("25%", "50%"), 
    `Override Window:` = c("0.5 hrs", "4 hrs"),
    `Guaranteed Range if charged for 8 hrs:` = c("100 miles", "300 miles"))

alts_t <- as.data.frame(t(alts)) %>% 
  rownames_to_column(var = "Attribute")

colnames(alts_t) <- NULL

kable(alts_t, escape = FALSE, align = c('r', 'c', 'c')) %>% 
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE, 
                position = "center") %>%
  column_spec(column = 1, width = '20em') %>%
  column_spec(column = 2, width = '10em') %>%
  column_spec(column = 3, width = '10em') %>%
  row_spec(row = 1, bold = TRUE)
# Generate the profiles of attributes
profiles <- cbc_profiles(
    upfront      = c(100, 300, 500), # USD
    lv_2_charger = c('No', 'Yes'),
    discount     = c(10, 25, 50),    # percentage
    window       = c(0.5, 1, 2, 4),  # hrs
    guaranteed   = c(25, 50, 75)     # percentage
)

# Randomized full-factorial design
design <- cbc_design(
    profiles = profiles,
    n_resp   = 500,  # Number of respondents
    n_alts   = 2,    # Number of alternatives per question
    n_q      = 8,    # Number of questions per respondent
    no_choice = TRUE
)

# Save design
write_csv(design, here('data', 'choice_questions.csv'))
# 1. Read survey data
p1 <- read_csv(here("data", "pevFinal1.csv"))
p2 <- read_csv(here("data", "pevFinal2.csv"))
p3 <- read_csv(here("data", "pevFinal3.csv"))

# 2. Format and combine the 3 dataframes
# 2.1 Format p1
p1 <- p1 %>% 
    mutate(
        created = ymd_hms(created, tz = "EST"),
        ended =  ymd_hms(ended, tz = "EST"),
        time_sec_p1 = as.numeric(ended - created, units = "secs")) %>%
    # Select important columns
    select(session, time_sec_p1, starts_with("mc"))

# 2.2 Format p2
p2 <- p2 %>% 
    mutate(
        created = ymd_hms(created),
        ended =  ymd_hms(ended),
        time_sec_p2 = as.numeric(ended - created, units = "secs")) %>%
    # Select important columns
    select(session, time_sec_p2, respondentID, cbc_all_same, starts_with("mc"))

# 2.3 Format p3
p3 <- p3 %>% 
    mutate(
        created = ymd_hms(created),
        ended =  ymd_hms(ended),
        time_sec_p3 = as.numeric(ended - created, units = "secs")) %>%
    # Select important columns
    select(session, time_sec_p3, completion_code, starts_with("mc"))

# 2.4 Combine all parts together using the session variable
choice_data_wider <- p1 %>% 
    left_join(p2, by = "session") %>% 
    left_join(p3, by = "session") %>% 
    # No longer need session variable
    select(-session)

# 3. Drop unnecessary parts
# 3.1 Drop anyone who didn't complete all choice questions
choice_data_wider <- choice_data_wider %>% 
    filter(!is.na(mc_cbc_1)) %>% 
    filter(!is.na(mc_cbc_2)) %>% 
    filter(!is.na(mc_cbc_3)) %>% 
    filter(!is.na(mc_cbc_4)) %>% 
    filter(!is.na(mc_cbc_5)) %>% 
    filter(!is.na(mc_cbc_6)) %>% 
    filter(!is.na(mc_cbc_7)) %>% 
    filter(!is.na(mc_cbc_8))

# 3.2 Drop respondents who went too fast
choice_data_wider <- choice_data_wider %>% 
    mutate(
        # First replace NA values with 0 seconds
        time_sec_p1 = ifelse(is.na(time_sec_p1), 0, time_sec_p1),
        time_sec_p2 = ifelse(is.na(time_sec_p2), 0, time_sec_p2),
        time_sec_p3 = ifelse(is.na(time_sec_p3), 0, time_sec_p3),
        # Now compute the total time
        time_min_total = (time_sec_p1 + time_sec_p2 + time_sec_p3) / 60
    )

# 3.3 Drop anyone who finished in under the 10th percentile of completion times
time_10 <- quantile(choice_data_wider$time_min_total, 0.1)
choice_data_wider <- choice_data_wider %>% 
    filter(time_min_total >= time_10)

# 3.4 Drop respondents that got the attention check question wrong
# choice_data_wider <- choice_data_wider %>% 
#    filter(mc_cbc_practice == 'option_2')

# 4. Generate choice data csv file
# 4.1 Convert the data to long format
choice_data <- choice_data_wider %>% 
    pivot_longer(
        cols = mc_cbc_1:mc_cbc_8,
        names_to = "qID",
        values_to = "choice") %>% 
    # Convert the qID variable to a number
    mutate(qID = parse_number(qID))

# 4.2 Read in choice questions and join it to choice_data
survey <- read_csv("https://github.com/pingfan0727/Public-Resources/raw/main/datasets/cbc_questions_final.csv")
choice_data <- choice_data %>% 
    rename(respID = respondentID) %>% 
    left_join(survey, by = c("respID", "qID"), relationship = "many-to-many")

# 4.3 Convert choice column to 1 or 0 based on if the alternative was chosen 
choice_data <- choice_data %>% 
    mutate(
        choice = case_when(
            str_detect(choice, "^option_") ~
                as.numeric(str_replace(choice, "option_", "")),
            choice == "not_interested" ~ 3,
            TRUE ~ as.numeric(choice)
        ),
        choice = ifelse(choice == altID, 1, 0)
    ) %>% 
    select(-mc_cbc_practice, -cbc_all_same)

# 4.4 Create new values for respID & obsID
nRespondents <- nrow(choice_data_wider)
nAlts <- max(survey$altID)
nQuestions <- max(survey$qID)
choice_data$respID <- rep(seq(nRespondents), each = nAlts*nQuestions)
choice_data$obsID <- rep(seq(nRespondents*nQuestions), each = nAlts)

# 4.5 Reorder columns - it's nice to have the "ID" variables first
choice_data <- choice_data %>% 
    select(ends_with("ID"), "choice", "upfront",
           "lv_2_charger_Yes", "lv_2_charger_No",
           "discount", "window", "guaranteed",
           "no_choice", everything())

# 4.6 Clean up names for choice_data
choice_data <- clean_names(choice_data)

# 4.7 Save the choice_data_wider for future use
save(
    choice_data_wider,
    file = here("data", "choice_data_wider.RData")
)

# 4.8 Save cleaned data for modeling
write_csv(choice_data, here("data", "choice_data.csv"))
# 1.1 Load and pivot the data
load(here("data", "choice_data_wider.RData"))
choice_data_vehicle <- choice_data_wider %>% 
    pivot_longer(cols = c(mc_own_ev, mc_max_range, mc_commute,
                          mc_car_number, mc_secondary),
                 names_to = 'category',
                 values_to = 'value')

# 1.2 Reorder and relabel the values based on category
choice_data_vehicle <- choice_data_vehicle %>% 
    mutate(
        category = case_when(
            category == "mc_own_ev" ~ "Own EV",
            category == "mc_max_range" ~ "Max Range",
            category == "mc_car_number" ~ "Car Number",
            category == "mc_secondary" ~ "Secondary Car",
            category == "mc_commute" ~ "Commute",
            TRUE ~ category
        ),
        value = case_when(
            category == "Own EV" ~
                factor(value, levels = c("yes_own", "yes_lease", "no_but_next",
                                         "no_but_interested", "no"), 
                       labels = c("Own", "Lease", "Plan", "Interested", "No")),
            category == "Max Range" ~
                factor(value, levels = c(
                    "below_100", "range_100_to_149", "range_150_to_199",
                    "range_200_to_249", "range_250_to_299", "over_300"
                ),
                labels = c("100", "150", "200", "250", "300", ">300")),
            category == "Car Number" ~
                factor(value, levels = c("zero", "one", "two", "three", "four"),
                       labels = c("0", "1", "2", "3", "4")),
            category == "Secondary Car" ~
                factor(value, levels = c("yes_ev", "yes_gas", "yes_both", "no"),
                       labels = c("EV", "Gas", "Both", "Nil")),
            category == "Commute" ~
                factor(value, levels = c("on_foot", "by_bus", "by_train_subway",
                                         "self_driving", "others"),
                       labels = c("On Foot", "By Bus", "By Train",
                                  "Self-driving", "Others")),
            TRUE ~ as.factor(value)
        )
    )

# 1.3 Plot the vehicle info summary
vehicle_plot <- choice_data_vehicle %>%
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
    facet_wrap(~fct_relevel(category, "Own EV", "Max Range",
                            "Car Number", "Secondary Car", "Commute"),
               scales = "free_x", nrow = 3) +
    labs(x = NULL,
         y = "Count",
         title = "Vehicle Info Summary Plots",
         subtitle = "Summary Statistics Part 1") +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '1_2_vehicle_plot.png'), 
    plot = vehicle_plot,
    width = 7,
    height = 7 * 0.8
)
# 2.1 Pivot the data
choice_data_charging <- choice_data_wider %>% 
    pivot_longer(cols = c(mc_at_home, mc_at_work),
                 names_to = 'category',
                 values_to = 'value') %>% 
    separate_rows(value, sep = ",\\s*")

# 2.2 Reorder and relabel the values based on category
choice_data_charging <- choice_data_charging %>% 
    mutate(
        category = case_when(
            category == "mc_at_home" ~ "At Home",
            category == "mc_at_work" ~ "At Work",
            TRUE ~ category
        ),
        value = case_when(
            category == "At Home" ~
                factor(value, levels = c(
                    "home_morning", "home_afternoon", "home_evening",
                    "home_night", "no_home"), 
                    labels = c("Morning", "Afternoon", "Evening",
                               "Night", "No")),
            category == "At Work" ~
                factor(value, levels = c(
                    "work_morning", "work_noon", "work_afternoon",
                    "work_evening", "work_night", "no_work"), 
                    labels = c("Morning", "Noon", "Afternoon",
                               "Evening", "Night", "No")),
            TRUE ~ as.factor(value)
        )
    )

# 2.3 Plot the charging preference summary
charging_plot <- choice_data_charging %>%
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_x_discrete(limits = c("Morning", "Noon", "Afternoon",
                                "Evening", "Night", "No")) + 
    scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
    facet_wrap(~fct_relevel(category, "At Home", "At Work"),
               scales = "free_x", nrow = 2) +
    labs(x = NULL,
         y = "Count",
         title = "Charging Preference Summary Plots",
         subtitle = "Summary Statistics Part 2") +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '1_2_charging_plot.png'), 
    plot = charging_plot,
    width = 6,
    height = 6 * 0.8
)
# 3.1 First Part - gender, race, climate caring, and party
# 3.1.1 Pivot the data
choice_data_demo <- choice_data_wider %>% 
    mutate(across(c(mc_gender, mc_race, mc_climate, mc_party), as.character)) %>%
    pivot_longer(cols = c(mc_gender, mc_race, mc_climate, mc_party),
                 names_to = 'category',
                 values_to = 'value') %>% 
    separate_rows(value, sep = ",\\s*")

# 3.1.2 Reorder and relabel the values based on category
choice_data_demo <- choice_data_demo %>% 
    mutate(
        category = case_when(
            category == "mc_gender" ~ "Gender",
            category == "mc_race" ~ "Race",
            category == "mc_climate" ~ "Climate Caring",
            category == "mc_party" ~ "Party",
            TRUE ~ category
        ),
        value = case_when(
            category == "Gender" ~
                factor(value, levels = c(
                    "male", "female", "transMale", "transFemale",
                    "genderNonconform", "others", "prefer_not_say"), 
                    labels = c("Male", "Female", "Trans Male", "Trans Female",
                               "Genderqueer", "Others", "Not Say")),
            category == "Race" ~
                factor(value, levels = c(
                    "asian", "black", "white", "hispanic", "native",
                    "pacific", "others", "prefer_not_say"), 
                    labels = c("Asian", "Black", "White", "Hispanic", "Native",
                               "Pacific", "Other", "Not Say")),
            category == "Party" ~
                factor(value, levels = c(
                    "democratic", "republican", "independent", "prefer_not_say"), 
                    labels = c("Democratic", "Republican", "Independent", "Not say")),
            TRUE ~ as.factor(value)
        )
    )

# 3.1.3 Plot the first part of demographic info summary
demo_plot_1 <- choice_data_demo %>%
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
    facet_wrap(~fct_relevel(category, "Gender", "Race", "Climate Caring", "Party"),
               scales = "free_x", nrow = 2) +
    labs(x = NULL,
         y = "Count",
         title = "Demographic Summary Plots Part 1",
         subtitle = "Summary Statistics Part 3-1") +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '1_2_demo_plot_1.png'), 
    plot = demo_plot_1,
    width = 7,
    height = 7 * 0.8
)
# 3.2 Second Part - education
# 3.2.1 Pivot the data
choice_data_demo_2 <- choice_data_wider %>% 
    mutate(across(c(mc_education, mc_income), as.character)) %>%
    pivot_longer(cols = c(mc_education, mc_income),
                 names_to = 'category',
                 values_to = 'value') %>% 
    separate_rows(value, sep = ",\\s*")

# 3.2.2 Reorder and relabel the values based on category
choice_data_demo_2 <- choice_data_demo_2 %>% 
    mutate(
        category = case_when(
            category == "mc_education" ~ "Education",
            category == "mc_income" ~ "Income",
            TRUE ~ category
        ),
        value = case_when(
            category == "Education" ~
                factor(value, levels = c(
                    "no_hs", "hs", "college_some", "vocational", "degree_associate",
                    "degree_bs", "degree_grad", "others", "prefer_not_say"), 
                    labels = c("< High School", "High School", "Some College",
                               "Vocational", "AA/AS", "BA/BS", "Grad",
                               "Others", "Not Say")),
            category == "Income" ~
                factor(value, levels = c(
                    "under60", "inc_60to80", "inc_80to100", "inc_100to150",
                    "inc_150to200", "inc_200to250", "inc_250to300",
                    "inc_300to350", "inc_350to400", "inc_over400",
                    "prefer_not_say"), 
                    labels = c("< $60K", "$60K to $80K", "$80K to $100K",
                               "$100K to $150K", "$150K to $200K",
                               "$200K to $250K", "$250K to $300K",
                               "$300K to $350K", "$350K to $400K",
                               "> $400K", "Not Say")),
            TRUE ~ as.factor(value)
        )
    )

# 3.2.3 Plot the education distribution
demo_plot_2 <- choice_data_demo_2 %>% 
    filter(category == "Education") %>% 
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
    labs(x = NULL,
         y = "Count",
         title = "Demographic Summary Plots Part 2 - Education",
         subtitle = "Summary Statistics Part 3-2") +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '1_2_demo_plot_2.png'), 
    plot = demo_plot_2,
    width = 6,
    height = 6 / 1.618
)
# 3.3 Third Part - income
# 3.3.1 Plot the income distribution
demo_plot_3 <- choice_data_demo_2 %>% 
    filter(category == "Income") %>% 
    ggplot(aes(x = value)) + 
    geom_bar(width = 0.6) +
    scale_y_continuous(expand = expansion(mult = c(0, 0.05))) +
    labs(x = NULL,
         y = "Count",
         title = "Demographic Summary Plots Part 3 - Income",
         subtitle = "Summary Statistics Part 3-3") +
    coord_flip() +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '1_2_demo_plot_3.png'), 
    plot = demo_plot_3,
    width = 6,
    height = 6 / 1.618
)
vehicle_plot
charging_plot
demo_plot_1
demo_plot_2
demo_plot_3
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
data_mnl_pref <- read_csv(here("data", "choice_data.csv")) %>% 
    mutate(upfront = -1 * upfront / 100,
           discount = discount / 10,
           guaranteed = guaranteed / 10)

mnl_pref <- logitr(
    data    = data_mnl_pref,
    outcome = "choice",
    obsID   = "obs_id",
    pars    = c("upfront", "lv_2_charger_yes", "discount",
                "window", "guaranteed", "no_choice")
    )

# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_pref)

# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mnl_pref$gradient

# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mnl_pref$hessian)$values

# 2.3 Save the model
save(
    mnl_pref,
    file = here("models", "mnl_pref.RData")
)

# 3. Model Equation
# 3.1 Estimated values and std err
table_mnl_pref <- data.frame(
    Coefficient = c("&beta;<sub>1</sub>", "&beta;<sub>2</sub>",
                    "&beta;<sub>3</sub>", "&beta;<sub>4</sub>",
                    "&beta;<sub>5</sub>", "&beta;<sub>6</sub>"),
    Meaning = c("Upfront", "Lv 2 Charger", "Discount",
                "Window", "Guaranteed", "No Choice"),
    Estimate = coef(mnl_pref),
    StdError = se(mnl_pref),
    Level = c("1, 3, 5", "-", "1, 2.5, 5",
               "0.5, 1, 2, 4", "25, 50, 75", "-"),
    Unit = c("100 USD", "-", "10%", "1 hr", "10%", "-")
)
table_mnl_pref$Estimate <- round(table_mnl_pref$Estimate, 4)
table_mnl_pref$StdError <- round(table_mnl_pref$StdError, 4)

mnl_pref_kable <- table_mnl_pref %>% 
    kable(format = 'html', escape = FALSE,
          col.names = c("Coefficient", "Meaning", "Estimate", "Std Error",
                        "Level", "Unit"),
          align = c('c', 'l', 'c', 'c', 'c', 'c'),
          caption = "Summary of Model Coefficients") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 3.2 Model figure
# 3.2.1 Construct the data frame
figure_mnl_pref <- data.frame(
    Attribute = factor(
        c("Upfront", "Charger", "Discount", "Window", "Guaranteed", "No Choice"),
        levels = c("No Choice", "Guaranteed", "Window", 
                   "Discount", "Charger", "Upfront")
    ),
    Estimate = coef(mnl_pref),
    StdError = se(mnl_pref)
)

figure_mnl_pref$LowerBound =
    figure_mnl_pref$Estimate - 1.96 * figure_mnl_pref$StdError
figure_mnl_pref$UpperBound =
    figure_mnl_pref$Estimate + 1.96 * figure_mnl_pref$StdError

# 3.2.2 Create the plot
mnl_pref_plot <- figure_mnl_pref %>% 
    ggplot(aes(x = Attribute, y = Estimate)) +
    geom_point() +
    geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
    labs(title = "Coefficient Estimates with Uncertainty Bounds",
         subtitle = "Simple logit preference space",
         x = "Attribute",
         y = "Estimate") +
    coord_flip() +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '2_1_mnl_pref_plot.png'), 
    plot = mnl_pref_plot,
    width = 6, 
    height = 6 / 1.618
)
mnl_pref_kable
mnl_pref_plot
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
data_mxl_pref <- read_csv(here("data", "choice_data.csv")) %>% 
    mutate(upfront = -1 * upfront / 100,
           discount = discount / 10,
           guaranteed = guaranteed / 10)
set.seed(456)
mxl_pref <- logitr(
    data    = data_mxl_pref,
    outcome = "choice",
    obsID   = "obs_id",
    pars    = c("upfront", "lv_2_charger_yes", "discount",
                "window", "guaranteed", "no_choice"),
    randPars = c(upfront          = "n",
                 lv_2_charger_yes = "n",
                 discount         = "n",
                 window           = "n",
                 guaranteed       = "n",
                 no_choice        = "n"),
    numMultiStarts = 30,
    drawType = 'sobol',
    numDraws = 200
    )

# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mxl_pref)

# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mxl_pref$gradient

# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mxl_pref$hessian)$values

# 2.3 Save the model
save(
    mxl_pref,
    file = here("models", "mxl_pref.RData")
)

# 3. Model Equation
# 3.1 Estimated values and std err
table_mxl_pref <- data.frame(
    Coefficient = c("&beta;<sub>1</sub>", "&beta;<sub>2</sub>",
                    "&beta;<sub>3</sub>", "&beta;<sub>4</sub>",
                    "&beta;<sub>5</sub>", "&beta;<sub>6</sub>",
                    "SD &beta;<sub>1</sub>", "SD &beta;<sub>2</sub>",
                    "SD &beta;<sub>3</sub>", "SD &beta;<sub>4</sub>",
                    "SD &beta;<sub>5</sub>", "SD &beta;<sub>6</sub>"),
    Meaning = c("Upfront", "Lv 2 Charger", "Discount",
                "Window", "Guaranteed", "No Choice",
                "SD Upfront", "SD Lv 2 Charger", "SD Discount",
                "SD Window", "SD Guaranteed", "SD No Choice"),
    Estimate = coef(mxl_pref),
    StdError = se(mxl_pref),
    Level = c("1, 3, 5", "-", "1, 2.5, 5",
               "0.5, 1, 2, 4", "25, 50, 75", "-",
              "-", "-", "-", "-", "-", "-"),
    Unit = c("100 USD", "-", "10%", "hour", "10%", "-",
             "-", "-", "-", "-", "-", "-")
)
table_mxl_pref$Estimate <- round(table_mxl_pref$Estimate, 4)
table_mxl_pref$StdError <- round(table_mxl_pref$StdError, 4)
mxl_pref_kable <- table_mxl_pref %>% 
    kable(format = 'html', escape = FALSE,
          col.names = c("Coefficient", "Meaning", "Estimate", "Std Error",
                        "Level", "Unit"),
          align = c('c', 'l', 'c', 'c', 'l', 'l'),
          caption = "Summary of Model Coefficients") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 3.2 Model figure
# 3.2.1 Construct the data frame
figure_mxl_pref <- data.frame(
    Attribute = factor(
        c("Upfront", "Charger", "Discount", "Window", "Guaranteed", "No Choice"),
        levels = c("No Choice", "Guaranteed", "Window",
                   "Discount", "Charger", "Upfront")
    ),
    Estimate = head(coef(mxl_pref), 6),
    StdError = head(se(mxl_pref), 6)
)

figure_mxl_pref$LowerBound =
    figure_mxl_pref$Estimate - 1.96 * figure_mxl_pref$StdError
figure_mxl_pref$UpperBound =
    figure_mxl_pref$Estimate + 1.96 * figure_mxl_pref$StdError

# 3.2.2 Create the plot
mxl_pref_plot <- figure_mxl_pref %>% 
    ggplot(aes(x = Attribute, y = Estimate)) +
    geom_point() +
    geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
    labs(title = "Coef Estimates with Uncertainty Bounds",
         subtitle = "Mixed logit preference space",
         x = "Attribute",
         y = "Estimate") +
    coord_flip() +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '2_2_mxl_pref_plot.png'), 
    plot = mxl_pref_plot,
    width = 6, 
    height = 6 / 1.618
)
mxl_pref_kable
mxl_pref_plot
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
data_mnl_pref_groups <- read_csv(here("data", "choice_data.csv")) %>% 
    mutate(upfront = -1 * upfront / 100,
           discount = discount / 10,
           guaranteed = guaranteed / 10) %>% 
    mutate(group_A = ifelse(mc_income == "under60", 1, 0),
           group_B = ifelse(mc_income != "under60", 1, 0)) %>% 
    mutate(upfront_B          = upfront * group_B,
           lv_2_charger_yes_B = lv_2_charger_yes * group_B,
           discount_B         = discount * group_B,
           window_B           = window * group_B,
           guaranteed_B       = guaranteed * group_B,
           no_choice_B        = no_choice * group_B)

mnl_pref_groups <- logitr(
    data    = data_mnl_pref_groups,
    outcome = "choice",
    obsID   = "obs_id",
    pars    = c("upfront", "lv_2_charger_yes", "discount",
                "window", "guaranteed", "no_choice",
                "upfront_B", "lv_2_charger_yes_B", "discount_B",
                "window_B", "guaranteed_B", "no_choice_B")
    )

# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_pref_groups)

# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mnl_pref_groups$gradient

# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mnl_pref_groups$hessian)$values

# 2.3 Save the model
save(
    mnl_pref_groups,
    file = here("models", "mnl_pref_groups.RData")
)

# 3. Model Equation
# 3.1 Estimated values and std err
table_mnl_pref_groups <- data.frame(
    Coefficient = c("&beta;<sub>1</sub>", "&beta;<sub>2</sub>",
                    "&beta;<sub>3</sub>", "&beta;<sub>4</sub>",
                    "&beta;<sub>5</sub>", "&beta;<sub>6</sub>",
                    "&beta;<sub>7</sub>", "&beta;<sub>8</sub>",
                    "&beta;<sub>9</sub>", "&beta;<sub>10</sub>",
                    "&beta;<sub>11</sub>", "&beta;<sub>12</sub>"),
    Meaning = c("Upfront", "Lv 2 Charger", "Discount",
                "Window", "Guaranteed", "No Choice",
                "Upfront B", "Lv 2 Charger B", "Discount B",
                "Window B", "Guaranteed B", "No Choice B"),
    Estimate = coef(mnl_pref_groups),
    StdError = se(mnl_pref_groups),
    Level = c("1, 3, 5", "-", "1, 2.5, 5",
              "0.5, 1, 2, 4", "25, 50, 75", "-",
              "1, 3, 5", "-", "1, 2.5, 5",
              "0.5, 1, 2, 4", "25, 50, 75", "-"),
    Unit = c("100 USD", "-", "10%", "hour", "10%", "-",
             "100 USD", "-", "10%", "hour", "10%", "-")
)
table_mnl_pref_groups$Estimate <- round(table_mnl_pref_groups$Estimate, 4)
table_mnl_pref_groups$StdError <- round(table_mnl_pref_groups$StdError, 4)
mnl_pref_groups_kable <- table_mnl_pref_groups %>% 
    kable(format = 'html', escape = FALSE,
          col.names = c("Coefficient", "Meaning", "Estimate", "Std Error",
                        "Level", "Unit"),
          align = c('c', 'l', 'c', 'c', 'l', 'l'),
          caption = "Summary of Model Coefficients") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 3.2 Model figure
# 3.2.1 Construct the data frame
figure_mnl_pref_groups <- data.frame(
    Attribute = factor(
        c("Upfront", "Charger", "Discount", "Window", "Guaranteed", "No Choice",
          "Upfront B", "Charger B", "Discount B",
          "Window B", "Guaranteed B", "No Choice B"),
        levels = c("No Choice B", "Guaranteed B", "Window B",
                   "Discount B", "Charger B", "Upfront B",
                   "No Choice", "Guaranteed", "Window",
                   "Discount", "Charger", "Upfront")
    ),
    Estimate = coef(mnl_pref_groups),
    StdError = se(mnl_pref_groups)
)

figure_mnl_pref_groups$LowerBound =
    figure_mnl_pref_groups$Estimate - 1.96 * figure_mnl_pref_groups$StdError
figure_mnl_pref_groups$UpperBound =
    figure_mnl_pref_groups$Estimate + 1.96 * figure_mnl_pref_groups$StdError

# 3.2.2 Create the plot
mnl_pref_groups_plot <- figure_mnl_pref_groups %>% 
    ggplot(aes(x = Attribute, y = Estimate)) +
    geom_point() +
    geom_errorbar(aes(ymin = LowerBound, ymax = UpperBound), width = 0.2) +
    labs(title = "Coefficient Estimates with Uncertainty Bounds",
         subtitle = "Simple logit preference space with groups",
         x = "Attribute",
         y = "Estimate") +
    coord_flip() +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '2_3_mnl_pref_groups_plot.png'), 
    plot = mnl_pref_groups_plot,
    width = 6, 
    height = 6 / 1.618
)

# 4. Hypothesis Test
# 4.1 Perform the simulation
beta <- coef(mnl_pref_groups)
hessian <- mnl_pref_groups$hessian
covariance = -1*(solve(hessian))
draws = as.data.frame(MASS::mvrnorm(10^5, beta, covariance))

# 4.2 Define the coefficients for groups A and B
b1_A <- draws$upfront
b2_A <- draws$lv_2_charger_yes
b3_A <- draws$discount
b4_A <- draws$window
b5_A <- draws$guaranteed
b6_A <- draws$no_choice

b1_B <- draws$upfront + draws$upfront_B
b2_B <- draws$lv_2_charger_yes + draws$lv_2_charger_yes_B
b3_B <- draws$discount + draws$discount_B
b4_B <- draws$window + draws$window_B
b5_B <- draws$guaranteed + draws$guaranteed_B
b6_B <- draws$no_choice + draws$no_choice_B

# 4.3 Construct the data frame
coefficients = list(b1_A, b2_A, b3_A, b4_A, b5_A, b6_A,
                    b1_B, b2_B, b3_B, b4_B, b5_B, b6_B)

draw_sim_df <- data.frame(
    Coefficient = c("b1_A", "b2_A", "b3_A", "b4_A", "b5_A", "b6_A",
                    "b1_B", "b2_B", "b3_B", "b4_B", "b5_B", "b6_B"),
    Mean = sapply(coefficients, mean),
    LowerQuantile = sapply(coefficients, function(x) quantile(x, probs = 0.025)),
    UpperQuantile = sapply(coefficients, function(x) quantile(x, probs = 0.975))
)

# 4.4 Sketch the plot
hypothesis_test_plot <- draw_sim_df %>% 
    ggplot(aes(x = Coefficient, y = Mean)) +
    geom_point() +
    geom_errorbar(aes(ymin = LowerQuantile, ymax = UpperQuantile), width = 0.2) +
    labs(title = "Hypothesis Test on Groups A and B",
         x = "Coefficient",
         y = "Values") +
    coord_flip() +
    theme_bw(base_family = "Ubuntu")

ggsave(
    filename = here('figs_manu', '2_3_hypothesis_test_plot.png'), 
    plot = hypothesis_test_plot,
    width = 6, 
    height = 6 / 1.618
)
mnl_pref_groups_kable
mnl_pref_groups_plot
hypothesis_test_plot
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
data_mnl_pref <- read_csv(here("data", "choice_data.csv")) %>% 
    mutate(upfront = -1 * upfront / 100,
           discount = discount / 10,
           guaranteed = guaranteed / 10)

set.seed(234)
mnl_wtp <- logitr(
    data    = data_mnl_pref,
    outcome = "choice",
    obsID   = "obs_id",
    pars    = c("lv_2_charger_yes", "discount",
                "window", "guaranteed", "no_choice"),
    scalePar = "upfront",
    numMultiStarts = 30
    )

# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_wtp)

# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mnl_wtp$gradient

# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mnl_wtp$hessian)$values

# 2.3 Compare preference vs WTP spaces
load(here("models", "mnl_pref.RData"))
mnl_wtp_compare <- wtpCompare(mnl_pref, mnl_wtp, scalePar = 'upfront')
mnl_wtp_compare_kable <- mnl_wtp_compare %>% kable(
    format = 'html', escape = FALSE,
    col.names = c("Coefficient", "mnl_pref Space",
                  "mnl_wtp Space", "Difference"),
    caption = "Model Coefficients Compare") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 2.4 Save the model
save(
    mnl_wtp,
    file = here("models", "mnl_wtp.RData")
)
mnl_wtp_compare_kable
# 1. Model Summary
# 1.1 Use the logitr package to generate the model
data_mxl_pref <- read_csv(here("data", "choice_data.csv")) %>% 
    mutate(upfront = -1 * upfront / 100,
           discount = discount / 10,
           guaranteed = guaranteed / 10)

set.seed(89)
mxl_wtp <- logitr(
    data     = data_mxl_pref,
    outcome  = "choice",
    obsID    = "obs_id",
    pars    = c("lv_2_charger_yes", "discount",
                "window", "guaranteed", "no_choice"),
    scalePar = "upfront",
    randPars = c(lv_2_charger_yes = "n",
                 discount         = "n",
                 window           = "n",
                 guaranteed       = "n",
                 no_choice        = "n"),
    numMultiStarts = 30
    )

# 1.2 Use the summary() function to see model summary
# Skipped here. Please refer to the R file in the data folder.
# summary(mxl_wtp)

# 2 Model Evaluation
# 2.1 First order condition - Is the gradient close to zero?
# Skipped here. Please refer to the R file in the data folder.
# mxl_wtp$gradient

# 2.2 Second order condition - Is the hessian negative definite?
# Skipped here. Please refer to the R file in the data folder.
# eigen(mxl_wtp$hessian)$values

# 2.3 Save the coefficients
coef_mxl_wtp <- head(coef(mxl_wtp), 6)
coef_mxl_wtp_df <- as.data.frame(coef_mxl_wtp)

mxl_wtp_compare <- cbind(coef_mxl_wtp_df, head(mnl_wtp_compare, 6)) %>% 
    select(pref, coef_mxl_wtp, difference) %>% 
    mutate(difference = coef_mxl_wtp - as.numeric(pref))

mxl_wtp_compare_kable <- mxl_wtp_compare %>% kable(
    format = 'html', escape = FALSE,
    col.names = c("Coefficient", "mnl_pref Space",
                  "mxl_wtp Space", "Difference"),
    caption = "Model Coefficients Compare") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 2.4 Save the model
save(
    mxl_wtp,
    file = here("models", "mxl_wtp.RData")
)
mxl_wtp_compare_kable
# 1. Load the mnl_pref model
load(here("models", "mnl_pref.RData"))

# Skipped here. Please refer to the R file in the data folder.
# summary(mnl_pref)

# 2. Compute WTP estimates
coefs_mnl_pref <- coef(mnl_pref)
wtp_mnl_pref <- coefs_mnl_pref / (-1*coefs_mnl_pref['upfront'])

# 3. Simulate WTP with uncertainty
cov_mnl_pref <- vcov(mnl_pref)
set.seed(3)
coef_draws_mnl_pref <- as.data.frame(
    MASS::mvrnorm(10^6, coefs_mnl_pref, cov_mnl_pref)
    )
wtp_draws_mnl_pref <- -1*(coef_draws_mnl_pref[,2:6] / coef_draws_mnl_pref[,1])
wtp_ci_mnl_pref <- ci(wtp_draws_mnl_pref)

# 4. Create kables for WTP with uncertainty
wtp_mnl_pref_kable <- wtp_mnl_pref %>% kable(
    format = 'html', escape = FALSE,
    col.names = c("Attribute", "Coefficient"),
    caption = "Computed WTP") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

wtp_ci_mnl_pref_kable <- wtp_ci_mnl_pref %>% kable(
    format = 'html', escape = FALSE,
    col.names = c("Attribute", "Mean", "Lower", "Upper"),
    caption = "Simulated WTP with CI") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 5. Save WTP with uncertainty
save(
    wtp_mnl_pref,
    wtp_ci_mnl_pref,
    file = here("data", "wtp_from_pref.RData")
)
wtp_mnl_pref_kable
wtp_ci_mnl_pref_kable
# 1. Load the mnl_pref model
load(here("models", "mnl_pref.RData"))

# 2. Market Simulation
# 2.1 Single market simulation
scenarios_sing <- data.frame(
    alt_id           = c(1, 2, 3),
    obs_id           = c(1, 1, 1),
    upfront          = c(-5, -1, 0),
    lv_2_charger_yes = c(1, 0, 0),
    discount         = c(5, 1, 0),
    window           = c(1, 4, 0),
    guaranteed       = c(2.5, 7.5, 0),
    no_choice        = c(0, 0, 1)
)

sim_mnl_pref_sing <- predict(
    mnl_pref,
    newdata    = scenarios_sing,
    obsID      = 'obs_id',
    level      = 0.95,
    interval   = 'confidence',
    returnData = TRUE
)

# 2.2 Multiple simulations
scenarios_mult <- read_csv(here("data", "scenarios.csv"))

sim_mnl_pref_mult <- predict(
    mnl_pref,
    newdata    = scenarios_mult, 
    obsID      = 'obs_id', 
    level      = 0.95,
    interval   = 'confidence',
    returnData = TRUE
)

# 2.3 Save simulations
save(
    sim_mnl_pref_sing,
    sim_mnl_pref_mult,
    file = here("sims", "sim_mnl_pref.RData")
)

# 3. Market Simulation Plots
# 3.1 Single market simulation plot
sim_sing_plot <- sim_mnl_pref_sing %>% 
    mutate(label = factor(c("Monetary", "Flexibility", "Not Enroll"),
                          levels = c("Monetary",
                                     "Flexibility",
                                     "Not Enroll"))) %>% 
    ggplot(aes(
        x = label, y = predicted_prob, 
        ymin = predicted_prob_lower, ymax = predicted_prob_upper)) +
    geom_col(fill = "grey", width = 0.6) +
    geom_errorbar(width = 0.3) +
    scale_y_continuous(limits = c(0, 1), labels = scales::percent) +
    labs(x = 'Options',
         y = 'Probability',
         title = 'Single Market Simulation Plot',
         subtitle = 'Monetary vs Flexibility') +
    theme_bw(base_family = 'Ubuntu')

ggsave(
    filename = here('figs_manu', '3_2_sim_sing_plot.png'),
    plot = sim_sing_plot,
    width = 6,
    height = 6 / 1.618
)

# 3.2 Multiple market simulation plot
scenarios_mult_labels <- setNames(c(
    "Polarized", "Incentive Only", "Charger Only", "Discount Only",
    "Window Only", "Guaranteed Only", "Monetary vs Flexibility",
    "Monetary", "Flexibility"), 1:9)

sim_mult_plot <- sim_mnl_pref_mult %>%
    ggplot(aes(
        x = as.factor(alt_id), y = predicted_prob, 
        ymin = predicted_prob_lower, ymax = predicted_prob_upper)) +
    geom_col(fill = "grey", width = 0.6) +
    geom_errorbar(width = 0.3) +
    facet_wrap(~obs_id, labeller = labeller(obs_id = scenarios_mult_labels)) +
    scale_x_discrete(labels = c("1" = "1", "2" = "2", "3" = "No")) +
    scale_y_continuous(limits = c(0, 1), labels = scales::percent) +
    labs(x = 'Options',
         y = 'Probability',
         title = 'Multiple Market Simulation Plot') +
    theme_bw(base_family = 'Ubuntu')

ggsave(
    filename = here('figs_manu', '3_2_sim_mult_plot.png'),
    plot = sim_mult_plot,
    width = 7,
    height = 7 * 0.8
)
sim_sing_plot
sim_mult_plot
sim_cases <- tibble(
    `No.` = c(seq(1, 9)),
    `Option 1` = c(
        "Best levels of all attributes",
        "Best upfront incentive",
        "Free charger",
        "Best electricity price discount",
        "Best override window",
        "Best guaranteed battery",
        "Best upfront, discount, and a free charger",
        "Best upfront, discount, and a free charger",
        "Best window and guaranteed battery"),
    `Option 2` = c(
        "Lowest level for all",
        "Lowest level for all",
        "Lowest level for all",
        "Lowest level for all",
        "Lowest level for all",
        "Lowest level for all",
        "Best window and guaranteed battery",
        "Lowest level for all",
        "Lowest level for all"
    ),
    `No Option` = c(
        "No participation", "No participation", "No participation",
        "No participation", "No participation", "No participation",
        "No participation", "No participation", "No participation"
    )
)

sim_cases_kable <- sim_cases %>% 
    kable(format = 'html', escape = FALSE,
          align = c('c', 'l', 'l', 'l'),
          caption = "Multiple Market Simulation Cases") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

sim_cases_kable
# 1. Load the mnl_pref model
load(here("models", "mnl_pref.RData"))

# 2. Sensitivity Analysis
# 2.1 Sensitivity of market share to changes in "upfront"

# Set a baseline scenario
# Option 1 is a no option
# Option 2 is provided with moderate values
# We will change the upfront values for Option 2
baseline <- data.frame(
    alt_id           = c(1, 2), 
    obs_id           = c(1, 1),
    upfront          = c(0, -2),
    lv_2_charger_yes = c(0, 1),
    discount         = c(0, 2.5),
    window           = c(0, 2),
    guaranteed       = c(0, 5),
    no_choice        = c(1, 0)
)

baseline_kable <- baseline %>% kable(
    format = 'html', escape = FALSE,
    caption = "Single Attribute Baseline") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# Set the sequence of upfront levels from 1000 to 100 USD
upfront_levels <- seq(-10, -1, by = 1)

# Count the numbers of upfront scenarios
upfront_numbers <- length(upfront_levels)

# Create scenarios with different upfront levels in option 2
scenario_upfront <- do.call(
    rbind, replicate(upfront_numbers, baseline, simplify = FALSE)
    )
scenario_upfront$obs_id <- rep(seq(upfront_numbers), each = 2)
scenario_upfront$upfront[which(scenario_upfront$alt_id == 2)] <- upfront_levels 

# Simulate the upfront sensitivity
sens_upfront <- predict(
    mnl_pref,
    newdata = scenario_upfront,
    obsID = 'obs_id',
    level = 0.95,
    interval = 'confidence',
    returnData = TRUE
    ) %>% 
    # Keep only option 2
    filter(alt_id == 2) %>% 
    # Keep only upfront and predictions
    select(upfront, starts_with("predicted_"))

sens_upfront_kable <- sens_upfront %>% kable(
    format = 'html', escape = FALSE,
    caption = "Upfront Sensitivity Table") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# 2.2 Sensitivity of market share to changes in multiple attributes
# We continue to work on Option 2
# This time, we modify all continuous attributes
# The attributes are: upfront, discount, window, and guaranteed

# Construct the attribute_levels, like we constructed upfront_levels
# But attribute_levels is a df, while upfront_levels is a sequence
attribute_levels <- tribble(
    ~obs_id, ~alt_id, ~attribute,    ~case,  ~value,
    2,       2,      'upfront',     'high',  -5,
    3,       2,      'upfront',     'low',   -1,
    4,       2,      'discount',    'high',  5,
    5,       2,      'discount',    'low',   1,
    6,       2,      'window',      'high',  4,
    7,       2,      'window',      'low',   0.5,
    8,       2,      'guaranteed',  'high',  7.5,
    9,       2,      'guaranteed',  'low',   2.5
)

attribute_levels_kable <- attribute_levels %>% 
    kable(format = 'html', escape = FALSE,
          col.names = c("Obs ID", "Alt ID", "Attribute",
                        "Case", "Value"),
          caption = "Attribute Levels") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# Define the numbers of different scenarios
scenario_numbers <- 9

# Create scenarios with different upfront levels in option 2
scenario_table <- do.call(
    rbind, replicate(scenario_numbers, baseline, simplify = FALSE)
)
scenario_table$obs_id <- rep(seq(scenario_numbers), each = 2)
scenario_table <- scenario_table %>% 
    left_join(attribute_levels, by = c("alt_id", "obs_id")) %>% 
    mutate(
        attribute = ifelse(is.na(attribute), "other", attribute),
        case = ifelse(is.na(case), "base", case),
        upfront = ifelse(attribute == 'upfront', value, upfront),
        discount = ifelse(attribute == 'discount', value, discount),
        window = ifelse(attribute == 'window', value, window),
        guaranteed = ifelse(attribute == 'guaranteed', value, guaranteed)
    )

scenario_kable <- scenario_table %>% 
    kable(format = 'html', escape = FALSE,
          col.names = c("Alt ID", "Obs ID", "Upfront", "Charger",
                        "Discount", "Window", "Guaranteed",
                        "No Choice", "Attribute", "Case", "Value"),
          caption = "Scenario Table") %>% 
    kable_styling(bootstrap_options = c("striped", "hover"))

# Simulate the multiple change sensitivity
sens_multiple <- predict(
    mnl_pref,
    newdata = scenario_table,
    obsID = 'obs_id',
    level = 0.95,
    interval = 'confidence',
    returnData = TRUE
    ) %>% 
    # Keep only option 2
    filter(alt_id == 2) %>% 
    # Keep only upfront and predictions
    select(upfront, attribute, case, value, predicted_prob)

# Save simulations
save(
    sens_upfront,
    sens_multiple,
    file = here("sims", "sens_mnl_pref.RData")
)
# 3. Sensitivity Plots
# 3.1 Plot of market share change to "upfront"
sens_upfront_plot <- sens_upfront %>% 
    ggplot(aes(x = upfront * -100,
               y = predicted_prob,
               ymin = predicted_prob_lower,
               ymax = predicted_prob_upper)) +
    geom_ribbon(alpha = 0.2) +
    geom_line(linetype = "dashed") +
    geom_line(
        data = sens_upfront %>% filter(upfront <= -1, upfront >= -5), 
        linetype = "solid") +
    expand_limits(x = c(-10, -1), y = c(0, 1)) +
    labs(x = 'Upfront (USD)',
         y = 'Market Share',
         title = 'Sensitivity of market share change to upfront') +
    theme_bw(base_family = "Ubuntu")

# Save plot
ggsave(
    filename = here('figs_manu', '3_3_sens_upfront_plot.png'), 
    plot = sens_upfront_plot,
    width = 6, 
    height = 6 / 1.618
)

# 3.2 Plot of policy cost change to "upfront"
# Assume a market size of 1000 users
market_size <- 1000
sens_cost <- sens_upfront %>%
    mutate(
        cost_mean = upfront*market_size*predicted_prob*-100 / 1000,
        cost_low  = upfront*market_size*predicted_prob_lower*-100 / 1000,
        cost_high = upfront*market_size*predicted_prob_upper*-100 / 1000)
sens_cost_plot <- sens_cost %>% 
    ggplot(aes(x = upfront * -100,
               y = cost_mean,
               ymin = cost_low,
               ymax = cost_high)) +
    geom_ribbon(alpha = 0.2) +
    geom_line(linetype = "dashed") +
    geom_line(
        data = sens_cost %>% filter(upfront <= -1, upfront >= -5), 
        linetype = "solid") +
    expand_limits(x = c(-10, -1), y = c(0, 1)) +
    labs(x = 'Upfront (USD)',
         y = 'Cost ($1000)',
         title = 'Sensitivity of policy cost change to upfront') +
    theme_bw(base_family = "Ubuntu")

# Save plot
ggsave(
    filename = here('figs_manu', '3_3_sens_cost_plot.png'), 
    plot = sens_cost_plot,
    width = 6, 
    height = 6 / 1.618
)

# 3.2 Tornado Plot of market share change to multiple attributes
labels <- data.frame( 
    attribute = c('upfront', 'discount', 'window', 'guaranteed'), 
    label = c('Upfront Incentive (USD)',
              'Electricity Discount (%)',
              'Override Window (hrs)',
              'Guaranteed Battery (%)')
)

tornado_data <- sens_multiple %>% 
    select(-upfront) %>% 
    filter(case != 'base') %>% 
    left_join(labels, by = 'attribute') %>% 
    mutate(value = ifelse(attribute == "upfront", -100*value, value),
           value = ifelse(attribute == "discount" | attribute == "guaranteed",
                          10*value, value))

tornado_plot <- jph::ggtornado(
    data = tornado_data,
    baseline = sens_multiple$predicted_prob[1],
    var = 'label',
    level = 'case',
    value = 'value',
    result = 'predicted_prob'
    ) +
    scale_fill_manual(values = c("#67a9cf", "#ef8a62")) + 
    labs(x = 'Market Share',
         y = 'Attribute',
         title = 'Plot of market share change to multiple attributes') +
    theme_bw(base_family = 'Ubuntu') +
    theme(legend.position = 'none')

# Save plot
ggsave(
    filename = here('figs_manu', '3_3_tornado_plot.png'), 
    plot = tornado_plot,
    width = 6,
    height = 6 / 1.618
)
baseline_kable
sens_upfront_kable
sens_upfront_plot
sens_cost_plot
attribute_levels_kable
scenario_kable
tornado_plot
# Read in the choice questions
library(tidyverse)
survey <- read_csv("https://raw.githubusercontent.com/pingfan0727/Public-Resources/main/datasets/cbc_questions_final.csv")

# Define the respondent ID
respondentID <- sample(survey$respID, 1)

# Create the subset of rows for that respondent ID
df <- survey %>% 
    filter(respID == respondentID)

# Convert df to json
df_json <- jsonlite::toJSON(df) # Here the df_json is NOT serialized since we are using Kables instead of buttons.
library(dplyr)
library(kableExtra)
brief <- tibble(
    `Attribute` = c("Range", "Explanation"),
    `Upfront Incentive:` = c("$100, $300, $500", "It means the amount of money you will be instantly paid if you join the smart charging program. The unit is in USD."),
    `Free Level 2 Charger:` = c("No or Yes", "A \"Yes\" means you will get a free level 2 charger. A level 2 charger enables faster charging than a standard home charger."), 
    `Electricity Price Discount:` = c("10%, 25%, 50%", "If you choose to join this smart charging program, this will be your electricity price discount."), 
    `Override Window:` = c("0.5 ~ 4 hrs", "This is the longest time window per day that you can force override to regular charging, during which time your EV is not controlled by the smart charging program."),
    `Guaranteed Range if charged for 8 hrs:` = c("25%, 50%, 75%", "It means the guranteed range if your car is charged for 8 hrs during smart charging. It shows percentage here, but will be converted to miles."))
row.names(brief) <- NULL

kable(t(brief), escape = FALSE) %>% 
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE, 
                position = "center") %>%
  row_spec(row = 1, bold = TRUE)
library(dplyr)
library(kableExtra)
library(tibble)

alts <- tibble(
    `Options:` = c("Option 1", "Option 2"),
    `Upfront Incentive:` = scales::dollar(c(100, 500)),
    `Free Level 2 Charger:` = c("No", "Yes"), 
    `Electricity Price Discount:` = c("25%", "50%"), 
    `Override Window:` = c("0.5 hrs", "4 hrs"),
    `Guaranteed Range if charged for 8 hrs:` = c("100 miles", "300 miles"))

alts_t <- as.data.frame(t(alts)) %>% 
  rownames_to_column(var = "Attribute")

colnames(alts_t) <- NULL

kable(alts_t, escape = FALSE, align = c('r', 'c', 'c')) %>% 
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE, 
                position = "center") %>%
  column_spec(column = 1, width = '20em') %>%
  column_spec(column = 2, width = '10em') %>%
  column_spec(column = 3, width = '10em') %>%
  row_spec(row = 1, bold = TRUE)
library(dplyr)
library(kableExtra)
library(tibble)

alts <- jsonlite::fromJSON(df_json) %>%
  filter(qID == 1 & altID != 3) %>% 
  mutate(
    altID = paste("Option", altID),
    upfront = scales::dollar(upfront),
    discount = paste0(discount,"%"),
    window = paste(window, "hrs"),
    guaranteed = paste(guaranteed, "%"),
    lv_2_charger_Yes = case_when(lv_2_charger_Yes == 1 ~ "Yes",
                                 lv_2_charger_Yes == 0 ~ "No")) %>%
  select(
    `Options:` = altID,
    `Upfront Incentive:` = upfront,
    `Free Level 2 Charger:` = lv_2_charger_Yes,
    `Electricity Price Discount:` = discount,
    `Override Window:` = window,
    `Guaranteed Range if charged for 8 hrs:` = guaranteed)

alts_t <- as.data.frame(t(alts)) %>% 
  rownames_to_column(var = "Attribute")

colnames(alts_t) <- NULL

kable(alts_t, escape = FALSE, align = c('r', 'c', 'c')) %>% 
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE, 
                position = "center") %>%
  column_spec(column = 1, width = '20em') %>%
  column_spec(column = 2, width = '10em') %>%
  column_spec(column = 3, width = '10em') %>%
  row_spec(row = 1, bold = TRUE)
dict_code <- data.frame(
    name = c('0_make_choice_questions',
             '1_1_clean_data',
             '1_2_summary_statistics',
             '2_1_mnl_pref_model',
             '2_2_mxl_pref_model',
             '2_3_mnl_pref_groups_model',
             '2_4_mnl_wtp_model',
             '2_5_mxl_wtp_model',
             '3_1_wtp_from_pref',
             '3_2_market_sim',
             '3_3_sensitivity_analysis'),
    type = 'R',
    description = c('Construct the conjoint questions.',
                    'Clean and export survey data.',
                    'Generate summary statistics of test takers.',
                    'The simple logit preference space model.',
                    'The mixed logit preference space model.',
                    'The simple logit preference space model by groups.',
                    'The simple logit WTP space model.',
                    'The mixed logit WTP space model.',
                    'Calculation of WTP using mnl_pref model.',
                    'Market simulations',
                    'Market sensitivity analysis')
) %>% 
    mutate(count = row_number()) %>%
    select(count, everything())
dict_code %>% 
    kable(format = 'html',
          escape = FALSE,
          align = c('c', 'l', 'c', 'l'),
          col.names = c('No.', 'Name', 'Type', 'Description'),
          caption = 'Dictionary of R Codes') %>%
    kable_styling(bootstrap_options = c('striped', 'hover'))
dict_file <- data.frame(
    name = c('choice_data',
             'choice_questions',
             'pevFinal1',
             'pevFinal2',
             'pevFinal3',
             'scenarios',
             'choice_data_wider',
             'wtp_from_pref',
             'mnl_pref_groups',
             'mnl_pref',
             'mnl_wtp',
             'mxl_pref',
             'mxl_wtp',
             'sens_mnl_pref',
             'sim_mnl_pref'),
    type = c('CSV', 'CSV', 'CSV', 'CSV', 'CSV', 'CSV',
             'RData', 'RData', 'RData', 'RData', 'RData', 
             'RData', 'RData', 'RData', 'RData'),
    description = c(
        'Cleaned results from the final survey.',
        'Local version of Conjoint questions.',
        'Part 1 of final survey result from formr.',
        'Part 2 of final survey result from formr.',
        'Part 3 of final survey result from formr.',
        'Scenarios of multiple attributes for sensitivity analysis.',
        'Wider version of survey results',
        'WTP with uncertainty calculated from mnl_pref.',
        'The simple logit preference space model.',
        'The mixed logit preference space model.',
        'The simple logit preference space model by groups.',
        'The simple logit WTP space model.',
        'The mixed logit WTP space model.',
        'Market sensitivity of mnl_pref model.',
        'Market simulation of mnl_pref model.'
        )
) %>% 
    mutate(count = row_number()) %>%
    select(count, everything())
dict_file %>% 
    kable(format = 'html',
          escape = FALSE,
          align = c('c', 'l', 'c', 'l'),
          col.names = c('No.', 'Name', 'Type', 'Description'),
          caption = 'Dictionary of Files') %>%
    kable_styling(bootstrap_options = c('striped', 'hover'))
dict_df <- data.frame(
    name = c('choice_data',
             'choice_data_charging',
             'choice_data_demo',
             'choice_data_demo_2',
             'choice_data_vehicle',
             'choice_data_wider',
             'coef_mxl_wtp_df',
             'data_mnl_pref',
             'data_mnl_pref_groups',
             'data_mxl_pref',
             'figure_mnl_pref',
             'figure_mnl_pref_groups',
             'figure_mxl_pref',
             'mnl_wtp_compare',
             'mxl_wtp_compare',
             'profiles',
             'scenario_table',
             'scenario_upfront',
             'scenarios_mult',
             'scenarios_sing',
             'sens_cost',
             'sens_multiple',
             'sens_upfront',
             'sim_cases',
             'sim_mnl_pref_mult',
             'sim_mnl_pref_sing',
             'survey',
             'table_mnl_pref',
             'table_mnl_pref_groups',
             'table_mxl_pref',
             'tornado_data',
             'wtp_ci_mnl_pref'),
    type = c('A tibble: 2,928 × 40',
             'A tibble: 336 × 38',
             'A tibble: 499 × 36',
             'A tibble: 244 × 38',
             'A tibble: 610 × 35',
             'A tibble: 122 × 38',
             'A data frame: 6 × 1',
             'A tibble: 2,928 × 40',
             'A tibble: 2,928 × 48',
             'A tibble: 2,928 × 40',
             'A data frame: 6 × 5',
             'A data frame: 12 × 5',
             'A data frame: 6 × 5',
             'A data frame: 7 × 3',
             'A data frame: 6 × 3',
             'A data frame: 216 × 6',
             'A data frame: 18 × 11',
             'A data frame: 20 × 8',
             'A tibble: 27 × 8',
             'A data frame: 3 × 8',
             'A data frame: 10 × 7',
             'A data frame: 9 × 5',
             'A data frame: 10 × 4',
             'A tibble: 9 × 4',
             'A data frame: 27 × 11',
             'A data frame: 3 × 11',
             'A tibble: 12,000 × 12',
             'A data frame: 6 × 6',
             'A data frame: 12 × 6',
             'A data frame: 12 × 6',
             'A data frame: 8 × 5',
             'A data frame: 5 × 3'),
    description = c(
        'Conjoint questions for the pilot survey.',
        'Longer version of choice data about charging.',
        'Longer version of choice data about main demography.',
        'Longer version of choice data about education and income.',
        'Longer version of choice data about vehicle.',
        'Original wider version of choice data.',
        'Coefficients of the mxl_wtp model.',
        'Choice data for the mnl_pref model.',
        'Choice data for the mnl_pref_groups model.',
        'Choice data for the mxl_pref model.',
        'Data frame for mnl_pref plot.',
        'Data frame for mnl_pref_groups plot.',
        'Data frame for mxl_pref plot.',
        'Compare data between mnl_pref and mnl_wtp.',
        'Compare data between mnl_pref and mxl_wtp.',
        'CBC profiles of the conjoint survey.',
        'Scenarios of multiple attributes for market sensitivity analysis.',
        'Scenarios of upfront attribute for market sensitivity analysis.',
        'Multiple scenarios for maket simulation.',
        'Single scenarios for maket simulation.',
        'Sensitivity of cost.',
        'Sensitivity of multiple attributes.',
        'Sensitivity of upfront attribute.',
        'Cases of multiple market simulation.',
        'Multiple simulation using mnl_pref model.',
        'Single simulation using mnl_pref model.',
        'Survey data.',
        'Details of the mnl_pref model.',
        'Details of the mnl_pref_groups model.',
        'Details of the mxl_pref model.',
        'Data of tornado plot.',
        'WTP with CI calculated from mnl_pref model.'
        )
) %>% 
    mutate(count = row_number()) %>%
    select(count, everything())
dict_df %>% 
    kable(format = 'html',
          escape = FALSE,
          align = c('c', 'l', 'l', 'l'),
          col.names = c('No.', 'Name', 'Type', 'Description'),
          caption = 'Dictionary of Data Frames') %>%
    kable_styling(bootstrap_options = c('striped', 'hover'))