Due: Sep 07 by 11:59pm

Weight: This assignment is worth 3% of your final grade.

Purpose: The purpose of this assignment is to get more familiar with R and RStudio and to develop some basic strategies for working with data in R.

Assessment: This assignment is graded using a check system:

Notice that this is essentially a pass/fail system. I’m not grading your writing ability and I’m not counting the number of words you write - I’m looking for thoughtful engagement.

1. Software

If you haven’t yet, go to the Course Software page and install all the software we’ll need for this course. You’ll need these tools for this assignment.

2. Getting Organized

Download and edit this template when working through this assignment.

3. Readings

Open up a notebook (physical, digital…whatever you take notes in best), and take notes while you go through these readings:

  1. Getting Familiar with the Course: Follow Snoop’s advice and read the entire Course Syllabus (actually read the whole thing). Then review the schedule and make sure to note important upcoming deadlines.
  2. Basics [Optional] Read through Lessons 1 “Getting Started” and 2 “Data Types & Vectors” in the R4A Primer to get more familiar with basics. You may also want to watch the recording of last week’s class on Blackboard (see the “Zoom Recordings” section).
  3. Data Frames & Data Wrangling Reading through Lessons 3 “Data Frames” and 4 “Data Wrangling” in the R4A Primer to get more familiar with working with data sets in .

4. Exercises

RStudio offers many excellent primers to get up and running quickly in . Running through these exercises will help prepare you for class next week:

  1. Programming Basics [Optional]
  2. Working with Data
  3. Isolating Data with dplyr
  4. Derive Information with dplyr

5. Reflect & Submit

Reflect on what you’ve learned while going through these readings and exercises. Is there anything that jumped out at you? Anything you found particularly interesting or confusing? Write a few sentences in the template you downloaded for this assignment, then create a zip file of everything in your R Project folder and submit the zip file in the “Assignment Submission” page on Blackboard.


Extra Practice

Not required, but probably helpful, especially if you’re new to .

Inspect data from other packages

Write R code to install the dslabs package from CRAN, then write code to load the library. Write some code to preview and inspect the movielens data frame that gets loaded when you load the library using some of the techniques we saw in class. For each of the following questions, write code to find your answer and leave a detailed response in a comment:

  • What is this dataset about?
  • How many observations are in the data frame?
  • What is the original source of the data?
  • What type of data is each variable?
  • What are the years of the earliest and most recent observations in the data set?

Answer questions about the data

For each of the following questions, write code to find your answer and leave a detailed response in a comment:

  • What is the min, mean, and max rating in the data set?
  • How many observations received the maximum rating?
  • What percentage of total observations received the maximum rating?
  • What is the title of the observation with the longest title (in terms of numbers of letters in the title)?

Installing packages from Github: the BRRR library

The vast majority of the time, you will install external packages using the install.packages() function. This installs packages from the Comprehensive R Archive Network (CRAN), where most packages are published. But you can also install packages that are under development or haven’t been published to CRAN yet. Most of the time, these packages are hosted on GitHub - an online platform for sharing code (it’s also where all of the files that make up this website are stored).

To install a package from GitHub, you first need to install the remotes library. Then you can use the remotes::install_github()` function to install packages directly from GitHub. To try this out, install the remotes library, then trying installing the BRRR package:

remotes::install_github("brooke-watson/BRRR")

Note: Packges on GitHub are in development and often require other packges to work. So if you get an installation error about some other package dependency, try restarting your R session and try again.

Not sure what this package does? Well, one of the other nice things about packages listed on GitHub is the authors tend to write detailed descriptions - check out the GitHub page for the BRRR package. Then try using the BRRR::skrrrahh() function with different number arguments (turn your volume up). In the #welcome channel on slack, post your favorite argument to skrrrahh() (mine is 24).


EMSE 6035: Marketing Analytics for Design Decisions (Fall 2021)
Wednesdays | 6:10 - 8:40 PM | SEH 7040 | Dr. John Paul Helveston | jph@gwu.edu
LICENSE: CC-BY-SA