Appex 17 – Decision Trees
STA 363 - Spring 2023
Set up
Login to RStudio Pro
- Note: if you are off campus, you will need to use a VPN to connect
- Go to rstudio.deac.wfu.edu
Step 1: Create a New Project
Click File > New Project
Step 2: Click “Version Control”
Click the third option.
Step 3: Click Git
Click the first option
Step 4: Copy my starter files
Paste this link in the top box (Repository url
):
https://github.com/sta-363-s23/appex-20.git
Part 1
- Pull in the application exercise files from:
https://github.com/sta-363-s23/appex-20.git
- Run
install.packages("keras")
once in the console - Run
keras::install_keras()
once in the console - When it asks if you want to install Miniconda type
Y
Part 2
- Create a recipe that uses all of the included variables to predict Salary
- Remove any missing data for the outcome, impute data for the remaining predictors (HINT: in
step_naomit()
setskip = FALSE
to make sure it does this when we are prepping the data) - Make all nominal variables dummy variables
- Normalize all predictors using
step_normalize()
Part 3
- Using the Hitters data, split into 2/3 training and 1/3 testing data
- Create datasets for the training and testing data
- Run the recipe on the training and testing data
- Extract just the predictors into a data frame and then convert this into a matrix for both the training and testing pre-processed data
Part 4
- Fit a model with 40 activations in a hidden layer with 50% dropout and a batch size of 30
- Run your model for 1,000 iterations (epochs)
- Use your testing data as validation
- Plot the output