Predict passengers being dead/alive using titanic datasets applying R language and Random forest
This programs explains how to use the titanic datasets to predicts if a passenger is alive or dead. The variables are- survival(0 = No, 1 = yes)
pclass (1 = 1st, 2 = 2nd, 3 = 3rd)
age, sex, sibsp (siblings / spouses aboard the Titanic)
parch (parents / children aboard the Titanic)
ticket numer, fare (passenger fare), cabin (cabin number)
emabraked (C = Cherbourg, Q = Queenstown, S = Southampton)
The steps used to train model on window (10) in the R studio Desktop (R-Studio 8.12 build 175481) envionment are decribed below.
Visit https://rstudio.com/products/rstudio/ Download and install R studio Desktop
Visit https://www.kaggle.com/c/titanic/data Download train.csv and test.csv Place the downloaded files in a folder (say titanic) in any drive
load the training and testing data sets in to the R studio
train <- read.csv("train.csv", header = TRUE)
test <- read.csv("test.csv", header = TRUE)
If a person title is "Mr" and he happens to be in the third class, he is most likely to perish
The passenger with title "Mr" is more likely to perish wherther he is in 1 st, 2nd, and 3rd class
Most of the females are in the 1st and 2nd class.
Females in the 1st and 2nd class are more likely to survive
pclass, title, sibsp, parch, and family size varaibles have influence on the passenger survival ticket number, fares, cabin, embarked variables have no influence on the passenger survival pclass, title, sibsp, parch, and family size is used for building machine learning model
OOB estimate of error rate: 18.18% (min) Accepted variables pclass, title, & family.size determines/ predicts the survival of titanic passenger
The titles (Mr, Mrs, Miss, Master) has value in predicting the passenger survival as compared to pclass (1, 2, 3), family size, and parch
Titles of "Mr." and "Other" are predicted to perish at an overall accuracy rate of 83.2 %.
Titles of "Master.", "Miss.", and "Mrs." in 1st & 2nd class are predicted to survive
Titles of "Master.", "Miss.", and "Mrs." in 3rd class with family sizes equal to 5, 6, 8, & 11 are predicted to perish with 100% accuracy.
Titles of "Master.", "Miss.", and "Mrs." in 3rd class with family sizes not equal to 5, 6, 8, or 11 are predicted to survive with 59.6% accuracy.
If a person has title Mr and "other", he will survive at 91 instances.
If the title is neither Mr and "other", may be (Mrs, Miss), and if these people are in the 3 class, with alrge family size none of them will survive. Small family size, 83 instance of survival.
If the title is neither Mr and "other", may be (Mrs, Miss), and if these people are not in the 3 class, 168 instance of survival
variables *pclass, title, & family.size* determines/ predicts the survival of titanic passenger