To analyze and investigate the Titanic dataset and summarize the main information that we can retrieve.
To predict which passengers survived the Titanic shipwreck and describe the survival rate of passengers on the Titanic.
The main source of data collected about Titanic came from the Encyclopedia Titanica (https://www.encyclopedia-titanica.org/). However, you can view a description of this dataset on the Kaggle website, where the data was obtained (https://www.kaggle.com/c/titanic/data).
The datasets were collected by a variety of researchers and were collected with primary sources where the data was collected at the time of the event such as newspapers, photographs, etc.
The sources of these data came from official inquiries in Britain and the USA and newspapers articles that related to the sinking of the Titanic.
The event happened a long time ago in the 1900s when the technologies were still not sophisticated which made it hard to collect proper data to analyze.
The dataset had a couple of columns that were missing values and invalid fields for the analysis.
We had one table which is the “passengers” table.