It is said that cleaning/cleansing data takes 80% of data analysis process. Data cleaning must be repeated for every new data in every project. Typically, data sets obtained from a real world problems violate the standards of clean data in different ways and analyzing data without cleaning is impossible. In the process of cleaning data, we try to remove every possible problem in data and organize the values in a standard manner.
In this workshop, we aim to focus on small but main aspects of data cleaning including:
- Importing and exporting data without having problems like:
- Column headers are values, not variable names
- Changing type of data
- Multiple variables are stored in one column
- Detection and localization of errors like:
- Missing values and imputation
- Special values
Note: Please bring your own fully charged laptop with the latest version of R and RStudio installed.