Data Wrangling
These are the works produced for SIT731 Data Wrangling unit. This unit focuses on the essential process of data wrangling, a foundational skill in both Data Science (DS) and Artificial Intelligence (AI). With the rise of large-scale data collection from diverse sources, preparing data correctly is critical for accurate analysis and modelling. Throughout the unit, I learned to:
- Program in Python to perform a wide range of data wrangling tasks
- Extract data from various sources and formats
- Handle different data types, store and retrieve data efficiently
- Apply sampling techniques and inspect data distributions
- Clean data by identifying and managing outliers, anomalies, and missing values
- Transform, select, and extract features for modelling
- Conduct exploratory data analysis and create visualisations
- Summarise data and perform basic statistical analysis
- Build simple machine learning models for initial insights
The unit also covered ethical considerations and data privacy techniques, ensuring responsible data handling practices.
Back to top