Machine Learning Fundamentals (en)
The “Machine Learning Fundamentals” course provides a comprehensive overview of fundamental concepts and techniques in the field of machine learning. On the first day of the course, students will be introduced to the Jupyter Notebook environment, with an option to participate in an optional introductory session. Subsequently, key concepts such as model representation, cost functions, gradient descent, and vectorization will be explored. Through a series of practical lessons, students will gain skills in applying techniques such as multiple linear regression, learning rate optimization, and feature engineering using Python and the Scikit-Learn library.
On the second day, the course will delve deeper into topics such as logistic regression, neural networks, support vector machines, and decision trees. Through interactive labs and coding exercises, participants will implement these algorithms from scratch and gain insights into their practical applications. The course will also cover essential concepts in model evaluation, including cross-validation, bias-variance tradeoff, and hyperparameter tuning.
CODE: DSAI200
Category: Artificial Intelligence
Teaching methodology
The course includes educational laboratories in which each student will be able to work in order to complete training exercises that will provide practical experience in using the instrument, for each of the topics covered during the course.
Prerequisites
- Basic programming skills in Python
- Familiarity with fundamental concepts in linear algebra and calculus
- Understanding of basic statistics and probability theory
- Experience with data manipulation libraries such as NumPy and Pandas
- Knowledge of data visualization techniques using libraries like Matplotlib and Seaborn
The following is an overview of course content:
Day 1:
-
Notebooks: Introduction to Notebook for interactive computing and data analysis.
-
Model Representation: Understanding how machine learning models are represented and structured.
-
Cost Function: Exploring the concept of cost functions and their role in model optimization.
-
Gradient Descent: Learning the optimization algorithm used to minimize the cost function and update model parameters.
-
Vectorization: Introduction to vectorization for efficient computation in machine learning algorithms.
-
Multiple Linear Regression: Implementing and understanding multiple linear regression models for predicting continuous outcomes.
-
Learning Rate: Exploring the significance of the learning rate parameter in gradient descent optimization.
-
Feature Engineering: Techniques for selecting, transforming, and creating features to improve model performance.
-
Linear Regression: Further exploration and application of linear regression models in machine learning tasks.
Day 2:
-
Classification: Introduction to classification problems and the distinction between different classes.
-
Logistic Regression: Understanding logistic regression models for binary classification tasks.
-
Decision Boundary: Exploring the decision boundary concept and its significance in classification algorithms.
-
Logistic Loss: Understanding the logistic loss function used in logistic regression for model evaluation.
-
Cost Function for Logistic Regression: Specific cost function tailored for logistic regression models.
-
Gradient Descent for Logistic Regression: Applying gradient descent optimization to logistic regression for parameter estimation.
-
Logistic Regression using Scikit-Learn: Implementing logistic regression models using the Scikit-Learn library for Python.
-
Overfitting: Understanding the concept of overfitting and its implications in machine learning model performance.
-
Regularization: Techniques for preventing overfitting by adding penalty terms to the cost function.
Students will acquire concepts about:
- Jupyter Notebook.
- Concepts of model representation.
- Cost function.
- Gradient descent.
- Concept of vectorization.
- Multiple linear regression.
- Importance of learning rate.
- Feature engineering.
- Implementation of linear regression using Scikit-Learn.
- Assessment of skills acquired on the first day.
- Concepts of classification.
- Logistic regression.
- Concept of decision boundary.
- Logistic loss.
- Cost function for logistic regression.
- Gradient descent for logistic regression.
- Implementation of logistic regression using Scikit-Learn.
- Discussion on overfitting.
- Regularized cost and gradient.
- Assessment of skills acquired on the second day.
Duration – 2 days
Delivery – in Classroom, On Site, Remote
PC and SW requirements:
- Internet connection
- Web browser, Google Chrome
- Zoom
Language
- Instructor: English
- Workshops: English
- Slides: English