Preface & Introduction to Machine Learning
ML From Scratch Series
Suitable for those who want to understand the theory behind machine learning.
Why I wrote this article?
This document aims to consolidate and provide a deeper understanding of the theoretical underpinnings of machine learning. In my daily use of related technologies, I often find myself directly using packages like Tensorflow or PyTorch to execute and implement the desired results. During these interactions, I frequently overlook or forget the knowledge that underlies these technologies.
With approximately 2–3 years of experience in this field, I have decided to write my own understanding of the theory behind machine learning-related technologies in the year 2023.
My goal is to internalize this knowledge and, through these notes, provide a reference for others or motivate myself to delve deeper. If you have any questions or concerns about the content, please feel free to contact me.
What is Machine Learning?
Machine learning is one of the methods to achieve Artificial Intelligence (AI). It involves using designed algorithms to analyze data and make decisions or predictions after learning from that data.
As Professor Hung-yi Lee puts it:
Machine learning can be simplified into three stages:
1. Defining a set of functions for the dataset (Define a set of functions)
2. Evaluating the goodness of different functions through specific methods (Goodness of function)
3. Selecting the best function based on the results of the second step (Pick the best function)
For example, when we need to determine the output for unknown input data based on known data such as (2, 10) and (9, 31), following the three steps mentioned earlier:
- We can define a linear function as y = ax + b.
- Using the known data, we can determine that the function matches the output when a = 3 and b = 4.
- When we have unknown data with x = 7 as input, we can use this function to predict that it may have an output of y = 25.
Machine Learning and Its Relationships
- Artificial Intelligence:
Machine learning is not equivalent to artificial intelligence; it is just one method to achieve AI. - Deep Learning:
Deep learning is an algorithm within machine learning that analyzes input data using pre-trained artificial neural networks to make classifications. - Data Science:
In the field of data science, machine learning is used for data analysis. However, data science encompasses more than just machine learning; it includes the practical benefits derived from data. Machine learning is just one application within data science.
Types of Machine Learning
Machine learning can be categorized into the following types:
- Supervised Learning:
This involves training a model with labeled data and using it to predict outcomes for new, unseen data. - Semi-Supervised Learning:
In this approach, a model is trained with only a small amount of labeled data. - Unsupervised Learning:
Unsupervised learning requires building models with unlabeled data, and the model categorizes input data based on patterns it identifies. - Reinforcement Learning:
This type involves learning through interactions with an environment to maximize expected rewards. In most cases, it is considered a Markov decision process.
Supervised learning, especially in the forms of classification and regression, is the most common type of machine learning.
Machine Learning Algorithms
Here are some of the major machine learning algorithms, with those in bold representing broader categories:
Supervised Learning
- Classification
- Regression
- Linear Regression
- Polynomial Regression
- Support Vector Regression
- Decision Tree Regression
- Random Forest Regression
- Ridge Regression
- Lasso Regression
- Logistic Regression
- Neural Network
- Perceptron
- Feed Forward Neural Network
- Multilayer Perceptron
- Convolutional Neural Network
- Recurrent Neural Network
- Long Short-Term Memory Recurrent Neural Network
- Sequence-to-Sequence Models
- Naive Bayes
- Support Vector Machine
- K-Nearest Neighbors
Semi-Supervised Learning
- Transductive Learning
- Inductive Learning
- Generative Adversarial Network
- Self-training
- Co-training
Unsupervised Learning
- K-Means
- Autoencoder
- Gaussian Mixture Model
Reinforcement Learning
- Q Learning
- Deep Q-Network
- Actor-Critic
- Policy Gradient
- Proximal Policy Optimization