Introduction to Machine Learning

Exploring Machine Learning: Basics and Applications

Machine learning is a rapidly growing field that continues to evolve and impact various aspects of our lives. In this article, we will discuss the basics of machine learning and why it is essential in today’s data-driven world.

Tom Mitchell, a prominent machine learning researcher, defines the concept as:

A computer program is said to learn from experience E with respect to some class of tasks T, and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.
Mitchell, Tom, and Machine Learning McGraw-Hill. “Edition.” (1997).

How Machine Learning Works

Consider a scenario where you want your computer to play tic-tac-toe (the task) and have played 100 games with it so far (the experience). The performance is measured by the number of games the computer has won. As you play more games with the computer, its experience and performance improve.

Machine learning employs training data to learn a model. Imagine a human baby who is naturally curious about their surroundings. Their primary way of learning is to ask questions like, “What is this?” and learn from the answers provided.

When a parent or caregiver answers, the child learns to recognize patterns and distinguish between different objects. For instance, if the child first encounters a Persian cat and later sees a Sphynx cat, they might ask about the animal again. Over time, the child learns to identify various species of cats without needing to ask.

Analogy to Machine Learning

In machine learning, instead of a child, we have a computer, and instead of a parent, we have a dataset with known answers. The machine is first trained by being introduced to an image of a cat (from the dataset). It then queries the dataset (asking what kind of cat it is) and receives the answer. This process of querying the dataset is called training or learning, and the data used for this purpose is called training data.

After training, if a new image is shown to the machine, it can identify the type of cat. This process is called testing, and the data used for this purpose is called testing data.

Types of Machine Learning Problems

Machine learning problems are categorized based on the type of data and problem definition:

Classification: Identifying the type of cat or distinguishing between a cat and a dog.
Regression: Predicting future prices, such as stock market trends.
Clustering: Grouping articles based on their properties, like all sports-related articles or weather news.
Ranking: Ranking CVs based on specific criteria.
Recommendation System: Suggesting movies based on viewing history.

Why Do We Need Machine Learning?

In a nutshell: Data is everywhere.

Machine learning is a subfield of the broader domain known as “Artificial Intelligence.” It involves using computer algorithms to learn from data, enabling machines to make decisions and predictions.

Although machine learning algorithms have existed for decades, recent advancements in data collection and computational power have allowed us to harness their potential to extract valuable insights. Machine learning algorithms can be categorized into three major types based on the nature of the data:

Types of Machine Learning Algorithms

Various algorithms have been developed to solve different types of problems. These algorithms fall into categories such as:

1. Supervised Learning

Algorithms are provided with labeled data. For instance, if you have a basket filled with red and green balls, you train the computer to identify the ball’s color using labeled examples. Applications include image classification, stock-price predictions, and speech recognition.

2. Unsupervised Learning

Algorithms do not rely on labeled data but detect patterns within the data. An example is news classification, where an algorithm groups news articles based on content. Google News operates on this principle.

3. Reinforcement Learning

A reward-based technique where algorithms learn through trial and error. If they perform well, they receive rewards; if not, they are penalized. This approach is used in computer gaming (e.g., DeepMind’s AlphaGo) and autonomous driving.

In this set of tutorials, we will delve into different problems, the datasets, and the suitable algorithms to solve these problems.