From self-driving cars to multimodal chatbots, there’s no question that artificial intelligence (AI) is making rapid progress. But behind these mystifying innovations are a set of fairly standard (and quite old) algorithms that have seen refinement and optimization over many years. If you want to better understand AI, then you’ll definitely want to know about the algorithms in this article.
First, what are AI algorithms? Simply put, AI algorithms are mathematical models that enable machines to learn from data. They come in different forms, including supervised learning, unsupervised learning, and reinforcement learning (RL).
Supervised learning algorithms learn from labeled examples whereas unsupervised learning algorithms learn from unlabeled data. Labeled data is data that has been annotated with predefined target values, while unlabeled data is data that is not assigned any such values. Reinforcement learning algorithms learn by trial and error and so have become very popular in game playing (like chess and Go) and robotics.
The Algorithms
- Artificial Neural Networks (ANNs): This one you’ve probably heard of. ANNs are inspired by the brain and are used for image and speech recognition and natural language processing. The basic idea behind ANNs is that you input data, and the network sends the data through layers of artificial neurons. Each neuron takes in information from the previous layer and calculates an output, which then gets passed on to the next layer. Deep learning uses ANNs with multiple layers and is the architecture of choice for almost every AI application today. ANNs themselves were first implemented in the 1950s.
- Support Vector Machines (SVMs): SVMs are used for classification and regression problems and work by finding the best line or curve (called a “hyperplane”) that separates different groups of data points. This hyperplane can then be used to predict which group a new data point belongs to. SVMs can tell you if an email is spam or not and are widely used in areas such as bioinformatics, finance, and computer vision.
- Decision Trees: Decision trees are a type of supervised learning algorithm used to make predictions. They work by recursively partitioning the data into subsets based on the value of a chosen feature.
- Random Forests: Random forests are an extension of decision trees. They improve the accuracy of predictions by combining the results of multiple decision trees.
- K-Means Clustering: K-Means Clustering is an unsupervised machine learning algorithm that partitions data points into K number of clusters (distinct subsets) based on their similarity. The value of K is pre-defined by the user or determined using algorithms. It is useful in areas such as image segmentation and document clustering.
- Gradient Boosting: Gradient Boosting is a machine learning technique that builds a predictive model by combining the results of many weak models. It is used in web search ranking and online advertising.
- Convolutional Neural Networks (CNNs): CNNs are inspired by the visual cortex of the human brain and can automatically learn features such as edges and corners from images. While ANNs are general-purpose, CNNs are specialized networks designed to process grid-like data (like pixels) and so are used for image and video processing.
- Long Short-Term Memory Networks (LSTMs): LSTMs are a type of neural network that are designed to handle sequential data such as speech and text and are thus useful for speech recognition, machine translation, and handwriting recognition.
- Principal Component Analysis (PCA): PCA is a technique for reducing the dimensionality of data by projecting it onto a lower-dimensional space. It is used in facial recognition and image compression.
- Apriori Algorithm: Apriori is an algorithm for association rule learning, a technique used to discover relationships between variables in large datasets by identifying frequent patterns, associations, or correlations among them. It is popular in market basket analysis to identify items that are frequently purchased together.
When you interact with AI, you are interacting with these algorithms (and many other algorithms). There is a tendency to anthropomorphize AI systems, but this is unnecessary to understand AI. It’s just math, and there are limitations. One limitation is the dependence on data. AI algorithms require vast quantities of high-quality data to be trained effectively. In AI, you need quality and quantity. In contrast, a person can learn something with just one example.
To achieve AI systems that are generally intelligent, one or some combination of the following needs to be true:
- The scaling hypothesis is correct (that simply adding more data and compute will deliver AGI – artificial general intelligence).
- Large language models (LLMs) represent a viable alternative path to general intelligence as compared to the biological path (like how airplanes achieve flight but are not designed like birds).
- New, innovative algorithms and architectures are needed that enable AI systems to learn anything from one or a few examples (such a system might require a cohesive world model and virtual/physical embodiment).
What Have We Learned?
AI, while incredibly powerful, is a set of optimized algorithms based on well-established mathematical principles and probability and statistics. It is not agreed upon at which point (if at all with current approaches) an AI-based information processing system becomes generally intelligent and exceeds the human mind. However, it’s clear that we’re entering a new era, and the increasing demand for automation means that AI will change the world as we know it.