Class 5 - Deep Dive Into AI
Class 5 - Deep Dive Into AI
Manual feature engineering, where the data scientist must carefully select
and transform features to improve model performance.
https://www.ibm.com/topics/artificial-intelligence
https://mitsloan.mit.edu/ideas-made-to-matter/machine-learning-explained
Deep Learning
https://www.ibm.com/topics/artificial-intelligence
Machine Learning vs Deep Learning
CNNs are a specific type of neural network, which is composed of node layers, containing
an input layer, one or more hidden layers and an output layer. Each node connects to
another and has an associated weight and threshold.
https://www.linkedin.com/pulse/what-convolutional-neural-network-cnn-deep-learning-nafiz-shahriar/
Recurrent neural networks (RNNs)
RNNs use sequential or time-series data.
RNNs use their “memory” as they take information from prior inputs to influence the current
input and output.
RNNs: A Simplified Explanation
https://mohamedbakrey094.medium.com/all-about-recurrent-neural-network-rnn-a236ad5c59f4
RNN Applications
• Language translation: Translating text from one language to another.
• Speech recognition: Converting spoken words into text.
• Generating text: Creating new text, such as poems or articles.
• Time series analysis: Predicting future trends based on past data.
Autoencoders and variational autoencoders (VAEs)
Deep learning - numerical data - analysis of images, speech and other complex data types.
VAEs are used for generating realistic images and speech — cornerstone of what we think
of as generative AI.
Variational autoencoders added the critical ability not just to reconstruct data, but also
output variations on the original data.
Two types:
Regular autoencoders: These try to reconstruct the input data as accurately as possible.
VAEs: These not only try to reconstruct the input data but also generate new, similar data.
How does it work?
1. Encoder: A module that compresses the train-validate-test set input
data into an encoded representation that is typically several orders of
magnitude smaller than the input data.
2. Bottleneck: A module that contains the compressed knowledge
representations.
3. Decoder: A module that helps the network “decompress” the knowledge
representations and reconstructs the data back from its encoded form.
The output is then compared with a ground truth.
Autoencoders & VAEs: A Simplified Explanation
https://www.v7labs.com/blog/autoencoders-guide
Autoencoders & VAEs - Applications
Image Compression: Reducing the size of images while preserving their
quality.
https://developers.google.com/machine-learning/gan/gan_structure
GAN applications
• Creating realistic images: GANs can be used to generate new images that look like real
photos or paintings.
• Generating music: GANs can create new music pieces in different styles.
• Enhancing images: GANs can be used to improve the quality of images or videos.
Limitations:
• Training a GAN can be time-consuming and computationally expensive.
• GANs may generate images or other data that don't make sense in the real world.
Diffusion Models
Diffusion models are generative models that are trained using the forward and reverse
diffusion process of progressive noise-addition and denoising.
They generate data similar to the data on which they are trained, but then overwrite the
data used to train them.
A diffusion model learns to minimize the differences of the generated samples versus the
desired target.
Diffusion models have the advantage of not requiring adversarial training, which speeds
the learning process and offers process control.
Diffusion Models – How it works
1. Add noise: The model starts by adding random noise to the data. This makes the data look
unrecognizable.
2. Learn to denoise: The model learns to remove the noise and recover the original data.
3. Generate new data: Once the model has learned to denoise the data, it can be used to generate
new, similar data.
https://encord.com/blog/diffusion-models/
Diffusion models advantages over other generative
models
• Diffusion models are generally easier to train than other generative models, such as GANs.
• Diffusion models can generate very realistic images and other data.
• Mode collapse is a problem that can occur with other generative models, where the model
generates the same few samples over and over again. Diffusion models are less likely to
suffer from this problem.
Image generation models are GANs (e.g., BigGAN, ProGAN), VAEs (e.g., DRAW,
BetaVAE), and Diffusion models (e.g., DALL-E 2, Stable Diffusion, Imagen)
Image Generation Models
Art and design: Creating new forms of art and design, such as
digital paintings and fashion designs.
Entertainment: Generating special effects and animations for
movies and games.
Product development: Designing new products and prototypes.
Transformer Models
Transformer models combine an encoder-decoder architecture with a text-processing
mechanism.
An encoder converts raw, unannotated text into representations known as embeddings;
The decoder takes these embeddings together with previous outputs of the model, and
successively predicts each word in a sentence.
The encoder learns how words and sentences relate to each other, building up a powerful
representation of language without having to label parts of speech and other grammatical
features.
After these powerful representations are learned, the models can later be specialized—
with much less data—to perform a requested task.
Transformer Models