DL MID2 Bit Bank 2024-25
DL MID2 Bit Bank 2024-25
MULTIPLE CHOICE
3.Which of the following activation functions is often used in hidden layers for deep learning?
A.Sigmoid
B.ReLU
C.Softmax
D.Step Function
ANS: B
6.In convolutional neural networks (CNNs), what is the purpose of pooling layers?
A.To increase the size of the input
B.To reduce the dimensionality of the feature map
C.To fully connect the layers
D.To apply activation functions
ANS: B
7.Which type of neural network is typically used for sequence data, like time series or language?
A.Convolutional Neural Network (CNN)
B.Recurrent Neural Network (RNN)
C.Feedforward Neural Network (FNN)
D.Generative Adversarial Network (GAN)
ANS: B
8.The "vanishing gradient problem" is commonly associated with which type of neural network?
A.Convolutional Neural Networks (CNNs)
B.Recurrent Neural Networks (RNNs)
C.Feedforward Neural Networks
D.Graph Neural Networks
ANS: B
10.In representation learning, what is a commonly used model for dimensionality reduction?
A.Decision Trees
B.Autoencoders
C.Random Forests
D.k-Nearest Neighbours
ANS: B
12.Which of the following is a popular loss function for binary classification tasks?
A.Mean Squared Error (MSE)
B.Cross-Entropy Loss
C.Hinge Loss
D.L2 Regularization
ANS: B
16.In convolutional neural networks, what is the primary role of convolutional layers?
A.To reduce dimensionality of data
B.To detect spatial hierarchies in the input data
C.To fully connect layers
D.To apply dropout to prevent overfitting
ANS: B
20.Which of the following terms refers to the depth of the input in a convolutional operation?
A.Channels
B.Filters
C.Stride
D.Pooling
ANS: A
21.If an input has multiple channels (e.g., RGB image), how does a convolutional layer process it?
A.By applying a single filter to each channel
B.By applying a separate filter to each channel and summing the results
C.By ignoring all but one channel
D.By converting it to grayscale before processing
ANS: B
22.In multichannel convolution, if the input has CCC channels, how many filters are typically applied?
A. 1
B. C
C.C * C
D.It depends on the number of desired output channels
ANS: D
23.What happens when you increase the number of filters in a convolutional layer?
A.The output spatial dimensions increase
B.The depth of the output increases
C.The stride of the convolution decreases
D.The size of each filter decreases
ANS: B
24.When using a 3x3 filter on a multichannel input, how does the filter interact with the channels?
A.A separate 3x3 filter is applied to each channel independently
B.A 3D filter is applied across all channels simultaneously
C.Only the first channel is used for the filter operation
D.The filter size changes depending on the number of channels
ANS: B
26.In a convolution operation with multichannel input, what determines the depth of the output feature
map?
A.The number of channels in the input
B.The size of the filters
C.The number of filters used
D.The padding used
ANS: C
27.If a convolutional layer has 64 filters and receives a 32x32x3 RGB image as input, what will the depth of
the output feature map be?
A. 3
B. 32
C. 64
D.Depends on the stride
ANS: C
28.In a convolution operation, what does the term "kernel size" refer to?
A.The number of filters in the layer
B.The dimensions of the filter matrix
C.The stride used in the convolution
D.The type of padding used
ANS: B
29.How does the convolution operation handle depth when applied to multichannel inputs like RGB images?
A.Each channel is processed independently and outputs are stacked
B.All channels are combined into a single-channel output
C.A filter with a depth matching the input is applied across all channels
D.It ignores all but one channel
ANS: C
30.Which of the following operations can reduce the computational cost in a convolutional layer?
A.Increasing the filter size
B.Decreasing the stride
C.Reducing the number of filters
D.Increasing padding
ANS: C
31.Which of the following is the main feature of a Recurrent Neural Network (RNN) that distinguishes it from
other neural networks?
A) Feed-forward connections
B) Convolutional layers
C) Recurrent connections
D) Skip connections
ANS: C
32.Which activation function is commonly used in the hidden layers of RNNs to handle the vanishing
gradient problem?
A) ReLU
B) Sigmoid
C) Tanh
D) Softmax
ANS: C
33.What is the primary issue faced by vanilla RNNs when dealing with long sequences?
A) Overfitting
B) High computation cost
C) Vanishing and exploding gradients
D) Lack of training data
ANS: C
34.Which type of RNN is designed to address the vanishing gradient problem?
A) Convolutional RNN
B) Long Short-Term Memory (LSTM)
C) Gated Recurrent Unit (GRU)
D) Both B and C
ANS: D
35.In RNNs, the hidden state at time ttt depends on which of the following?
A) Only the input at time ttt
B) The hidden state at time t−1t-1t−1 and the input at time ttt
C) Only the hidden state at time t−1t-1t−1
D) Only the output at time t−1t-1t−1
ANS: B
36.In PyTorch, which function is commonly used to define an RNN layer?
A) torch.nn.Conv2d
B) torch.nn.Linear
C) torch.nn.RNN
D) torch.nn.ReLU
ANS: C
37.Which library can be used to easily implement RNNs in Python?
A) NumPy
B) SciPy
C) PyTorch
D) Matplotlib
ANS: C
38.In RNNs, which function is used to initialize the hidden state to zeros at the start of a sequence?
A) torch.zeros
B) torch.ones
C) torch.randn
D) torch.zeros_like
ANS: A
39.Which of the following is an advantage of using an RNN over a feed-forward neural network?
A) Ability to process inputs of fixed size
B) Ability to model sequential dependencies
C) Faster training time
D) Lower memory requirement
ANS: B
40.What is a tensor in PyTorch?
A) A scalar
B) A matrix with only 2 dimensions
C) A multidimensional array
D) A data loader in PyTorch
ANS: C
41.Which of the following command initializes a tensor of all zeros with shape (2,3)?
A) torch.zeros(2,3)
B) torch.ones(2,3)
C) torch.rand(2,3)
D) torch.randn(2,3)
ANS: A
42.What function would you use to change the shape of a tensor in PyTorch?
A) torch.reshape()
B) torch.view()
C) torch.resize()
D) torch.size()
ANS: B
43.How can you move a tensor to a GPU in PyTorch?
A) tensor.gpu()
B) tensor.to('gpu')
C) tensor.cuda()
D) torch.gpu(tensor)
ANS: C
44.Which function in PyTorch is used to compute the gradients of tensors?
A) torch.grad()
B) torch.backprop()
C) tensor.backward()
D) tensor.gradient()
ANS: C
45.What does the nn.Module class represent in PyTorch?
A) A base class for all neural network layers and models
B) A function for calculating loss
C) A function for optimization
D) A class for handling datasets
ANS: A
46.Which function is used to apply an optimizer in PyTorch?
A) optimizer.update()
B) optimizer.apply()
C) optimizer.step()
D) optimizer.backward()
ANS: C
47.Which of the following PyTorch functions is typically used to split data into training and testing sets?
A) torch.split()
B) torch.utils.data.random_split()
C) torch.divide()
D) torch.subset()
ANS: B
48.In CNNs, what does the term “convolution” refer to?
A) Merging multiple channels
B) A mathematical operation combining two functions
C) A pooling layer operation
D) Adjusting model weights
ANS: B
49.Which layer in CNNs is primarily responsible for down-sampling the feature maps?
A) Convolution layer
B) Fully connected layer
C) Pooling layer
D) Dropout layer
ANS: B
50.What does nn.Conv2d in PyTorch define?
A) A dense layer
B) A 1D convolution layer
C) A 2D convolution layer
D) A 3D convolution layer
ANS: C
51.Which activation function is most commonly used in CNNs?
A) Sigmoid
B) ReLU
C) Tanh
D) Softmax
ANS: B
52.In a CNN model in PyTorch, which layer would you typically use to flatten a 2D feature map into a 1D
tensor before passing it to a fully connected layer?
A) torch.nn.ReLU
B) torch.nn.Flatten
C) torch.nn.Softmax
D) torch.nn.Conv2d
ANS: B
53.In a typical CNN architecture, which of the following orders of layers is most common?
A) Convolution → Pooling → Activation
B) Activation → Convolution → Pooling
C) Pooling → Activation → Convolution
D) Convolution → Activation → Pooling
ANS: D
54.What does the parameter padding in torch.nn.Conv2d affect?
A) It changes the number of output channels
B) It modifies the kernel size
C) It adds extra pixels around the input to control output dimensions
D) It alters the activation function
ANS: C
55.Which of the following terms describes the process of adjusting the weights of a CNN during training?
A) Pooling
B) Backpropagation
C) Padding
D) Convolution
ANS: B
56.Which of the following is a common approach to prevent overfitting in CNNs?
A) Increasing the number of convolutional layers
B) Adding dropout layers
C) Using larger filters
D) Reducing the learning rate
ANS: B
57.Which of the following operations is typically used in a CNN to reduce the spatial dimensions of the
input?
A) Convolution
B) Pooling
C) Padding
D) Fully Connected Layer
ANS: B
58.What is the primary purpose of torch.nn.Conv2d in a CNN model in PyTorch?
A) To perform linear transformations
B) To apply a 2D convolution operation
C) To downsample the input
D) To apply activation functions
ANS: B
59.In PyTorch, what does the stride parameter in torch.nn.Conv2d control?
A) The number of output channels
B) The spacing between kernel elements
C) The downsampling factor applied to the input
D) The activation function used
ANS: C
60.In PyTorch, which loss function is most commonly used for classification problems?
A) MSELoss
B) CrossEntropyLoss
C) L1Loss
D) SmoothL1Loss
ANS: B
61. Which of the following is an interactive application of deep learning?
A. Image Classification
B. Autonomous Driving
C. Recommender Systems
D. All of the above
ANS: D
62. In deep learning applications for natural language processing (NLP), which model is commonly
used for tasks like language translation and chatbots?
A. Convolutional Neural Network (CNN)
B. Recurrent Neural Network (RNN)
C. Random Forest
D. Support Vector Machine (SVM)
ANS: B
65. In recommender systems, which deep learning approach is commonly employed to analyze user
behavior and suggest personalized content?
A. CNN
B. Collaborative Filtering
C. Reinforcement Learning
D. GAN (Generative Adversarial Network)
ANS: B
66. What role does reinforcement learning play in interactive deep learning applications like gaming
and robotics?
A. It learns from labeled data.
B. It follows a supervised learning approach.
C. It learns by interacting with the environment and receiving feedback.
D. It uses unsupervised clustering methods.
ANS: C
67. For real-time speech recognition systems, which deep learning model is preferred?
A. Transformer
B. CNN
C. SVM
D. k-Nearest Neighbors (k-NN)
ANS: A
68. Which of the following is a common deep learning library used for developing interactive
applications?
A. Scikit-Learn
B. TensorFlow
C. OpenCV
D. NLTK
ANS: B
69. In interactive applications, what is the main purpose of using Generative Adversarial Networks
(GANs)?
A. To classify objects
B. To generate synthetic data
C. To analyze customer data
D. To perform clustering
ANS: B
70. Which of the following describes a major challenge in deploying deep learning models in
interactive applications?
A. Low accuracy
B. High computational requirements
C. Simplicity of model design
D. Limited applications
ANS: B
71.Which deep learning model is commonly used for image classification tasks in machine vision?
A. Recurrent Neural Network (RNN)
B. Convolutional Neural Network (CNN)
C. Support Vector Machine (SVM)
D. Decision Tree
ANS: B
72.Which dataset is commonly used as a benchmark for object detection tasks in machine vision?
A. MNIST
B. CIFAR-10
C. ImageNet
D. IMDB
ANS: C
73.What is the purpose of using Generative Adversarial Networks (GANs) in machine vision?
A. Object detection
B. Image synthesis and enhancement
C. Image segmentation
D. Classification
ANS: B
74.Which technique is often used in machine vision to locate objects within an image?
A. Image classification
B. Object detection
C. Sentiment analysis
D. Data augmentation
ANS: B
75.In machine vision, what is a key benefit of using CNNs over traditional algorithms?
A. Faster processing without GPUs
B. Improved performance with large datasets
C. Better results on small datasets
D. Reduced need for labeled data
ANS: B
78.Which of the following models is specifically optimized for real-time object detection?
A. Faster R-CNN
B. YOLO (You Only Look Once)
C. VGGNet
D. AlexNet
ANS: B
79.In machine vision applications, which type of neural network is most effective for analyzing
spatial hierarchies in images?
A. Convolutional Neural Network (CNN)
B. Recurrent Neural Network (RNN)
C. Feedforward Neural Network
D. Long Short-Term Memory (LSTM)
ANS: A
80.Which method is often used for identifying the edges or boundaries of objects within an image in
machine vision?
A. Image classification
B. Edge detection
C. Sentiment analysis
D. Reinforcement learning
ANS: B
81. Which model is widely used in NLP for machine translation tasks?
A. Convolutional Neural Network (CNN)
B. Recurrent Neural Network (RNN)
C. Support Vector Machine (SVM)
D. Decision Tree
ANS: B
84. Which NLP task involves determining whether a text expresses positive, negative, or neutral sentiment?
A. Text Summarization
B. Sentiment Analysis
C. Machine Translation
D. Named Entity Recognition
ANS: B
85. Which of the following is a state-of-the-art model architecture for NLP tasks?
A. CNN
B. GAN
C. Transformer
D. Decision Tree
ANS: B
86.Which method is commonly used to remove common words like "the," "is," and "and" in NLP?
A. Lemmatization
B. Stemming
C. Stopword Removal
D. Tokenization
ANS: C
88.Which of the following is a common evaluation metric for language generation tasks in NLP?
A. Mean Squared Error (MSE)
B. BLEU Score
C. Accuracy
D. ROC-AUC
ANS: B
89.Which pre-trained NLP model is known for generating high-quality text through transfer
learning?
A. GPT (Generative Pre-trained Transformer)
B. CNN
C. SVM
D. K-Nearest Neighbors
ANS: A
98.What loss function modification helps improve GAN training stability by addressing vanishing
gradients?
A. Cross-Entropy Loss
B. Wasserstein Loss
C. Mean Squared Error
D. Hinge Loss
ANS: B
100.Which GAN variant is designed to learn mappings between two domains, such as translating
images from one style to another?
A. DCGAN
B. CycleGAN
C. WGAN
D. cGAN
ANS: B
102. Which of the following algorithms is commonly used in deep reinforcement learning?
A. K-Nearest Neighbors
B. Q-Learning
C. Principal Component Analysis
D. Support Vector Machine
ANS: B
104. Which of the following is an example of a deep reinforcement learning algorithm that combines Q-
learning with neural networks?
A. Deep Belief Network
B. Deep Q-Network (DQN)
C. Support Vector Machine
D. Recurrent Neural Network
ANS: B
106. Which function in reinforcement learning estimates the expected cumulative reward from a
given state?
A. Loss function
B. Activation function
C. Value function
D. Kernel function
ANS: C
108. Which deep reinforcement learning algorithm uses multiple actors to collect data in parallel
environments?
A. Deep Q-Network (DQN)
B. Monte Carlo Tree Search
C. A3C (Asynchronous Advantage Actor-Critic)
D. K-Means Clustering
ANS: C
109. In reinforcement learning, which method is used to predict the value of future actions based
on past experiences?
A. Gradient Descent
B. Temporal Difference (TD) Learning
C. Clustering
D. Dimensionality Reduction
ANS: B
110. Which of the following algorithms is known for combining policy gradients with value
function approximation in deep reinforcement learning?
A. Deep Belief Networks
B. Actor-Critic
C. K-Nearest Neighbors
D. Decision Trees
ANS: B
113. Which type of autoencoder is commonly used to handle noisy data by learning to reconstruct
the original input?
A. Convolutional Autoencoder
B. Sparse Autoencoder
C. Denoising Autoencoder
D. Variational Autoencoder
ANS: C
115. Which type of autoencoder is designed to generate new data by sampling from a probability
distribution?
A. Sparse Autoencoder
B. Denoising Autoencoder
C. Convolutional Autoencoder
D. Variational Autoencoder
ANS: D
118. Which autoencoder variant is often used for extracting sparse representations of data?
A. Denoising Autoencoder
B. Sparse Autoencoder
C. Convolutional Autoencoder
D. Variational Autoencoder
ANS: B
119. In a variational autoencoder (VAE), which term in the loss function encourages the latent
space to follow a normal distribution?
A. Reconstruction Loss
B. KL-Divergence Loss
C. Cross-Entropy Loss
D. Softmax Loss
ANS: B
120. What type of neural network layers are commonly used in a Convolutional Autoencoder?
A. Fully connected layers
B. Recurrent layers
C. Convolutional and Deconvolutional layers
D. LSTM layers
ANS: C
121. . What is the primary difference between a Boltzmann Machine and a Restricted Boltzmann
Machine (RBM)?
A. RBMs have no hidden layer.
B. RBMs have connections only between visible and hidden layers, with no intra-layer connections.
C. RBMs are used only for supervised learning.
D. Boltzmann Machines are only used for image data.
ANS: B
122. What type of learning algorithm is typically used to train Restricted Boltzmann Machines?
A. Backpropagation
B. Contrastive Divergence
C. K-Nearest Neighbors
D. Gradient Descent
ANS: B
123. Which of the following is a common application of Restricted Boltzmann Machines (RBMs)?
A. Image segmentation
B. Dimensionality reduction and feature extraction
C. Supervised classification
D. Reinforcement learning
ANS: B
127. Which of the following is a major limitation of standard Boltzmann Machines that Restricted
Boltzmann Machines address?
A. High computational cost due to intra-layer connections
B. Limited to supervised learning tasks
C. Inability to work with high-dimensional data
D. Overfitting on small datasets
ANS: A
128. What type of energy function is used in Restricted Boltzmann Machines to calculate the
probability of a given configuration?
A. Cross-Entropy Loss
B. Hinge Loss
C. Quadratic Energy Function
D. Energy-based function
ANS: D
129. Which of the following techniques is often used to stack multiple RBMs in deep learning
architectures?
A. Deep Belief Networks (DBNs)
B. Recurrent Neural Networks (RNNs)
C. Autoencoders
D. Convolutional Networks
ANS: A
131. What are Deep Belief Networks (DBNs) primarily used for?
A. Supervised classification
B. Unsupervised feature learning and dimensionality reduction
C. Reinforcement learning
D. Time series analysis
ANS: B
132. Which of the following architectures is used to build Deep Belief Networks?
A. Stacked Restricted Boltzmann Machines (RBMs)
B. Convolutional layers
C. Long Short-Term Memory (LSTM) cells
D. Decision trees
ANS: A
134. What is the primary training technique used in Deep Belief Networks?
A. Backpropagation
B. Contrastive Divergence
C. Gradient Descent
D. Reinforcement Learning
ANS: B
135. DBNs are typically trained in two main phases. What are they?
A. Pretraining and fine-tuning
B. Clustering and classification
C. Gradient descent and backpropagation
D. Labeling and segmentation
ANS: A
136. In the context of DBNs, what is the purpose of the "pretraining" phase?
A. To classify data directly
B. To initialize the weights for each layer in an unsupervised manner
C. To reduce overfitting
D. To increase the learning rate
ANS: B
137.Which loss function is typically minimized in the pretraining phase of a Deep Belief Network?
A. Cross-Entropy Loss
B. Mean Squared Error
C. Negative Log-Likelihood
D. Hinge Loss
ANS: C
138. What is the main advantage of using DBNs over shallow networks?
A. Reduced computational complexity
B. Ability to capture hierarchical feature representations
C. Faster training time
D. Simpler network architecture
ANS: B
140. In Deep Belief Networks, what does each layer learn to represent?
A. Specific labels for classification
B. High-level abstract features of the input data
C. Time dependencies in sequential data
D. Probability distributions for reinforcement learning
ANS: C
142. Which of the following machine learning algorithms is commonly used for binary text
classification tasks like sentiment analysis?
A. K-Nearest Neighbors
B. Support Vector Machine (SVM)
C. Principal Component Analysis (PCA)
D. K-Means Clustering
ANS: B
143. In text classification, which technique is used to convert text data into numerical features?
A. Tokenization
B. Cross-Validation
C. Hyperparameter Tuning
D. Normalization
ANS: A
144. Which of the following is a popular library for processing and tokenizing text data in Python?
A. Pandas
B. NumPy
C. NLTK (Natural Language Toolkit)
D. Matplotlib
ANS: C
145. In binary classification of movie reviews, which metric is often used to evaluate model
performance?
A. Mean Absolute Error
B. R-Squared
C. Accuracy
D. Adjusted R-Squared
ANS: C
146. Which technique is commonly used to reduce overfitting in a binary classification model?
A. Increasing the learning rate
B. Adding more features
C. Cross-validation
D. Ignoring low-frequency words
ANS: C
147. Which deep learning model is commonly used for binary text classification tasks, such as
classifying movie reviews?
A. Convolutional Neural Networks (CNNs)
B. Recurrent Neural Networks (RNNs)
C. Principal Component Analysis (PCA)
D. K-Means Clustering
ANS: B
148. Which of the following representations is commonly used to convert text data into vectors for
sentiment analysis?
A. Word2Vec
B. Image Pixels
C. Edge Detection
D. Fourier Transform
ANS: A
150. Which of the following is an example of a binary classification model output in sentiment
analysis?
A. Multiclass output based on topics
B. Probability score for positive and negative classes
C. Prediction of multiple sentiments simultaneously
D. Cluster labels for unsupervised learning
ANS: B
152. Which type of activation function is commonly used in the output layer for multiclass
classification?
A. Sigmoid
B. ReLU
C. Softmax
D. Tanh
ANS: C
153. In a multiclass classification problem, which metric is typically used to evaluate the overall
performance of the model?
A. Mean Squared Error
B. Accuracy
C. R-Squared
D. Mean Absolute Error
ANS: B
154. What type of loss function is commonly used for multiclass classification tasks?
A. Mean Squared Error
B. Binary Cross-Entropy
C. Categorical Cross-Entropy
D. Hinge Loss
ANS: C
155. Which technique is often used to convert text data, like news articles, into numerical features?
A. Cross-validation
B. Tokenization and Vectorization
C. Normalization
D. Hyperparameter Tuning
ANS: B
156. Which of the following libraries is commonly used for text preprocessing in Python for
newswire classification?
A. TensorFlow
B. Matplotlib
C. NLTK (Natural Language Toolkit)
D. Pandas
ANS: C
157. When converting text data for multiclass classification, which representation captures the
semantic meaning of words?
A. Bag of Words
B. Word2Vec
C. One-Hot Encoding
D. Label Encoding
ANS: B
158. Which type of neural network is often used for sequential data like news articles?
A. Convolutional Neural Network (CNN)
B. Recurrent Neural Network (RNN)
C. Decision Tree
D. Support Vector Machine
ANS: B
160. Which of the following evaluation metrics is particularly useful when dealing with
imbalanced classes in multiclass classification?
A. Mean Squared Error
B. Precision, Recall, and F1-Score
C. R-Squared
D. Adjusted Accuracy
ANS: B