50% found this document useful (2 votes)
2K views

ML Mcqs Without Answers

The document contains questions about neural networks and reinforcement learning concepts. It discusses backpropagation, perceptrons, supervised learning, associative networks, advantages of neural networks over conventional computers, the XOR problem, activation functions, gradient descent, stochastic gradient descent, Q-learning algorithm steps, temporal difference learning in Q-learning, and optimal policies based on state-action values.

Uploaded by

Syeda Maria
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
50% found this document useful (2 votes)
2K views

ML Mcqs Without Answers

The document contains questions about neural networks and reinforcement learning concepts. It discusses backpropagation, perceptrons, supervised learning, associative networks, advantages of neural networks over conventional computers, the XOR problem, activation functions, gradient descent, stochastic gradient descent, Q-learning algorithm steps, temporal difference learning in Q-learning, and optimal policies based on state-action values.

Uploaded by

Syeda Maria
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Back propagation is a learning technique that adjusts weights in

the neural network by propagating weight changes


A) Forward from source to sink
B) Backward from sink to source
C) Forward from source to hidden nodes
D) Backward from sink to hidden nodes

Which of the following neural networks uses supervised learning?


i) Multilayer perceptron
ii) Self organizing feature map
iii) Hopfield network
A) (i) only
B) (ii) only
C) (i) and (ii) only
D) (i) and (iii) only

Which of the following is true ?


(i) On average, neural networks have higher computation rates than
conventional computers.
(ii) Neural networks learn by example
(iii) Neural networks mimic the way the human brain works
A) All of them are true
B) (ii) and (iii) are true
C) (i),(ii) and (iii) are true
D) None of these

An associative network is
A) A neural network that contains no loop
B) A neural network that contains feedback
C) A neural network that has only one loop
D) None of These

Perceptron can learn


A) AND
B) XOR
C) Both A and B
D) None of these

A perceptron is
A) a single layer feed-forward neural network with pre-processing
B) an auto-associative neural network
C) a double layer auto-associative neural network
D) a neural network that contains feedback

Which of the following is true for neural networks?


(i) The training time depends on the size of the network.
(ii) Neural networks can be simulated on a conventional computer.
(iii) Artificial neurons are identical in operation to biological
ones.
A) All of the mentioned
B) (ii) is true
C) (i) and (ii) are true
D) None of the mentioned

Which of the following is true?


(i) On average, neural networks have higher computational rates
than conventional computers.
(ii) Neural networks learn by example.
(iii) Neural networks mimic the way the human brain works.
A) All of the mentioned are true
B) (ii) and (iii) are true
C) (i), (ii) and (iii) are true
D) None of the mentioned

What are the advantages of neural networks over conventional


computers?
(i) They have the ability to learn by example
(ii) They are more fault tolerant
(iii)They are more suited for real time operation due to their
high ‘computational’ rates
A) (i) and (ii) are true
B) (i) and (iii) are true
C) Only (i)
D) All of the mentioned

Why is the XOR problem exceptionally interesting to neural network


researchers?
A) Because it can be expressed in a way that allows you to use a
neural network
B) Because it is complex binary operation that cannot be solved
using neural networks
C) Because it can be solved by a single layer perceptron
D) Because it is the simplest linearly inseparable problem that
exists.

A perceptron is
A) a single layer feed-forward neural network with preprocessing
B) an autoassociative neural network
C) a double layer autoassociative neural network
D)None

Which of the following is true?


A)On average, neural networks have higher computational rates than
conventional computers.
B)Neural networks learn by example.
C)Neural networks mimic the way the human brain works.
D) all of them are true

A single perceptron can be used to represent many boolean


functions
A)TRUE
B)FALSE

Neural network learning methods provide a robust approach to


approximating
A)real-valued functions
B)discrete-valued functions
C)vector-valued target functions
D)All of the above

Artificial neural network used for


A)Pattern Recognition
B)Classification
C)Clustering
D) All of these

In artificial Neural Network interconnected processing elements


are called
A)weights
B)nodes or neurons
C) axons
D)Soma

Each connection link in ANN is associated with ________ which has


information about the input signal.
A)neurons
B)weights
C)bias
D)activation function

A Neural Network can answer


A)For Loop questions
B)what-if questions
C)IF-The-Else Analysis Questions
D)None of these

A 4 input neuron has weights 1, 2, 3 and 4. The transfer function


is linear .The inputs are 4, 10, 5 and 20 respectively. The output
will be
A)238
B)76
C)119
D)123

Which of the following is not the promise of artificial neural


network?
A) It can explain result
B)It can survive the failure of some nodes
C)It has inherent parallelism
D)It can handle noise

A perceptron adds up all the weighted inputs it receives, and if


it exceeds a certain value, it outputs a 1, otherwise it just
outputs a 0.
A)True
B)False
C)Sometimes -it can also output intermediate values as well
D)Can’t say

Perceptron can be viewed as representing a hyperplane decision


surface in the n-dimensional space
A)True
B)False

Example for linearly nonseparable training examples are based on


A)AND function
B)OR function
C)XOR function
D)NOT function

Some of the examples of popular weight determining algorithms are


A)Delta rule
B)Perceptron rule
C)Stochastic gradient descent
D)All of the above

Convergence fails in --------learning procedures,when training


examples are not linearly separable
A)Delta rule
B)Stochastic gradient descent rule
C) Perceptron rule
D)Gradient descent rule

Sequence the flow of perceptron rule


i)Choose random weights
ii)Modifying the perceptron weights on misclassification
iii) iteratively apply the perceptron
iv)iterating through the training examples for proper
classification
A) iii , ii, iv, i
B) i,iii ,ii,iv
C)i,ii,iii,iv
D)iv,iii,ii,i

The key idea behind the delta rule is to use -------to search the
hypothesis space
A)Stochastic gradient descent
B)Linear programming
C)Gradient descent
D) Both A and B

The delta training rule is best understood by considering the task


of training an
A) thresholded perceptron
B)Unthresholded perceptron
C) randomised perceptron
D) None of the above

In gradient descent algorithm steepest descent along the error


surface can be found by -------- with respect to each component of
the input vector
A)computing the derivative of E(error)
B)computing the derivative of E(error)
C)computing the treshold of E(error)
D)both B and C

Gradient descent can be applied when


A)hypothesis space contains weights in a linear unit
B)the error can be differentiated with respect weights in a linear
unit
C) both A and B
D) None of the above

Practical difficulties in applying gradient descent are


A)Converging to a local minimum can sometimes be quite slow
B)Un-guaranteed procedure of finding the global minimum in the
presence of multiple local minimum
C)Both A and B
D)Only A

Activation function are used to bring in _______ in Neural


Networks
A) Linearity
B) Non linearity
C) Both A and B
D) None of the above
Single layer Perceptron has ______ layers
A) 2
B) 3
C) 4
D) Zero

Multi layer Perceptron has minimum ______ layers


A) 1
B) 2
C) 3
D) 4

Neural Network is a _______ learning model


A) Supervised
B) Unsupevised
C) Reinforced
D) None of the above

Gradient Descent is used for updating _____


A) Weights
B) No. of nodes in input layer
C) No. of nodes in hidden layer
D) No. of nodes in output layer

Stochastic Gradient Descent is used for updating ______


A) Weights
B) No. of nodes in input layer
C) No. of nodes in hidden layer
D) No. of nodes in output layer

In order to attain a greater cumulative future award when


v*(s1)>v*(s2), the evaluation fuction used by agent to learn is --
---------------
A) state S1
B) state S2
C) both S1 and S2
D) None of the above

The agent acquires optimal policy by learning v*


A) intermediate reward function
B) state transition function
C) both of the above
D) None of the above
Q-Learning does not need representing or learning a model, this
makes the implementation of Q-learning
A) easy
B) moderate
C) difficult
D) none

Order the steps of the Q Learning algorithm


1. observe new State S'
2. update table entry for Q(S,a)
3. Receive Intermediate award
4. select an action a
5. execute action a

A) 4-5-3-1-2
B) 2-5-3-4-2
C) 4-1-2-3-5
D) 4-5-3-2-1

One step error is used in Q-Learning algorithm


A) True
B) False
C) None
D) Either
ANSWER : A
In model free reinforment learning, Learning is from
A) Optimal value function V
B) Optimal Q function
C) None of the above
D) Both

From the following Q(S,a)=12; Q(S,b)=100, Q(S, c)= 67; based on


Greedy function which is the best Q-fucntion value that is chosen
A) Q(S,a)=12;
B) Q(S,b)=100,
C) Q(S,c)= 67;
D) none

Thinking about Reinforcement Learning which ones of the following


statements are true (multiple
choice):
A) The maximization of the future cumulative reward allows to
Reinforcement Learning to
perform global decisions with local information
B) Q-learning is a temporal difference RL method that does not
need a model of the task to
learn the action value function
C) Reinforcement Learning only can be applied to problems with a
finite number of states
D) In Markov Decision Problems (MDP) the future actions from a
state depend on the previous
states

Optimal policy of agents is based on ---------------


A) actions
B) state
C) both of the above
D) None of the above

In the Q learning Algorithm, At each step choose the action a


which --------------- the function Q(S,a)
A) Minimizes
B) Maximizes
C) Stabalizes
D) None of the above

Q learning is based on learning from


A) experience
B) model of the real world
C) experience and model
D) none

ANSWER : A

The data point is that the agent received the future value of r+
?V(s'), where V(s') =maxa' Q(s',a'); this is the actual current
reward plus the discounted estimated future value. This new data
point is called a ___________.
A) Return
B) Spatial
C) Global
D) Local

In Q-Learning- the agent was in state s, it did action a, it


received reward r, and it went into state s',this experience tuple
can be given as ___________
A) ?s,r,a,s'?
B) ?s',a,r,s?
C) ?s,a,r,s'?
D) None
Q-learning uses ____________ differences to estimate the value of
Q*(s,a).
A) spatial
B) temporal
C) both
D) None

ANSWER : B

As an example, consider the process of boarding a train, in which


the reward is measured by the negative of the total time spent
boarding (alternatively, the cost of boarding the train is equal
to the boarding time). One strategy is to enter the train door as
soon as they open, minimizing the initial wait time for yourself.
If the train is crowded, however, then you will have a slow entry
after the initial action of entering the door as people are
fighting you to depart the train as you attempt to board. The
total boarding time, or cost, is then: which is a better option
for the above scenario.
A) 0 seconds wait time + 15 seconds fight time
B) 5 second wait time + 0 second fight time.
C) Both
D) None

What are the advantages of biological neural networks (BNNs)


compared to conventional Von Neumann computers:

A) BNNs have the ability to learn from examples.


B) BNNs have a high degree of parallelism.
C) BNNs require a mathematical model of the problem.
D) BNNs can acquire knowledge by “trial and error”.
E) BNNs use a sequential algorithm to solve problems.

A) (i), (ii), (iii), (iv) and (v).


B) (i), (ii) and (iii).
C) (i), (ii) and (iv).
D) (i), (iii) and (iv).

Which of the following techniques can NOT be used for pre-


processing the inputs to an artificial neural network:
A) Normalization.
B) Winner-takes-all.
C) Fast Fourier Transform (FFT).
D) Principal component analysis (PCA).

Which of the following neural networks uses supervised learning?:


A) Self-organizing feature map (SOFM).
B) The Hopfield network.
C) Simple recurrent network (SRN).
D) All of the above answers.

Which of the following algorithms can be used to train a single-


layer feedforward network
A) Hard competitive learning.
B) Soft competitive learning.
C) A genetic algorithm.
D) All of the above answers.

What is the credit assignment problem in a multi-layer feedforward


network:
A) The problem of adjusting the weights for the output units.
B) The problem of adapting the neighbours of the winning unit.
C) The problem of defining an error function for linearly
inseparable problems.
D) The problem of adjusting the weights for the hidden units.

Which of the following equations best describes the Generalized


Delta Rule with momentum?:
A) Owji(t + 1) = ?djxi
B) Owji(t + 1) = adjxi
C) Owji(t + 1) = ?djxi + aOwji(t)
D) Owji(t + 1) = ?djxi + adjxi(t)
Where wji(t) is the change to the weight from unit i to unit j at
time t, ? is the learning rate, a is the momentum coefficient, dj
is the error term for unit j, and xi is the ith input to unit j.

One method for dealing with local minima is to use a committee of


networks. What does this mean:
A) Large number of different networks are trained and tested. The
network with the lowest sum squared error on a separate validation
set is chosen as the best network.
B) Large number of different networks are trained and tested. All
of the networks are used to solve the real-world problem by taking
the average output of all the networks.
C) Large number of different networks are trained and tested.
D)The networks are then combined together to make a network of
networks, which is biologically more realistic and computationally
more powerful than a single network.

What is the most general type of decision region that can be


formed by a feedforward network with NO hidden layers?:
A) Convex decision regions – for example, the network can
approximate any Boolean function.
B) Arbitrary decision regions – the network can approximate any
function (the accuracy of the approx- imation depends on the
number of hidden units).
C) Decision regions separated by a line, plane or hyperplane.
D) None of the above answers.

Which of the following statements is the best description of


overfitting:
A) The network becomes “specialized” and learns the training set
too well.
B) The network can predict the correct outputs for test examples
which lie outside the range of the training examples.
C) The network does not contain enough adjustable parameters
(e.g., hidden units) to find a good approximation to the unknown
function which generated the training data.
D) The network cannot predict the correct outputs.

Neural Networks:
A) Nerve cells in the brain are called neurons
B) The output from the neuron is called dendrite
C) One kind of neurons is called synapses
D) Learning takes place in the synapses

Multilayer perceptron network:


A) Is a neural network with several layers of nodes (or weights)
B) There are connections both between and within each layer
C) The number of units in each layer must be equal
D) Multiple layers of neurons does not allow for more complex
decision boundaries than a single layer

Backpropagation:
A) Is a learning algorithm for multilayer perceptron networks
B) Is applicable for testing
C) Is based on a gradient descent technique to maximize the mean
square difference between the desired and actual
outputs
D) Is also applicable to self-organizing feature maps

Weight updates in Back propagation


A) Usually, the weights are initially set to 0
B) Are proportional to the difference between the desired and
actual outputs
C) The weight change is also proportional to the input to the
weight layer
D) All of the above

F(x)=1 /(1 + e^-x)


A) f(x) is called a sigmoid function
B) It is beneficial because it does not limit the output value
C) It is called an activation function and such a function is used
on every multilayer perceptron output
D) Is called a hyperbolic function

What is the learning which addresses the question of how an


autonomous agent that senses and acts in the environment can learn
to choose optimal actions to achieve its goals.

A) Supervised
B) Unsupervised
C) Semi- supervised
D) Reinforcement

Reinforcement algorithm is used for


A) Market-basket analysis
B) Diagonising a diabetic patient
C) To control a mobile robot
D) Identification of similar objects

To optimize operations in factories _________ learning is used.


A) Supervised
B) Reinforcement
C) Unsupervised
D) Semi- supervised

Each time the agent performs the action in its environment, a


trainer may provide a ________ to indicate the desirability of the
resulting state.
A) Reward or Penality
B) Reward only
C) Reward and Penality
D) Penality only

An agent will provide ________ reward when game is won and zero
reward in all other states.
A) Positive
B) Negative
C) Either
D) Neither

____________ is one of the algorithm that can acquire optimal


control strategies from delayed rewards, even when the agent has
no prior knowledge of the effects of its actions on the
environment.

A) Q learning
B) Supervised Learning
C) Deep learning
D) None of the above

Reinforcement learning algorithms are related to dynamic


programming algorithms frequently used to solve _____________
problems.
A) Classification
B) Optimization
C) Association
D) Clustering

TD-GAMMON program, used ______________ learning to become a world-


class backgammon player.
A) Q learning
B) Supervised Learning
C) Deep learning
D) Reinforcement

The task of the agent is to learn a target function that maps the
__________ to ____________.

A) previous state, current state


B) previous state, optimal action
C) current state, optimal action
D) previous state, next state

As the trainer provides only a sequence of immediate reward values


as the agent executes its sequence of actions, the agent faces the
problem of ______________________
A) Temporary credit assignment
B) Temporal credit assignment
C) Partially observable state
D) Exploration
In MDP, at each discrete time t, the agent senses the _________
state, chooses the current action and performs it.
A) Previous
B) Current
C) Previous and current
D) Current and future

The goal state G is


A) current state
B) future state
C) succeeding state
D) absorbing state

In a simple grid-world evnironment diagram, each grid square


represents _________, each arrow represents a _________.
A) penalty, action
B) distinct state, distinct action
C) reward, action
D) Penalty,other action

Exploration means
A) gathering new information
B) optimizing the existing solutions
C) exploring unknown states and actions
D) both a & c

perceptron is
A) a single layer feed-forward neural network with preprocessing
B) an autoassociative neural network
C) a double layer autoassociative neural network
D)None

Which of the following is true?


A)On average, neural networks have higher computational rates than
conventional computers.
B)Neural networks learn by example.
C)Neural networks mimic the way the human brain works.
D) all of them are true

A single perceptron can be used to represent many boolean


functions
A)TRUE
B)FALSE

Neural network learning methods provide a robust approach to


approximating
A)real-valued functions
B)discrete-valued functions
C)vector-valued target functions
D)All of the above

Which of the following techniques can NOT be used for pre-


processing the inputs to an artificial neural network:
A) Normalization.
B) Winner-takes-all.
C) Fast Fourier Transform (FFT).
D) Principal component analysis (PCA).
E) Deleting outliers from the training set.

Which of the following neural networks uses supervised learning?:


A) Self-organizing feature map (SOFM).
B) The Hopfield network.
C) Simple recurrent network (SRN).
D) All of the above answers.
e) None of the above answers.

Which of the following algorithms can be used to train a single-


layer feedforward network?:
A) Hard competitive learning.
B) Soft competitive learning.
C) A genetic algorithm.
D) All of the above answers.
E) None of the above answers.

What is Artificial intelligence


A) Putting your intelligence into Computer
B) Programming with your own intelligence
C) Making a Machine intelligent
D) Playing a Game

Which is the best way to go for Game playing problem


A) Linear approach
B) Heuristic approach
C) Random approach
D) Optimal approach
ANSWER: B
Which is not the commonly used programming language for AI
A) PROLOG
B) Java
C) LISP
D) Perl

In an Unsupervised learning
A) Specific output values are given
B) Specific output values are not given
C) No specific Inputs are given
D) Both inputs and outputs are given

A perceptron is a --------------------------------.
A) Feed-forward neural network
B) Back-propagation alogorithm
C) Back-tracking algorithm
D) Feed Forward-backward algorithm

Neural Networks are complex -----------------------with many


parameters
A) Linear Functions
B) Nonlinear Functions
C) Discrete Functions
D) Exponential Functions

What is the goal of artificial intelligence


A) To solve real-world problems
B) To solve artificial problems
C) To explain various sorts of intelligence
D) To extract scientific causes

Machine learning is
A) The autonomous acquisition of knowledge through the use of
computer programs
B) The autonomous acquisition of knowledge through the use of
manual programs
C) The selective acquisition of knowledge through the use of
computer programs
D) The selective acquisition of knowledge through the use of
manual programs

Factors which affect the performance of learner system does not


include
A) Representation scheme used
B) Training scenario
C) Type of feedback
D) Good data structures
Perception involves
A) Sights, sounds, smell and touch
B) Hitting
C) Boxing
D) Dancing

What are the advantages of biological neural networks (BNNs)


compared to conventional Von Neumann computers?

(i) BNNs have the ability to learn from examples.


(ii) BNNs have a high degree of parallelism.
(iii) BNNs require a mathematical model of the problem.
(iv) BNNs can acquire knowledge by “trial and error”.
(v) BNNs use a sequential algorithm to solve problems.

A) (i), (ii), (iii), (iv) and (v)


B) (i), (ii) and (iii)
C) (i), (ii) and (iv)
D) (i), (iii) and (iv)
E) (i), (iv) and (v)

Which of the following algorithms can be used to train a single-


layer feedforward network?
A) Hard competitive learning
B) Soft competitive learning
C) genetic algorithm
D) All of the above answers

What is the credit assignment problem in a multi-layer feedforward


network?
A) The problem of adjusting the weights for the output units
B) The problem of adapting the neighbours of the winning unit.
C) The problem of defining an error function for linearly
inseparable problems.
D) The problem of adjusting the weights for the hidden units.

Gradient of a continuous and differentiable function


A)is zero at a minimum
B)is zero at a saddle point
C)decreases as you get closer to the minimum
D)All of the above

Computational complexity of Gradient descent is


A)linear in D
B)linear in N
C)polynomial in D
D)dependent on the number of iterations
Reinforcement learning is all about
A)making decisions sequentially
B)decision made on the initial input
C) both of the above
D) none of the above

Practical applications of Reinforcement Learning


A)robotics for industrial automation
B)machine learning and data processing
C)create training systems that provide custom instruction
D)All of the above

RL can be used in large environments in the following situations


A)A model of the environment is known, but an analytic solution is
not available
B)Only a simulation model of the environment is given
C)The only way to collect information about the environment is to
interact with it
D)All of the above

An artificial neuron receives n inputs x1, x2, x3............xn


with weights w1, w2, ..........wn attached to the input links.
The
weighted sum_________________ is computed to be passed on to a
non-linear filter F called activation function to release
A)S wi
B)S xi
C)S wi + S xi
D)S wi* xi

What is back propagation


A)It is another name given to the curvy function in the perceptron
B)It is the transmission of error back through the network to
adjust the inputs
C)It is the transmission of error back through the network to
allow weights to be adjusted so that the network can learn
D)None of the mentioned

3-input neuron is trained to output a zero when the input is 110


and a one when the input is 111.
After generalization, the output will be zero when and only when
theinput is
A) 000 or 110 or 011 or 101
B) 010 or 100 or 110 or 101
C) 000 or 010 or 110 or 100
D) 100 or 111 or 101 or 001
Which of the following is true
Single layer associative neural networks do not have the ability
to:
(i) perform pattern recognition
(ii) find the parity of a picture
(iii)determine whether two or more shapes in a picture are
connected or not

A)(ii) and (iii) are true


B) (ii) is true
C)All of the mentioned
D) None of the mentioned

The process by which you become aware of messages through your


sense is called
A)Organization
B) Sensation
C) Interpretation-Evaluation
D) Perception

What is a perception check


A) a cognitive bias that makes us listen only to information we
already agree with
B) a method teachers use to reward good listeners in the classroom
C) any factor that gets in the way of good listening and decreases
our ability to interpret correctly
D) a response that allows you to state your interpretation and ask
your partner whether or not that interpretation is correct

What is used in determining the nature of the learning problem


A) Environment
B) Feedback
C) Problem
D) All of the mentioned

Which of the following is characteristic of best machine learning


method
A.Fast
B. Accuracy
C. Scalable
D. All of the Mentioned

Supervised learning and unsupervised clustering both require at


least one
A.hidden attribute.
B.output attribute.
C.input attribute.
D.categorical attribute.
Data used to build a machine learning model
A.validation data
B.Training data
c.Test data
D.hidden data

Another name for an output attribute


A.Predictive variable
B.Independent variable
C.Estimated variable
D.Dependent variable

Computers are best at learning


A.Facts
B.Concepts
C.Procedures
D.Principles

Data used to optimize the parameter settings of a supervised


learner model
A.Training
B.Test
C.Verification
D.Validation

Machine learning techiques differ from statistical techiques in


that machine learning methods
A.typically assume an underlying distribution for the data
B.are better able to deal with missing and noisy data
C.are not able to explain their behaviour
D.Have trouble with large sized data sets

Suppose your model is overfitting. Which of the following is NOT


a valid way to try and reduce theoverfitting?
A.Increase the amount of training data
B.Improve the optimisation algorithm being used for error
minimisation
C.Decrease the model complexity.
D.Reduce the noise in the training data.

Two Types of reinforcement learning


A.Positive
B.Negetive
C.Both of the above
D.None of the above

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy