MCA part 3 paper 21
MCA part 3 paper 21
Email-Id: kiranpandey.nou@gmail.com
INTRODUCTION
Searching is the universal technique of problem solving in AI. There are some single-player
games such as tile games, Sudoku, crossword, etc. The search algorithms help you to search
for a particular position in such games. We will also discuss the concept of problem reduction
and constraint satisfaction in this unit. This will give an insight in various searching techniques
and use of AI in doing so.
TYPES OF SEARCHES
There are two types of search, based on whether they use information about the goal.
(i) Uninformed search (ii) Informed Search
Un-informed Search
This type of search does not use any domain knowledge. This means that it does not use any
information that helps it reach the goal, like closeness or location of the goal. The strategies or
algorithms, using this form of search, ignore where they are going until they find a goal and
report success.
Uninformed search, also called blind search, is a class of general purpose search algorithms
that operate in a brute-force way. These algorithms can be applied to a variety of search
problems, but since they don't take into account the target problem. They are most simple, as
they do not need any domain-specific knowledge. They work fine with small number of possible
states.
• BFS (Breadth First Search): It expands the shallowest node (node having lowest depth)
first.
• UCS (Uniform Cost Search): It expands the node with least cost (Cost for expanding the
node).
• Bidirectional Search.
Breadth-First Search (BFS)
It is a simple strategy in which the root node is expanded first, then all the successor nodes are
expanded next, then their successors and so on. Thus all the nodes are expanded at the given
depth first in the search tree before going to the next level of node. It generates one tree at a
time until the solution is found. This method uses First-In-First-Out (FIFO) queue assuring that
the nodes that are visited first will be expanded first.
Disadvantage: Memory requirement is high. Since each level of nodes is saved for creating
next one, it consumes a lot of memory space. Space requirement to store nodes is
exponential. For example, if your problem has a solution of depth 12, then it will take 35 years
for breadth first search to find it.
If branching factor (average number of child nodes for a given node) = b and depth = d, then
number of nodes at level d = bd. The total no of nodes created in worst case is b + b2 + b3 + …
+ bd. Its complexity depends on the number of nodes. It can check duplicate nodes.
As the nodes on the single path are stored in each iteration from root to leaf node, the space
requirement to store nodes is linear. With branching factor b and depth as m, the storage
space is bm.
Advantages: DFS search has a very modest memory requirements. It needs only a single
path from the root to a leaf node, along with the remaining unexpanded sibling nodes for each
node on the path. Once the node has been expanded, it can be removed from memory as
soon as all its descendants have been fully explored.
Disadvantage − This algorithm may not terminate and go on infinitely on one path. The
solution to this issue is to choose a cut-off depth. If the ideal cut-off is d, and if chosen cut-off is
lesser than d, then this algorithm may fail. If chosen cut-off is more than d, then execution time
increases. Its complexity depends on the number of paths. It cannot check duplicate nodes.
It is DFS with a limit on depth. That is at depth l nodes are treated as if they have no successors.
It solves the infinite-path problem. DFS can be viewed as a special case of depth-limited search
with l = ∞.
DLs also introduces an additional source of incompleteness if we choose l<d, that is, the
shallowest goal is beyond the depth limit. DLS will also be non-optimal if we choose l>d. Its time
complexity is is O(bl) and space complexity is O(bl).
It is a general strategy often used in combination with DFS that finds the best depth limit. It is
DFS with increasing limit. It combines the benefits of DFS and BFS. Like DFS its memory
requirements are very modest that is O(bd) but like BFS it is complete when the branching factor
is finite. It is optimal when the path cost is a non-decreasing function of the depth of the node.
It expands the node with least cost. It is identical to BFS. UCS does not care about the number of
steps a path has, but only about their total cost. Therefore it will be stuck in an infinite loop if it
ever expands a node that has a zero-cost action leading back to the same state. We can
guarantee completeness provided the cost of every step is greater than or equal to some small
positive constant c. This condition is sufficient to ensure optimality. It means that the cost of the
path always increases as we go along the path. UCS is guided by path costs rather than depths,
so its complexity cannot easily be characterized in terms of b and d.
Disadvantage − There can be multiple long paths with the cost ≤ C*. Uniform Cost search must
explore them all.
Bidirectional Search
It searches forward from initial state and backward from goal state till both meet to identify a
common state. The path from initial state is concatenated with the inverse path from the goal
state. Each search is done only up to half of the total path. The motivation is that bd/2 + bd/2 is
much less than bd. The time and space complexity of bidirectional search is bd/2. The algorithm
is complete and optimal if both searches are BFS; other combinations may sacrifice
completeness, optimality or both.
Time bd bm bd/2 bd bd
Space bd bm bd/2 bd bd
h(n) = estimated cost of the cheapest path from node to a goal node.
• Greedy search (best first search) : It expands the node that appears to be closest to goal
• A* search: Minimize the total estimated solution cost that includes cost of reaching a state
and cost of reaching goal from that state.
A* algorithm generates all successor nodes and computes an estimate of the distance (cost)
from the start node to a goal node through each of the successors. It then chooses the
successor with the shortest estimated distance for expansion. The successor for this node are
then generated, their distance estimated, and the process continues until a goal is found or the
search ends in failure.
A* algorithm is both complete and admissible. Thus, A* will always find an optimal path if one
exists. The efficiency of an A* algorithm depends on how closely h* approximates h and the cost
of the computing f*.
Hill-Climbing Search
It is like DFS where most promising node is selected for expansion. It is an iterative algorithm
that starts with an arbitrary solution to a problem and attempts to find a better solution by
changing a single element of the solution incrementally. If the change produces a better
solution, an incremental change is taken as a new solution. This process is repeated until there
are no further improvements. An example of hill climbing is shown below:
Figure 6: Hill-Climbing
Advantages: Hill climbing can produce substantial savings over blind searches when an
informative, reliable function is available to guide the search to a global goal. It suffers from
some serious drawbacks when this is not the case.
Disadvantage − this algorithm is neither complete, nor optimal. Potential problem types named
after certain terrestrial anomalies are the foothill, ridge, and plateau.
A foothill trap results when local maxima or peaks are found. In this case the children all have
less promising goal distances than the parent node. The search is essentially trapped at the at
the local node with no try moving in some arbitrary direction a few generations in the hope that
the real goal direction will become evident, backtracking to an ancestor node and trying a
secondary path choice, or altering the computation procedure to expand ahead a few
generations each time before choosing a path.
A ridge occurs when several adjoining nodes have a higher values than surrounding nodes
whereas a plateau may occur during search when in an area all neighboring nodes will have
the same value.
PROBLEM REDUCTION
When a problem can be divided into a set of sub problems, where each sub problem can
be solved separately and a combination of these will be a solution. The AND-OR (AO*)
graphs are used for representing the solution. The decomposition of the problem or
problem reduction generates AND arcs. One AND arc may point to any number of
successor nodes. All these must be solved so that the arc will rise to many arcs,
indicating several possible solutions. Hence the graph is known as AND - OR instead of
AND. Figure shows an AND - OR graph:
The above graph represents the search space for solving the problem P, using the goal-
reduction methods:
(i) P if Q and R, (ii) P if S, (iii) Q if T, (iv) Q if U
The algorithm is a variation of the original given by Nilsson (1971). It requires that nodes
traversed in the tree be labeled as solved or unsolved in the solution process to account for
AND node solutions which require solutions to all successor nodes. A solution is found when
the start node is labeled as solved.
AO* algorithm
The expected output is a binary matrix which has 1s for the blocks where queens are placed.
For example following is the output matrix for above 4 queen solution.
{0, 1, 0, 0}
{0, 0, 0, 1}
{1, 0, 0, 0}
{0, 0, 1, 0}
Example 2: A map coloring problem: We are given a map, i.e. a planar graph, and we are told
to color it using k colors, so that no two neighboring countries have the same color.
Example 4: The Boolean satisfiability problem (SAT) is a decision problem considered in
complexity theory. An instance of the problem is a Boolean expression written using only AND,
OR, NOT, variables, and parentheses. The question is: given the expression, is there some
assignment of TRUE and FALSE values to the variables that will make the entire expression
true? In mathematics, a formula of propositional logic is said to be satisfiable if truth-values can
be assigned to its variables in a way that makes the formula true. The class of satisfiable
propositional formulas is NP-complete. The propositional satisfiability problem (SAT), which
decides whether or not a given propositional formula is satisfiable, is of central importance in
various areas of computer science, including theoretical computer science, algorithmic, artificial
intelligence, hardware design and verification.
Example 5: A crypt-arithmetic problem: In the following pattern
SEND
+MORE
=========
MONEY
We have to replace each letter by a distinct digit so that the resulting sum is correct. All these
examples and other real life problems like time table scheduling, transport scheduling, floor
planning etc. are instances of the same pattern.
QUESTIONS
1. Discuss different types of un-informed search.
2. Discuss various heuristic search.
3. Compare and contrast between DFS and BFS.
4. Describe Hill climbing search method with an example.
5. What is Best First Search?
6. Discuss A* search algorithm.
7. What are the desirable properties of heuristic search algorithms? Explain.
8. Explain AO* algorithm.
SUGGESTED READINGS
1. Introduction to Artificial Intelligence and Expert systems by Dan W. Patterson.
2. Artificial Intelligence: A modern approach by Stuart Russell and Peter Norvig.