Ai3 1
Ai3 1
Ai3 1
Source
SELECTION
Selecting👆| This process is used to select a node on the tree
that has the highest possibility of winning. For Example —
Consider the moves with winning
possibility 2/3, 0/1 & 1/2 after the first move 4/6, the
node 2/3 has the highest possibility of winning.
The node selected is searched from the current state of the tree
and selected node is located at the end of the branch. Since the
selected node has the highest possibility of winning — that
path is also most likely to reach the solution faster than other
path in the tree.
EXPANSION
SIMULATION
Simulating|Exploring 🚀 Since nobody knows which node is
the best children/ leaf from the group. The move which will
perform best and lead to the correct answer down the tree. But,
Let’s say the simulation of the node gives optimistic results for
its future and gets a positive score 1/1.
After updating all the nodes, the loop again begins by selection
the best node in the tree→ expanding of the selected node →
using RL for simulating exploration → back-propagating the
updated scores → then finally selecting a new node further
down the tree that is actual the required final winning result.
Conclusion
Instead of brute forcing from millions of possible ways to find
the right path.
Stochastic Games
A stochastic game:
A repeated interaction between several participants in which the
underlying state of the environment changes stochastically, and it
depends on the decisions of the participants.
A strategy:
A rule that dictates how a participant in an interaction makes his
decisions as a function of the observed behavior of the other participants
and of the evolution of the environment.
An equilibrium:
A collection of strategies, one for each player, such that each player
maximizes (or minimizes, in case of stage costs) his evaluation of stage
payoffs given the strategies of the other players.
A correlated equilibrium:
An equilibrium in an extended game in which at the outset of the game
each player receives a private signal, and the vector of private signals is
chosen according to a known joint probability distribution. In the
extended game, a strategy of a player depends, in addition to past play, on
the signal he received.
This is a standard backgammon position. The object of the game is to get all of
one’s pieces off the board as quickly as possible. White moves in a clockwise
direction toward 25, while Black moves in a counter clockwise direction toward
0. Unless there are many opponent pieces, a piece can advance to any position;
if there is only one opponent, it is caught and must start over. White has rolled a
6–5 and must pick between four valid moves: (5–10,5–11), (5–11,19–24),
(5–10,10–16), and (5–11,11–16), where the notation (5–11,11–16) denotes
moving one piece from position 5 to 11 and then another from 11 to 16.
Stochastic game tree for a backgammon position
White knows his or her own legal moves, but he or she has no idea how Black
will roll, and thus has no idea what Black’s legal moves will be.
That means White won’t be able to build a normal game tree-like in chess or
tic-tac-toe. In backgammon, in addition to M A X and M I N nodes, a game tree
must include chance nodes. The figure below depicts chance nodes as circles.
The possible dice rolls are indicated by the branches leading from each chance
node; each branch is labelled with the roll and its probability. There are 36
different ways to roll two dice, each equally likely, yet there are only 21 distinct
rolls because a 6–5 is the same as a 5–6. P (1–1) = 1/36 because each of the six
doubles (1–1 through 6–6) has a probability of 1/36. Each of the other 15 rolls
has a 1/18 chance of happening.
Firstly, they rely on the assumption that the game is fully observable,
deterministic and has perfect information.
Secondly, their effectiveness is highly dependent on the complexity of the game
and the branching factor of the game tree.
Thirdly, they can be computationally expensive and time-consuming to run.
Fourthly, the assumption that players are rational may not always hold, leading
to suboptimal decisions.
Fifthly, game search algorithms cannot solve games beyond their rule-based
definitions and cannot cope with games that may incorporate random events.
Lastly, they may not always result in an optimal or even satisfactory solution as
they do not take into account human behaviour or intuition, which can
sometimes lead to unexpected outcomes.
3.7 Constraint Satisfaction Problems (CSP)
Finding a solution that meets a set of constraints is the
goal of constraint satisfaction problems (CSPs), a type
of AI issue.
Finding values for a group of variables that fulfill a set
of restrictions or rules is the aim of constraint
satisfaction problems.
For tasks including resource allocation, planning,
scheduling, and decision-making, CSPs are frequently
employed in AI.
2) Forward-checking algorithm
The forward-checking algorithm is a variation of the
backtracking algorithm that condenses the search space
using a type of local consistency.
For each unassigned variable, the method keeps a list of
remaining values and applies local constraints to eliminate
inconsistent values from these sets. The algorithm
examines a variable’s neighbors after it is given a value to
see whether any of its remaining values become
inconsistent and removes them from the sets if they do.
The algorithm goes backward if, after forward checking, a
variable has no more values.
Algorithms for propagating constraints are a class that
uses local consistency and inference to condense the
search space. These algorithms operate by propagating
restrictions between variables and removing inconsistent
values from the variable domains using the information
obtained.