Decision Making IV

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

1

BRAIN-INSPIRED MODEL FOR DECISION-MAKING IN THE SELECTION OF

BENEFICIAL INFORMATION AMONG SIGNALS RECEIVED BY AN

UNPREDICTABLE INFORMATION-DEVELOPMENT ENVIRONMENT

Alireza Asgari,1 Yvan Beauregard 1


1Mechanical Engineering Department, École de Technologie Supérieure, Montréal, Canada.

ABSTRACT

With its diversification in products and services, today’s marketplace makes competition wildly dynamic and unpredictable

for industries. In such an environment, daily operational decision-making has a vital role in producing value for products

and services while avoiding the risk of loss and hazard to human health and safety. However, it makes a large portion of

operational costs for industries. The main reason is that decision-making belongs to the operational tasks dominated by

humans. The less involvement of humans, as a less controllable entity, in industrial operation could also favorable for

improving workplace health and safety. To this end, artificial intelligence is proposed as an alternative to doing human

decision-making tasks. Still, some of the functional characteristics of the brain that allow humans to make decisions in

unpredictable environments like the current industry, especially knowledge generalization, are challenging for artificial

intelligence. To find an applicable solution, we study the principles that underlie the human brain functions in decision-

making. The relative base functions are realized to develop a model in a simulated unpredictable environment for a

decision-making system that could decide which information is beneficial to choose. The method executed to build our

model's neuronal interactions is unique that aims to mimic some simple functions of the brain in decision-making. It has

the potential to develop for systems acting in the higher abstraction levels and complexities in real-world environments.

This system and our study will help to integrate more artificial intelligence in industrial operations and settings. The more

successful implementation of artificial intelligence will be the steeper decreasing operational costs and risks.

KEYWORDS: Artificial intelligence, Decision-making, Brain-inspired systems, Unpredictable environments,

Dynamic tasks, Knowledge generalization


2

1. INTRODUCTION

Today’s marketplace is seeing more and more diversification in products and services, a situation that has

generated unprecedented competition for industries. The unpredictability and dynamicity of such competition have

forced industries to challenge the emerged complex operational environments. One of the main challenges in such

operational environments is managing tasks like decision-making, which, despite growing industrial automation,

require more human involvement in the process. This has caused higher operating costs and a less controllable

environment. To some extent, the industry has mitigated the occurred cost through executing artificial intelligence

(AI) in operational systems [1]. However, current AI systems are flawed by algorithm-complexity, unpredictable

behaviors, and difficulty working at abstract levels [2, 3]. They lack the ability to adapt to changing environments,

generalize knowledge, and do causal reasoning [2, 4]. They cannot generalize acquired knowledge for any further

changes in the environment that transcend their limitations [5].

AI’s current two foremost platforms are reinforcement learning (RL) and deep learning (DL). RL algorithms are

the base procedures in decision-making problems for decades. They execute predefined symbolic reasoning to

process the environment and classify actions [6]. On this platform, an agent chooses an action from an action-policy

for a state of a predefined environment; the agent then receives the related reward and adjusts the choice [7]. Here, a

state is a segment of a path toward a given goal and valued against that. The RL platform has been executed for

robotic, data processing, business strategy planning, training systems, and aircraft control. On the other side, the DL

platform provides a system with the concept of neural networking to recognize a detected state of an environment

through its predefined states for that environment. It has entered in recent AI investments such as image, text, and

pattern recognition, object detection, and bioinformatics applications [8]. A system on the DL platform tries through

a neural-network to correlate the output result with input data even does not follow any logic. This platform lacks the

applicable features for the abstraction of the relationships between environmental states. While the symbolic

manipulation characteristic of the RL platform decreases the responsiveness to unknown and unpredictable

environments. And, a state could not be abstracted from its environment. The literature presents some combinative

approaches on both platforms as a solution for the discussed issues. However, it is still an open debate to overcome

the discussed problem [9].

An AI system could function in an unpredictable environment if it learns from experience and generalizes
3

knowledge to choose an alternative and take action [10]. It is what the brain does in the decision-making tasks. The

brain could separately abstract a state's features or its interrelation with the other states of an environment. The brain

does not make complex decisions by calculation but through cooperation between the neurobiological, neuro-

structural, neurochemical, and psychological mechanisms [11]. It adequately not perfectly makes decisions based on

imperfect information received [12]. It realizes data rapidly most of the time [12, 13] in an episode of sequential

actions [14] within a vast decentralized network of neurons [15, 16] through unconscious layers [17]. Here, the

success key is knowledge generalization and learning through the abstraction of similarities between experienced

states’ features and their interrelations. The brain receives information from its sensory system. If it discovers

similarity in some aspects between the detected information and any experiences records, it generalizes that for the

realization of the new state [17, 18]. Hence, for overcoming our problem, we aim to model a decision-making system

inspired by the brain function to generalize knowledge, learn from experiences, and improve knowledge on imperfect

information in unpredictable environments. The system seeks benefits in the unpredictable states of a simulated

information development environment. However, there are some key questions before building this system that we

should answer: How does the brain find similarities between experienced states and generalize their values for new

states? What is the method it uses to judge values and make decisions? Finally, what is its fundamental function to

learn from experiencing new state and value similar experienced states?

We argue that an intelligent decision-making system generalizes knowledge like the brain if it clusters neurons to

classify common valued-patterns of information features received by a sensory gate. It should realize similarities

between experienced states in different levels, from the lowest to the most similar one to be able to share values even

in environments with scarce information. With the proposed classification, an emerged common pattern represents a

set of states detected by the information sensory gate and carries their similar features and their values for the

realization of any new similar states. That pattern shares the values achieved by a state between its commonality

cohort. To this end, a neuron in a neural cluster abstracts that common pattern. That neuron then participates in

making decisions about those states according to values achieved by the abstracted pattern. And any new value

received reframes the pattern values.

To evaluate our claims we divided our study into six sections. The first section introduces the problem with a brief

background and outline of the study. Section two investigates the principles underlying the processes of learning and

generalizing knowledge through experiences and updating alternatives values for decision-making in the brain. The
4

derived principles are based on structuring important aspects to develop an intelligent decision-making model. Then,

we realize the development steps of the model based on two outstanding methods of the brain, best and worst

alternatives in value-add achievements and revaluation of those values with new achievements in the third section,

and then describe the experiments, which verify our system in knowledge generalization and learning from

experience in section four. The results of our experiments are presented in the fifth section, and we conclude with our

analysis of our results and suggestions for future work in section six.

2. BRAIN FUNCTION PRINCIPLES FOR DECISION-MAKING

Studies show that the brain uses simple principles to interpret information signals received on its sensory systems [19,

20]. Helmholtz (1867) discussed that the brain interprets the sensory data in the convergence of information flow

from the brain’s accumulated knowledge [21]. For example, the brain interpolates knowledge from the visual cortex

into the data stream coming through the retina to determine the final visual information [22]; then, through the

realized visual information, an action could be initiated [23]. The familiarity between the received information

(current state) and knowledge (those experienced) is extracted for selecting a proper alternative [12, 22]. The brain

regulates the motivation values applied to alternatives by setting the loss and likelihood of achievement [19, 24]. In

this way, different neurotransmitters (chemical messengers in the brain) associating with the motivation and

nurturing of the excitatory and inhibitory neuronal circuits to locate beneficial information, support alternatives for

action, and develop knowledge [25]. In a neuronal circuit, a neuron activates the next neurons in the path when it

reaches its activation threshold through previous neurons' applied activation. The excitatory and inhibitory

connections to a neuron determine when it activates the next neurons in a neuronal circuit [26]. Those excitatory and

inhibitory neuronal circuits are cultivated between neuronal clusters (grouped same-characteristic neurons) in the

brain [27]. Every neuron here would link into several neurons [28] with an average of 255 ± 13 connections [29].

Normally, in the decision-making process, each circuit supports a definite choice [27]. The winner circuit induces

turning off the other active circuits [30]. The reward of the made decision modifies and strengthens the winner

neuronal circuit [31]. To this end, neuronal circuits remember detected information patterns with relative alternatives

and achieved values [32] and the brain establishes Knowledge over time [5, 33].

We have found two studies, which explicitly described the brain's process of valuing the states of an environment

and choosing an alternative for decision-making. Both are the main body parts of our decision-making model. In the
5

first, Groman (2019) shows the brain’s performance when taking or avoiding choices in the information received

from a state. They show that the brain activates the neuronal circuits with similar experienced states for finding the

best and the worst rewarded alternatives. From each category (best and worst), the highest valued one is going to the

final race of the decision-making process. Then, the alternative with the highest motivation value determines to

choose or avoid a choice in the decision-making process [34]. In the second study, Liu (2019) depicts the

modification of alternative-values through the received reward after the made decision on the current information.

They situated a set of sequential experiments to study the brain's function during action selection and alternative

revaluation. They found that the brain continuously reorganizes its alternatives by revaluating them with new

information [18].

This work relies on the studies discussed above and does not aim to structure the brain’s exact functions. We seek

to examine if our model could make decisions through realizing common patterns between experienced states. The

modeled system aims to generalize knowledge by achieved the worst and best values of the experienced common

patterns. Then, this study verifies if the model learns through updating its knowledge. The modeled system should

learn by reevaluating the best and worst achieved values.

3. DEVELOPMENT OF A BRAIN-INSPIRED MODEL FOR DECISION-MAKING

Following the investigation of the brain’s decision-making functions, we developed a system to mimic the brain on

evaluating the best and worst alternatives. The system should provide adequate decisions in response to

unpredictable states. The states were in a simulated information development environment. Such an environment

experiences unpredictable information flow from the resource provider’s (newly available, effective, and ineffective

resource information) and the customer’s sides (the data flow about satisfied or failed information) [35, 36].

Specifically, the system makes decisions to proceed with or avoid the development ("develop information" or "skip

information") of newly received information signals from the resource provider side. It compares the information

features with similar experienced states and determines alternatives with the highest motivation values (derived from

the customer side). We developed the model to verify if our system could generalize knowledge for new states and

update it. The system should label valued patterns at different realization levels for states detected by the information

sensory gate. In this way, the system abstracts common patterns from vague to exact similarities to run its decision-

making process. Its ability to learn from valuing experienced states is evaluated too. Table 1 presents our model
6

realization steps with the relative principles inspired by brain function.

Table 1. The steps to develop a brain-inspired intelligent decision-making system

Brain functions Aspects realized in the decision-making system


Chaining the evaluation and revision episodes of 1. Build an episodic process of realizing patterns, choosing between
neuronal circuits to develop alternatives→ alternatives, decision-making, and altering the alternative by
achieved values.
Triggering current in a neuronal circuit by 2. Structure a decision-making process, which begins from the
detecting sensory data and ending by selecting a information sensory gate, continues in the information pattern
choice→ realization stage, and ends with the alternative realization stage
before decision-making.
The same characteristic neurons contribute to the 3. Place the same characteristic neurons in the same matrix at each
same neuronal cluster→ stage of the decision-making process.
Each matrix realizes a certain level of information commonality
between states .
Generalizing information patterns similarities for 4. Arrange neuronal matrices in three levels: vague, approximate, and
any new state and discriminating them to find the explicit, to generalize knowledge for detected patterns and
most similar one→ discriminate the most similar ones for decision-making.
Each neuron is interconnected with 255 ± 13 of 5. A neuron in each matrix is connectable with all neurons of the other
neurons in other neuronal clusters→ levels to realize a larger number of patterns and alternatives.
Motivating and demotivating neuronal circuits to 6. Build neuronal circuits between matrices through their values’
contrast the benefits and disadvantages of choices. similarities and differences to find alternatives and discriminate
them.
A neuron in a neuronal circuit activated by its 7. Define a process of activation and action initiation for neurons
upstream neurons and reaches the action threshold during the information transportation.
then activates its downstream neurons→
Gleaning information from data received “the 8. Define information features of the environment’s states on the
environment’s states”→ information-sensory gate.
Accumulating knowledge over time through the 9. Record and classify received signals according to their frequencies’
similarities between information patterns and common patterns and corresponding values achieved on the value-
achieved rewards→ adding sensory gate.
Interconnecting the sensory information with 10. Compare the detected signal patterns with the common patterns of
achieved knowledge to realize choices experienced states on the pattern realization stage to find similar
Generalizing experienced alternatives for new ones and determine a proper alternative.
data→
Controlling neuronal communication by the 11. Consolidate the neuronal circuits at the pattern realization stage
motivation values of excitatory and inhibitory with higher motivated alternatives.
circuits to determine which neurons communicate
when and with whom→
Realizing alternatives by acquired knowledge 12. Realize alternatives through the experienced states.
Making the same reference for reward and loss. 13. Put for each pattern matrix’s neuron at the same level the best and
Attaching the best and the worst rewarded worst alternatives.
alternatives for similar states
Comparing the best and the worst rewarded 14. Decide on “do” or “skip” the “development of information”
alternatives from similar states for decision- according to the best and the worst alternatives.
making→
Learning through updating the motivation level of 15. Give a value to each alternative based on the relative achieved value
the involved neuronal circuits→ and modify it by new experiences and value-added achievement.
7

Neurons involved in our system are clustered in four classes of matrices (information signal representation, pattern

label realization, alternative realization, and value-add realization matrices) aside from decision-maker neurons

(Figure 1). These matrices realize and classify information into three levels of similarities (vague, approximate, and

explicit). At each level, every experienced pattern links up to two alternatives with the best and worst achievements.

The value-adding sensory gate receives valuing signals from the customer side for made decisions. The system then

records values with the relative information pattern and modifies one of the pattern's linked alternatives if it got a

higher score. The following discussions introduce the stages of the system.

Figure 1. The diagram of our decision-making system. The information-sensory gate receives a new information signal

of a resource, derives the common patterns of information features, and sends them to the pattern realization stage. The pattern-

realization stage at each level assigns a label to the corresponding pattern. If the label has had the experience, it activates the

attached alternatives at the alternative realization stage. Those alternatives determine the motivation values for the relative

decision-making neurons. After making the decision, the value-adding sensory gate gets the resulted value-added data. The

system records that value for the label. If this new value proposes to the labeled neuron a new option on top of one of the best or

worst alternatives, the option replaces that best (or best) alternative.
8

3.1. The information sensory gate

The information-sensory gate detects signals whose patterns represent the information characteristics of the resource-

provider. The signal frequencies represent information features. Here, eight frequencies represent eight features of

information. The literature presents eight features for information such as source, accuracy, validity, completeness,

consistency, accessibility, uniqueness, and timeliness [37]. Each frequency and the relative intensity respectively

represent a feature of information and the feature value (Figure 2). The system projects the signal patterns on three

levels, the common patterns with vague similarities (the low and high values), approximate similarities (the lowest,

low, high, and highest values). In the last level, explicit-similar patterns are detailed into two more significant

measures. Therefore, based on the defined numbers of eight frequencies and eight intensities, the information

sensory gate is structured on a matrix of 64 neurons. From the received information, the turned-on information

sensory neurons turn on the counterpart neurons at all pattern representative levels of the pattern realization stage

(the vague level (level-I), approximate level (level-II), and explicit level (level-III)) (the left side of Figure 3) and

trigger the neurons in the pattern-realization matrix-I. The triggered neurons begin a countdown process at a certain

time. The most motivated neuron in the matrix is the one that finishes the process first and does the action and

respectively triggers or turns on its connected neurons.

Figure 2. An example of the information pattern on the information sensory gate. A signal on the information-sensory

gate makes a pattern of eight different frequencies representing the embedded information features. The intensities in the vertical axis show the

value of each feature.


9

Figure 3. The schematic of the pattern realization stage. The pattern realization stage is in two parts, pattern representative

matrices (left) and pattern-labels’ matrices (right).

3.2. Pattern realization stage

We structured three levels for the pattern realization stage (Figure 3). Level-I recognizes a common pattern with

vague feature-similarity to the detected signal. Level-II finds one with approximate feature-similarity. Level-III then
10

discriminates the signal pattern from the most-similar one. A realized pattern (a pattern that participated before in

realizing of a state at least) has at least a linkage to one neuron at the alternative stage and the value-adding sensory

gate.

Each matrix comprises 256 neurons (the maximum number of labels for the common patterns possible at each

level). The realization process begins from the vague-similarities at level-I and ends up with the most-similarities at

level-III. From level-I, on-neurons at the signal-representative matrix motivate the connected neurons at the pattern-

realization matrix-I for the identical pattern and most-similar ones with linked alternatives (the best or worst links).

The eight features of an identical pattern are rival to its corresponding representative. The matrix's most motivated

neuron (the absolute-similar one) does the action and triggers the matrix-II neurons to do the same. The most

motivated one with alternative linkage (each of the most-similar patterns) does action and turns on the linked neuron

on matrix-I of the alternative stage. This process continues till to level-III. Without matrix-IV, so, at matrix-III, the

most motivated neuron triggers the decision-making neurons at level-III to do the action. By determining all three

levels’ interrelations, the pattern-realization stage could realize up to 16,777,216 different patterns. Level-I could

find a maximum of 65536 similar patterns for each detected signal. Level-II only could catch up to 256 similar

patterns. Then, level-III could discriminate maximum between the 256 most similar patterns. Adding another level to

this stage augments this ability up to the realization for 4,294,967,296 different patterns.

3.3. Alternative realization stage

At the alternative realization stage, our model followed the finding in the study of decision-making in the brain [34].

Accordingly, each labeled neuron in the pattern-realization matrices could link up to the best alternative and the

worst one of the same-level. Each alternative realization matrix is divided into two areas shown in the middle of

Figure 4. The area above with gray color displays the best alternatives and the one below represents the worst

alternatives. The best and the worst alternatives motivate the decision-making neurons sat at the same level for

“develop” and “skip” decisions.


11

Figure 4. The schematic of the alternative realization stage. The left side shows the decision-making neurons motivated by the

same-level best and worst alternatives in the middle (the gray area represents the best alternatives). Those alternatives are linked to the same level

realized pattern-labels. They could be replaced by received value-added achievements. The right side depicts the matrices that record receiving

signals on the value-adding sensory gate for the realized pattern-labels.

3.3.1. Decision-making levels

The best and the worst alternatives, which are attached to a pattern-realization neuron, motivate the decision-making

neurons who respectively stand for “develop” and “skip” decisions. The most motivated one between the two is the

first one that does the action and triggers the same-objective neuron at the next level (the left side of Figure 4). At the

final level, the first one who does the action nails the final decision. Then, the system brings out its output result.
12

3.4. Value-adding sensory gate

For each “Develop” decision, the system receives value-adding data on the value-adding sensory gate. That gate

transfers the value for the last made decision to the three corresponding value-added matrices (right side of

Figure 4). The neuron with the received signal assigns the new value to the on-neuron in the pattern realization

matrix. The on-neuron in the pattern matrix temporarily assigns an alternative to the new value. If the new

alternative pins a new record relative to the old one (the best or the worst alternative), it replaces that alternative.

4. EXPERIMENT TO TEST THE SYSTEM’S PERFORMANCE

We structured an experimental setup to test if the system learns through experiences and generalizes knowledge for

decision-making. The first step to evaluate the system on knowledge generalization and decision-making method was

to verify if the system could do decision-making by itself (non-default decisions) through its realization process.

Second, we aimed to justify if the system achieves knowledge to make decisions with more than 50% correct results

through experiences. Moreover, we measured the effect of each realization level on making successful decisions. To

verify the system on learning and updating knowledge, we evaluated if the system learns information features

through experiencing different states.

The information-sensory gate was receiving random signals with features, each got a random value between 1 and

8. Those signals did not represent any real resource provider's information and were only generated to test the system.

Likewise, the made-decision values applied on the value-adding-sensory gate were not from the real customer side.

We estimated each before application according to the ranks of features and the intensities of frequencies. We had

ranked each feature as listed in Table 3. We should remind that the combinatory ranks of features would play an

important role in creating value for information in a real environment. However, we ignored that to allow a simple

configuration of input data. Here, a signal value was determined by summing the multiplication of the ranks in signal

intensity of the frequencies.

The value-added achievements for each experienced state determined its alternatives scores, and those scores

asserted the motivation rates on "Develop" and “Skip” decisions for any new similar information signal. If there were

no significant alternatives for a new information pattern, a default decision was made, which was delivering the

"Develop" action. The decisions made were evaluated by comparing them to the expected results (estimated
13

information values for the non-default "develop" decisions). Here, a non-default decision or a “certain-decision” was

made based on acquired knowledge. The experiment was designed to prove or disprove if, based on principles

derived from human brain function, the system learns from experience and generalizes knowledge for decision-

making. In each experiment's state, a random signal was applied to the information-sensory gate. The system then

realized the information pattern to address the best and the worst alternatives in each realization level. Then, those

alternatives motivated the decision-makers from level three to level one. The last one decided to develop or skip the

information signal. Finally, an estimated value for the non-default “develop” decision, if made, was applied to the

value-adding sensory gate to evaluate the decision and modify the alternatives.

Table 3. Defined information features’ ranks for our experiments

Frequencies ranks
Accuracy (f1) 5
Completeness (f2) 1
Consistency (f3) 3
Uniqueness (f4) 4
Validity (f5) 6
Accessibility (f6) 4
Timeliness (f7) 1
Source (f8) 2

To analyze the experiment’s results, we first measured the number of non-default decisions made. If the system

encountered an unknown state and could not realize that, it made a default-decision, a decision with no value, which

gave the order to “develop” action. Then, we compared the values of the non-default “develop" decisions made at all

realization levels with the estimated values off from the experiment to define how many of made decisions were

successful. To complete the verification of knowledge generalization and decision-making, moreover, we looked for

if the system could make more than 50% successful decisions and after experiencing how many states, also, each

realization level to what extent affected increasing the successful decisions. Then, we examined the distribution of

the successful decisions based on the information features to compare with the ranks considered for the information

features to verify the system learning process.


14

5. RESULTS

From the first state of the experiment to state number 1000, the tendency of the system to make “certain decisions”

(non-default decisions) increased continuously to 94%. The system kept that percentage of certainty roughly the

same until the experiment's last state (Figure. 6, the red curve). Nonetheless, at that point, only 4 of 10 non-default

“develop” decisions were successful at the state 1000 (Figure. 6, the purple curve). We continued the experiment to

determine that if the number of successful decisions increases or not, if yes, where is the point that the system passes

50% success. After, the system was growing the percentage of successful decisions with a smooth slope to the last

state. The failed "Develop" decisions decreased from 60% after state number 1000 to 40% in the experiment end

(Figure. 6, the purple curve). Around state 37000, the system passed the threshold of 50% of successful decisions. At

this point, the system had experienced 0.2% of the total possible states.

Figure. 6. The system trend in the non-default decision-making and the trends of successful made-decisions.

For the red curve: The horizontal axis represents the number of experienced states. The vertical axis shows the “certain decisions” (non-default

decisions’) percentages, in which "0" and "1" are respectively for “default decisions” and “certain decisions”. For the other curves: The horizontal

axis represents the number of the experienced states, and the vertical axis depicts the successful-decisions ratios. The tendency of the total

successful decisions (purple line) is compared with levels I, II, and III (the blue, brown, and gray lines, respectively). After state number 1000,

40% of non-default “develop” decisions have the expected value-add achievements that increase to above 60% before state number 69,000. Level-

I (vague similarity level) shows a quick adaptation to the environment. The impacting percentages of the level-II and level-II on the successful

decisions continuously increases by approaching the end, from zero to the total value of 18%, before state number 69,000.
15

The three realization levels processed the detected information signals step by step, from roughly similar to most

familiar ones, to have enough alternatives for the fulfillment of any new states. Level three was for discriminating

between patterns to recognize the most similar patterns between more similar ones, which actually this level was not

tested in this experiment. We evaluated the effect of the first and second levels on the success of each state’s final

non-default “develop” decision. In this regard, Figure. 6 shows that 46% of successful decisions in the threshold of

50% success were relative to the level-I functioning. And, 4% of successful decisions were relative to level-II. The

effects of level-II and level-III had the same value because of the low rate of pattern repetitions at level two. We

continued the experiment to test the trend consistency of the successful decision. The results show that the system

continued the same trend for making successful decisions until the end. Figure. 6 compares the tendency of the total

successful decisions with levels I, II, and III. Level-I shows a quick adaptation to the environment. Before state

69000, the system made 60% successful decisions, 49% relative to level-I. Eleven percent of successful decisions

were relative to the other levels. Here, the second level found on average two similar patterns for each new state.

Level-I was responsible for 100% of the successful decisions made, at state numbers 1000, declining to about 82% at

state number 69,000, the other levels up to a total maximum of about 18%.

Figure. 7-a shows that the level-I reaches an average of 4 repetitions on each pattern before state 1000, which, by

the way, was responsible for all 40% of the made-decision success. Increasing the average repetition then slowly

affected the success of decisions. Accordingly, in the point of 50% successful decisions, level-I with an average of

147 repetitions on each pattern lead to only 46% success. At the end of the experiment, level-I with 271 repetitions

of each pattern accompanied 49% of the successful decisions. Figure. 7-b shows that the number of repetitions of

each pattern after state 100 had a direct relation to the percentage of made-decision success.
16

Figure. 7. The effects of all levels on successful “develop” decisions for experiments on 70,000 states. The

accumulation of repeated patterns for level-I during 70,000 experiments (the vertical axis shows the pattern repetitions). a- The trend of pattern

repetitions, blue line, is compared with the trend of successful decisions, red line, for level-I. b- The trend of pattern repetitions, green line, is

compared with the trend of successful decisions, red line, for level-II.

To evaluate the system’s learning from the experiences, we measured the accumulation of successful “Develop”

decisions on each frequency (each information feature) throughout all states. We evaluated the system on recognition

of the difference between the values of information features. Figure 8 shows the realized average influences of the

frequencies on making successful “Develop” decisions compared to the applied information features’ ranks. The

value one in the left graph of Figure 8 represents the frequency with the maximum accumulation of successful

“Develop” decisions. The other frequencies in this graph represent comparatively to that frequency. The highest
17

accumulated value was for f5, which correlated with the information features' highest rank. The lowest accumulation

value was for f2 and f7, likewise, correlated with the minimum information rank.

Figure 8. Left side, the accumulation of successful “Develop” decisions on the frequencies in 70,000 states,

compared to the features' given ranks on the right side. The ratios of the accumulated successful decisions are on the vertical

axis in the left-side graph, and the ranks of information features are on the vertical axis (between 1, 2.., and 8). In the right-side graph. These

graphs show the system's ability to differentiate between beneficial and non-beneficial features.

6. DISCUSSION

We reviewed the importance of decision-making in unpredictable environments, the superiority of the brain’s

decision-making, and the limitations of brain-inspired systems. We then developed a decision-making system based

on brain functions in generalizing knowledge and action selection. The evaluation of our model was figured through

progressing the states of the experiment. In each state, the system received information with a random pattern. Then,

it decided if to develop or skip that. For each "Develop" decision, we calculated a value from the features’ ranks and

frequencies’ intensities and applied it to the value-adding sensory gate. The system performance was measured using

the results from the realization of each state, determining relative alternatives, decision-making, and value-added

achievements. The results allowed us to reach the following conclusions.

For the generalization of knowledge and decision-making:

Our system showed that it could be able to decide on its achieved knowledge. Its default-decisions dominated the

first 500 states. However, by increasing the number of experienced states and relative value-added achievements, the
18

system tended to make more “certain decisions”. The certainty percentage kept around 94% until the last state

(Figure. 6, the red curve). Nonetheless, less than 40% of the non-default “Develop” decisions were successful

compared to the expected values. Meanwhile, in pursuit of experiencing more valued states, the system adjusted

itself to the environment, enhanced its process, and increased the decision success rate. The percentage of successful

decisions grew continuously until the last state (Figure. 6, the purple curve). We concluded that the system with a

low amount of experience could make more than 50% successful decisions. Figure. 6 shows that the system was able

to make decisions autonomously, develop its performance. It achieved 50% successful decisions with experiencing

only 0.002 of possible states.

For the ability of learning and modifying alternatives:

The system showed that be able to improve the decision-making process. It improved the process of realizing

alternatives through the abstraction of different patterns of the information features of experienced states. A common

pattern's value was generalized for all relevant new states. We evaluated the ability of level-I and level-II for

generalizing knowledge on decision-making. However, we didn't examine the ability to discriminate most similar

states. Level-III was for realizing this characteristic, which could be important for complex environments and needs

to be investigated in the future. The decision-making behavior of level-I is represented in Figure. 6, the blue line. It

shows that level-I by abstracting more similar features from experiences could help the system to adapt faster to

unpredictable environments. At the brink of 50% success, level-I was responsible for 92% of successful decisions.

This level provided through previous experiences further alternatives possible for responding to new states. Level-I

provided for each common pattern on average 147 values comparing to level-II with an average of 1.5 values. The

other two levels increased the precision of decision-making and the rate of higher achievements, however, for a

shorter scope of similarities. According to Figure. 7, the successful decision inclination for more than 50% success

correlated with the level-II and level-III participation in decision-making. The effects of both levels were the same

because of the low possible repetitions for level-II during our experiment. Nevertheless, both levels showed a slow

inclination, which suggests a large number of experiences required for making more precise decisions.

We verified the ability of the system to learn information features and differentiate them by evaluating the

accumulation of successful decisions on detected signal frequencies (Figure 8). The system developed the learning

process through the modification of the alternative motivation-scores. It could recognize the feature values in
19

detected frequencies through learning from experiences and generalize knowledge for new unpredictable states.

In this study, the pattern and alternatives realization stages mimicked the brain principles in decision-making. The

similarities between valued states were derived from the built neuronal circuits between the realization matrices. The

system generalized knowledge for decision-making through experiences. The alternative realization stage modified

alternatives based on the value-added achievements of made decisions. This study could be developed for

implementation in real-world environments with real unpredictable information signals and real value-added

achievements considering unknown interrelations between information aspects. Our next study will be developing a

decision-making system on dynamic tasks considering hazardous materials in an unpredictable environment. We

should note that the system could be used as the building block for developing artificial general intelligence (AGI) to

configuring complex tasks derived from a combination of different data as what is realized by the brain.

ACKNOWLEDGMENT

The authors would like to acknowledge the financial support provided by NSERC for this research under grant

RGPIN 2015-06253.

REFERENCES

1. Narayan, P., et al. An intelligent control architecture for unmanned aerial systems (UAS) in the national
airspace system (NAS). in Proceedings of AIAC12: 2nd Australasian Unmanned Air Vehicles Conference.
2007. Waldron Smith Management.
2. Yang, Q., et al. Re-examining Whether, Why, and How Human-AI Interaction Is Uniquely Difficult to
Design. in Proceedings of the 2020 chi conference on human factors in computing systems. 2020.
3. Deng, C., et al., Integrating Machine Learning with Human Knowledge. Iscience, 2020: p. 101656.
4. Booch, G., et al., Thinking fast and slow in ai. arXiv preprint arXiv:2010.06002, 2020.
5. Azulay, A. and Y. Weiss, Why do deep convolutional networks generalize so poorly to small image
transformations? arXiv preprint arXiv:1805.12177, 2018.
6. de Callatay, A., Natural and artificial intelligence: misconceptions about brains and neural networks. 2014:
Elsevier.
7. Sutton, R.S. and A.G. Barto, Reinforcement learning: an introduction Cambridge. MA: MIT Press.[Google
Scholar], 1998.
8. Patil, T., S. Pandey, and K. Visrani, A review on basic deep learning technologies and applications, in Data
Science and Intelligent Applications. 2021, Springer. p. 565-573.
9. Dong, T., Conclusions and Outlooks, in A Geometric Approach to the Unification of Symbolic Structures
and Neural Networks. 2021, Springer. p. 117-127.
10. Andresen, K. and N. Gronau, An approach to increase adaptability in ERP systems. 2005.
11. Potemkowski, A., Neurobiology of Decision Making: Methodology in Decision-Making Research.
Neuroanatomical and Neurobiochemical Fundamentals, in Neuroeconomic and Behavioral Aspects of
Decision Making. 2017, Springer. p. 3-18.
12. Sołtys, A., et al., Emotions in Decision Making, in Neuroeconomic and Behavioral Aspects of Decision
Making. 2017, Springer. p. 35-47.
13. McDonnell, M.D., et al., Engineering intelligent electronic systems based on computational neuroscience
20

[scanning the issue]. Proceedings of the IEEE, 2014. 102(5): p. 646-651.


14. Tyburski, E., Psychological Determinants of Decision Making, in Neuroeconomic and Behavioral Aspects
of Decision Making. 2017, Springer. p. 19-34.
15. Banich, M.T., Executive function: The search for an integrated account. Current directions in psychological
science, 2009. 18(2): p. 89-94.
16. O’Reilly, R.C., S.A. Herd, and W.M. Pauli, Computational models of cognitive control. Current opinion in
neurobiology, 2010. 20(2): p. 257-261.
17. Williams, C., Brain imaging spots our abstract choices before we do. New Scientist, 2013. 10.
18. Liu, Y., et al., Human Replay Spontaneously Reorganizes Experience. Cell, 2019.
19. Santos, L.R. and A.G. Rosati, The evolutionary roots of human decision making. Annual review of
psychology, 2015. 66: p. 321-347.
20. Goschke, T., Dysfunctions of decision‐making and cognitive control as transdiagnostic mechanisms of
mental disorders: advances, gaps, and needs in current research. International journal of methods in
psychiatric research, 2014. 23(S1): p. 41-57.
21. Von Helmholtz, H., Handbuch der physiologischen Optik. Vol. 9. 1867: Voss.
22. Qamar, A.T., et al., Trial-to-trial, uncertainty-based adjustment of decision boundaries in visual
categorization. Proceedings of the National Academy of Sciences, 2013. 110(50): p. 20332-20337.
23. O'Reilly, R.C., et al., Goal-driven cognition in the brain: a computational framework. arXiv preprint
arXiv:1404.7591, 2014.
24. Wang, Y., et al., Novelty seeking is related to individual risk preference and brain activation associated
with risk prediction during decision making. Scientific reports, 2015. 5: p. 10534.
25. Panzeri, S., et al., Cracking the neural code for sensory perception by combining statistics, intervention,
and behavior. Neuron, 2017. 93(3): p. 491-507.
26. Bogacz, R., et al., The physics of optimal decision making: a formal analysis of models of performance in
two-alternative forced-choice tasks. Psychological review, 2006. 113(4): p. 700.
27. Luczak, A., B.L. McNaughton, and K.D. Harris, Packet-based communication in the cortex. Nature
Reviews Neuroscience, 2015. 16(12): p. 745.
28. Cossell, L., et al., Functional organization of excitatory synaptic strength in primary visual cortex. Nature,
2015. 518(7539): p. 399.
29. Markram, H., et al., Reconstruction and simulation of neocortical microcircuitry. Cell, 2015. 163(2): p.
456-492.
30. Wang, X.-J., Probabilistic decision making by slow reverberation in cortical circuits. Neuron, 2002. 36(5):
p. 955-968.
31. Lerner, J.S., et al., Emotion and decision making. Annual review of psychology, 2015. 66.
32. Vinyals, O., et al. Matching networks for one shot learning. in Advances in Neural Information Processing
Systems. 2016.
33. Churchland, A.K., et al., Variance as a signature of neural computations during decision making. Neuron,
2011. 69(4): p. 818-831.
34. Groman, S.M., et al., Orbitofrontal Circuits Control Multiple Reinforcement-Learning Processes. Neuron,
2019.
35. Reinertsen, D.G., The principles of product development flow, second generation, lean product development.
2009.
36. Tatikonda, M.V. and S.R. Rosenthal, Technology novelty, project complexity, and product development
project execution success: a deeper look at task uncertainty in product innovation. IEEE Transactions on
engineering management, 2000. 47(1): p. 74-87.
37. Wang, R.Y., A product perspective on total data quality management. Communications of the ACM, 1998.
41(2): p. 58-66.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy