Artificial Intelligence and Data Science in The Au
Artificial Intelligence and Data Science in The Au
Artificial Intelligence and Data Science in The Au
net/publication/319534479
CITATIONS READS
11 1,917
3 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Florian Neukart on 25 December 2017.
Optimization
control the respective process. If sensor systems are also
integrated directly into the production process – to collect
Figure 2: Traditional CRISP-DM process with an data in real time – this results in a self-learning cyber-
additional optimization step physical system 3
that facilitates implementation of the
4
It is important to note that the original CRISP model deals Industry 4.0 vision in the field of production engineering.
with a largely iterative approach used by data scientists to
Multi-criteria optimization
analyze data manually, which is reflected in the iterations
Automatic modeling
between business understanding and data understanding as
Data management
well as data preparation and modeling. However,
Sensor system Control system/actuators
evaluating the modeling results with the relevant application
Machine/system
experts in the evaluation step can also result in having to
start the process all over again from the business Figure 3: Architecture of an Industry 4.0 model for
understanding sub-step, making it necessary to go through optimizing analytics
all the sub-steps again partially or completely (e.g., if
additional data needs to be incorporated). This approach is depicted schematically in Figure 3. Data
The manual, iterative procedure is also due to the fact that from the system is acquired with the help of sensors and
the basic idea behind this approach – as up-to-date as it integrated into the data management system. Using this as a
may be for the majority of applications – is now almost 20 basis, forecast models for the system's relevant outputs
years old and certainly only partially compatible with a big (quality, deviation from target value, process variance, etc.)
data strategy. The fact is that, in addition to the use of are used continuously in order to forecast the system's
nonlinear modeling methods (in contrast to the usual output. Other machine learning options can be used within
generalized linear models derived from statistical modeling) this context in order, for example, to predict maintenance
and knowledge extraction from data, data mining rests on results (predictive maintenance) or to identify anomalies in
the fundamental idea that models can be derived from data the process. The corresponding models are monitored
with the help of algorithms and that this modeling process continuously and, if necessary, automatically retrained if
can run automatically for the most part – because the any process drift is observed. Finally, the multi-criteria
algorithm “does the work.” optimization uses the models to continuously compute
In applications where a large number of models need to be 3
Systems “in which information and software components are
created, for example for use in making forecasts (e.g., sales connected to mechanical and electronic components and in which
forecasts for individual vehicle models and markets based data is transferred and exchanged, and monitoring and control
tasks are carried out, in real-time using infrastructures such as the
on historical data), automatic modeling plays an important Internet.” (Translation of the following article in Gabler
role. The same applies to the use of online data mining, in Wirtschaftslexikon, Springer:
http://wirtschaftslexikon.gabler.de/Definition/cyber-physische-
which, for example, forecast models (e.g., for forecasting systeme.html).
4
product quality) are not only constantly used for a Industry 4.0 is defined therein as “a marketing term that is also
used in science communication and refers to a ‘future project’ of
production process, but also adapted (i.e., retrained)
the German federal government. The so-called ‘Fourth Industrial
continuously whenever individual process aspects change Revolution’ is characterized by the customization and
(e.g., when a new raw material batch is used). This type of hybridization of products and the integration of customers and
business partners into business processes.” (Translation of the
application requires the technical ability to automatically following article in Gabler Wirtschaftslexikon, Springer:
http://wirtschaftslexikon.gabler.de/Definition/industrie-4-0.html).
In order to differentiate it from “traditional” data mining, An early definition of artificial intelligence from the IEEE
the term “big data” is frequently defined now with three Neural Networks Council was “the study of how to make
(sometimes even four or five) essential characteristics: computers do things at which, at the moment, people are
volume, velocity, and variety, which refer to the large better.”5 Although this still applies, current research is also
volume of data, the speed at which data is generated, and the focused on improving the way that software does things at
heterogeneity of the data to be analyzed, which can no which computers have always been better, such as analyzing
longer be categorized into the conventional relational large amounts of data. Data is also the basis for developing
database schema. Veracity, i.e., the fact that large artificially intelligent software systems not only to collect
uncertainties may also be hidden in the data (e.g., information, but also to:
measurement inaccuracies), and finally value, i.e., the value
• Learn
that the data and its analysis represents for a company's
business processes, are often cited as additional • Understand and interpret information
characteristics. So it is not just the pure data volume that • Behave adaptively
distinguishes previous data analytics methods from big data,
but also other technical factors that require the use of new • Plan
methods– such as Hadoop and MapReduce – with • Make inferences
appropriately adapted data analysis algorithms in order to
• Solve problems
allow the data to be saved and processed. In addition, so-
called “in-memory databases” now also make it possible to • Think abstractly
apply traditional learning and modeling algorithms in main
• Understand and interpret ideas and language
memory to large data volumes.
This means that if one were to establish a hierarchy of data 3.1 Machine learning
analysis and modeling methods and techniques, then, in
At the most general level, machine learning (ML)
very simplistic terms, statistics would be a subset of data
algorithms can be subdivided into two categories:
mining, which in turn would be a subset of big data. Not
supervised and unsupervised, depending on whether or not
every application requires the use of data mining or big data
the respective algorithm requires a target variable to be
technologies. However, a clear trend can be observed, which
specified.
indicates that the necessities and possibilities involved in the
use of data mining and big data are growing at a very rapid Supervised learning algorithms
pace as increasingly large data volumes are being collected Apart from the input variables (predictors), supervised
and linked across all processes and departments of a learning algorithms also require the known target values
company. Nevertheless, conventional hardware architecture (labels) for a problem. In order to train an ML model to
with additional main memory is often more than sufficient identify traffic signs using cameras, images of traffic signs –
for analyzing large data volumes in the gigabyte range. preferably with a variety of configurations – are required as
Although optimizing analytics is of tremendous importance, input variables. In this case, light conditions, angles, soiling,
it is also crucial to always be open to the broad variety of etc. are compiled as noise or blurring in the data;
applications when using artificial intelligence and machine nonetheless, it must be possible to recognize a traffic sign in
learning algorithms. The wide range of learning and search rainy conditions with the same accuracy as when the sun is
methods, with potential use in applications such as image shining. The labels, i.e., the correct designations, for such
and language recognition, knowledge learning, control and
5
E. Rich, K. Knight: Artificial Intelligence, 5, 1990
• Emulating biological visual perception in order to • Object detectors, in which case a window moves
better understand which physical and biological over the image and a filter response is determined
processes are involved, how the wetware works, for each position by comparing a template and the
and how the corresponding interpretation and sub-image (window content), with each new object
understanding work. parameterization requiring a separate scan. More
sophisticated algorithms simultaneously make
• Technical research and development focuses on
calculations based on various scales and apply
efficient, algorithmic solutions – when it comes to
filters that have been learned from a large number
CV software, problem-specific solutions that only
of images.
have limited commonalities with the visual
perception of biological organisms are often • Segment-based techniques extract a geometrical
developed. description of an object by grouping pixels that
define the dimensions of an object in an image.
All three areas overlap and influence each other. If, for
Based on this, a fixed feature set is computed, i.e.,
example, the focus in an application is on obstacle
the features in the set retain the same values even
recognition in order to initiate an automated braking
when subjected to various image transformations,
maneuver in the event of a pedestrian appearing in front of
such as changes in light conditions, scaling, or
the vehicle, the most important thing is to identify the
rotation. These features are used to clearly identify
pedestrian as an obstacle. Interpreting the entire scene – e.g.,
objects or object classes, one example being the
understanding that the vehicle is moving towards a family
aforementioned identification of traffic signs.
having a picnic in a field – is not necessary in this case. In
contrast, understanding a scene is an essential prerequisite if
7
context is a relevant input, such as is the case when R. Bajcsy: Active perception, Proceedings of the IEEE, 76:996-
1005, 1988
developing domestic robots that need to understand that an 8
J. L. Crowley, H. I. Christensen: Vision as a Process: Basic
occupant who is lying on the floor not only represents an Research on Computer Vision Systems, Berlin: Springer, 1995
Mathematical logic is the formal basis for many applications • First-order predicate logic
in the real world, including calculation theory, our legal
• Modal logic
system and corresponding arguments, and theoretical
developments and evidence in the field of research and • Non-monotonic logic
development. The initial vision was to represent every type
Automated decision-making, such as that found in
of knowledge in the form of logic and use universal
autonomous robots (vehicles), WWW agents, and
algorithms to make inferences from it, but a number of
communications agents, is also worth mentioning at this
challenges arose – for example, not all types of knowledge
point. This type of decision-making is particularly relevant
can be represented simply. Moreover, compiling the
when it comes to representing expert decision-making
knowledge required for complex applications can become
processes with logic and automating them. Very frequently,
very complex, and it is not easy to learn this type of
this type of decision-making process takes account of the
knowledge in a logical, highly expressive language. 16 In
dynamics of the surroundings, for example when a transport
addition, it is not easy to make inferences with the required
robot in a production plant needs to evade another transport
highly expressive language – in extreme cases, such
robot. However, this is not a basic prerequisite, for example,
scenarios cannot be implemented computationally, even if
if a decision-making process without a clearly defined
the first two challenges are overcome. Currently, there are
direction is undertaken in future, e.g., the decision to rent a
three ongoing debates on this subject, with the first one
warehouse at a specific price at a specific location.
focusing on the argument that logic is unable to represent
Decision-making as a field of research encompasses
many concepts, such as space, analogy, shape, uncertainty,
multiple domains, such as computer science, psychology,
etc., and consequently cannot be included as an active part
economics, and all engineering disciplines. Several
in developing AI to a human level. The counterargument
fundamental questions need to be answered to enable
states that logic is simply one of many tools. At present, the
development of automated decision-making systems:
combination of representative expressiveness, flexibility,
and clarity cannot be achieved with any other method or • Is the domain dynamic to the extent that a sequence
system. The second debate revolves around the argument of decisions is required or static in the sense that a
that logic is too slow for making inferences and will single decision or multiple simultaneous decisions
therefore never play a role in a productive system. The need to be made?
counterargument here is that ways exist to approximate the • Is the domain deterministic, non-deterministic, or
inference process with logic, so processing is drawing close stochastic?
to remaining within the required time limits, and progress is
being made with regard to logical inference. Finally, the • Is the objective to optimize benefits or to achieve a
third debate revolves around the argument that it is goal?
extremely difficult, or even impossible, to develop systems • Is the domain known to its full extent at all times?
based on logical axioms into applications for the real world. Or is it only partially known?
The counterarguments in this debate are primarily based on
Logical decision-making problems are non-stochastic in
the research of individuals currently researching techniques
nature as far as planning and conflicting behavior are
for learning logical axioms from natural-language texts.
concerned. Both require that the available information
In principle, a distinction is made between four different regarding the initial and intermediate states be complete,
types of logic17 which are not discussed any further in this that actions have exclusively deterministic, known effects,
article: and that a specific defined goal exists. These problem types
are often applied in the real world, for example in robot
16
N. Lavarac, S. Dzeroski: Inductive Logic Programming, Vol. 3:
Non-Monotonic Reasoning and Uncertain Reasoning, Oxford
control, logistics, complex behavior in the WWW, and in
University Press: Oxford, 1994 computer and network security.
17
K. Frankish, W. M. Ramsey: The Cambridge Handbook of
Artificial Intelligence, Cambridge: Cambridge University Press, In general, planning problems consist of an initial (known)
2014
• Position 3: In general, the predicates of logic and In traditional AI, people focused primarily on individual,
formal systems only appear to be different from isolated software systems that acted relatively inflexibly to
human language, but their terms are in actuality the predefined rules. However, new technologies and
words as which they appear applications have established a need for artificial entities
that are more flexible, adaptive, and autonomous, and that
The introduction of statistical and AI methods into the field
act as social units in multi-agent systems. In traditional AI
is the latest trend within this context. The general strategy is
(see also "physical symbol system hypothesis" 20 that has
to learn how language is processed – ideally in the way that
been embedded into so-called “deliberative” systems), an
humans do this, although this is not a basic prerequisite. In
action theory that establishes how systems make decisions
terms of ML, this means learning based on extremely large
and act is represented logically in individual systems that
corpora that have been translated manually by humans. This
must execute actions. Based on these rules, the system must
often means that it is necessary to learn (algorithmically)
prove a theorem – the prerequisite here being that the
how annotations are assigned or how part-of-speech
system must receive a description of the world in which it
categories (the classification of words and punctuation
currently finds itself, the desired target state, and a set of
marks in a text into word types) or semantic markers or
actions, together with the prerequisites for executing these
primes are added to corpora, all based on corpora that have
been prepared by humans (and are therefore correct). In the
18
case of supervised learning, and with reference to ML, it is G. Leech, R. Garside, M. Bryant: . CLAWS4: The Tagging of
the British National Corpus. In Proceedings of the 15th
possible to learn potential associations of part-of-speech International Conference on Computational Linguistics (COLING
tags with words that have been annotated by humans in the 94) Kyoto, Japan, pp. 622-628, 1994
19
K. Spärck Jones: Information Retrieval and Artificial
text, so that the algorithms are also able to annotate new, Intelligence, Artificial Intelligence 141: 257-81, 1999
20
A. Newell, H. A. Simon: Computer Science as Empirical
Enquiry: Symbols and Search, Communications of the ACM
19:113-26
purchasing of goods, a large amount of historical price supplier network not only allows this type of bottleneck to
information is available for data mining purposes, which can be identified, but also countermeasures to be optimized. In
be used to generate price forecasts and, in combination with order to facilitate a simulation that is as detailed and
delivery reliability data, to analyze supplier performance. As accurate as possible, experience has shown that mapping all
for shipment, optimizing analytics can be used to identify subprocesses and interactions between suppliers in detail
and optimize the key cost factors. becomes too complex, as well as nontransparent for the
A similar situation applies to production logistics, which automobile manufacturer, as soon as attempts are made to
deals with planning, controlling, and monitoring internal include Tier 2 and Tier 3 suppliers as well.
transportation, handling, and storage processes. Depending This is why data-driven modeling should be considered as
on the granularity of the available data, it is possible to an alternative. When this approach is used, a model is
identify bottlenecks, optimize stock levels, and minimize the learned from the available data about the supplier network
time required, for example here. (suppliers, products, dates, delivery periods, etc.) and the
Distribution logistics deals with all aspects involved in logistics (stock levels, delivery frequencies, production
transporting products to customers, and can refer to both sequences) by means of data mining methods. The model
new and used vehicles for OEMs. Since the primary can then be used as a forecast model in order, for example,
considerations here are the relevant costs and delivery to predict the effects of a delivery delay for specific parts on
reliability, all the subcomponents of the multimodal supply the production process. Furthermore, the use of optimizing
chain need to be taken into account – from rail to ship and analytics in this case makes it possible to perform a worst-
truck transportation through to subaspects such as the case analysis, i.e., to identify the parts and suppliers that
optimal combination of individual vehicles on a truck. In would bring about production stoppages the fastest if their
terms of used-vehicle logistics, optimizing analytics can be delivery were to be delayed. This example very clearly
used to assign vehicles to individual distribution channels shows that optimization, in the sense of scenario analysis,
(e.g., auctions, Internet) on the basis of a suitable, vehicle- can also be used to determine the worst-case scenario for an
specific resale value forecast in order to maximize total sale automaker (and then to optimize countermeasures in future).
proceeds. GM implemented this approach as long ago as
35
http://www.automotive-fleet.com/news/story/2003/02/gm-
installs-nutechs-vehicle-distribution-system.aspx
38
http://www.syntragy.com/doc/q3-05%5B1%5D.pdf
5.1 Vision – Vehicles as autonomous, adaptive, and • Autonomously in the sense that they automatically
social agents & cities as super-agents follow a route to a destination
Research into self-driving cars is here to stay in the • Adaptively in the sense that they can react to
automotive industry, and the “mobile living room” is no unforeseen events, such as road closures and
longer an implausible scenario, but is instead finding a more breakdowns
and more positive response. Today, the focus of • Socially in the sense that they work together to
development is on autonomy, and for good reason: In most achieve the common goals of optimizing the flow
parts of the world, self-driving cars are not permitted on of traffic and preventing accidents (although the
actual situation is naturally more complex and
39
L. Gräning, B. Sendhoff: Shape Mining: A Holistic Data Mining many subgoals need to be defined in order for this
Approach to Engineering Design. Advanced Engineering to be achieved).
Informatics 28(2), 166-185, 2014.
In order to learn from data, a robot must not just operate Stage 2 – Overcoming the limitations of programming –
according to static programming, it must also be able to use smart factories as individuals
ML methods to work autonomously towards defined What if the production plant needs to learn things for which
learning goals. With regard to any production errors that even the flexibility of one or more ML methods used by
may occur, this means, first and foremost, that the actions individual agents (such as production or handling robots) is
being carried out that result in these errors will have been insufficient? Just like a biological organism, a production
learned, and not programmed based on a flowchart and an plant could act as a separate entity composed of
event diagram. Assume, for example, that the subcomponents, similarly to a human, who can be addressed
aforementioned parking light problem has not only been using natural language, understands context, and is capable
identified, but that its cause can also been traced back to an of interpreting this. The understanding and interpretation of
issue in production, e.g., a robot that is pushing a headlamp context have always been a challenge in the field of AI
into its socket too hard. All that’s now required is to define research. AI theory views context as a shared (or common)
the learning goal for the corrective measure. Let us also interpretation of a situation, with the context of a situation
assume that the production error is not occurring with robots and the context of an entity relative to a situation being
in other production plants, and that left-hand headlamps are relevant here. Contexts relevant to a production plant
being installed correctly in general. In the best-case include everything that is relevant to production when