- Introduction to artificial intelligence (AI)
- Types of AI
- AI and purpose
- Define intelligence
- Define AI
- Types of AI
- Purpose of AI
- Human senses
- Define an AI project
Key Learning Points
- Learn and understand machine learning
Introduction to Artificial Intelligence (AI)
What Is Intelligence?
Intelligence is the ability to understand or deduce information and retain it as knowledge, and subsequently apply it toward a context in an environment (Barrat 2013). This includes logic, self-awareness, learning, emotional knowledge, reasoning, planning, creativity, and problem solving. Intelligence can be found in humans and animals. This intelligence is expected to extend to machines and that is what strong artificial intelligence (AI) is about. This book expands from basic to advanced AI.
What Is AI?
People all over the world provide definitions of AI. Here are some definitions of AI. AI is the science- and engineering-enabling intelligence, specifically in computer programs, using computers to understand human intelligence and other living things. About 60 years ago, John McCarthy called on a group of computer scientists to discern if computers could learn like a child (McCarthy 1959). The project objective was to see if computers could solve all sorts of problems that are reserved for humans and to improve themselves, especially when addressing a huge amount of data. Since then AI has been in university laboratories and super-secret labs. In 1955, McCarthy defined AI as having seven characteristics:
- Simulating higher functions of the human brain.
- Programming a computer to use general language.
- Arranging hypothetical neurons in a manner so it can form concepts.
- Finding ways to determine and measure problem complexity.
- Abstraction that defines the quality of addressing ideas rather than events.
- Randomness and creativity.
McCarthy further discussed that all the various parts of learning and intelligence can be precisely described as intelligence that a machine can be built to simulate.
In 1995, Jake Copeland described AI further, averring that AI has no technical concepts. He analyzed what those working in AI must achieve before claiming to have built a thinking machine. Copeland further itemized areas in AI as follows:
- Generalize learning
- Recognize human faces
- Reason to draw conclusions based on information gathered
- Problem solving
- Give way to humans
Other people defined AI as computers that seem like they have human intelligence. Not merely the ability to obey road signs and drive forward, rather to show human emotions such as road rage. This is not a new concept; one can recall Dortmund Professor McCarthy who coined the term AI in 1956.
Recently, a huge amount of data is being generated. Technology giants such as Google, Facebook, Twitter, Microsoft, Amazon, and IBM embrace AI to solve problems of various magnitudes. AI is being used in robotics to solve complex empirical problems. AI can manifest in many ways such as forecasting the weather, based on the data from its source. However, the same data can yield a different forecast, based on the intent of the question. Thus, it is capable of thinking, based on how it is programmed. NLP makes this more exotic. One of the most exciting areas of AI is machine learning (ML). Machines can retain knowledge based on the data collected, in contrast to a human who retain knowledge and respond differently.
Types of AI
There are three types of AI: Weak, Strong, and Superintelligence (AI researcher Ben Goertzel 2014). Weak AI focuses on narrow tasks. Strong AI can apply intelligence to solve problems generally rather than focusing on one specific problem. AI has the intelligence to respond with intelligence and can be compared to a typical human. Superintelligence AI is supposed to have intelligence attributes that surpass those of the brightest and most gifted human minds (Muehlhauser May 2014). Its recursive self-improvement provides a rapid outcome able to create artificial general intelligence.
AI and Capabilities
The field of AI is vast and yet recognizes many unknowns. Some known AI capabilities are listed as follows (Brownlee 2013):
- NLP enables machines to communicate in natural way in human languages.
- Knowledge representation enables machines to represent known knowledge.
- Automated reasoning enables machines to determine appropriate reasoning like a human.
- ML entails teaching the machines.
- Computer vision enables machines to see and detect.
- Robotics helps in automating movable use cases such as manufacturing units.
- Internet of Things (IoT) devices help in data collection and controlling machines through sensors.
- Virtual reality helps in simulating human senses close to reality. Make your mind believe.
What Is the Purpose of AI?
- To better humankind.
- To help humans extend their capabilities on repeated tasks through automation.
- To avoid human and manual errors on repeated tasks.
- To improve programmatic approaches to align with ever-changing business demands and requirements. Instead of programmatically maintaining business rules to data-driven decision processes, accommodates business demands immediately.
- To handle the large volume of data generated every day, using approximately 2.5 billion gigabytes.
- To handle a variety of data in an automated way.
- To handle the velocity of data.
- To detect patterns in data.
- No human or group of humans can handle the huge volume of data and the variety of data at its present velocity of accumulation.
The five senses of humans are sight, sound, touch, smell, and taste. Currently, machines are capable of replicating the sight sense through a camera to see and project through monitors and projectors. Machines can replicate the sound sense through a microphone to listen. Speakers speak through speakers, users touch screens, keyboards, and mouse. People have yet to develop machines that have a sense of smell and taste. However, a couple of new inventions are being developed such as a digital nose and smell maker, as well as an electronic tongue to mimic the taste sense. People are working to answer the challenges associated with new technological machine and device capabilities. Artifacts are still evolving, and reliability and usability are not tested yet on real-world applications. Acceptance among users is still unclear. Applying machines with these digital senses to real-world applications is an enormous accomplishment, although they are still in the research and development (R&D) experimental phase. These kinds of risks are very high, and one corporation alone cannot handle these challenges. That is why many developers are using open source environments and conducting university-based projects. It is unclear how reliable and secure capabilities will be to avoid early detection in the new evolving technologies.
The Horizontal and Vertical AI landscapes. Horizontal AI focuses on general questions and fundamental problems across industries. Large corporations such as Google, Facebook, Microsoft, Amazon, IBM, and universities are investing in Horizontal AI. Vertical AI focuses on a specific industry problem, specific business case, and use cases. Many startups and medium-sized companies are investing in and exploring Vertical AI.
Some major Horizontal AI projects are Watson (IBM), AlphaGo (Google), Google Brain, Blue Brain (IBM), M (Facebook), Siri (Apple), Google Now, Cortana (Microsoft), Wolfram Alpha (Wolfram Research), Echo (Amazon), and Google home.
Some of the Vertical AI projects are www.BizStats.AI—Retail E-Commerce and Event Ticket.
Here is the list of supported industry-specific verticals:
- Retail e-commerce Analytics AI: https://bizstats.ai/solutions/by_industry/retail_e-commerce.html
- Automotive Analytics AI: https://bizstats.ai/solutions/by_industry/automotive.html
- Banking and Financing Analytics AI: https://bizstats.ai/solutions/by_industry/banking_finance.html
- Consumer Products Analytics AI: https://bizstats.ai/solutions/by_industry/consumer_products.html
Define AI Projects
An AI project is similar to PMBoK (PMI 2017) project definition except that AI projects are more dependent on data and algorithms, such as the availability of initial data for training, continuing data collection strategy, cleaning up collected data, determining the useful features of data, transforming data to fit a model, selecting appropriate algorithms, evaluating multiple algorithms to determine accuracy, comparing against other algorithms, and determining the learning rate of the model. Can this model function autonomously or does it need human intelligence to speed up the learning process?
The first phase of an AI Project is the most important. It defines and identifies business cases and use cases like a regular project, but has more risk associated with it. Because AI projects are still in the discovery mode and processes are still evolving, companies are trying to learn from each other’s mistakes, challenges, and new knowledge, determining how to monetize. Again, this falls into the business value proposition like a regular project. See Figure 2.1.
Figure 2.1 AI project value proposition
AI project value propositions include great business value, causing an abrupt increase in revenue in the shortest time possible by gaining more customers to increase market share, in the mode of start-up companies. AI projects are continuous, collecting a new set of data and applying predefined/preselected algorithms and pretrained models that have already gone through the initial training. The goal is to reach success with great accuracy, near to 80 percent or more. The required accuracy is based on the business case, the goal of the project, and the problem. For example, the AI self-driving car project needs nearer to 100 percent accuracy and has zero fault tolerance. This is because human safety is directly involved. But some other AI projects, such as assistance provided by Apple’s SIRI project may not need 100 percent accuracy. In general, more accuracy with less fault tolerance is better.
AI projects need the latest trends and demanding roles, skill sets such as those of data scientists, data architects, data designers, data engineers, ML engineers, AI engineers, cloud engineers, and subject matter experts in their respective fields. In addition to human resources, machines are also part of the resources needed such as IoT devices, virtual reality devices, robots, and others.
See Figure 2.2 for AI-based projects and AI Products that enable business value creation. Using AI will provide more value that is data driven and automated.
Figure 2.2 Business value versus time
Business value is the net quantifiable benefit that may be tangible, intangible, or both. Business value benefits can include time, money, goods, or intangibles.
Most corporations strive for the following business value:
- How to safeguard and increase monetary assets.
- How to increase market share and revenue share.
- How to increase the customer base by designing innovative useful products.
- How to increase the good will of the organization.
- How to improve brand recognition, brand value, and corporate reputation.
- How to improve customer experience.
Big Data Ecosystem
The big data ecosystem is growing exponentially, meaning more data are being generated every minute and usage of data devices, data collectors, data aggregators, and data users or buyers is increasing. In the big data ecosystem, a need exists for AI solutions for risk. See Figure 2.3. The risk associated with AI projects is enormous. AI projects have positive risks and negative risks. The next section details some positive and negative risks of AI projects.
Figure 2.3 Big data ecosystem
Positive and Negative Risks of AI
One major challenge and a negative risk of AI is to align with human emotions and safety. It is presumed that AI is programmed to do something that is overwhelming. If AI gets into the hands of the wrong person, it can be used to create harm, such as serving as a weapon. AI arms can lead to AI war, which may cause mass casualties. Weapons could be designed to be extremely difficult to turn off, causing humans to lose control of the situation (Eliezer Yudkowsky 2008).
Humans may have only good intentions when developing AI systems, but the system itself may develop a destructive method to achieve its intended goal. In such a situation, much havoc can ensue. For example, requesting a vehicle to take a person from Point A to Point B very fast might create problems because an AI machine may travel too fast. Many other examples can be added to this scenario. The point here is that AI systems must be made with considerations for human safety.
Well-known people in science and technology have expressed concerns about AI: Stephen Hawking, Elon Musk, Steve Wozniak, and Bill Gates. Initially, strong AI may take a long time to develop, but recent accomplishments with AI has provide concern that the acceleration of AI development can move at a surprising pace if humans do not take the necessary precautions to protect the development of strong AI. AI can become more intelligent than any human and ultimately humans cannot predict how it will behave. Presently, humans control the world and to have something else controlling the world is a scary thought. Thus, the idea is to support AI safety and with great caution.
Negative Risks of AI
- No standardized terminology exists, and AI can loosely be viewed as a machine that chooses whatever action appears to best achieve its goals. This means AI can choose whatever function it assesses as best, depending on the mathematical algorithm.
- The goal of the AI system may not be intelligent enough to think of resisting programmer attempts to modify it and may not be sufficiently advanced to react rationally. This lack of oversight may lead to resisting any changes to its goal structure.
- A super-intelligent program created by humans may be obedient to humans. However, it may be more intelligent than a human, thereby understanding the moral truth of humans more than humans. This could create a problem slowing it down, simply because it knows more than the human, and may think it knows the best approach.
- Using AI to do things for humans may lead to losing our skills.
- Humans will end up blaming machines for a mistake made by humans.
- A smart machine may decide that humans are not needed.
- AI may use human weaknesses against us.
- AI may use human intentions against humans.
Positive Risks of AI
- AI can now automate everyday tasks.
- AI can help with management decisions and put the most effective teams together.
- AI can process and analyze a huge amount of data.
- AI can converse with customers to resolve customer issues.
- AI can create algorithms to forecast growth.
- AI can help doctors diagnose patients.
- Incorporating AI into an organization is like having a private robotic assistant that can streamline the work that needs to be undertaken in the office space.
- AI can be quite productive.
- AI can carry out menial tasks in the office such as managing referrals.
- AI can update and coordinate schedules.
- Some predictors speculate that AI is not yet mature enough to provide careful assessment of its value to the organization.
- Planning for the inclusion of AI into organizational capabilities can be handled in an orderly and beneficial fashion with a little planning effort.
- AI technology is changing at a fast rate. Organizations are now in various stages of managing their AI interests.
Challenges of AI and Adoption
Distraction of AI
Many organizations are distracted by the following issues, attributed
Organizations are concerned that not enough attention is being focused on the dangers of AI. The fact that many devices are being hooked to the Internet, it is creating fear that the data generated from the devices may cause problems in terms of cybersecurity issues. Besides, there are not enough skilled cyber workers. It is being predicted that using AI and ML could automate threat detection and response. There is a possibility that this approach could be the response to a potential threat and likely be more efficient (Ford February 11, 2015).
Many organizations are suspicious of data risk issues. However, ML algorithms could create a false sense of security. Quite recently, researchers have trained supervised ML software. Algorithms to train the machine learner must be well defined. Rolled out software requires thorough scrubbing of anomalous data points. The algorithm may miss some attacks. Attackers who get access to corporate systems could corrupt data by switching labels so that some malware is tagged as clean code. Algorithms that are compromised and do not flag a problem can cause bigger risk issues.
In addition to all the risk issues discussed, AI and ML should not be used for risk defense. An appropriate risk process must be in place to monitor and minimize the risk associated with algorithm adoption and ML. Researchers showed that a challenge persists in finding resources with knowledge and experience in cybersecurity and data.
Mass Unemployment Due to AI Adoption
It is typical in the United States that people stop working at the age of 65 and spend their time mentoring other workers or volunteering. In the manufacturing sector, AI may not have a negative impact on losses. Job losses may come in the service sectors, such as construction, health care, and business. The loss of a job will mostly depend on how jobs will be transformed by adding new tasks while being supported by computers and robots.
AI algorithms are replacing jobs that are routine, repetitive, and take much time and thus are more easily and effectively performed by machines and robots. This means humans can be left to tackle interpersonal, social, and emotional jobs (Furman December 20, 2016). A typical example is that the bank teller job may change so tellers will concentrate on giving money and helping clients.
Areas in which AI and ML can be greatly helpful are agriculture, weather forecasting, and determining the latest market prices. A typical requirement for an online customer is requesting help to purchase products. AI can add services to improve customer experiences, allowing companies to retain those customers.
It is possible that human labor may be less expensive than machines. There may be a lack of required skill, poor energy, poor energy infrastructure, broadband, and transport networks. Other areas such as legal and regulatory issues could use AI quite well. When AI is deployed, the doctor needs to confirm if the AI is responsible for claims of medical malpractice.
The Impossibility of Total Human Control
AI is currently popular and can be heard everywhere. Many business sectors use AI including insurance, health care, genetics, agriculture industry, road traffic management, and other areas that are based on data. Some people believe companies are trying to remove human resources from routine work and replace them with AI and ML. Many organizations like what AI can do for them. Yet, other companies focus on negative aspects such as data risk issues, data privacy concerns, mass unemployment due to AI integration, unbiasedness of AI, the impossibility of total human control, and the notion that AI-based solutions are still too expensive for most organizations. There is a question of whether real problems exist in using AI or simply being prejudice against the application and idea that it brings to the business market.
AI has been used and recommended by many experts in business and computing fields.
AI and ML use large amounts of data. Most of these data are personal. Recently, data have leaked from organizations such as Facebook and Apple. These organizations use ML for personal data processing. Is it possible that AI and ML will increase the probability of data leakage? However, no cases have occurred of AI data leak; organizations build AI to solve data leak problems.
Usually, top-notch designers design AI software, making the application safe and difficult to hack. If the software were hacked, it would be difficult to understand and make changes to it. The data in AI or neural networks cannot be decrypted because of the way they are built. Subsequently, AI systems and neural network systems are developed using open source frameworks and libraries such as Microsoft CNTK, Theano, TensorFlow, Caffe, Keras, and Torch. Most of these open source
frameworks are supported by large organizations such as Google, Facebook, and Microsoft. The supporting organizations have policies that ensure privacy through penetration tests. Of the five causes of data breach, four come from a human error that stems from password error. This means that problems related to AI are not legitimate concerns and can be considered myths. Data leaks will diminish drastically if organizations pay more attention to train their staff with the basic rules of data management.
People do not like their data to be analyzed in fear of being targeted. Personal data analyses have been performed in insurance, finance, and other industries. People fear that other people or machines will know their private details, which creates fear in them. AI alone does not violate any personal data policies. However, data insights reveal organization status to show profit or loss. Organizations developing AI have processes to regulate documentation and personal data. Further, people fear that AI and the neural network will replace managers in making decisions. Every organization is interested in skillful and loyal personnel. The existence of AI and neural networks does not mean companies have plans to substitute humans with computers, even though computers can do the work of the human faster and probably better. Gartner predicted that, in 2020, AI will generate 2.3 million jobs from 1.8 million jobs.
AI programs and robots are liable to make mistakes. A good example can be cited with a case of robot failing exam questions that would be obvious to young children. Police departments have used AI systems and those systems can also make mistakes. This can be concerning, such as distinguishing between a toy gun and a real gun.
People have some level of fear that a smart computer with AI will control humans instead of the other way around. This belief stems mostly from customers who think about dealing with computers rather than humans after watching sci-fi movies. This further raises the following questions:
- Complex AI systems comprise a few subsystems such as speech recognition, decision making, and data analysis. All the subsystems are hard coded; thus, it is not possible for the subsystems to add new features by themselves.
- AI systems have limitations based on how they are developed.
- Developing AI that is close to a human can be quite expensive to have and maintain.
Developers can create very complex neural networks and ML algorithms for various industries. However, the cost of such systems is high and difficult. Complexity grows with new product ideas.
Data to AI
Let us illustrate how AI operates: AI starts with data and progresses to data science, then to ML, deep learning, and finally to AI. Figure 2.4 shows the relationship between data, data science, ML, deep learning, and AI.
The distinction between data, data science, ML, deep learning, and AI is defined next.
Figure 2.4 Data to AI
What Are Data?
Data are information usually used for analysis, calculation, or to plan something. Data are often produced or stored by a computer. Data have value and relate to subjects that are qualitative or quantitative. Data explore the study and construction of algorithms that can learn from data and make decisions and predictions or decisions by building a model. ML is a subset of AI in the field of computer science and data science.
Examples of data in the corporate world are revenue, sales data, profits, and stock prices. In the government sector, examples are rates such as crime rates and unemployment rates. Examples in the nongovernmental sector are the number of homeless people or the top location of homeless people.
What Is Data Science?
Data science is the field that uses scientific methods, processes, algorithms, and systems to derive knowledge and insights from acquired data in an environment. Data can be structured or unstructured. The idea of data science is to unify statistics, data analysis, ML, and related methods to provide insights.
What Is ML?
ML was first coined by Arthur Samuel in 1959. ML explores the study and construction of algorithms that can learn from data and make decisions and predictions by building a model. ML is a subset of AI in the field of computer science and data science. ML is used in a range of computing tasks that require designing and programming explicit algorithms. ML means simply teach machines to accomplish expected tasks.
Major Problems of Teaching Machines
Major problems in teaching machines are:
- Identifying methods to teach the machines.
- Identifying, collecting, and preparing training data.
- It takes much time to train machines.
- Training machines needs many computational resources.
- ML requires improving model accuracy with efficient time and optimal resources.
Types of ML Systems
The types of ML systems are very important to choose, based on the use cases, type of available algorithms, types of data, and problem one is trying to solve. Listed as follows are the most commonly used types of ML systems:
- Supervised ML
- Unsupervised ML
- Semisupervised ML
- Transfer ML
- Reinforcement ML
- Ensemble learning
Additionally, types of ML can be categorized based on how the machine is trained. Training is based on offline learning or online (real-time) learning in a predefined batch mode or stream mode with regular intervals.
- Offline learning
- Online learning
- Batch mode learning
- Stream mode learning
Another way to categorize ML systems is based on similarity and derived from mathematical models.
- Instance-based learning is how similar the new set of data is coming in and detecting new patterns on data that enable continuous learning.
- Model-based learning derives from mathematical formulas to construct a mathematical model.
What Is Supervised ML?
Supervised ML is a way to teach machines through training data by examples with input and output of historically collected data or valid data labeled by humans. The name supervised ML reflects concept of supervised by humans to improve high accuracy to address problems of classification or categorization. Further, the goal is to predict the future value of continuous variables such as predicting housing prices, based on available historical data by applying appropriate algorithms to extend mathematically and geometrically from the historical data points linearly or nonlinearly.
Linear Versus Nonlinear
Here are explanations of linear and nonlinear data points. This is the starting point for determining a pattern in the data that is either linear or nonlinear. Linear patterns are any data points that are in these patterns (x,y) = (0,0),(1,1),(2.2),(3,3). See Figure 2.5.
Figure 2.5 Linear and nonlinear graphs
Some of the important and famous supervised machine algorithms are:
- k-nearest neighbors, a nonparametric method used to classify and regress.
- Linear regression is a linear approach to modeling the relationship between a dependent variable and one or more explanatory independent variables.
- Logistic regression is a method of analyzing a dataset with one or more independent variables that determine an outcome.
- Support Vector Machines are supervised ML algorithms used for classification or regression challenges.
- Decision trees and random forests are collections of decision trees whose results are aggregated into one result.
- Neural networks are a series of algorithms that recognize underlying relationships in a set of data through a process that imitates the way the human brain operates.
Some typical use cases for supervised ML algorithms follow:
- Classification of customers based on purchasing patterns, behavior patterns, frequency patterns, income-based patterns, and so on.
- Grouping of customers by lifetime value usage patterns.
- Predicting customer churn based on usage pattern, value versus price, competitor pricing strategy on campaigns, introducing new products and services, and much more.
- Fraud detection by analyzing data to find anomalies, unique patterns, and extreme cases.
- Sales forecasts and predictions.
- Risk identification and risk categorization.
- Building and categorizing threat models.
What Is Unsupervised ML?
Unsupervised ML describes a type of teaching machine through training data autonomously without labeled data. The name unsupervised ML reflects the concept of being unsupervised by humans by extending algorithms to approximate groupings, identifying the association between data and anomaly detection.
Some important and famous unsupervised machine algorithms are:
- Association rule
- Anomaly detection
- Dimensionality reduction
- Some of typical use cases for unsupervised ML algorithms are:
- Clustering or grouping of any data.
- Anomaly detection of any set of data.
- Similar product and associated product recommendation.
- Feature reduction on high dimensional data.
What Is Semisupervised ML?
Semisupervised ML is a type of ML with some part using a supervised learning method and some part using an unsupervised learning method in any combination of labeled and unlabeled data. Typically, semisupervised ML applies the unsupervised ML algorithm first for unlabeled data and identifies labeled data and applies supervised learning algorithms to improve accuracy. Sometimes, annotation tools are also used to complete human labeling through web-based applications such as http://bizstats.ai/product/urAI.html
urAI—Annotation Tool | BizStats.AI
Event tickets Application Programming Interface (API) improves user’s search experience by a Named Entity Recognition ML model, exclusively for event ticketing sites.
What Is Deep Learning?
Deep learning is part of ML methods based on useful data representations or features from the raw data. Data representations are meant to be the understanding of the data structures and identify, extract, and evolve underlying features from the raw data. The feature’s learning can be supervised learning, semisupervised learning, unsupervised learning, or reinforcement learning. Different deep learning architectures are artificial neural networks, deep neural networks, deep belief networks, and recurrent neural networks based on learning data representations.
Typical distinctions between traditional ML approaches and deep learning approaches are that deep learning extracts automatic learning features from the raw data using deep learning models formed by different types of layers.
Fully connected neural network layers consist of a list of inputs inserted in a list of outputs (see Figure 2.6). There are three basic types of layers: (1) input layer, (2) output layer, and (3) hidden layer. Some functionality-based layers are the convolution layer, max/avg pooling layer, dropout layer, nonlinearity layer, and loss function layer.
Figure 2.6 Fully connected neural network layer
Convolution Neural Networks
Convolution Neural Network (CNN) consists of a convolution layer that filters input section by section for useful features and flows through all sections to automatically extract the important features for the given problem. CNN works best in image recognition use cases. See the following illustration:
Perceptron (P): 2 input layer → 1 output layer.
The perceptron is the basic architecture of artificial neural networks.
Let us consider, X1, X2, X3 … Xn as inputs and W1, W2, W3 ... Wn as weights
X1 → W1
X2 → W2
X3 → W3
Xn → Wn
and Weighted Sum z= w1x1+w2x2+w3x3 ......+wnxn= wT.x
h(x)= step(wT.x)= step(z)
Feed forward (FF): 2 input layer → 2 hidden layer → 1 output layer.
Radial basis network (RBF): 2 input layer → 2 hidden layer → output layer.
Deep feed forward (DFF): 3 input layer → 2 hidden layer → 1 output layer.
Recurrent Neural Network (RNN): 3 input layer → 2 hidden layer → 3 output layer.
Long/Short Term Memory (LSTM): 3 input layer → 2 hidden layers with memory cell → 3 output layer.
Gated Recurrent Unit (GRU): 3 input layer → 2 hidden layers with different memory cell → 3 output layer.
Auto encoder (AE): 4 input layer → 1 hidden layer → 4 output layer with a matched number of input cells to output cells.
Variational AE (VAE): 4 input layer → 1 hidden layer with probabilistic hidden cell → 4 output layer with a matched number of inputs to output cells.
De-noising AE (DAE): 4 input layers with noisy input cell → 1 hidden layer → 4 output layer with a matched number of inputs to the output cells.
Sparse AE (SAE): 2 input layer → 1 hidden layer
Deep learning has been applied to image recognition, computer vision, speech recognition, natural language processing, machine translations, and much more. Most recently, deep learning has become very popular in the technology industry in the following areas of computer assistance—human language translations, customer support, bots, and much more—explored while writing this book.
As to AI for risk, we used most deep learning architectures to automatically detect features from raw data to apply respective use cases for risk. These are covered in the AI solutions for risk chapters.
What Is Transfer Learning?
Transfer learning focuses on storing knowledge from solving one problem and applying it to another or similar problems. A typical example is knowledge gained from recognizing one thing and applying it to a similar thing.
Transfer learning is a type of transfer knowledge gained through previously trained models and then applying additional training to answer a specific problem as an add-on. The goal is to use a previously trained model for a similar problem and to train the model to extend the solution to other problems.
A typical example is in natural language processing applications. Some models were trained with the English language corpus, the knowledge gained from recognizing English words and grammar, and then applying that knowledge to act as a chatbot with additional training toward specific use-case-based problems, making it a chatbot.
What Is Reinforcement Learning?
Reinforced learning is the part of ML that relates to how software agents are supposed to act in some environments to maximize reward. Disadvantages related to reinforced learning usage accrue from its generality. Researchers study this aspect in game theory, control theory, operations research, and statistics. Reinforced learning is different from standard supervised learning in correct input/output pairs.
Reinforcement learning consists of a learning agent that continuously learns from observing and trying out the next action, based on the defined policy. The machine earns rewards or penalties and, based on the real-time examples or tryouts, automatically updates policies. The machine continues these steps until the optimal policy is constructed.
Markov Decision Process
How does one apply Markov Decision Process (MDP) to automate the decision-making process?
MDPs comes from the Russian mathematician named Andrew Markov, going back to 1950.
MDP provides a mathematical approach for making decisions in situations with output that are partly random and partly under the control of a decision maker. MDP has been used to study optimizing problems using dynamic programming and reinforcement learning. People in many disciplines use MDP such as robotics, automatic control, manufacturing, economics, applying gaming, and driverless cars, and it can be extended to risk use cases.
MDP can be better explained using a scenario where the thought process is applied as an agent. In the case of the ML model, the thought process uses a calculation that goes through the trial-and-error process. In summary, MDP is a sequential decision for a fully observable, random environment (MDP). This environment consists of a set of states, a set of actions and rewards, that includes positive and negative rewards. The policy will be captured to maintain different possible states (s0, s1, s2, ... sn), all possible actions from the current state to the new state [A(s0), A(s1), A(s2), … A(sn)], and respective rewards, which is R(s). The policy is represented as π(s).
MDP is explored in detail in the coming chapters and is illustrated as follows:
- A probability to move to different states.
- A way to evaluate rewards to being in different states.
s∈S —a set of states
a∈A —a set of actions
T(s,a,s´)—a transition function/model/dynamics
prob that aa from ss leads to s´s´, that is, P(s´|a,s)P(s´|a,s)
R(s,a,s´)R(s,a,s´) —a reward(cost) function aka R(s´)R(s´) or R(s)R(s) to maximize the reward/cost.
α —a start state
γ —a discount factor
MDPs are nondeterministic/stochastic search problems (Haskell May 10, 2019). Nondeterministic means that next action could be anything, in any direction, and not in the predefined sequence of steps. Each time it goes to a different state that is not in sequence, it may not be the same.
State transition is represented as an equation, for a Markov state s and successor state “s,” the state transition probability is defined by PSS′=ℙ[St+1=s′|St=s].
Two approaches to agent learning are active learning and passive learning. Passive learning mainly focuses on learning the possibility of the environment and exploration, whereas active learning builds policy by acting.
Decision Toward Next Action
MDP process use cases mainly rest on the decision to determine the next action for the use cases listed as follows:
- Robot path planning
- Route planning
- Aircraft navigation
- Driverless car navigation
- Manufacturing process
- Network switching and routing
The Monte Carlo (MC) algorithm is based on the small probability concept of the randomization algorithm, which applies randomness and applies the statistics of standard normal distribution. It uses repeated random sampling to get approximation solutions. This method is used in a case with no analytical solutions or numerical solutions.
Steps to implement MC methods follow:
- Determine the properties of statistics of input data.
- Generate all possible inputs based on the identified properties of statistics in Step 1.
- Perform a deterministic calculation.
- Analyze statistical results.
MC simulation is a computerized mathematical technique to account for risk in the quantitative analysis and decision-making process. An MC simulation is a useful tool to predict future results by calculating a formula multiple times with different random inputs.
This method can solve many optimization problems and numerical problems by generating sampling from statistical distribution input data to simulate working systems and predict financial investments with risk analysis to theoretical physics problems.
Because the integral part ranges from 0 to infinite, this needs a numerical approximation.
This crude method of calculating approximation uses the following formula:
Determine variance of estimation:
Determine variance of estimation expanded:
Common probability distributions are:
- Normal/bell curve
Most business activities, plans, and processes are too complex for an analytical solution. Many business situations involve uncertainty in many dimensions. For example, variable market demand, unknown plans of competitors, uncertainty in costs, and many others.
What Is MC and How It Is Used?
MC simulation is named after the Monaco gambling spot. The MC technique was originally developed by Stanislaw Ulam, a mathematician who worked on the Manhattan project when recovering from brain surgery. The technique was developed in collaboration with John Von Neuman. Developers use MC simulation to model the probability of different outcomes such as identified risk occurrence. Usually, developers would use MC if the risk cannot be easily predicted. Usually, the difficulty with the prediction may be due to some intervening random variables.
Developers use the MC simulation technique to understand the impact of risk and uncertainty in a project risk prediction and forecasting model. The technique has been used in projects, science, engineering, and supply chains.
In a project risk analysis that has significant uncertainties, MC might be effective. Usually, in organizational projects, random variables may interfere with the risks, requiring the use of MC. MC tends to have an enormous array of variables that lend themselves to applications. It can be used, for example, to assess the probability of cost overruns in projects. The telecommunication industry has used MC to determine network performance to help optimize the network. The insurance industry and various industry silos have use MC when necessary.
Let’s demonstrate how the MC technique can be used by projecting a price. Let’s use historical price data from a historic asset:
periodic daily return = ln (day’s price ÷ previous day’s price)
Subsequently, we may use the AVERAGE, STDEV.P, and VAR.P functions on the whole resulting series to get the average daily return, standard deviation, and variance inputs, in this order. The next step is
drift = average daily return - (variance ÷ 2)
Instead, drift can be set to 0; this choice reflects a certain theoretical orientation, but the differences are not supposed to be far from each other, at least for shorter time frames.
Going forward, get a random input:
random value = standard deviation * NORMSINV(RAND())
The equation for the following day’s price is:
next day’s price = today’s price * e ^ (drift + random value)
Now, use e to a given power x, and then use the EXP function: EXP(x). The calculation can be repeated the desired number of times (each repetition will represent one day) to obtain a simulation of future price movement. Generating a random number of simulations, it can assess the probability that a risk’s price will follow a given trajectory.
Q-learning is model-free reinforcement learning and goal to build the policy of the environment. Q provides the quality of action (A) for given state (S).
E refers to expectation and
ƛ refers to discount factor; thus, the Q-value equation:
Optimum Q-value equation:
Deep Q Network
The deep Q network is the extension of the Q-learning function to update the Q table, but with so many combinations to create these actions and states, to manage the rewards policy becomes complex, and it is impossible to manage many combinations created in the Q table. So, the deep Q network is the idea to create a neural network that will be approximate for each state and the different Q values for each action. See Figure 2.7. The logical sequence follows:
State -> Deep Q Neural Network -> Q value Action 1, Q value Action 2 .. Q value Action n.
Deep reinforcement learning applications are used in the following areas:
- Games—Go, poker
- Robotics—Robot Controller
- Computer vision—recognition, detection
- NLP—language translation, conversational
- Finance—pricing, trading, risk management
- Systems—performance optimization
What Is Classification?
Classification is the problem that identifies a set of categories to which new observations belong. One of the most often used ML algorithms uses cases to classify certain differences to differentiate the category by classification.
Classification is the process of predicting the class of given data points. These classes are called targets or labels or categories. Classification modeling is the task of approximating a mapping function (f) from input variables (X) to discrete output variables (y).
So, Y = f(x)
A classification problem is when the output variable is a category, such as a “number” or “letter” or “disease” and “no disease.” A classification model attempts to draw some conclusion from observed values. Given one or more inputs, a classification model will try to predict the value of one or more outcomes. The several classification models include logistic regression, decision tree, random forest, gradient-boosted tree, multilayer perceptron, one-vs-rest, and Naive Bayes.
Two types of learners in classification are lazy learners and eager learners. Lazy learners take less time to train and more time to predicting classification. Examples of lazy learners are k-nearest neighbor and case-based reasoning.
Eager learners construct a classification model based on the given training data and coverage to have the entire instance space. Eager learners take more time on training and less time on classifying. Examples of eager learnings are decision tree, Naive Bayes, and artificial neural networks.
Classification: Decision Tree Algorithm
Decision tree algorithm builds classification or regression models in the form of a tree structure with If–Then mutually exclusive rules and these rules learn sequentially using the training data. This process continues until it meets the termination condition. The problem with the decision tree is that it could easily get into the overfitting problem that means too many rules were constructed, limiting generalization.