Expert System Architecture Part 2

Production systems have long been the go-to approach for designing intelligent computer programs in artificial intelligence. These systems, also known as production rule systems, are built upon rules that govern the program’s behaviour. Whether automated planning, expert systems, or action selection, production systems have proven invaluable tools in achieving desired goals.

But what if there was another way? What if we could explore non-production system architectures that offer alternative solutions to the challenges faced in AI? In this blog post, we will delve into non-production system architectures and examine their potential to provide innovative approaches to artificial intelligence.

Before we dive into the specifics, let’s first understand the fundamental components of a production system. Productions are the heart of a production system, consisting of two parts: a sensory precondition, often represented as an “IF” statement, and an action, denoted as “THEN”. These rules are the building blocks for the program’s behaviour, allowing it to make decisions and take steps based on the given conditions.

While production systems have undoubtedly been successful in various applications, there are limitations to their approach. The rigid structure of IF-THEN rules can sometimes hinder flexibility and adaptability, especially in complex and dynamic environments. This is where non-production system architectures come into play.

Non-production system architectures offer an alternative framework for designing intelligent computer programs. These architectures prioritize flexibility and adaptability, allowing for more dynamic decision-making and action selection. Rather than relying solely on predefined rules, non-production systems employ various techniques such as machine learning, neural networks, and genetic algorithms to adapt and learn from their environment.

By embracing non-production system architectures, we open up new possibilities in artificial intelligence. These architectures can enable programs to learn from experience, adapt to changing circumstances, and make decisions beyond predefined rules’ constraints. This flexibility can be precious in domains where the environment is uncertain or constantly evolving.

In the following sections of this blog post, we will explore some of the prominent non-production system architectures and delve into their potential applications. From reinforcement learning to deep neural networks, we will examine how these architectures can revolutionize how we approach artificial intelligence.

As we embark on this journey into non-production system architectures, we must keep an open mind and embrace the possibilities. While production systems have served us well in the past, it’s crucial to explore alternative approaches that can push the boundaries of what is possible in artificial intelligence.

Join us in this exploration of non-production system architectures and discover the exciting potential they hold for the future of AI. Let’s challenge the status quo and pave the way for more flexible, adaptable, and intelligent computer programs.

13.3 Non-production System Architectures

in short, A production system is a computer program that uses a set of rules to achieve a specific goal. These rules, called productions, are used in automated planning, expert systems, and action selection. A production involves two components, namely a sensory precondition and an action. When a production’s precondition matches the current state of the world, the production is triggered. When the production activity is carried out, it is referred to as being fired. A production system includes a database that stores current knowledge and a rule interpreter.

Caution Here are the marker symbols used in the program:P1: $$ replaced with *P2: *$ replaced with *P3: *x replaced with x*P4: * replaced with null & haltP5: $xy replaced with y$xP6

13.3.1 Associative or Semantic Networks

Example: This example demonstrates production rules for reversing a string from an alphabet without “$” and “*” marker symbols.

Screenshot 2023 08 23 125253 - Help Of Ai

This production system selects production rules based on their order in the production list. The input string is then analyzed from left to right using a moving window to find a match with the LHS of the production rule. Once a game is found, the matched substring in the input string is replaced with the RHS of the production rule. Variables x and y game any character of the input string alphabet, and the match resumes with P1 after the replacement has been made.

The string “ABC” undergoes a sequence of transformations according to these production rules:

Screenshot 2023 08 23 130859 - Help Of Ai

In a simple system, the order of production rules is crucial. The lack of control structure makes production systems challenging to design. Control structures can be added to the inference engine or working memory model.

13.3.2 Frame Architecture

In a simulated world, a monkey can grab objects and climb. An example rule to hold a suspended thing is:

Screenshot 2023 08 23 131315 - Help Of Ai

In this example, data is structured in working memory with variables appearing between angle brackets. The first literal in conditions is the data structure’s name, such as “goal” and “physical object”. The fields in a system are marked with a “^” prefix. An adverse condition is indicated by “-“.

Self Assessment

Please indicate if the following statements are accurate or inaccurate:

  • Within a production system, there exists a database known as working memory.
  • The symbol “-” denotes a positive state.
  • Designing production systems becomes challenging without proper control structures in place.

13.4 Semantic Memory

Semantic memory is the memory of concepts, meanings, and understandings, which supports the conscious recollection of factual information and general knowledge about the world. It is part of declarative memory, along with episodic memory. Semantic memory enables us to understand the meaning of words and sentences that would otherwise be meaningless. We can learn new concepts by applying our past knowledge. The text discusses procedural memory and semantic networks. Semantic networks are composed of nodes representing concepts, words, or features, with links between them denoting specific relationships. Processing in a semantic web involves spreading activation.

Notes Semantic networks are often used in models of discourse and logical comprehension, as well as in Artificial Intelligence. The nodes in these models represent words or word stems, while the links show the syntactic relations between them.
  1. Feature Models

The semantic feature comparison model proposed by Smith, Shoben, and Rips (1974) describes memory as comprising feature lists for different concepts. Categories are viewed as sets of features, and relations between them are indirectly computed. Computational feature-comparison samples possess those offered by Meyer (1970), Rips (1975), and Smith et al. (1974).

Early theories of categorization assumed that categories were defined by critical features and membership determined by logical rules. Recent ideas accept the fuzzy structure of classes and propose probabilistic or global similarity models for verification.

  1. Associative Models

The concept of association, crucial to memory and cognition models, is represented by links between nodes in a network. Neural and semantic networks are examples of associative cognitive models. Associations between items in memory can be defined as an N×N matrix, with each cell indicating the strength of the association between the row and column items. The process of learning associations is believed to be Hebbian, meaning the association between two things in memory grows stronger whenever they are simultaneously active.

Self Assessment

Please indicate whether the following statements are accurate or erroneous:

  1. Semantic and episodic memory are the two major divisions of declarative memory.
  2. The process of spreading activation is commonly used in semantic networks.
  3. The links between nodes in a network are equivalent to the associations among a collection of items in memory.

13.5 Knowledge Acquisition and Validation

The frame provides instructions on utilizing it, what to anticipate, and actions to take if those expectations are unmet. Some information in the periphery remains constant, while other data stored in “terminals” can vary. Multiple frames may have the same terminals.

Remember to fill in all the necessary information in each slot for every frame. This will ensure that everything is organized and easy to find when needed. It’s essential to be thorough and accurate to avoid confusion or mistakes. The information can contain:

                                                        Facts or Data

                                                                  Values (called facets)

                                                                              Procedures (also called procedural attachments)

                                                                                     IF-NEEDED: deferred evaluation

                                                                                      IF-ADDED: updates linked information

                                                        Default Values

                                                                  For Data

                                                                  For Procedures

                                                                           Other Frames or Subframes

Neural Network

In the past, the phrase “neural network” was used to describe a biological circuit or network of neurons. Nowadays, it commonly refers to artificial neural networks that consist of nodes or artificial neurons. Hence, the term can indicate either biological neural networks, composed of actual biological neurons, or artificial neural networks, which are utilized to address problems related to artificial intelligence.

Artificial neural networks differ from von Neumann’s model of computations in that they don’t separate memory and processing. Instead, they transmit signals through network connections, similar to biological networks.

Networks are systems of interconnected components that can be used for various applications, such as predictive modelling and adaptive control. These applications can be trained by utilizing data sets.

When discussing a “network,” we refer to a group of interconnected parts. In the case of an artificial neural network, the term “network” refers explicitly to the connections between the neurons in each layer of the system. Typically, these systems have three layers: input neurons in the first layer, which send data through synapses to the second layer, and then through more synapses to the third layer of output neurons. Some more complex systems may have additional layers of neurons, with some having more input and output neurons. These synapses in the network store parameters called “weights.”

Three types of parameters generally define an Artificial Neural Network.:

  1. The pattern of connection between various layers of neurons.
  2. The process of updating the weights of the interconnections during learning.
  3. The activation function is responsible for converting the weighted input of a neuron into its corresponding output activation.

In mathematical terms, a neuron’s network function, denoted by f(x), is defined as a composition of other functions gi (x). These functions can be broken down further into pieces of different functions. This structure can be represented as a network, where arrows indicate the dependencies between variables. One common type of composition is the nonlinear weighted sum, where f(x) = K(Σi wig (x)). K (the activation function) is a predefined function, such as the hyperbolic tangent. For simplicity, we can refer to the collection of functions gi as a vector g = (g1, g2,…,gn).

IMG 13.5 This shows a f breakdown, with connections between variables shown by arrows. This can be interpreted in two different ways.

img 13.4: Architecture

In the functional view, the input x transforms into a 3-dimensional vector h. This is then changed into a 2-dimensional vector g, ultimately transformed into f. This approach is frequently used when optimizing.

In graphical models, the probabilistic view is often used. This means that the random variable F depends on the random variable G, dependent on H, which ultimately depends on the random variable x.

Both views are mostly the same. In both cases, for this specific network structure, the elements of each layer are separate from one another (for instance, the aspects of g are different from each other given their input h). This allows for some level of parallelism in the execution.

Dealing with Uncertainty and Change

Fancy Logics

It is crucial to address uncertainty and changes in the field of AI because…:

 The world can change unexpectedly due to actions outside of our control.

 Our perception of the world can shift based on new information, regardless of any external changes. When we receive further evidence, it can cause a ripple effect, altering our interconnected beliefs.

 Sometimes, we may have uncertainty about our beliefs regarding the world. We may not be entirely sure if we have observed something accurately, or we may make plausible but not altogether sure conclusions based on our current beliefs, which may have varying degrees of certainty.

Dealing with certain things can be managed easily on an ad hoc basis. We can establish rules to eliminate items from working memory when there are changes, regulations, and facts with numerical certainty factors. The challenge lies in dealing with uncertainty systematically. First-order predicate logic is insufficient as it is intended to function with complete, consistent, and monotonic data (which means only confirmed facts are added and not deleted from the pool of known information).

There is no direct method to use it for incomplete, variably certain, inconsistent and non-monotonic inferences. (Furthermore, using it to explicitly represent beliefs about the world is challenging, as it will become evident later.) We need a formal, systematic and preferably simple approach to handle confidence, uncertainty, and change.

To handle this situation, there are two main strategies. The first involves utilizing more advanced forms of logic. While first-order predicate logic is commonly used, many other logics have distinct and well-defined semantics and theoretical principles. Some of this more complex logic may be better equipped for managing uncertainty, belief, and change, but it is worth noting that no single sense can address our issues. Therefore, practitioners often select the most appropriate reason for their specific needs.

Another method is to utilize probability theory instead of relying on logic. This is a widely recognized theory that offers precise outcomes related to uncertainty. However, it assumes certain aspects regarding the access and accuracy of evidence, so it’s crucial to use it cautiously. Nonetheless, probability theory is an excellent foundation for evaluating more informal methods of dealing with uncertainty and verifying if they align with classical probability theory, given specific assumptions.?

There are different types of logic, and one of them is default logic. This type of logic enables us to make non-monotonic inferences, which means that specific facts may no longer be considered valid when new information is introduced.

Default Logics

When we encounter situations where there is a typical scenario that applies to most cases but with some exceptions, default logic comes in handy. We previously discovered this concept when exploring frame systems and inheritance.

Typically, elephants are grey, but an outlier named Clyde is pink. Expressing these exceptions using first-order predicate logic can be cumbersome because it requires specifying all the anomalies in the rules, leading to complexity. To simplify this process, a specific reason has been developed. While various such grounds exist, they are primarily based on the same idea.

With default logic, you can create rules based on whether it is logical to believe something. For example, if X is identified as an elephant and it is logical to think that elephants are grey, we can conclude that X is grey. Default logic is expressed in different variations, but one common approach is to use a particular operator called “M” to represent that X is consistent with all other information. Here are some rules and facts to consider:

       X elephant(X)     M grey(X) grey(X)

grey(cl de) elephant(Nellie) elephant(Clyde)

We can determine that Nellie is grey, but we cannot make the same conclusion for Clyde as it does not align with the other information provided, which indicates that Clyde is not grey.

Suppose we had a hundred elephants, of which around ten are of unique colours like pink or blue. We can add facts such as “grey” for elephants named Albert and Edward to specify the exceptions. We can add a new fact accordingly whenever a new elephant is acquired. However, if we apply regular predicate logic, we would need to alter the general rule itself, which could be structured as follows:

X elephant(X)       name(X, clyde) name(X, Albert) name(X, Edward) …. grey(X)

When we have a higher level of complexity in our default settings, it becomes more advantageous to use default logic. For instance, we may wish to state that elephants are typically grey, circus elephants are generally pink, but Nellie is grey. Handling all these nested defaults and exceptions explicitly in predicate logic can be intricate.

When using default logic to provide semantics, it’s essential to remember that in frame systems, you may need to choose which parent class to inherit from. For example, if Clyde is both a circus animal and an elephant, it’s difficult to determine whether he is likely to be tame.

This same issue arises in default logic, where there may be inconsistent conclusions and no clear indication of which is correct. If a default rule states that circus animals are generally not grey, and it’s consistent to believe they aren’t grey, then we can’t determine whether our circus elephant is grey. We could consider these different sets of facts as other extensions of the knowledge base.

13.5.1 Methods for Knowledge Acquisition

Probabilistic logic, also known as probability logic and probabilistic reasoning, aims to merge the ability of probability theory to handle uncertainty with the power of deductive logic to utilize structure. This leads to a more comprehensive and versatile formalism that can be applied in various areas.

Did you know? Probabilistic logic aims to provide a natural extension of traditional logic truth tables. The results they define are derived through probabilistic expressions instead of the usual methods.

Probabilistic logics often increase the computational complexities of their probabilistic and logical components, making them challenging to work with. There are concerns about potentially counterintuitive results, like those found in the Dempster-Shafer theory. As a result, many proposals have addressed the wide range of contexts and issues that arise.

13.5.2 Validation of Knowledge

Knowledge engineering is widely employed in various computer science domains, such as artificial intelligence, databases, data mining, expert systems, decision support systems, and geographic information systems. It’s also associated with mathematical logic and is closely linked to cognitive science and socio-cognitive engineering. This is because knowledge is created by socio-cognitive groups, primarily made up of humans, and is organized based on our comprehension of how human reasoning and logic function.

There are several activities that KE focuses on to develop a knowledge-based system.

  • Assessment of the problem
  • Development of a knowledge-based system shell/structure
  • The process of obtaining and organizing relevant information, expertise, and particular preferences using the IPK model.
  • Implementation of the structured knowledge into knowledge bases
  • and validation of the inserted knowledge
  • Integration and maintenance of the system
  • and evaluation of the system.

In practice, KE is more of an art than an engineering process. It is not as straightforward as the above list suggests. The phases tend to overlap, the process may be iterative, and various challenges might arise.

Picture of Hoa
Hoa

Leave a Comment