This specification defines the plausible knowledge notation (PKN) as a lightweight syntax for expressing imperfect knowledge, i.e. knowledge that is uncertain, imprecise, incomplete and inconsistent. PKN was inspired by the work of Alan Collins and co-workers in the 1980s, see [[COLLINS]]. PKN consists of queries, properties, relations, implications and analogies. PKN embraces fuzzy scalars, fuzzy modifiers and fuzzy quantifiers. PKN statements can be associated with metadata that models sub-symbolic knowledge corresponding to intuitive notations such as the strength of a statement, and the typicality of a sub-class in respect to its parent class.
This document is at early stages of development. Feedback is welcome through GitHub issues or on the email@example.com mailing-list (with public archives).
This specification defines a notation and model for imperfect knowledge, such as everyday knowledge that is uncertain, imprecise, incomplete and inconsistent. The approach is modelled on human argumentation, something that a long line of philosophers have investigated since the days of Ancient Greece. Traditional logic assumes perfect knowledge and provides mathematical proof for entailments. Unfortunately, knowledge is rarely perfect, but is nonetheless amenable to reasoning using guidelines for effective arguments. As an example consider the statement: if it is raining then it is cloudy. This is generally true, but you can also infer that it is somewhat likely to be raining if it is cloudy. This is plausible based upon your rough knowledge of weather patterns. In place of logical proof, we have multiple lines of argument for and against the premise in question just like in courtrooms and everyday reasoning.
The above figure shows how properties and relations involving a class may be likely to apply to a sub-class as a specialisation of the parent class. Likewise, properties and relations holding for a sub-class may be likely to apply to the parent class as a generalisation. The likelihood of such inferences is influenced by the available metadata, see Section [[[#metadata]]]. Inferences can also be based on implication rules, and analogies between concepts with matching structural relationships. PKN further supports imprecise concepts:
For a web-based demonstrator, see [[PKN-demo]].
PKN represents an evolution from graph databases to cognitive databases, that can more flexibly support reasoning over everyday knowledge, embracing uncertainty, imprecision, incompleteness and inconsistencies. Deductive proof is replaced by defeasible reasoning with arguments for and against suppositions.
The Stanford Encyclopedia of Philosophy entry on argument and argumentation [[ARGUMENTATION]] lists five types of arguments: deduction, induction, abduction, analogy and fallacies. Argumentation can be adversarial where one person tries to beat down another, or cooperative where people collaborate on seeking a better joint understanding by exploring arguments for and against a given supposition. The latter may further choose to focus on developing a consensus view, with the risk that argumentation may result in group polarisation when people's views become further entrenched.
Studies of argumentation have been made by a long line of philosophers dating back to Ancient Greece, e.g. Carneades and Aristotle. More recently, logicians such as Frege, Hilbert and Russell were primarily interested in mathematical reasoning and argumentation. Stephen Toulmin subsequently criticised the presumption that arguments should be formulated in purely formal deductive terms [[TOULMIN]]. Douglas Walton extended tools from formal logic to cover a wider range of arguments [[WALTON]]. Ulrike Hahn, Mike Oaksford and others applied Bayesian techniques to reasoning and argumentation [[HAHN]], whilst Alan Collins applied a more intuitive approach to plausible reasoning [[COLLINS]].
Formal approaches to argumentation such as [[ASPIC+]] build arguments from axioms and premises as well as strict and defeasible rules. Strict rules logically entail their conclusions, whilst defeasible rules create a presumption in favour of their conclusions, which may need to be withdrawn in the light of new information. Arguments in support of, or counter to, some supposition, build upon the facts in the knowledge graph or the conclusions of previous arguments. Preferences between arguments are derived from preferences between rules with additional considerations in respect to consistency. Counter arguments can be classified into three groups. An argument can:
[[AIF]] is an ontology intended to serve as the basis for an interlingua between different argumentation formats. It covers information (such as propositions and sentences) and schemes (general patterns of reasoning). The latter can be used to model lines of reasoning as argument graphs that reference information as justfication. The ontology provides constraints on valid argument graphs, for example:
Conflict schemes model how one argument conflicts with another, e.g. if an expert is deemed unreliable, then we cannot rely on that expert's opinions. Preference schemes define preferences between one argument and another, e.g. that expert opinions are preferred over popular opinions. The AIF Core ontology is available in a number of standard ontology formats (RDF/XML, OWL/XML, Manchester OWL Syntax).
PKN defines a simple notation and model for imperfect knowledge. Arguments for and against a supposition are constructed as chains of plausible inferences that are used to generate explanations. PKN draws upon Alan Collins core theory of plausible reasoning [[COLLINS]] in respect to statement metadata corresponding to intuitions and gut feelings based upon prior experience. This is in contrast to Bayesian techniques that rely on the availability of rich statistics, which are unavailable in many everyday situations.
Recent work on large language models (LLMs), such as [[GPT-4]], have shown impressive capabilities in respect to reasoning and explanations. However, there is a risk of hallucinations, where the system presents convincing yet imaginary results. Symbolic approaches like PKN are expected to play an important and continuing role in supporting semantic interoperability between systems and knowledge graphs. LLMs are trained on very large datasets, and in principle, could be exploited to generate symbolic models in a way that complements traditional approaches to knowledge engineering.
The grammatical rules in this document are to be interpreted as described in [[[RFC5234]]] [[RFC5234]].
Conformance to this specification is defined for four conformance classes:
PKN documents use the following restricted set of data types. See for a formal definition of their serialization.
A number represents a double-precision 64-bit format value as specified in the IEEE Standard for Binary Floating-Point Arithmeticis [[IEEE-754-2019]]. It is serialized in base 10 using decimal digits, following the same grammar as numbers in JSON [[RFC8259]].
A name is a string that can include letters, digits, period, hyphen, underscore and slash characters, and that cannot be interpreted as a [=number=].
Should this be expanded to cover additional data types such as ISO8601 dates and string literals? That might be convenient in some respects, but it would at the same time introduce complications in respect to how to operate on such data types. It might be simpler to handle such data types at the periphery of the system, e.g. mapping dates to a set of properties, and likewise handling string literals via a lexical dictionary.
PKN supports several kinds of statements as defined below.
Property, relation and implication statements optionally have a scope and a set of parameters. The scope is one or more names that indicate the context in which the statement applies, e.g. ducks are similar to geese in that they are birds with relatively long necks when compared to other bird species. Each parameter consists of a name and a value. Parameters represent prior knowledge as an informal qualitative feeling. Predefined parameters include:
See section [[[#reasoning]]] for the role of statement parameters in argumentation.
Where should we describe how names can be used for graphs and statements?
Need to explain the relation to existing argumentation systems, along with formalisms such as AIF and ASPIC, together with questions around correctness, complexity and semantics. Note the need to explain why formal semantics is challenging for imperfect knowledge.
These are used either to declare properties of things, or as conditions and actions in implication and query statements. A property statement has four parts: the descriptor, an argument, an operator and a referent. Here is an example:
The following operators are predefined:
Note: variables are only permitted within properties that appear as conditions or actions in implication statements or query statements.
Should the the predefined operators include "
Relation statements have three parts: the subject, the relationship, and the object. Relations can be used on their own, or as part of implication and query statements. Here are some examples:
Use metadata to indicate the strength of a relation, see [=strength=] and [=inverse=]. Use prefixes on the relationship as modifiers, i.e. similar to adjectives when applied to a noun, as in:
Is there a need to a separate means to model negation or are prefixes sufficient?
Note: variables are only permitted within relations that appear as conditions or actions in implication statements or query statements.
An implication statement defines one or more consequents, and one or more antecedents, which are properties or relations, and may include variables which are scoped to the implication statement. The consequents are evaluated as a conjunction of conditions. See section [[[#reasoning]]] for details. Here are some examples of implication statements:
These relate two pairs of concepts where each pair stands as an analogy for the other pair.
These have a quantifier, a variable, an optional where clause, and a required from clause. Variables are scoped to the query statement. The where and from clauses involve a conjunction of one or more conditions, which are properties or relations. The conditions for the from clause are evaluated against the PKN statement graph. The conditions for the where clause are evaluated against the matches found with the from clause. The tests relate to the variable bindings.
Queries can use the following quantifiers:
Note that few, many and most are intentionally imprecise, see Section [[[#fuzzy-quantifiers]]].
What are the implications of allowing a comma separated list of quantifier variables?
Plausible reasoning subsumes fuzzy logic as expounded by Lotfi Zadeh in his 1965 paper on fuzzy logic, see [[FUZZY SETS]]. Fuzzy logic includes four parts: fuzzification, fuzzy rules, fuzzy inference and defuzzification.
Fuzzification maps a numerical value, e.g. a temperature reading, into a fuzzy set, where a given temperature could be modelled as 0% cold, 20% warm and 80% hot. This involves transfer functions for each term, and may use a linear ramp or some kind of smooth function for the upper and lower part of the term’s range.
Fuzzy rules relate terms from different ranges, e.g., if it is hot, set the fan speed to fast, if it is warm, set the fan speed to slow. The rules can be applied to determine the desired fan speed as a fuzzy set, e.g., 0% stop, 20% slow and 80% fast. Defuzzification maps this back to a numeric value.
Fuzzy logic works with fuzzy sets in a way that mimics Boolean logic in respect to the values associated with the terms in the fuzzy sets. Logical AND is mapped to selecting the minimum value, logical OR is mapped to selecting the maximum value, and logical NOT to one minus the value, assuming values are between zero and one.
Plausible reasoning expands on fuzzy logic to support a much broader range of inferences, including context dependent concepts, and the means to express fuzzy modifiers and fuzzy quantifiers.
Some concepts are characterised by a value that lies in a range. Examples include temperature, age and climate. The temperature of a room may be described as cold, warm or hot. Such terms are imprecise and lack a formally agreed definition. Moreover, the meaning may depend on the context. Someone who is 60 years old will have a very different notion of what "old" means compared to that of an eight year old child.
The choice of scales for a range may depend on the application. Here are some examples of scales for different types of climate:
Each of these terms is associated with typical weather patterns for that climate, e.g. in cities like Shanghai, Buenos Aires, Sydney, and Hong Kong there is a so called Chinese climate with mild winters and humid summers with tropical rain.
This can be modelled using the range property along with a scope. The value of this property is a list of climates. These climates can be treated as arguments for properties that characterise that climate, e.g. in terms of their span on a numerical scale.
Note that descriptions may involve terms describing different aspects, e.g. temperature and humidity, that belong to separate ranges, e.g. cold and wet in contrast to hot and dry.
Here is an example for age that lists terms for describing someone's age, along with the associate span in years:
PKN allows terms to be combined with one or more fuzzy modifiers, e.g.
very:old where very acts like an adjective when applied to a noun. The meaning of modifiers can be expressed using PKN statements for relations and implications, together with scopes for context sensitivity. In respect to old, "very" could either be defined by reference to a term such as "geriatric" as part of a range for "age", or with respect to the numerical value, e.g. greater than 75 years old.
These are quantifiers, such as few, many and most, with an imprecise meaning. See Section [[[#queries]]]. Their meaning can be defined in terms of the length of the list of query variable bindings that satisfy the conditions.
few signifies a small number,
many signifies a large number, and
most signifies that the number of bindings for the where clause is a large proportion of the number of bindings for the from clause.
Add some examples for how to express the meaning of fuzzy modifiers and fuzzy quantifiers.
Is it necessary to predefine the operational meaning of fuzzy quantifiers and fuzzy modifiers? It may be preferable to require applications to define the meaning according to their needs as the meaning is likely to be context sensitive.
A PKN document is a sequence of PKN statements.
A PKN Graph is an abstract representation of a PKN document as a graph in which statement are the vertices and names are the edges.
A [=PKN document=] MUST follow the grammar defined below.
Comments start with '#' and continue to the end of the current line or the end of the file, whichever comes first. White space is permitted between tokens. One exception is between '?' and the name token for a variable. Another exception is between the sign ('+' and '-') and the digits that form a number.
Should the grammar notation be switched from EBNF to ABNF for consistency with W3C practice, noting that EBNF is easier to understand?
This section presents a visualisation of the grammar in the form of railroad diagrams. These diagrams are provided solely to make it easier to get an intuitive grasp of the syntax of each token.
The Resource Description Framework (RDF), see [[rdf-concepts]], defines a data model for labelled directed graphs, along with serialisation formats such as Turtle, query expressions with SPARQL, and schemas with RDF-S, OWL and SHACL. RDF identifiers are either globally scoped (IRIs) or locally scoped (blank nodes). RDF literals include numbers, booleans, dates and strings. String literals can be tagged with a language code or a data type IRI.
The semantics of RDF graphs is based upon Description Logics, see [[RDF Semantics]] and [[owl2-rdf-based-semantics]]. RDF assumes that everything that is not known to be true should treated as unknown. This can be contrasted with closed contexts where the absence of some statement implies that it is not true.
Description Logics are based upon deductive proof, whereas, PKN is based upon defeasible reasoning which involves presumptions in favour of plausible inferences, and estimating the degree to which the conclusions hold true. As such, when PKN graphs are translated into RDF, defeasible semantics are implicit and dependent on how the resulting graphs are interpreted. Existing tools such as SPARQL don't support defeasible reasoning.
Consider PKN property statements. The descriptor, argument operator and referent, along with any statement metadata can be mapped to a set of RDF triples where the subject of the triples is a generated blank node corresponding to the property statement. Comma separated lists for referents and scopes can be mapped to RDF collections.
PKN relations statements can be handled in a similar manner. It might be tempting to translate the relation's subject, relationship and object into a single RDF triple, but this won't work when the PKN relation is constrained to a scope, or is associated with statement metadata.
PKN implication statements are more complex to handle as they involve a sequence of antecedents and a sequence of consequents, as well as locally scoped variables. One possible approach is to first generate a blank node for the statement, and use it as the subject for RDF collections for the variables, antecedents and consequents.
PKN analogy statements are simpler, although there is a need to be able to distinguish variables from named concepts, e.g. as in
Need to explain how PKN relates to RDF-star and N3.
Plausible reasoning is about generating arguments that support or counter a premise, gathering information along the way. The reasoning process is driven by a queue of premises awaiting consideration, plus a record of reasoning so far, which can be modelled as a reasoning graph. The reasoner applies heuristics (strategies and tactics) to determine which potential inferences to apply and in what order to develop the arguments. As reasoning proceeds, the consideration of additional evidence may either strengthen or weaken earlier conclusions.
The inference engine can start with the premise in question and look for direct evidence, and after that, indirect evidence involving inferences using different kinds of PKN statements that pose intermediate premises for consideration. The reasoning process generates a graph that starts from the premises and works back to direct evidence in the knowledge graph. The reasoning graph can then be scanned in the reverse direction to generate explanations that start with the facts and progressively justify the premise via a series of inferences.
The reasoning process can try to support an argument via finding additional evidence, or it can try to generate counter arguments that undermine, undercut or rebut other arguments. The explanation then needs to put these arguments into context using the appropriate templates.
The next few sections describe how PKN statements can be used to compute inferences along with their estimated certainty, as the basis for constructing arguments both for and against a given premise, in terms of chains of inferences.
In probability theory the conditional probability of event A occurring given an occurrence of event B can be expressed using Bayes theorem:
When applied to inferences, this requires the gathering of lots of statistics which is very challenging from a practical perspective. In the absence of comprehensive statistics, the certainty to which the conclusions hold true can be informally modelled as a number in the range 0 to 1. Qualitative terms such as low, medium and high can be mapped to such numbers for use in estimation algorithms, and later when needed, mapped back into qualitative terms.
One such algorithm is where a premise directly matches multiple properties with non-exclusive values, which can then be combined as a set. The more matches, the greater the certainty for establishing the goal.
If c is the average certainty and n is the number of matches, the combined certainty is 1.0 - (1.0 - c)/n. If c = 0.5 and n = 1 we get 0.5. If n = 2, we get 0.75. If n = 4, we get 0.85, If n = 256, we get 0.998.
Collins et al. suggest a simplified approach to modelling the effects of the various metadata parameters in which each parameter strengthens, weakens or has no effect on the estimated certainty, and using the same algorithm for all such parameters.
If the original certainty is zero, the parameters should have no effect, and likewise, the effect of a parameter should be smaller as the parameter’s value tends to zero.
Treating the effect of a parameter as a multiplier m on the certainty, the number should be in the range 0 to 1/c, where c > 0. If we want to boost c by 25% when c = 0.5, m = 1.25, but this should shrink to 1 when c = 1.
How much should m increase as c tends to 0? One idea is to use linear interpolation, i.e., 1.5 when c = 0, 1.25 when c = 0.5 and 1 when c = 1. This multiplier shrinks to 1 as the value of the parameter tends to 0 and when c tends to 0. Thus m = 1 + p/2 - p*c/2, where m is the multiplier, p is the parameter value (0 to 1) and c is the certainty (0 to 1).
We also need to deal with multiple lines of argument for and against the premise in question. If the arguments agree, we can aggregate their certainties using the first algorithm above. We are then left with a fuzzy set for the different conclusions, e.g. (true 0.8, false 0.2). Note that arguments may present multiple conclusions rather than true or false, as depending on the query posed to the reasoner.
The reasoner can be given a property or relationship as a premise that we want to find evidence for and against, for instance, here is a premise expressed as a property:
Its inverse would be:
The reasoner first checks if the premise is a known fact, and if not looks for other ways to gather evidence. One tactic is to generalize the property referent, e.g., by replacing it with a variable as in the following:
The knowledge graph provides a matching property statement:
We then look for ways to relate daffodils to temperate flowers, finding a matching relation statement:
So, we have inferred that daffodils grow in England. Another tactic is to generalize the property argument as in the following:
We can then look for ways to relate England to similar countries, for example:
We find then a match, for example:
Thus, providing us with a second way to infer that daffodils grow in England. The certainty depends on the parameters, and the [=similarity=] parameter in respect to the similar-to relation.
If the premise is a property whose referent is a variable, then its binding may combine values from lines of evidence, e.g. roses from one line of reasoning, and daffodils and tulips from another. This depends on the property operator, e.g. [=includes=] supports such combination, whereas [=excludes=] precludes combination.
Whilst the above uses a property as a premise, you can also use relations as premises, for instance:
The reasoner will then look for relevant knowledge on whether Peter is, or is not, young, e.g., Peter’s age, and whether that implies he is a child or an adult, based upon pertinent range statements and associated definitions for their values.
In a small knowledge graph, it is practical to exhaustively consider all potentially relevant inferences. This becomes increasingly costly as the size of the knowledge graph increases. One way to address this challenge is to prioritize inferences that seem more certain, and to ignore those that are deemed too weak.
See [[[#inferences]]] for an illustration of how properties and relations can be plausibly inferred across relation statements such as kind-of depending on the associated metadata. An instance or subclass of a class can plausibly share properties of that class as a specialisation, especially if that instance or subclass is annotated as being typical of that class, see [=typicality=]. Likewise, a class can plausibly share properties of a subclass as a generalisation.
If two concept are similar, then they are likely to share the same properties according to their degree of [=similarity=]. You can restrict this to a named context, e.g. as in:
Here are some examples of PKN relationships with their meaning, and whether the meaning is symmetric or asymmetric.
|kind-of||instance of class||asymmetric|
|is-a||instance of class||asymmetric|
|depends-on||instance of class||asymmetric|
The reasoner can exploit built-in tactics for reasoning, but it would also be interesting to provide a declarative means to describe such tactics, for instance, as an enabler for machine learning.
Implication rules are conditioned on conjunctions of antecedents, which can be properties or relations, see the folowing example.
Disjunctions can be modelled using multiple implications. Negatives can expressed using antonyms. The [=strength=] parameter describes the conditional likelihood of the consequents given the antecedents, whilst [=inverse=] describes the conditional likelihood of the antecedents given the consequents.
An implication can be chosen for consideration when one of its consequents matches a premise in question, the reasoner then needs to seek evidence to establish all of the antecedents before binding any variables in that premise and updating its estimated certainty. A similar process applies when using the implication in reverse.
Analogies involve comparisons between things or objects where similarities in some respects can suggest the likelihood of similarities in other respects, for instance, in their respective properties and relationships, or in ways to solve related problems. Gentner and Markman proposed a basis for modelling analogies in terms of establishing a structural alignment based on common relational structure for two representations, where the stronger the match, the better the analogy, see [[[GENTNER]]].
In some cases, the objects may have the same properties, in other cases, there is a systematic mapping between different properties, e.g., relating electric current to the flow of a liquid in a pipe, and correspondingly, voltage to pressure. If you know that flow increases with pressure, you can infer that current increases with voltage.
Analogical reasoning tests often take the form of A is to B as Y is to —, where students are asked to supply the missing term. This can be readily modelled using PKN, for example consider the query:
when applied to the knowledge:
The reasoner finds that leaf is related to tree via the part-of relationship, and can then use that to look for a part-of relationship involving petal, yielding the result flower. Now consider:
when applied to the knowledge:
This gives the answer hate by matching the object of the relationship rather than the subject. Here is a more complex example showing the output from the reasoner:
PKN also allows you to reason from known analogies, e.g. given:
The reasoner can then infer the following:
Analogies involve a form of inductive reasoning, and is related to learning from examples, and being able to apply past experience to a new situation based upon noticing the similarities and reasoning about the differences. Such reasoning can be simple as in the above examples or more complicated, for instance, when involving causal modelling.
Query statements are loosely analogous to SPARQL for RDF graphs. The reasoner seeks evidence to bind the quantified variables and applies the quantifier to generate the query results. Consider the example:
when applied to the following facts:
?x binds to 3 kinds of yellow roses out of a total of 10 known varieties of roses, so for this dataset at least, the query evaluates to true as 3 is small compared to 10. For more details, see Section[[[#queries]]].
Consider the statement: "Mary thinks John loves Joan". We can model this using a graph for Mary's beliefs as the object of a relation:
Here is a more complicated example "Mary thinks John is lying when he says he loves Joan"
Note that you can use a relation or property as a means to refer to a graph from a query or implication statement.
An open challenge is how to declaratively model strategies and tactics for reasoning rather than needing to hard code them as part of the reasoner's implementation. Further work is needed to clarify the requirements and to evaluate different ways to fulfil those requirements using an intuitively understandable syntax. The [[AIF]] ontology would be a useful source of inspiration.
This raises the question of how enable machine learning for both knowledge graphs and reasoning strategies and tactics.
Perhaps large scale machine learning is much more tractable using vector space representations instead of symbolic graphs? Large language models have shown impressive reasoning capabilities and can readily handle complex higher order statements, such as above.
PKN reasoners can generate arguments for and against a given premise, where each argument is formed from a chain of plausible inferences. The reasoning process estimates the certainty of each argument. Arguments can be grouped for the purposes of generating explanations from the argument graph. If the certainties for and against the premise are roughly similar, then the users are left to draw their own conclusions. The reasoner should provide further details when asked.