A General Specification of an Efficient Java Implementation
for Representing and Interpreting OSF Constraints

Hassan Aït-Kaci

HAK's Language Technology
All contents Copyright © by Hassan Aït-Kaci—All Rights Reserved
Last modified on Mon Dec 09 13:10:58 2019 by hak

Table of Contents


This is a high-level specification for implementing an operational system in Java for Order-Sorted Feature (OSF) constraints. It describes general syntactic aspects for defining a sort taxonomy and OSF terms (also known as ψ-terms) using such a taxonomy. It discusses considerations for an optimized operational semantics for the efficient representation and interpretation of expressions using ψ-terms. [N.B.: the expression "ψ-term" was introduced originally by the author in his PhD thesis (Penn, 1984) as a shorthand for "property and sort inheritance term," as a formalization of some parts of Ron Brachman's "SI-Nets," or "Structured Inheritance Networks" defined informally in the latter's PhD thesis (Harvard, 1977).]

In particular, we cover implementation details for encoding partially-ordered sorts using binary codes. We also describe the basic building blocks of a general execution architecture using this set-up to implement ψ-terms unification and generalisation. For general user convenience, we also suggest useful pragmas than can come handy in a practical language implementation for experimenting with OSF constraints.

Note that this is just an initial specification for guiding the Java implementation of such a language whose workings are based on an underlying concrete architecture such as the one we describe. One can peruse the initial set of source files. An actual first language prototype testing some of the issues described in this specification is realized using the HLT Jacc grammar meta-compiler and HLT language tools implemented by the author. It is called "OSF V3" (for "Version 3")—see the source files for what is in V3 that was not in V2, V1, or V0.

To make some aspects more tangible, this specification will make informal reference to OSF V0's specific syntax when discussing general structural and operational issues. However, this specific syntax is not binding for future more complete versions of OSF technology, and neither are all the points discussed, as it is likely that variations and departures from some details are bound to occur in future implementations. Nevertheless, the overall structure is expected to stand as described.

Syntactic Forms

Sort Encoding

In this section, we cover in some detail practical aspects for the efficient representation and use of multiple-inheritance in a large sort taxonomy, and the machinery for encoding partially ordered sorts as binary codes to optimize lattice operations on them. We first give the details for implementing a basic technique based on transitive closure. Then, we develop data structures and methods for implementing a more elaborate technique capable of handling very large taxonomies.

Transitive Closure Encoding

Note that no operation (either on sorts or ψ-terms) may be performed before the complete set of sorts has been encoded. Sorts are recorded from their appearances in declarations. However, other undeclared sorts appearing in ψ-terms definitions must also be so recorded. Therefore, any attempt to perform any operation needing information on sorts will prompt the user on whether compilation should proceed. (See also the %encode and %automatic pragmas below.)

Once the sort taxonomy has been encoded, any new sort declaration extending the taxonomy, or any definition involving a new sort symbol will prompt the user for the need to reencode the sort taxonomy before evaluating any expression—unless the taxonomy is explicitly "de-encoded"—i.e. all sort codes are erased. (See also the %clear and %extend pragmas below.)

In such a language, the sort taxonomy consisting of all the sorts is encoded into bit vectors. This does not apply to literal values (such as numbers, strings, etc.) but only to symbolic sorts. All the symbolic sorts are numbered in the order in which they appear in a program and stored in an array called (say) TAXONOMY of objects of class Sort at the index corresponding to their order of appearance. [N.B.: In fact, to allow for dynamic growth if needed, it will be an ArrayList rather than, strictly speaking, an array. But this is just a detail and we use "array" in this specification for simplicity.] In this TAXONOMY array, the built-in symbolic sorts are initialized as follows:

The bottom and top sorts ({} and @) are not stored in TAXONOMY. They are instead defined as constant sorts.

[N.B.: In this specification, we limit the built-ins to numbers, but actual implementations may (and hopefully will) include other built-in sorts (e.g., String, Boolean, List, etc., ...). As well, interpeted operations on literals such as arithmetic, string manipulation, etc., and related issues will be discussed later.]

All other non-built-in sorts can then be stored from index 3 and upwards in the order they come. In order to allow for more built-in sorts in future versions (say, List, Nil, String, Boolean, True, False, etc., ...) rather than referring to index 3 explicitly, we shall define a int constant called MIN_INDEX denoting the least index available for non-built-in sorts in the array TAXONOMY. Thus, in this specification, MIN_INDEX = 3.

The class Sort has the following fields:

We shall ensure that the built-in sorts are encoded once and for all as follows: Before the complete sort encoding may be performed, a Sort object is created for each sort in the order it appears and stored in the TAXONOMY array at the next available index in TAXONOMY, with its name and this index. As it is created, a Sort object of index n greater or equal to MIN_INDEX, is initialized to have the nth bit of its code set to true.

When all Sort objects are thus stored in the array TAXONOMY, a transitive closure procedure on all sorts of index greater or equal to MIN_INDEX must be performed to reflect the declared ordering for non-built-in sorts. In order to do so, whenever TAXONOMY[i].name has been declared to be a subsort of TAXONOMY[j].name, then the jth bit of TAXONOMY[i].code must be set to true.

After that, a standard transitive closure is performed on the codes of all elements of index higher than MIN_INDEX to obtain the final set of codes for all the sorts. Using Warshall's algorithm (see also here), this gives:

int n = TAXONOMY.size();
for (int k = MIN_INDEX; k < n; k++)
  for (int i = MIN_INDEX; i < n; i++)
    for (int j = MIN_INDEX; j < n; j++)
      if (!TAXONOMY[i].code.get(j)))
        TAXONOMY[i].code.set(j,TAXONOMY[i].code.get(k) && TAXONOMY[k].code.get(j));
N.B.: The java.util.BitSet ensures that the size of bit sets is increased as needed. However, extending the size of a BitSet is done so the new bits added to it are higher-end bits and are initialized to false. This may be fine for all the sort codes except for that of @ since it must always have all its bits set to true. It is also problematic when using Boolean negation on a code as the upper bits will then be flipped to true—thus giving way to nonsensical interpretations.

Therefore, it is necessary to limit the size of relevant bits in a BitCode so as to manipulate only the lower bits that make sense. In particular, we must ensure that @ has all and only those bits within this size set to true. This size is the number of defined sorts. Hence, we must set the code of @ only after all other codes have been computed so that we know the maximum BitSet size (i.e., TAXONOMY.size()) used to allow setting @'s code correctly to BitCode.allSet(TAXONOMY.size()).

We must be careful to make Boolean operations (especially the not operation) on codes operate only on the bits of index lower than TAXONOMY.size(). This is another reason why we need to define the BitCode subclass of java.util.BitSet to provide for dyadic and, or, and not Boolean methods that are aware of the size of the codes to maintain always all others to false .

For large or sparse taxonomies, this encoding method can be further optimized (see the following section on modulated encoding).

Modulated encoding for large taxonomies

This section discusses implementation issues for sort encodings in large taxonomies using a technique of code modulation taking advantage of sparse or barely disconnected sort hierarchies which happen frequently in practice.

Modulated encoding is technique taking advantage of the topology of the taxonomy. The basic idea of modulated encoding that a taxonomy can be partitioned into modules of tightly related elements such that only a sparse number of "is-a" links cross over from one module to another. Then, the set of modules can be viewed itself as a "meta"-taxonomy of module elements that can be encoded using the method above. The taxonomy inside each module can then be encoded independently of the others modules. This ends up in providing the sorts with a lexicographic encoding. This reduces the size of codes at the expense of slightly less efficient boolean lattice operations. Comparing two elements first compares the codes of their modules, then if that result in consistent module, proceeding with the comparison in the resulting module. The efficiency in code size and code operations depends of the "quality" of the modulation of the original ordering. This method is worth using for very large sparsely connected taxonomies (which are very common). This modulation can be reiterated to have modules of modules, etc., ...

A summary of a modulation algorithm and operations on modulated codes together with considerations for implementation in Java is given here.

Interpreting Sort Expressions

In essence, once the set of sorts has been encoded as explained above, it is possible to proceed to interpreting sort expressions using bit vector operations on the underlying sort codes. However, there are some practical issues to consider.

Identifying sorts by name

As they are being declared, sorts are parsed as identifiers which are then encapsulated as Sort objects stored in the TAXONOMY array at their first appearance. Each sort name must correspond to a unique Sort object. Therefore, a later appearance of a sort name must be able to retrieve its corresponding Sort object stored in TAXONOMY. Rather than sweeping this array each time with a linear search, it is more efficient to use a hash table called NAMEDSORTS associating String sort names to the Sort objects they correspond to. Thus, at the same time a Sort object is stored in the array TAXONOMY, it is also recorded in the hashtable NAMEDSORTS as the value of the its unique (i.e., interned) String name. It is important to have the equals method and the hash code of a Sort object depend only on the sort's name, and neither on its index nor on its code fields as these are both subject to change during the encoding and reordering of the array TAXONOMY (see below). For this reason, Sort.name will always be set to its intern() unique version so that they may compared with == rather than equals.

Cyclic order declarations

It is possible (if only as a mistake) that some is-a declarations introduce cycles in the ordering. Detecting such cycles is simple once the sort encoding has been performed by transitive closure. This is explained next.

Sort retrieval

Sort codes allow fast Boolean lattice operations on the sort taxonomy so that intersection, union, and complement are optimized. Note however that such operations may result in a code for which there is no explicit declared sort. This is fine as long as it is not necessary to access explicitly declared sorts that appear in the sort ordering. However, when this is eventually necessary—if only for printing results—we will need to retrieve which explicit sorts a code corresponds to. When this code corresponds to an actual sort in TAXONOMY, then it is returned. Otherwise (i.e., when there is no explicit sort with such a code), the answer for this code is the disjunctive set of sorts possessing a code immediately lower than the given code. In other words, the sort of a non-existing sort's code is retrieved as the sort expression that is the union of its maximal lower bounds.

In order to allow for such retrieval, we shall use a hash table called CODECACHE, as a cache associating sort codes to the set of all Sort objects that are its maximal lower bounds of explicitly declared sorts existing in the encoded taxonomy. If not stored already, such a set is computed from the TAXONOMY array.

To compute such a set of maximal lower bounds of a given code, we sweep the TAXONOMY array upwards starting from index MIN_INDEX. As we proceed upwards, we collect all lower bounds of the given code in a set, pruning this set anytime a code higher than one of its elements is found, which is then added to the computed set. Note that there is no need to go higher than the first sort whose code is the given code or that of a super sort of the given code (because the order of the sorted array ensures that all sorts of higher index are greater or unrelated). Thus, the final set of Sort objects we end up with is that of the code's maximal lower bounds, which we can then store in the hash table cache CODECACHE as an image of the given code for future potential retrieval of this code's maximal lower bounds without having to recompute it.

One important note is worth making regarding how to represent sets of sorts stored in CODECACHE as potentially non-unique sorts associated to a bit code. Such sets of sorts are also needed for printing out results when decoding bit codes. It is possible of course to use aggregative objects of Java type Sort[], ArrayList<Sort>, or HashSet<Sort>. However, even though any of those would work for our purpose, they would be unnecessarily wasteful in terms of memory space and processing time. For example, they would lead to awkward set operations, which are needed in many of the taxonomy pragmas used to explore a sort taxonomy (e.g., computing a sort's children/parents, or its siblings, mates, similars, etc.,...).

So, rather than using such space-bulky and time-inefficient aggregate Java types, it is simpler and more efficient to use a BitCode. Indeed, all we need is to represent a set of sorts in canonical form; i.e., a maximal disjunctive set of incomparable sorts of the form {s1; ...; sn}. Hence, a bit code will precisely represent such a disjunctive sort when its jth bit is set to true iff j == si.index, for all n = 1,...,n. Retrieving any sort member si of such a set is then given simply by TAXONOMY[RANK[j]], where j is the bit set to true for si. If such a bit code c is such that c.cardinality() == 1 then it corresponds the unique defined sort given by TAXONOMY[RANK[c.nextSetBit(0)]].

Evaluating sort expressions

Sort expressions as defined above are essentially Boolean expressions. Evaluating such expressions can be quite efficient once the sort taxonomy has been encoded as we can rely on the underlying Boolean lattice of BitCodes. However, there are a few subtle issues to consider in order to keep these operations as efficient as they should be. Following is an overview of a setup for evaluation of sort expressions that will have been parsed as expressions involving sort names.


Before being interpreted, a ψ-term is represented as a data structure, which is described next. Such a structure is built by parsing its syntax, ensuring that all such a structure's properties are kept consistent with its definition prior to being interpreted.

Ψ-term structure representation

A ψ-term structure is built out of the syntactic expression described above. Such a syntactic structure is represented as an instance of the PsiTerm Java class. The basic structure of this class consists of:

A tag is essentially a logical variable. In other words, it may be unbound, or bound to another tag. Its essential structure consists of:

Thus, a tag uses a Tag deref() method that follows the chain of ref links until reaching the ultimate unbound (i.e., self-referential) tag, which it returns. Therefore, anytime accessing a tag, it must be dereferenced to get to its actual identity.

A feature is just a name—and could thus be represented as a String. However, for reasons that will become apparent when we introduce typing constraints for features, we represent them as instances of the class Feature, which (for now) has only one field name, of type String.

Ψ-term structure construction

Building a ψ-term structure by a parser from its written syntax must essentially ensure that (1) features and positions are functional (i.e., that a given feature or position appears at most once as argument of the ψ-term); and, (2) tagging is consistent (i.e., that a given tag refers to the same ψ-term (sub-)structure no matter where it occurs). Since this must be done prior to sort interpretation, we proceed as follows.

If a feature (or position) that has already been built as a ψ-term's argument (say, as f ⇒ t) occurs again as an argument to the same ψ-term (say, as f ⇒ t'), then the structure of t' is merged into that of t by:

The same subterm-merging procedure is applied for multiply occurring tags. In that case, a subterm is first built using a new tag, then this newly built subterm is merged with the reoccurring tag's term. This ensures consistent term sharing, including circular coreferences.

For example, parsing the following ψ-term syntax:

#P : person
     ( id => @
             ( first => "John"
     , id => name
             ( last => #S
             , first => string
     , spouse => married_person
                 ( address => #A : location
     , spouse => @
                 ( id => name
                         ( first => "Jane"
                         , last => #S : "Doe"
                 , id => name
                         ( first => string
                 , spouse => #P : married_person
                                  ( address => #A
will yield the following uninterpreted ψ-term structure:
#P : person & married_person
     ( id => name
             ( first => "John" & string
             , last => #S : "Doe"
     , address => #A : location
     , spouse => married_person
                 ( id => name & name
                         ( first => "Jane" & string
                         , last => #S
                 , spouse => #P
                 , address => #A

Note that, because of term merging and the fact that a ψ-term is uniquely identified by its tag, the "true" identity of a ψ-term must be retrieved through its dereferenced tag. This is especially relevant when accessing subterms of a ψ-term. Thus, just as for tags, a PsiTerm deref() method is provided for this purpose. For a ψ-term t, invoking t.deref() is equivalent to invoking t.tag().deref().term(). Therefore, anytime accessing the actual structure of a ψ-term, it must first be dereferenced to get to its actual identity.

Ψ-term structure interpretation

Once built, an uninterpreted ψ-term structure may be interpreted by evaluating the sort expressions at each node. In order to do that, the sort taxonomy must first be encoded. If the intepretation process yields {} for any node, then the whole ψ-term's value is {}.

Once interpreted, the pretty-printer must be adapted to decode the sort expression's values at each node resulting from their evaluation.

[To be completed later...]

Sort Name Definitions

Local sort names definitions

This is a facility only meant as a pragmatic convenience for a user to avoid having to type in a large ψ-term several times if needed. Each occurrence of a local sort name in an expression stands for a new ψ-term: the one it defines up to renaming of its tag variables. In particular several occurrence of the same name will not share variables before interpretation. After interpretation, some of these tag variables may get unified. Indeed, the scope of tag variables appearing in the expression defining a local sort name are local to that expression and are renamed each time this local sort name is expanded to its definition. In order to avoid infinite recursive expansion issues, such a local sort name may not be allowed to appear inside an expression defining this sort name.

Global sort name definitions

This defines a sort name standing for a global ψ-term structure. Wherever it appears in a ψ-term (whether local or global, including itself), it is the same. Its tags have also global scope and are always the class rep of any tag bound to it. For this reason, a global tag may only be dereferenced to another global variable. [To be completed later...]

User Interaction Pragmas

Next, we elaborate on operations meant as facilities for a user to explore or set parameters of an OSF sort definition and evaluation environment. These are dubbed pragmas because they are, strictly speaking, not components of an OSF language, but pragmatic tools related to an OSF environment. They take the form of extra-linguistic operators whose effect is to set parameters of, or return information about, the state an OSF environment. Such a pragma is a built-in construct of the form %pragma where "pragma" is the name of the pragma, possibly followed by arguments.

Taxonomy pragmas

Taxonomy pragmas are operators meant to explore a declared sort taxonomy. Useful pragmas for exploring taxonomic properties are listed below. This is neither a complete list, nor are all of these guaranteed to be provided due to non-trivial computational complexity required by some (e.g., %width).

Computing the children and parents of a sort
The children of a sort are its maximal lower bounds—except for bottom, which is always childless: its children are just itself ({}).

In order to compute the children of a sort, given its code c, we need to sweep the TAXONOMY array starting from {} up to, but not past, the first sort whose code is equal to or greater than c. As we do so, we collect any defined sort whose code is a strict lower bound of c, removing from the collected set any sort subsumed by a previously so-collected sort. The reason why we need not proceed past the first sort whose code is equal to or greater than c is thanks to the post-encoding reordering of the TAXONOMY array, which guarantees that any sort of higher rank will have a code that is either incomparable to or contains the given code.

The children of a sort expression are defined as follows.

The parents of a sort are its minimal upper bounds—except for top, which is always parentless: its parents are just itself (@).

Computing the parents of a sort is done similarly as for the children, only starting from @ and proceeding downwards down to, but not past, the first sort whose code is equal to or less than c. As we do so, we collect any defined sort whose code is a strict upper bound of c, removing from the collected set any sort subsuming a previously so-collected sort.

The parents of a sort expression are defined similarly as for the children of a sort expression, only dually.

Computing the siblings of a sort

[To be completed later...]

Computing the mates of a sort

[To be completed later...]

Computing the similars of a sort

[To be completed later...]

Computing the heirs and founders of a sort
The heirs (resp., founders) of a sort s are easily obtained from the minimals (resp., maximals), which are given by %parents {} (resp. %children @).

For %heirs s, this is done by keeping only those sorts in the set of minimals whose index is set to true in s.code.

For %founders s, this is done by keeping only those sorts in the set of maximals whose code has the bit corresponding to s.index set to true.

Computing the descendants of a sort
Computing the set of descendants of a sort in a fully encoded taxonomy is quite easy since they correspond to the true bits in its code, except for that sort itself. Therefore, it is given by the following formula:

Computing the ancestors of a sort
Contrary to the set of descendants, computing ancestors of a sort s in a fully encoded taxonomy is not so obvious. Indeed, the method described above relies on the fact that s.code was obtained by transitive closure. So the positions of the true bits in s.code give the ranks of all and only the descendants of s.code.

For ancestors, however, it is not possible to proceed similarly by relying on the false bits of s.code. Indeed, these bits only say that the sorts positioned at these ranks are not subsorts of s. In other words, they would in general give a superset of the set of the ancestors of s: the actual ancestors as well as incomparable sorts (remember this is a partial-order).

Let us take an example. Consider the following given taxonomy:


Using the transitive closure of the "is-a" ordering, this taxonomy is encoded as the following TAXONOMY array, each entry bearing information about a sort in the taxonomy:

code index: name

111111111111 12: @ (all)
010011110011 11: animal
001000110011 10: winged-thing
000100011000 09: vehicle-driver
000010010000 08: mammal
000001000000 07: fish
000000100011 06: bird
000000010000 05: plane
000000001000 04: car-driver
000000000100 03: human
000000000010 02: ostrich
000000000001 01: canary

So computing the set of ancestors of a sort s works as follows:

This is because all ancestors of s must have a code containing a 1 wherever there is a 1 in the code of s (because all descendants of s are also descendants of any ancestor of s, by definition of the transitive-closure code).

Computing the height of a sort
Computing the height of a sort in a fully encoded taxonomy can by done by dynamic programming as follows. Then, the height of the whole taxonomy is given by height(@).

In order to make this computation more efficient, we can prevent multiple recomputation of the height of each sort in the above recursive formula by setting a new field in the class Sort called height to a sort's height as soon as it is computed.

Computing the width of the sort taxonomy
Computing the width of a poset is known to be a difficult problem.

[To be completed later...]

Miscellaneous pragmas

We also provide for other miscellaneous useful pragmas. There could certainly be other useful pragmas that may be added to this basic set.

Built-in Literals and Operations

Dealing with literal constants of built-ins sorts (such as numbers, strings, etc.), requires special care.

In order to allow elements of large or infinite sets of values such as integers, floats, strings, etc, we need to provide a means to account for such values without the need of treating each element as a bona fide sort. Indeed, this would be unfeasible to encode elements of infinite sets (e.g., integers), or grossly inefficient for large finite sets as it would require a large number of codes of large size, each bearing only one single true bit.

Another issue regarding elements is that they may be aggregated using a monoid composition law—i.e., a dyadic associative operation with a neutral element. For example, for number elements: (+,0), (×,1), (max,-∞), (min,+∞), etc., ...

To deal with both issues, a simple data structure representing an element sort can be used as a subclass of Sort bearing, in addition to the sort's bit code, a value and a monoidal composition law. The former indicates the sort of the element, and the latter how to compose two elements of compatible sorts if ever unified. By default, this composition law is equality—i.e., whenever the elements have a value, it must be the same. If another law is specified and both elements have compatible sorts and each has a value, then the two elements are composed using it and the sorts intersected as normal sort conjunction.

[To be completed later...]

Monoidal Aggregation

[To be completed later...]

Polymorphic sorts

A polymorphic sort is a sort that takes another sort as argument—e.g., set(α), list(α), etc., ...—where α is a sort variable. Such a sort may have its sort variable be instantiated by any existing sort—e.g., set(person), list(set(integer)), etc., ...

[To be completed later...]

OSF Constraints and Theories

A constraint attached to a the sort in the taxonomy specifying how features are sorted and equations among feature path is called an OSF theory (see also slides 37–48 of these presentation slides). It first puts ψ-terms into a so-called dissolved form. Dissolving ψ-term transforms its graph structure into a set of labelled triples representing its edges. This is pretty much similar to the way RDF triples are obtained from their XML representation syntax that refer to other XML structures using tag attributes. In fact, this can be a standard internal format for ψ-terms as well as closely related graph-data formats such as semi-structured data representations, LinkedData, JSON, JSON-LD, etc., ...

[To be completed later...]

WAM-Style Compilation

This addresses a method of compiling dissolved ψ-term (especially for unification) using a WAM-style abstract-machine. A basic method is described in this technical report. An implementation of the method done in LIFE is described here. This method can be extended to compile OSF theories. However, these methods have to be spelled out for an efficient Java implementation.

[To be completed later...]

Logic Programming over OSF Constraints

[To be completed later...]

Functional Programming over OSF Constraints, Residuation

[To be completed later...]