Understanding Interpreter pattern

I am trying to understand how interpreter pattern can actually be implemented.

enter image description here

As per the diagram; an expression has 2 nodes: terminal & non-terminal. Can it have multiple type of nodes as well? Because I believe it is drawn considering the math expression transformed in binary tree DS where leaf nodes are number and non-leaf nodes are operations: +,-,/ etc.

Somewhere I read that java.util.Pattern is an example of interpreter pattern.

Pattern pattern = Pattern.compile("^[abc](ab|c?d)?ef$");
Matcher matcher = pattern.matcher("some RE");
if(matcher.matches()){ .. }

So I was trying to relate if my implementation of RE engine: BooleanSequence is also an example of interpreter pattern.

BooleanSequence seq = new BooleanSequence("[abc](ab|c?d)?ef");
seq.compile();seq.minimize();
Matcher matcher = seq.getCoreMatcher();
matcher.match("some RE");

Implementation note:

Main class BooleanSequence(BS) takes RE (can be considered as context) and build some kind of in-memory DS, where each char of RE is a node. There are many types of node, like: normal, range, lazy, any etc. I believe it can be considered as expression as given in diagram.

Node class also has match(). BS gives a matcher or it can be created separately. It is completely isolated from BS class. And there are many types of matchers (currently 3). Matcher calls match() of all the nodes until whole expression is evaluated.

This class diagram means that an AbstractExpression is either a TerminalExpression or a NonTerminalExpression. If its a NonTerminalExpression, it is itself an aggregation of one or several AbstractExpression.

In fact this structure is a tree. Typically the terminals would be further derived into vriables and litterals, and the non terminals would be further derived at lest in unary and binary operators, but may be even more.

An example of instantiation could be:

enter image description here

In your java.util.Pattern example:

  • the calling code is the client, who first build the abstract syntax tree when compiling a regex pattern (i.e. building an internal representation of the regex pattern).
  • the Pattern is the interpreter. It would in principle correspond to the top level AbstractExpression. The only particularity is that the structure of the expression is encapsulated in the interpreter and not accessible.
  • the Pattern.matcher() is the equivalent of a call to interpret(), the string to parse being the context.
  • the Matcher object is the result of the interpretation with on a particular string.

Can it have multiple type of nodes as well?

This diagram you are showing is a class diagram, so more-or-less by definition, you can continue to subclass. For example you might subclass NonTerminalExpression with BinaryExpression.

If we take the diagram suggestively instead of literally, you might have BinaryExpression directly subclass from AbstractExpression, since you know you’ll have exactly a left and a right instead of an arbitrary sized collection of children. You can see that in the Wikipedia Interpreter Pattern article, they have done this for the Java version.


I don’t see java.util.Pattern exhibiting that specific interpreter pattern, externally, at least, because it always involves just one object, a matcher, rather than the arbitrary number of objects that I would expect to be involved in the interpreter pattern for any given text of arbitrary length. By contrast, the interpreter pattern would produce a tree of objects for a given expression, and there is nothing in the API that necessarily suggests such a tree is present.

java.util.Pattern may very well do something like that internally, however, it is not visible (to me without some further investigation) because it is providing encapsulated behavior. So, the API isn’t demonstrating the interpreter pattern. Still, a regular expressions matching mechanism following that API could be implemented in a number of different ways either using or not using an interpreter pattern.

I would say that what both java.util.Pattern and your BooleanSequence follow a linguistic pattern where the first step creates a parser for a language, which is defined by specifying a grammar in some other language, and both have an additional step of applying a parser to a specific text.


See also Abstract Syntax Tree. If we were to apply behaviors (methods) to the class hierarchy of the abstract syntax tree we could get to the interpreter pattern, and we could also get to a compiler (pattern).

Trả lời

Email của bạn sẽ không được hiển thị công khai. Các trường bắt buộc được đánh dấu *