Types of Annotation Languages

What do annotations look like? What specification languages are there?

If annotations are to be machine-readable, they must have a well-defined syntax, and if a tool’s checks are to be meaningful, the annotations also need a well-defined semantics. Quite a few specification languages have been defined over the past several decades; they are categorized and explained below.

More detail about some of the specific languages, including examples and supporting tools, is provided on linked pages. An ever-changing list of tools supporting these languages is provided here.

The list below covers many different kinds of specification languages. From the perspective of this project, the most interesting types are those languages that have tools that directly check source code against annotations. These mostly (although not exclusively) exist in three categories:

  • Behavioral Interface Specification Languages
  • Pluggable type checkers
  • Model-checking languages with tools to extract checkable models from source code

Natural Language Comments and Requirements

The original annotations are natural language comments. For this project, they're out of scope because they are not machine-checkable. Note that tools such as javadoc for Java and Doxygen for C++ use structured comments. Some aspects of the comments can be checked, such as that all procedure parameters are mentioned. But this type of annotation is not useful for automated checking of program semantics.

However, all requirements for a project begin as informal expressions of the project concept, and become increasingly more detailed as the project is designed. Some projects only ever have informal requirements that must be checked against the implementation by human review. For other projects, at some point informal requirements are translated into tool-supported formal requirements (and human review is needed to be sure that translation is correct). With the right language to express the requirements and the right tools, some (or even many) aspects of the requirements can be checked against the implementation with automated tools. Software development will always require human input and creativity in most of its phases, even with the best tool support serving as an automated, pedantic assistant. The tools take care of routine or subtle details, raising the conceptual level at which human review is useful.

There are commercial and open source tools that support requirements management and traceability. A long list, created in early 2013, can be found here. Note that these tools arise from the project management domain; most of the other specification language and verification tools stem from research in software verification, model checking and flaw-finding. Consequently, we are not aware of any tools that manage the relationships and traceability of requirements (in the project management sense) to formal specifications.

Keyword-based annotation languages

These languages define a number of keywords and locations in the code where those keywords may be placed. The corresponding static checking tool recognizes the keywords and checks that whatever the keyword requires actually holds. The static checking tool may be independent of the compiler or built into the compiler.

Languages of this type do not typically include any (or at most very rudimentary) expression syntax. However, they might do significant inter-procedural reasoning about the global implications of a particular annotation. For example, declaring a variable immutable or non-null may have wide-ranging effects. This is particularly the case if the semantics is such that where annotations are absent, appropriate values are inferred by the context.

For example, Microsoft’s SAL language defines parameterized keywords such as _In_reads_bytes_(count). This keyword modifies the parameter to a procedure and states that the parameter is an input buffer from which ‘count’ bytes of data will be read. The requirement then is that any actual parameter must point to a memory area that is readable and contains at least the stated number of bytes. A tool can then check that that requirement holds in each location that the procedure is called.

A more flexible example of this language category is named Pluggable Types (cf. http://javacop.sourceforge.net and http://types.cs.washington.edu/checker-framework). Pluggable types are annotations on programming language types. For example, one can create a subtype of Object which is a non-null Object (an Object reference that is never null). The Pluggable type system allows users to define new such types and provides infrastructure for checking that the types are used properly; for example, ensuring that a possibly-null Object is never assigned to a non-null Object without confirming the value is indeed not null.

Other example subtypes that have been created express that objects or variables are readonly, interned, immutable, or define distinctions among kinds of strings. Sometimes, the characteristic to be checked is clearly one in the application domain, and not something one could build into a library or compiler; for example, one might characterize Strings as Encrypted or PlainText. The burden of writing annotations throughout an existing program can be minimized by defining appropriate defaults and by including tools that infer (and possibly automatically annotate) the correct subtype annotations for existing code. The most extensively developed and tested (with documented use cases) pluggable type system is the Checker framework for Java, developed by Ernst et al. at the University of Washington.

Example languages:

FindBugs – uses Java annotation syntax for some of its checks
Pluggable Types – for example

Tools that don't appear to use annotation languages

A number of tools do automated static checking of source code and don't require specifications from the user. At the low end, such tools check for compliance with coding standards or style guidelines. At the high end, these tools find violations of programming rules (e.g., buffer overruns) and inconsistent usage (e.g., a variable is usually but not always guarded in a critical section), but cannot otherwise check against programmer intent. Because these tools don't require formal specifications, the tools are easy to apply, and since they do find real bugs, they are widely used.

Although technically no user annotations are required for these tools, annotations have crept in various ways, including:


In some cases, type annotations are needed. For example, FindBugs normally operates without annotations, but for finding null pointer errors, FindBugs uses @NonNull, @Nullable, and @CheckForNull annotations. These are effectively keyword annotations used for just one of FindBugs’ analyses.


Tools that reason about inter-procedural control flows using source code need descriptions of the behavior of library functions for which source code is not available. For example, GrammaTech's CodeSonar doesn't require user annotations but internally uses its own language to express library models. The models summarize aspects of the behavior of a library function relevant to the tools’ analyses.


All of these tools produce false positives. That is, they sometimes warn about code that is not in error. Many tools provide a mechanism to suppress those warnings, using some tool-specific annotation (e.g., @SuppressWarning).

Because these tools do not check programs against user intent, they are of less interest to this project. However, they are useful nonetheless and the static analyses used by the higher-end tools are similar to the analyses used for verification tasks. Furthermore, the tools may be enhanced in the future to read and check annotations along with the processing they currently perform.

Many tools exist within this category, Wikipedia's (partial) list can be found here.

A subset of this category is model checkers that operate directly on source code, using SAT, SMT, or built-in solvers to find feasible paths to error conditions.

Mathematical, programming-language-independent specification languages

The keyword languages add useful additional static checking that prevents some kinds of errors. However, they are not expressive enough to use for full functional verification of programs – establishing that a program does what it is intended to do.

For that purpose, the specification language must include full logical expressions, including at least arithmetic and arrays. One approach is to use basic, programming-language-independent mathematics. The specification language provides an easy way to express the kinds of mathematical models typically used by software. Supporting tools can check that the specification is consistent.

These languages aren't usually connected to the code itself, though, and for that reason have been largely superseded by BISLs or top-down correct-by-construction approaches.

Example languages:

Larch (Shared Language)

Correct-by-construction/refinement techniques

One of the faults of languages that are purely mathematical is that they represent a design but don’t have tools to check that the implementation (the code) matches that design. Refinement techniques address this problem. In correct-by-construction development, an abstract model of the desired system is created, at a level of abstraction where it is ‘clear’ that the model meets the human intent for the software.

Next, the model is refined, perhaps in multiple steps. Each refinement step adds more detail, until eventually executable code is created. Some of these refinement steps might be performed semi-automatically, such as with code generation tools. The ‘correctness’ of the constructed code, with respect to the original abstract design, is ensured by validating that each refinement step maintains correctness.

Correct-by-construction is more of a design and implementation technique than a language per se, though some formal notation must be used to express the abstract models. Thus, the ‘specification language’ is essentially inextricable with the tool that performs the refinement.

Example systems/languages:

Perfect Developer
Event-B (Rodin tool)

Behavioral Interface Specification Languages (BISLs)

Languages in this class have two differentiating characteristics:

1. Their syntax and semantics are related to the programming language they describe
2. They are meant to describe the implementation itself, rather than an abstraction of it

Most of these languages descend from the Larch family, which had a language-independent shared language and companion programming-language-specific aspects for each supported programming language. Nearly all of the example languages listed below are quite similar to each other, with adjustments for the host programming language; for example, ACSL (for C) was explicitly derived from JML (for Java), adding and adjusting features to make it more appropriate for C. The research for each new specification language and its accompanying tools brought new ideas about how to specify languages, with significant cross-fertilization among different specification languages.

By borrowing syntax and semantics from the programming language, the specification language has the benefit that the initial learning cost is reduced and less mental translation between the specification and programming languages is needed when humans are comparing the code and its specification. When working with multiple programming languages, there is some advantage to having to learn just one programming-language-independent specification language. However, most of the various specification languages in this category are similar enough that that advantage is small.

The bulk of recent research has certainly focused on programming-language-related specification languages, also called Behavioral Interface Specification Languages (BISLs). There are numerous examples:

Example languages:

Java: JML
Concurrent C: VCC
C#: Spec#
C#: Sing#
.NET: CodeContracts
Dafny: Dafny
Eiffel: Specification language built-in to Eiffel
Scala: Leon

Some of these languages have multiple tools that use them. For example, JML is supported by ESC/Java2, the JML2 tools, OpenJML, KeY, the LOOP tool, and others.

JML, ACSL, Spec#, and VCC are quite similar, though the different languages use different syntax and stress different features, depending on the research directions that led to their design; they translate the code and specifications into a weakest precondition logical form and use SMT constraint solvers to establish the resulting verification conditions. SPARK is a dialect of Ada that has specification constructs built-in to the language. Eiffel was historically oriented toward runtime checking, but has recently added static checking as well. CodeContracts is also run-time checkable and is designed for use by an abstract interpreter rather than an SMT constraint solver or theorem prover. Leon uses a recursive functional form logic and reasoning engine.

Languages for modeling systems

Another approach is to first model the system accurately, then validate that the system satisfies relevant correctness and safety properties (using a model checking tool), and finally, implement the system in a programming language. There are many varieties of systems that might be modeled and corresponding varieties of modeling languages and model checking tools. For example, some model hardware while others model reactive systems. Some examples are given below, but for the most part, these systems are used prior to implementation and do not have tools that validate the implementations against the models. However, SPIN and Bandera, for example, are able to extract a model from source code (C and Java, respectively) and then apply its model checker to that model.

There are many tools and languages. See wikipedia's list, for instance. In this project, we are mostly interested in those that have tools that extract models from source code or analyze it directly.

Examples in this category:

SPIN (mostly on models written in PROMELA, but can extract a PROMELA model from C)

Model Programs

Model programs express specifications as snippets of code in the host programming language. In practice, in order to express what is done (the abstract behavior) rather than how it is done (the implementation), extensions to the core programming language are needed.

The advantage of this syntax is that it is largely parseable by the compiler itself. The only examples of this form that we are aware of are proprietary syntax for expressing library summaries for commercial tools. As such, aspects of the semantics are tailored to the companion tool and not designed for widespread reading or writing.