How Protovalidate uses CEL

This page explains the relationship between CEL and Protovalidate: why Protovalidate chose CEL as the foundation for all of its validation rules, how rules are defined as CEL expressions in validate.proto, and what happens when your application validates a message.

For a practical introduction to writing CEL validation rules, see the CEL overview.

Why Protovalidate uses CEL

Protovalidate is the spiritual successor to protoc-gen-validate (PGV), a protoc plugin that generates polyglot message validation functions. When developers use their Protobuf files and PGV to generate code, PGV creates idiomatic Validate methods for the generated types.

p := new(Person)
err := p.Validate() // err: First name is required

Because it relies on code generation, PGV’s rules have to be implemented in each supported language. When UUID was added as a well-known string rule, the code change had to consistently implement the definition of a UUID string in Go, Java, and C++.

CEL solves this problem. CEL expressions evaluate consistently across multiple languages, so instead of defining each rule in each language, Protovalidate defines a library of CEL expressions for common rules that work across all of its supported languages.

How rules are defined

Unlike protoc-gen-validate, Protovalidate isn’t a protoc plugin. It doesn’t rely on any code generation. The core of Protovalidate is simply one Protobuf file using the proto2 syntax to define options (annotations).

In validate.proto, you can see the definition for every standard Protovalidate rule. For example, Protovalidate doesn’t have to define validation for a UUID string in Go, Java, and C++. Instead, it stores the rule once as a CEL expression:

bool uuid = 22 [
  (predefined).cel = {
    id: "string.uuid"
    message: "value must be a valid UUID"
    expression: "!rules.uuid || this == '' || this.matches('^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$')"
  }
]

Because uuid is defined as a bool field in the StringRules message, this makes it easy for you to annotate any string field that should be a UUID, without worrying about inconsistent UUID checking across Go, Java, or other systems:

message Contact {
    string name = 1 [(buf.validate.field).string.uuid = true];
}

Since the definition of uuid is part of the StringRules message, its backing CEL expression is compiled as part of the Contact’s Protobuf descriptor — the compiled schema contains all of its own validation rules within its own metadata.

How validation works at runtime

Under the hood, CEL expressions are compiled into programs ahead of time. An application using CEL employs the CEL compiler to parse and type-check expressions, producing an abstract syntax tree (AST) that is then turned into an executable program. This makes it straightforward to cache compiled programs so that validation is fast at runtime.

Here’s what happens when your application validates a Protobuf message:

Your application depends on the Protovalidate library for its language.
It creates a Protovalidate Validator (a class or type provided by the Protovalidate library).
Optionally, it warms up the Validator’s compiled CEL program cache.
When the Validator is asked to validate a Protobuf message, it:
1. Uses its cache to look up all CEL programs that should be run for the message.
2. Runs each program, binding either the message or each field’s value as a variable named this.
3. Collects the results of each program.
4. Transforms those results into the Violation and Violations messages defined by Protovalidate.
Your application handles the idiomatic response from the Validator: Go uses an error, Java uses a ValidationResult class, etc.

If that sounds like a lot, and you’re just interested in using Protovalidate in RPCs, don’t fret — Buf provides quickstarts with either open-source or example interceptors that do all of this for you. They’re available for Connect and Go, gRPC and Go, gRPC and Java, and gRPC and Python.

What CEL unlocks

Because Protovalidate relies on CEL expressions that are compiled into schema metadata, it’s not limited to using only its standard library of CEL-based validation expressions. CEL allows Protovalidate to do what no other Protobuf validation library has ever done — it lets you write your own validation expressions.

Custom CEL expressions

With Protovalidate, you can write your own validation rules once in your Protobuf files, and then immediately use them across any supported language.

Protovalidate calls these custom rules. Simple to implement, they’re nothing more than an association of a CEL expression with a given field or message:

message SampleMessage {
  string must_be_five = 1 [(buf.validate.field).cel = {
    id: "must.be.five"
    message: "this must be five letters long"

    // A CEL expression defines the rule.
    expression: "this.size() >= 5"
  }];
}

Reusable rule libraries

Protovalidate defines its standard rules in a Protobuf file. By extending its messages, you can do the same thing. This means you can develop organization-specific libraries of your own rules, publish them to the Buf Schema Registry, and then reuse them across your enterprise.

Creating these predefined rules is similar to creating custom rules, using proto2 syntax and extending Protovalidate’s rule messages:

extend buf.validate.StringRules {
  optional bool must_be_five = 80048952 [
    (buf.validate.predefined).cel = {
        id: "must.be.five"
        message: "this must be five letters long"

        // A CEL expression defines the rule.
        expression: "this.size() >= 5"
    }
  ];
}

CEL extensions it adds

You’ve already seen that CEL allows variable values to be bound at runtime. Protovalidate takes advantage of this, providing variables like this, rule, and rules to your CEL expressions.

CEL doesn’t stop with variables, however — brand-new functions and overloads can be added to CEL itself. CEL programs delegate their execution to implementations provided by the host language, binding to names and CEL types.

Protovalidate leverages this to provide common validation functions that aren’t built into CEL. For example, every language-specific Protovalidate implementation consistently implements isNan() to provide a function that you can use to check for NaN values. In protovalidate-go’s source code, you can see this function’s declaration, naming, binding to the CEL double type, and delegation to math.isNaN():

cel.Function("isNan",
   cel.MemberOverload(
       "double_is_nan_bool",
       []*cel.Type{cel.DoubleType},
       cel.BoolType,
       cel.UnaryBinding(func(value ref.Val) ref.Val {
           num, ok := value.Value().(float64)
           if !ok {
               return types.UnsupportedRefValConversionErr(value)
           }
           return types.Bool(math.IsNaN(num))
       }),
   ),
)

This introduces cross-platform concerns: if Go’s math.IsNaN() follows different semantics than the type-specific isNaN() functions for Java’s Double and Float types, consistency could suffer. Protovalidate addresses this through a suite of conformance tests that all supported implementations must pass.

All of Protovalidate’s CEL extensions are documented in the extensions reference.

Learning more

CEL by Example teaches CEL through interactive, hands-on examples covering types, operators, macros, and more.
Learn more about using CEL with Protovalidate to write custom and predefined rules.
Find out how to use CEL in your own Go, Java, or C++ applications with a CEL code lab.
Take a deep dive into the CEL language reference.