Building a runtime reflection system for Rust 🦀️ (Part 2)

Sam Scott

Part 2: dyn Attribute

Introduction

This is Part 2 in our blog series on our experience building a runtime reflection system in Rust for our policy engine, oso.

In Part 1, we built a simple class system, with a metaclass struct called Class to track type information. We also introduced the Instance struct to wrap any object and erase the type using dyn Any. You can find Part 1: dyn Class here.

In this post, we'll go through how we support accessing arbitrary attributes on our instances at runtime. There are multiple approaches to solving this problem, including the use of traits from the standard library and commonly used crates like serde. We'll talk through the pros and cons of each, and the version we landed on: adding attribute getters.

Accessing attributes on a struct

Suppose we have a struct:

struct Foo {
    x: u32,
}

We would like to write the following code in Polar, the oso Policy language:

x_is_one(foo: Foo) if
    foo.x = 1;

and have it check that foo has an attribute called x and that its value is 1.

What makes this so hard?

To start with, there is no getattr(foo, "x") equivalent in Rust. And what makes this hard to implement ourselves is that Rust (by default) provides no guarantees about things like the memory layout of a struct. In fact, in some cases will simply optimise away the field entirely. Even in debug builds, a debugger might be unable to lookup the value of variables.

In other words, if you haven't explicitly told Rust that you want it to let you access an attribute, you're unlikely to get it.

Our approach

Let's start by thinking about the method signature for dynamic access. We would like to take in an Instance and an attribute name (&str), and return a value to the policy engine.

In other words:

Fn(&Instance, &str) -> PolarValue;

There are actually two distinct places we could conceivably implement this method: on Instance, and in Class.

On one hand, if we implement it directly on Instance, then we don't need to do any more work to look up the class in order to get attributes.

On the other hand, we need to store the attributes somewhere so that we can add them to the instance on creation. So we would probably need to store these these on the Class anyway, and then copy them onto the Instance.

It's a subtle distinction, but for us it made more sense to the keep the instance lightweight, and to use the Class to store the attribute lookups. Although in Python its perfectly valid to add attributes to a class on the fly, we have that requirement in neither the host language nor in Polar.

To make the above work, we simply add a method on an Instance to retrieve the class of it, and we can get attributes from there:

impl Instance {
    fn get_class(&self) -> Result<&Class> { ... }

    pub fn get_attribute(&self, name: &str, host: &mut Host) -> crate::Result<Term> {
        let class = self.class(host)?
        class.get_attribute(self, name)
    }
}

Implementing the getter

To implement the get_attribute method, we can either implement (a) a method that can fetch all attributes, or (b) a method for each attribute.

We opted for (b), which gives us the most flexibility to decide on a per-attribute basis how the lookup works. Class stores a map from attribute names to getters:

/// Class definitions
struct Class {
   ...

    /// Map from attribute name to the attribute lookup
    attributes: HashMap<&'static str, AttributeGetter>
}

struct AttributeGetter(Arc<dyn Fn(&Instance) -> PolarValue);

We've now split the attribute lookup into two steps:

  1. Look up the getter for the named attribute: (instance, name) → (instance, attributes.get(name)) → (instance, f)
  2. Apply the getter method: (instance, f) → f(instance)

A simple example of an attribute getter is the closure |foo: &Foo| foo.x which has a unique anonymous type – written [closure@<source code location>] — and implements the trait Fn(&Foo) -> R.

To store this on the class we need to do some dynamic type mapping again. First, we have in our hands an Instance, not a Foo. We saw in the last post that downcast would allow us to get back to a concrete type. Furthermore we cannot store a function that returns foo.x is>. Instead, we make sure to do the conversion back to a PolarValue as part of looking up the attribute.

To make it a bit easier to create such type-erased functions, and to make it easier for users of oso, the AttributeGetter provides a new method which handles this for them:

impl AttributeGetter {
    pub fn new<T, F, R>(f: F) -> Self
    where
        T: 'static,
        F: Fn(&T) -> R,
        R: crate::ToPolar,
    {
        Self(Arc::new(move |receiver: &Instance, host: &mut Host| {
            // get back the original type of the receiver
            let receiver = reveiver.downcast().expect("type error");
            // call the function; convert the result
            f(receiver).map(|res| res.to_polar(host))
        }))
    }
}

We use this in the builder method:

impl<T: 'static> ClassBuilder<T> {
  pub fn add_attribute_getter<F, R>(mut self, name: &'static str, f: F) -> Self
    where
        F: Fn(&T) -> R,
        R: crate::ToPolar,
    {
        self.attributes.insert(name, AttributeGetter::new(f));
        self
    }
}

One nice property of the above is that it's a compile-time error to define an attribute getter for the wrong class! So despite erasing the type when constructing an Instance and later recovering the type dynamically, it's impossible to provide the wrong getter:

error[E0631]: type mismatch in closure arguments
  --> languages/rust/oso/examples/blog.rs:53:14
   |
53 |             .add_attribute_getter("y", |foo: &Bar| foo.y)
   |              ^^^^^^^^^^^^^^^^^^^^      ----------------- found signature of `for<'r> fn(&'r example_two::Bar) -> _`
   |              |
   |              expected signature of `for<'r> fn(&'r example_two::Foo) -> _`

Trying it out

Okay, let's see it in action! We'll start with some test setup:

struct Foo { x: u32, }

#[test]
fn test_get_attribute() {
    // Setup the host environment
    let polar = crate::Polar::new();
    let mut host = Host::new(Arc::new(polar));

   ...
}

Now that we're a runtime class system, we need something to manage these classes at runtime. This is our Host object. Host will be responsible for things like storing class definitions and caching instances.

Next, we define the Foo class:

// builder for defining the Foo class
let foo_class: Class = Class::builder::<Foo>()
    .add_attribute_getter("x", |foo| foo.x)
    .build();
// register the Foo class on the host
host.cache_class(foo_class, "Foo".into()).unwrap();

We added a simple builder pattern to make it easier to register attributes in bulk, and the built Class is cached on the host.

With the setup done, we can try out accessing attributes dynamically:

let foo_instance: Instance = Instance::new(Foo { x: 1 });
// get the value of "x" from the instance
let get_x = foo_instance.get_attribute("x", &mut host).unwrap();

// check the result is the PolarValue for 1
assert_eq!(get_x, polar_core::term!(1));

The test looks up the value of "x" on the instance of foo, and compares it to the expected result. Dynamic attributes in action! The returned value is now a Polar term, since in normal operation the return value would be sent to the Polar VM.

The benefit of this approach is it gives the user a lot of optionality: for cases where the attribute doesn't implement ToPolar, we can handle conversions right in the closure defined. And the implementation itself is super simple.

The main downside of the approach is that it is very manual. For every attribute a user wants to use, they need to add the add_attribute_getter line.

Fortunately, in the oso crate, we also provide a derive macro which can handle this. We still make getting attributes an opt-in process, but the code from above becomes:

#[derive(PolarClass)]
struct Foo { 
    #[polar(attribute)]
    x: u32,
}

#[test]
fn test_get_attribute() {
    ...

    // builder for defining the Foo class
    let foo_class: Class = Foo::get_polar_class();

    ...
}

Alternate Approaches

The approach we outline above suits our needs, but is leveraging virtually nothing from the Rust standard library nor the Rust ecosystem. Is there really nothing out there to address this problem?

We'll take you through a few of the alternatives we tried. Some of them are complementary to the approach we've taken and would be reasonable future additions, others just didn't quite seem to be what we needed.

In no particular order:

Implement Index<String>

In some ways, the most direct route we could take would be to use the Index<String> trait. This is normally used for data structures like Vec, which allow you to index into it with vec[i], or with HashMap, which allows you to get a key with map[&key].

The benefit of using a trait from the standard library is that users may get implementations of the trait for "free", e.g., if they are already using a HashMap, then they don't need to do any additional work to define attribute getters that leverage the trait.

impl ClassBuilder<T> {
    fn supports_indexing(mut self) -> Self
    where T: Index<&str>
    {
        self.set_attribute_lookup(|receiver, attr| {
            receiver.index(&attr)
        }
    }
}

The problem is: what is the output type for Index? We need it to be any type that can be converted back into the Polar core – perhaps Box<dyn ToPolar> or similar. To make that happen, we would be requiring the user to implement a version of Index for their classes, which is unlikely to be something they will want in the rest of their code. Given this, it doesn't seem like there is much benefit to using Index.

serde::Serialize

Whenever we encounter a problem that requires some form of translation, serde comes to mind.

We could require types to implement serde::Serialize. By writing our own custom serializer, this would give us a way to leverage the existing machinery for walking over all the fields of a struct. And the ubiquity of serde means many types will already have an implementation or are just one #[derive(Serialize)] away from having one.

// Structs are like maps in which the keys are constrained to be compile-time
// constant strings.
impl<'a> ser::SerializeStruct for &'a mut AttributeSerializer {
    type Ok = PolarValue;
    type Error = Error;

    fn serialize_field<T>(&mut self, key: &'static str, value: &T) -> Result<()>
    where
        T: ?Sized + Serialize,
    {
        if key == self.attribute_name {
            value.serialize(&mut **self)
        }
    }

    fn end(self) -> Result<PolarValue> {
        self.attribute_value.ok_or_else(|| Error::AttributeNotFound)
    }
}

But this approach still doesn't feel quite right. It is conflating application types with data formats. The way a type is serialized/deserialized doesn't necessarily correspond to how the object should behave in the application.

However, it would still be possible to extend the current approach by using serde as a shortcut to deriving attribute getters.

Require "getter" methods

Instead of providing a special mechanism to set attribute getters, we could follow the lead of other scripting languages built in Rust, like Rhai, and require that the user tags methods on the struct with #[polar(getter ="get_x")] or similar. This would be a reasonable extension of the existing proc macro.

Self-referential pinned structs

With the introduction of Pin, it is now possible to write self-referential structs (safely) in Rust. Indeed, this is one of the main examples given on the pin module docs.

How would this help us with dynamically accessing attributes? The basic idea would be to extend Instance to have a map of attributes: pub attributes: HashMap<&'static str, NonNull<dyn Any>>

Since this uses pointers (via NonNull), we would need to use unsafe blocks when accessing the attributes. The invariant that we are upholding is: it is safe to do this because we are fixing the Instance to a specific memory location, and thus referencing attributes on the instance is safe.

This might look as follows:

impl Instance {
    pub fn add_attribute_ref<F, T, R>(self: Pin<&mut Self>, name: &str, getter: F)
    where
        F: Fn(&T) -> &R,
        T: 'static,
    {
        let attr = NonNull::from(getter(
            self.downcast()
                .expect("wrong instance type"),
        ));
        self.attributes.insert(name, attr);
    }
}

We have the same basic setup: we are asking the user to provide a getter, but this time it returns a reference to the attribute. We then cast this reference into a pointer and store it on the Instance. The entire instance is pinned: this method takes Pin<&mut Self> .

Now, instead of invoking the attribute getter, retrieving the attribute from the instance is as simple as instance.attributes.get(&name), but with an unsafe block if you want to dereference the pointer.

What's the payoff for this additional complexity and unsafety? Instead of creating and storing a simple closure accessor, we are creating and storing a pointer to the value itself. However, in coarse benchmarks, the difference between these two didn't seem to be much of a win relative to some of the other more expensive operations we already have.

Compile-time policies

At this point in a project, when evaluating other options and thinking about how they would work in Rust, you start asking yourself questions like: "Why are we doing this?", and "What if Rust doesn't want us to do runtime dynamic classes?".

oso policies are meant to be treated as code. They live in your source code alongside your application code. So in theory, there's no need for us to defer type-checking and attribute lookups until runtime.

We could include a build step, or proc-macro, that takes in a Polar policy and outputs the needed Rust code. For example, given the following policy code:

x_is_one(foo: Foo) if foo.x = 1;

a proc-macro could parse the Polar code and output the necessary Rust code:

fn isa(self, instance: &Instance, class_tag) {
    match class_tag {
       "Foo" => instance.is::<Foo>()
       ...
    }
}

fn get_attribute(self, instance: &Instance, attribute) {
    match self.instance_to_class[instance_id] {
        "Foo" => match attr {
            "x" => instance.downcast::<Foo>().x,
            ...
        }
    }
}

It's not entirely clear just how much can be done at compile time without restricting Polar to exactly how Rust wants the world to behave. But its an intriguing idea! And with crates out there like reflect for doing a sort of "compile-time reflection," perhaps the implementation isn't too far away either.

Summary

After sailing through dynamic classes in Part 1, accessing attributes put up a bit more of a fight. However, with our attribute getter method in hand, we have something simple and flexible enough to handle most cases.

In Part 3 we'll look into dynamically calling methods! At first glance, these are similar to getting attributes, but we'll see how Rust's Fn traits don't quite make this as simple as we would hope.

  • Subscribe to our newsletter below to get the next installment of this series.
  • Interested in learning more about oso and why we need the Polar language? Check out our docs.
  • If you have any feedback, or want to chat about Rust, come join us in Slack.

Get updates from oso.

We won't spam you. Ever.