Ditto - Blog - Introducing Safer FFI

Quick References

When we started to build Ditto as a cross-platform SDK, we understood that it was untenable to create a specific port for each popular programming language. Instead, we opted to build the vast majority of shared code in Rust. Rust bought us a lot of features such as being easier to read, highly performant, and includes a modern build system and package manager. After the common code was built in Rust, we would then expose the primary APIs over a foreign function interface (FFI). For higher level programming languages like Swift, Java, and C#, we would create ergonomic bindings to Rust through calling the FFI C-headers.

The Ugly of Rust FFI

While this FFI-strategy sounded straightforward on paper, in practice this didn't turn out so pretty. Our initial efforts primarily revolved around editing a singular C-FFI layer. We knew that C was the lingua franca of cross-language communication and initially assumed this singular layer to be the long term solution to writing FFI code. However, this FFI layer that we created began to grow unnaturally beyond it's initial purpose. As a rapidly iterating startup, non-trivial logic began to creep inside this FFI layer to compensate for the rising idiosyncracies and specific needs of each platform. As more logic began to creep into this FFI layer, we began to write more and more unit tests to guarantee stability of a layer that was only meant for passing functions and data around. As a side effect, we added more unit tests to the `#[no_mangle]` functions which frequently juggled several C-pointers.

You can probably see where this is going. As soon as we started aggressively using C-pointers everywhere, the FFI layer began to drown under `unsafe` code which was used to manage raw pointers, confusing `out` parameters, and uninitialized memory. This made the FFI layer increasingly challenging, confusing and unpleasant to work with on a day to day basis. Most importantly, the plentiful usage of `unsafe` code fundamentally defeated one of the purposes of why we chose to work with Rust. Rust promised us safety when it comes to memory management. Unfortunately, our growing `unsafe` code directly opened up opportunities for memory related bugs to sneak in.

After noticing these major problems, we decided to fully invest our efforts in an FFI framework that maintained Rust's memory safety features while making the FFI far more pleasant to work with. `safer_ffi` is a framework that was built in-house specifically to tackle our challenges with building a cross-platform SDK with a Rust core. However, we're open sourcing it because we believe that Rust will play a major role in cross-platform development for many projects and that `safer_ffi` will prevent developers from repeating our same mistakes and make the FFI process a lot easier.

How `::safer_ffi` improved our code at Ditto

Before diving into the explanation, here is a snippet of code that was transformed with `safer_ffi`.

Before:

#[no_mangle]
pub extern "C" fn ditto_collections_next(collections: *mut CCollectionNames) -> c_int {
    let iterator: &mut CollectionNamesIterator =
        unsafe { &mut *((*collections).iterator as *mut _) };
    if unsafe { (*collections).name } != std::ptr::null_mut() {
        drop(unsafe { CString::from_raw((*collections).name) });
    }
    let result = match iterator.next() {
        Some(s) => {
            unsafe {
                (*collections).name = CString::new(s).unwrap().into_raw();
            }
            1
        }
        None => {
            unsafe {
                (*collections).name = std::ptr::null_mut();
                drop(Box::from_raw(iterator));
                drop(Box::from_raw(collections));
            }
            0
        }
    };
    result
}

After:

#[ffi_export]
pub fn ditto_collections_next<'__, 'txn : '__>(
    collections: &'__ mut CCollectionNames<'txn>,
) -> c_int {
    if let Some(s) = collections.iterator.as_mut().unwrap().it.next() {
        collections.name = Some(s.try_into().unwrap());
        1
    } else {
        collections.name = None;
        drop(collections.iterator.take());
        0
    }
}

As you can see, the after has been dramatically altered to improve readability. Most importantly, it has also has removed the need for `unsafe` code blocks and relies on `&mut` instead of the leakier `*mut` parameter. This is only one snippet of a very large FFI effort at Ditto.

Over our entire codebase, `safer_ffi` has reduced our `unsafe` code blocks from 282 instances to 48 instances. That is a whopping 83% decrease. You may be asking about the remaining `unsafe` code. At the moment, `safer_ffi` doesn't support Rust lifetimes in callback signatures. However, that feature is top priority for us and once it's introduced, we should have no more `unsafe` blocks.

Returning a `Vec` from Rust to C

One of the most unintuitive parts of writing FFI code in Rust is returning a `Vec` of data. With `safer_ffi`, all you need to do is use `repr_c::Vec`, which is like `Vec`, but with a well-defined C layout. Here's an example straight out of Ditto's code base. This function `ditto_document_cbor` needed to serialize some data into CBOR and return a heap-allocated slice of bytes.

Before:

#[require_unsafe_in_body]
#[no_mangle]
pub
unsafe extern "C"
fn ditto_document_cbor (
    document: *const Document,
    out_cbor_len: *mut usize,
) -> *const c_uchar
{
    let value =
        unsafe { &*document }
            .to_value()
    ;
    let cbor_bytes: Box<[u8]> =
        ::serde_cbor::to_vec(&value)
            .unwrap()
            .into_boxed_slice()
    ;
    unsafe {
        *out_cbor_len = cbor_bytes.len();
    }
    Box::into_raw(cbor_bytes)
        as *const _
}

After:

// Bring types with a C layout (such as `c_slice::Box`) in scope
use ::safer_ffi::prelude::*;

#[ffi_export]
pub
fn ditto_document_cbor (
    document: &'_ Document,
) -> c_slice::Box<u8>
{
    let value = document.to_value();
    let cbor_bytes: Box<[u8]> =
        ::serde_cbor::to_vec(&value)
            .unwrap()
            .into_boxed_slice()
    ;
    cbor_bytes
        .into()
}

Just like before we have major improvements:

No more `unsafe fn`
No more `unsafe { ... }` blocks of code
No more casting of an unannotated pointer

Furthermore, the ever-so-confusing `out` param, `out_cbor_len: *mut usize` is also gone. We've noticed that functions that have both a return value and an out parameter are extremely prone to errors. For example, we may forget to update the `out_cbor_len` field even though we've gotten what we wanted from the return value. With `repr_c::Vec` these are no longer separate variables to deal with.

Note: You might have noticed that there is a slight optimization in the code above. We converted a `Vec<u8>` into a `Box<[u8]>` which allowed us to remove any unused extra capacity. This means that we got rid of the `capacity: usize` field. We can make the code even simpler by simply returning a `repr_c::Vec`. We just need to use it in the function signature and convert any standard `Vec` to the `repr_c::Vec` with `.into()`. For example:

// Bring types with a C layout (such as `repr_c::Vec`) in scope
use ::safer_ffi::prelude::*;

#[ffi_export]
fn ditto_document_cbor (
    document: &'_ Document,
) -> repr_c::Vec<u8>
{
    let vec: Vec<u8> =
        ::serde_cbor::to_vec(&document.to_value())
            .unwrap()
    ;
    vec.into()
}

#[ffi_export]
fn ditto_free_cbor (vec: repr_c::Vec<u8>)
{
    drop(vec);
}

This generates the C header:

/** \brief
 *  Same as [`Vec<T>`][`rust::Vec`], but with guaranteed `#[repr(C)]` layout
 */
typedef struct {
    uint8_t * ptr;

    size_t len;

    size_t cap;
} Vec_uint8_t;

Vec_uint8_t ditto_document_cbor (
    Document_t const * document);

void ditto_free_cbor (
    Vec_uint8_t vec);

How safer_ffi made our function signatures readable

Traditional Rust FFI often led us to abusing flat pointers all over our codebase. In turn, all the ownership, borrowing, and nullability semantics are lost. This made functions incredibly difficult for us to read unless we follow the code inside the function body. Without fully understanding the insides of the function, we were never sure how pointers were used.

With `safer_ffi`, we've made dramatic improvements to readability without requiring our team to worry about how pointers were used. For example:

Before:

/// Creates a new document from CBOR
///
/// It will allocate a new document and set `document` pointer to it. It will
/// later need to be released with `::ditto_document_free`.
///
/// The input `cbor` must be a valid CBOR.
///
/// Return codes:
///
/// * `0` -- success
/// * `1` -- invalid CBOR
/// * `2` -- cbor is not an object
/// * `3` -- ID string is empty
#[no_mangle] pub unsafe extern "C"
fn ditto_document_new_cbor (
    cbor_ptr: *const u8,
    cbor_len: usize,
    id: *const c_char,
    site_id: c_uint,
    document: *mut *mut Document,
) -> c_int

After:

/// Creates a new document from CBOR
///
/// It will allocate a new document and set `document` pointer to it. It will
/// later need to be released with `::ditto_document_free`.
///
/// The input `cbor` must be a valid CBOR.
///
/// Return codes:
///
/// * `0` -- success
/// * `1` -- invalid CBOR
/// * `2` -- cbor is not an object
/// * `3` -- ID string is empty
#[ffi_export]
fn ditto_document_new_cbor (
    cbor: c_slice::Ref<'_, u8>,
    id: Option<char_p::Ref<'_>>,
    site_id: c_uint,
    document: Out<'_, Option<repr_c::Box<Document>>>,
) -> c_int

Here are the summary of improvements:

Now, developers can infer that `id` can be `NULL`;
`document` is now an `Out<T>` parameter, which is write-only since it is used to return an extra value. If a non-NULL pointer gets written to it, then such pointer carries ownership: the caller is responsible for freeing it. This preserves the ownership model of Rust all while working with generated C headers.

You might have noticed that we could further simplify the code above. At the moment we are still working to support `Result` and other complex types. But once we do the code should look like the following:

#[ffi_export]
fn ditto_document_new_cbor (
    cbor: c_slice::Ref<'_, u8>,
    id: Option<char_p::Ref<'_>>,
    site_id: c_uint,
) -> repr_c::Result<repr_c::Box<Document>>

As we continue to develop and improve `safer_ffi` we are equally excited to see what the open-source community does with it. Rust has been a fantastic core component of our SDK and is running on multiple architectures, operating systems, and programming languages. We hope that your projects that require Rust interop see the major improvements to readability and code safety with `safer_ffi`.

safer_ffi Github Repo

Introducing Safer FFI

Quick References

The Ugly of Rust FFI

How `::safer_ffi` improved our code at Ditto

Before:

After:

Returning a `Vec` from Rust to C

Before:

After:

How safer_ffi made our function signatures readable

Before:

After:

Get posts in your inbox