BonsaiDb November update: Working towards alpha 1

This is the first devlog regarding BonsaiDb since September 24 – we have not been idle! While I wrote about a new example in October, I haven’t mentioned much about what’s been going on with BonsaiDb for too long.

What does “alpha” mean for BonsaiDb?

I began writing BonsaiDb in March of this year, and it has been a series of experiments along the way. Each step I’ve grown more confident in what works well and what might need to change. I don’t want to begin advertising a stable version until I am ready to use it myself in “production.” We have no customers, so production really doesn’t mean much for me except being always available).

To me, that’s what the alpha/beta/release candidate cycle is all about. For BonsaiDb, alpha simply means it will no longer be considered experimental – I hope people will start toying with it in projects. But, being alpha software, I hope no one will be as stupid as me and try to run it in a production environment.

The alpha phase will be about defining the features and functionality. I have a milestone on GitHub that I’m gathering my hit list for 0.1.0 on. Each release target will have its own milestone – you can see the remaining issues for 0.1.0-alpha.1 here.

The beta phase will be for focusing on testing, fixing/refactoring existing functionality, rounding out rough edges. Finally, a (hopefully short) release candidate cycle that focuses purely on bug fixes.

Is that too much work for a 0.1.0 release of a crate? Maybe. But, this approach has allowed me to be more free in breaking BonsaiDb often in an effort to refine its APIs.

Updates to BonsaiDb

Automatic TLS Certificates via LetsEncrypt/ACME

We now have automated TLS certificate configuration through the ACME protocol. I chose the async-acme crate, which is one of the few crates that supports rustls bindings. We want BonsaiDb to have as few non-Rust dependencies as possible, and OpenSSL is a heavy non-Rust dependency.

Using it is as simple as enabling the server-acme feature if you’re using the bonsaidb crate, or the acme feature if you’re using bonsaidb-server. Doing this enables the AcmeConfiguration. This is the minimum setup required to have a WebSocket powered server with TLS on port 443:

let server = Server::open(
    Path::new("acme-server-data.bonsaidb"),
    Configuration {
        server_name: String::from("khonsulabs.com"),
        acme: AcmeConfiguration {
            contact_email: Some(String::from("mailto:[email protected]")),
            ..Default::default()
        },
        ..Default::default()
    },
)
.await?;
server.listen_for_https_on("0.0.0.0:443").await

We use the TLS-ALPN-01 challenge type, which is performed on port 443 alongside the normal TLS traffic on that port. Additionally, the preferred QUIC-based connection will automatically utilize the same TLS certificate.

HTTP connection delegation

What if you could extend the same HTTP port that WebSockets and ACME utilize to serve your application’s website or REST API? With HTTP connection delegation through the Backend trait, you can (in theory) integrate any hyper-compatible HTTP framework.

I wrote an example showing how to use the Axum crate, which includes showing how upgrading the WebSocket connection works:

#[async_trait]
impl Backend for AxumBackend {
    async fn handle_http_connection<
        S: tokio::io::AsyncRead + tokio::io::AsyncWrite + Unpin + Send + 'static,
    >(
        connection: S,
        peer_address: std::net::SocketAddr,
        server: &CustomServer<Self>,
    ) -> Result<(), S> {
        let server = server.clone();
        let app = Router::new()
            .route("/", get(uptime_handler))
            .route("/ws", get(upgrade_websocket))
            // Attach the server and the remote address as extractable data for the /ws route
            .layer(AddExtensionLayer::new(server))
            .layer(AddExtensionLayer::new(peer_address));

        if let Err(err) = Http::new()
            .serve_connection(connection, app)
            .with_upgrades()
            .await
        {
            eprintln!("[http] error serving {}: {:?}", peer_address, err);
        }

        Ok(())
    }
    // ...
}

Store associated data on a per-connection basis

When writing a server with a CustomApi, you might find a situation where you want to store some state for a ConnectedClient. Now, you can use the client_data()/set_client_data() to store and retrieve associated data. The type is driven by the ClientData associated type on the Backend trait.

Full at-rest encryption

When I originally announced at-rest encryption… What’s that? I never announced at-rest encryption? Oh, so no one knows that it had some limitations due to not being able to support encryption in the storage layer? So… umm… I implemented at-rest encryption. It doesn’t have any weird limitations.

More seriously though, since I didn’t announce it before, I want to talk about it briefly. The design hasn’t changed significantly from the original thread.

BonsaiDb maintains a current “master” key as well as a list of all previous master keys. Each Collection can define whether it should be encrypted. Additionally, when opening BonsaiDb a default_encryption_key can be specified to force everything to be stored encrypted.

Encryption is only useful is the keys can be kept stored securely. I’ve defined a VaultKeyStorage trait that allows implementing storage mechanisms for a key that is used to encrypt the master keys – known as the “vault key”. I’ve also provided a s3-compatible implementation allowing for a diverse range of inexpensive, secure, and reliable options for storing vault keys.

By keeping the vault key remote, it ensures that you can be in full control of your encrypted data, even if a hard drive that contained your data on it was stolen (or purchased on eBay after old hardware was decommissioned).

Authenticated connection permissions configuration

A new configuration option for the server is available: authenticated_permissions. This set of permissions will be applied automatically once a user has authenticated as any known user in the database.

Multiple built-in serialization formats

As part of adopting our own serialization format, Pot, we also made it easy to set which serialization format to use on a per-collection basis. Remember, if you ever wish to skip the built-in serialization, you can always interact with the raw bytes on the Document directly.

Currently, the three additional formats available through optional features are: Bincode, CBOR, and JSON.

The road to alpha 1

A lot of what I’ve written above has been what I’ve been busy working on. @dAxpeDDa has also been busy working on custodian-password (and its upstream dependencies), and our transport layer fabruic. The main “blocker” is wanting to wait until we have all dependencies updated to rustls 0.20. With the recent update to quinn, this is now possible.

The other “blocker” is project management. I’ve been spending a bit more time each day trying to flush out existing issues and write new issues. I’m wanting to announce a little bit of a roadmap with the first alpha, and hopefully attract an additional contributor or two. If you are interested in contributing, we’d love to hear from you on GitHub, our Discord, or on these forums.

I’m getting really excited at what BonsaiDb is able to do. I hope you are too!