| lenses | ||
| src | ||
| .envrc | ||
| .gitignore | ||
| Cargo.lock | ||
| Cargo.toml | ||
| flake.lock | ||
| flake.nix | ||
| justfile | ||
| README.md | ||
Parlens (working title)
Parlens is an exploration of a protocol for communicating data and data transformations (lenses) with other parties. It can be thought of as an alternative to HTTP, trying to make it more concise and flexible and flattening many layers we've added on top of it over the past decades.
To support this, it'll offer:
- A framework for describing schemas, lenses and evolutions;
- A data serialization format;
- Infrastructure for sharing schemas and lenses.
The main objective is to allow developers to connect services without tight coupling of schemas: service A should be able to send information needed by service B even if it's not in the schema expected by B, as long as there's a path to transform that information (through lenses) into the schema. Schemas and lenses don't need to be known ahead of time: the protocol lets services discover new items during communication, either from the other party or using a distributed system similar to DNS. Each item carries signatures, so services can filter which items can be discovered based on their trusted sources.
Lenses are pure WebAssembly programs that receive data as input and output the same or similar data in another schema.
The proof of concept
This repository contains a very rough proof of concept to illustrate how Parlens could work. Compared to the vision set above, a lot in this proof of concept is hardcoded, makes simplifying assumptions and/or isn't implemented, but it demonstrates important things:
- The full path of data transformation works, even with indirections (e.g. C -> A -> B).
- It's still ergonomic for clients and servers.
The code is heavily commented to help readers understand what is hardcoded, what things were simplified, what wasn't implemented (and where in the flow they'd be implemented), and how the whole flow works. There's also a brief explainer of what's implemented and what isn't in the next subsection.
For this proof of concept, I took a few simplified schemas from a service related to board games that I'm building.
As a brief context: imagine that we're building a service for players to connect and play board games together with others. The service has a server that receives messages from players that want to connect to it, and also another backend worker that simulates an AI player reading data from the database and connecting to the server. The user and the backend worker run a client that sends messages to the server.
- The client always sends data in SchemaA.
- The server always processes messages in SchemaB.
- The backend worker always sends data in SchemaC.
There are lenses to convert from SchemaA to SchemaB, and from SchemaC to SchemaA, which means that we need two transformations to go from SchemaC to SchemaB (SchemaC -> SchemaA -> SchemaB).
Schemas
The schemas are written in some Rust-style pseudocode, but they should hopefully be understandable by anyone.
SchemaA:
{
pending_room_id: String,
game_server_room_id: String,
join_token: String,
boardgame_id: String,
}
SchemaB:
{
game_server_room_id: String,
join_token: String,
}
SchemaC:
{
pending_room_id: String,
assigned_game_server_id: u32,
user_id: String,
user_secret: String,
boardgame_id: String,
}
The code that transforms between schemas is located in the lenses directory.
The transformation from SchemaA to SchemaB is straightforward (just selecting some fields), but the transformation from SchemaC to SchemaA involves some processing work: selecting a server id through its index in a list of existing server ids, and generating a join token (which in this proof of concept is just the user's id joined with the user's secret through a '_' character).
What's implemented
Based on the vision, here's what the proof of concept implements (and how):
- A framework for describing schemas, lenses and evolutions
- Schemas are "described" directly in Rust, since this proof of concept is written purely in Rust. This is nowhere close to the vision, which wants to provide a "universal" way of describing schemas (similar to e.g. Protobuf's definitions, but very likely not in a textual format).
- Evolutions are not implemented at all. This proof of concept isn't concerned with schema evolution and how clients and servers would deal with data in different schema versions.
- Lenses are described directly in WAT (WebAssembly Text format), but information regarding which schemas it uses as input and output is hardcoded. The vision is to have a way to describe lenses that is "self-contained", meaning the input and output schemas (and any other relevant information) will be part of the lens itself.
- A data serialization format
- This proof of concept uses a data structure's C representation as the data serialization format. The vision is to have something similar to Protobuf, FlatBuffers, and others. This is a huge amount of work though, so the C representation is a placeholder for all of this.
- Infrastructure for sharing schemas and lenses
- There is no DNS-like system for any of this, as intended by the vision. However, the proof of concept shows the "peer to peer" part of the infrastructure: the server learning about a new lens (SchemaC to SchemaA) from a client (the backend worker) that wants to send data in a schema unknown to the server (SchemaC). But even this demonstration is very rough: there are no signatures or anything like that, so the server just blindly trusts that the lens does what the client says it does.
Reading the code
Things are heavily commented, so you should still be able to follow and roughly understand what's going on even if you don't know Rust.
Start with src/main.rs, which will show how ergonomic it is for clients to send data to the server.
From there, the code will point to other places you can go to continue reading the code.
The code for each lens is in the lenses directory, and you can read those as well.
The lib.rs files are very short.
If you're curious, the WAT files for each lens is also committed in this repository, so you don't need to run the code to see what they're like.
The relevant bits of WebAssembly for this proof of concept
WebAssembly is the "portable code" layer that lets clients and servers share lenses and run them to convert data. Since WebAssembly has a tight sandbox by design, it works super well - lenses can't affect the system they're running on, change things on disk, talk to the Internet, or do any other undesirable things. They're pretty much pure functions by default.
The way this works is that the code gives the WebAssembly engine the WebAssembly programs it wants to run - these are similar to system binaries, except that:
- It uses WebAssembly's instruction architecture (it's not x64, arm, risc-v, or whatever).
- It uses WebAssembly's memory model, which is like a contiguous array of bytes that start being addressed at 0. This means that the memory address 0 is valid and just like any other.
- They can export values for whoever is using the WebAssembly program, and usually the WebAssembly engines only let programmers read exported values. An example is the heap base pointer, as you'll see below.
Because of this memory model, the programs usually decide how much stack space they want to have (it seems that by default Rust uses an 1MiB stack), and so they reserve the first bytes in the memory for the stack. After that, they put whatever compile-time data they want, and then the rest is left for the heap.
This means that in these programs, the stack decreases from e.g. 1048576 (1MiB) to 0, and the heap grows from X to infinity, where X is the stack size + the size of all compile-time data the program uses. For example, the lens that converts from SchemaA to SchemaB is very simple, so there's no data the program needs, which means the stack starts at 1MiB (and decreases to 0), and the heap also starts at 1MiB (and grows to infinity). The lens for SchemaC to SchemaA is more complex: the stack starts at 1MiB, but the heap starts at 1050160 (~1.00151MiB). There's a bit over 1KiB of data included in the program, which sits between the base of the stack and the base of the heap.
stack heap
base base
0 1MiB (1+X)MiB ...
_________________________________________
| stack | compile- | heap
| <-----| time data |------> ...
|_____________|____________|______________
|<-- XMiB -->|
When calling WebAssembly code that takes input (i.e. any lens), that input needs to live somewhere in this memory. The stack is usually an area reserved for the WebAssembly program to do as it pleases with it, so it usually doesn't even export the base of the stack to users of the program. But they export the base of the heap, so this proof of concept just writes the input data in the heap.
The WebAssembly programs created in this proof of concept all follow the "C ABI", which means that they expect inputs and outputs to be in the same format that the same code in C would use after being compiled (to be honest, it's more like an architecture-specific ABI). In the case of structured data (which is what this proof of concept uses), the input should be a single pointer to the memory address where the structure is laid out, and the output is the memory address where it'll write the resulting data structure.
The structure is also represented as C would represent it. In the case of string slices (like the proof of concept uses), these should be a pointer to where the text is in memory, and then the length of the text, so you'll see that the proof of concept code always has to write the strings in the heap first, and then it writes the structure, with pointers to the strings followed by their length.
The output is also a single pointer to a structure, and then the proof of concept reads that part of memory to know the pointers and length to the actual strings, and reconstructs all of that in Rust.
As an example, given this structure:
{
game_server_room_id: "room_id_1",
join_token: "join_token_1"
}
This is how the proof of concept writes it to WebAssembly memory:
heap struct
base address
| |
v v
________________________________________
| | | | | |
|room_id_1join_token_1|PTR| 9|PTR|12| ...
|_____________________|___|__|___|__|____
^ ^ | |
└---------|-------------┘ |
└--------------------┘
With that, you should be able to follow what the code is doing and understand why.
Running the code
You'll need Nix with flakes enabled. Then, enter a development shell for this flake (or use direnv, whatever you prefer).
Then run:
just build_lenses
cargo run
Why?
Two main motivations:
- I needed a memory model for the hardware (and OS) I'm creating, and I decided to flatten a lot of layers and show that it is possible for machines to communicate in a much more "raw" form, while retaining the flexibility that our current abstraction layers provide, and even improving on it!
- I think that, as software engineers, we rely WAY too much on (plain)text, and we waste so much effort parsing things everywhere. We shouldn't be constraining how machines communicate because of our incompetence in creating better tools. This is my contribution to show that we can do much better than the status quo.
But what about X that is supported in HTTP? How will this do it?
Feel free to directly ask me these questions, but the general answer for anything like this is the following.
HTTP has A LOT of information that is transmitted in plaintext (or a packed version of it, but still closer to plaintext than anything else), but is just that - information. Examples are the method used (GET, POST, etc.), the path, headers.
One of the issues with this is that this forces the sender to serialize structured data down to plaintext, and the receiver to parse the plaintext back into some structured data. Wouldn't it be easier if we could just transmit the structured data as-is, and save the parsing work (but still do some validation at least)? This is also true of request bodies, which are commonly JSON (yet another serialization of structured data into plaintext).
So how will it do auth? What about encryption? What about forms? What about X?
The general answer is that these will all just become fields in the data structure being sent by the client to the server. Note that data structures can be hierarchical (i.e. a data structure can wrap another), but they can also be composed (like HTTP headers are) - you can do whatever you want with them, including wrapping/unwrapping and whatever in separate layers.
Parlens is a way to show how this can scale: you don't need to add support for specific data structures in your code. As long as you define the schema of the data structure that you want, Parlens will convert whatever the client is sending into your schema (as long as there are lenses to do that).