Normalised Caching Docs (#1895)

* wip

* more docs

* docs

* more docs
This commit is contained in:
Oscar Beaumont 2024-04-17 17:20:46 +08:00 committed by GitHub
parent 4fc7ba6275
commit 533d91a215
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 234 additions and 0 deletions

View file

@ -0,0 +1,143 @@
---
title: Normalised Cache
index: 12
---
# Normalised Cache
We use a normalised cache for our frontend to ensure the UI will always contain consistent data.
## What this system does?
By normalising the data it's impossible to get "state tearing".
Each time `useNodes` or `cache.withNodes` is called all `useCache` hooks will reexecute if they depend on a node that has changed.
This means the queries will always render the newest version of the model.
## Terminology
- `CacheNode`: A node in the cache - this contains the data and can be identified by the model's name and unique ID within the data (eg. database primary key).
- `Reference<T>`: A reference to a node in the cache - This contains the model's name and unique ID.
## High level overview
We turn the data on the backend into a list of `CacheNode`'s and a list of `Reference<T>`'s and then return it to the frontend.
We insert the `CacheNode`'s into a global cache on the frontend and then use the `Reference<T>`'s to reconstruct the data by looking up the `CacheNode`'s.
When the cache changes (from another query, invalidation, etc), we can reconstruct *all* queries using their `Reference<T>`'s to reflect the updated data.
## Rust usage
The Rust helpers are defined [here](https://github.com/spacedriveapp/spacedrive/blob/main/crates/cache/src/lib.rs) and can be used like the following:
```rust
pub struct Demo {
id: String,
}
impl sd_cache::Model for Demo {
// The name + the ID *must* refer to a unique node.
// If your using an enum, the variant should show up in the ID (although this isn't possible right now)
fn name() -> &'static str {
"Demo"
}
}
let data: Vec<Demo> = vec![];
// We normalised the data but splitting it into a group of reference and a group of `CacheNode`'s.
let (nodes, items) = libraries.normalise(|i| i.id);
// `NormalisedResults` or `NormalisedResult` are optional wrapper types to hold a one or multiple items and their cache nodes.
// You don't have to use them, but they save declaring a bunch of identical structs.
//
// Alternatively add `nodes: Vec<CacheNode>` and `items: Vec<Reference<T>>` to your existing return type.
//
return sd_cache::NormalisedResults { nodes, items };
```
## Typescript usage
The Typescript helpers are defined [here](https://github.com/spacedriveapp/spacedrive/blob/main/packages/client/src/cache.tsx).
### Usage with React
We have helpers designed for easy usage within React's lifecycle.
```ts
const query = useLibraryQuery([...]);
// This will inject all the models into the cache
useNodes(query.data?.nodes);
// This will reconstruct the data from the cache
const data = useCache(query.data?.item);
console.log(data);
```
### Vanilla JS
These API's are really useful for special cases. In general aim to use the React API's unless you have a good reason for these.
```ts
const cache = useNormalisedCache(); // Get the cache within the react context
// Pass `cache` outside React (Eg. `useEffect`, `onSuccess`, etc)
const data = ...;
// This will inject all the models into the cache
cache.withNodes(data.nodes)
// This will reconstruct the data from the cache
//
// *WARNING* This is not reactive. So any changes to the nodes will not be reflected.
// Using this is fine if you need to quickly check the data but don't hold onto it.
const data = useCache(query.data?.item);
console.log(data);
```
## Design decisions
### Why `useNodes` and `useCache`?
This was done to make the system super flexible with what data you can return from your backend.
For example the backend doesn't just have to return `NormalisedResults` or `NormalisedResult`, it could return:
```rust
pub struct AllTheData {
file_paths: Vec<Reference<FilePath>>,
locations: Vec<Reference<Location>>,
nodes: Vec<CacheNode>
}
```
and then on the frontend you could do the following:
```ts
const query = useQuery([...]);
useNodes(query.data?.nodes);
const locations = useCache(query.data?.locations);
const filePaths = useCache(query.data?.file_paths);
```
This is only possible because `useNodes` and `useCache` take in a specific key, instead of the whole `data` object, so you can tell it where to look.
## Known issues
### Specta support
Expressing `Reference<T>` in Specta is really hard so we [surgically update](https://github.com/spacedriveapp/spacedrive/blob/a315dd632da8175b47f9e0713d3c7fc470329352/core/src/api/mod.rs#L219) it's [type definition](https://github.com/spacedriveapp/spacedrive/blob/a315dd632da8175b47f9e0713d3c7fc470329352/crates/cache/src/lib.rs#L215).
This is done using `rspc::Router::sd_patch_types_dangerously` which is a method specific to our fork [spacedrive/rspc](https://github.com/spacedriveapp/rspc).
### Invalidation system integration
The initial implementation of this idea with an MVP. It works with the existing invalidation system like regular queries, but the invalidation system isn't aware of the normalised cache like a better implementation would be.

View file

@ -0,0 +1,91 @@
---
title: rspc
index: 13
---
We use a fork based on [rspc 0.1.4](https://docs.rs/rspc) which contains heavy modifications from the upstream.
## What's different?
- A super pre-release version of rspc v1's procedure syntax.
- Upgrade to Specta v2 prelease
- Add `Router::sd_patch_types_dangerously`
- Expose internal type maps for the invalidation system.
- All procedures must return a result
- `Procedure::with2` which is a hack to properly support the middleware mapper API
- Legacy executor system - Will require major changes to the React Native link.
Removed features relied on by Spacedrive:
- Argument middleware mapper API has been removed upstream
## Basic usage
```rust
use rspc::{alpha::AlphaRouter};
use super::{Ctx, R};
pub(crate) fn mount() -> AlphaRouter<Ctx> {
R.router()
// Define a library query
.procedure("todo", {
R.with2(library())
.query(|(node, library), input: ()| async move {
Ok(todo!())
})
})
// Define a node query
.procedure("todo2", {
R.query(|node, input: ()| async move {
Ok(todo!())
})
})
// You can copy the above examples but use `.mutation` instead of `.query` for mutations
// Define a node subscription
.procedure("todo3", {
// You can make this a library subscription by using `R.with2(library())`
R.subscription(|node, _: ()| async move {
// You can return any `impl Stream`
Ok(async_stream::stream! {
yield "hello";
})
})
})
}
/// You merge this router into the main router defined in `core/src/api.rs`.
```
## Known bugs
### Returning an error from a procedure can cause a panic
This is due to a bug in the future that resolves requests. This was fixed [upstream in July 2023](https://github.com/oscartbeaumont/rspc/commit/f115ab22e04d59b0c9056989392215df2b7bb531).
A panic will also take out the entire request which could contain a batch of multiple procedures.
### Blocking
### Batching
This applies to both Tauri and Axum when using the batch link (**which Spacedrive uses**). Each request within a batch is effectively run in serial. This may or may not make a major difference as the response can't be send until all items are ready but having to run them in parallel would be faster regardless.
### Tauri
Minus batching everything is run in parallel.
### Axum
All queries and mutations run within a single websocket connection (**which Spacedrive uses**) are run in serial.
Minus batching HTTP requests are run in parallel.
### Websocket reconnect
If the websocket connection is dropped (due to network disruption) all subscriptions *will not* restart upon reconnecting.
This will cause the invalidation system to break and potentially other parts of the app that rely on subscriptions.
Queries and mutations done during the network disruption will hang indefinitely.