mirror of https://github.com/Qortal/Brooklyn
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
88 lines
2.9 KiB
88 lines
2.9 KiB
=============== |
|
Persistent data |
|
=============== |
|
|
|
Introduction |
|
============ |
|
|
|
The more-sophisticated device-mapper targets require complex metadata |
|
that is managed in kernel. In late 2010 we were seeing that various |
|
different targets were rolling their own data structures, for example: |
|
|
|
- Mikulas Patocka's multisnap implementation |
|
- Heinz Mauelshagen's thin provisioning target |
|
- Another btree-based caching target posted to dm-devel |
|
- Another multi-snapshot target based on a design of Daniel Phillips |
|
|
|
Maintaining these data structures takes a lot of work, so if possible |
|
we'd like to reduce the number. |
|
|
|
The persistent-data library is an attempt to provide a re-usable |
|
framework for people who want to store metadata in device-mapper |
|
targets. It's currently used by the thin-provisioning target and an |
|
upcoming hierarchical storage target. |
|
|
|
Overview |
|
======== |
|
|
|
The main documentation is in the header files which can all be found |
|
under drivers/md/persistent-data. |
|
|
|
The block manager |
|
----------------- |
|
|
|
dm-block-manager.[hc] |
|
|
|
This provides access to the data on disk in fixed sized-blocks. There |
|
is a read/write locking interface to prevent concurrent accesses, and |
|
keep data that is being used in the cache. |
|
|
|
Clients of persistent-data are unlikely to use this directly. |
|
|
|
The transaction manager |
|
----------------------- |
|
|
|
dm-transaction-manager.[hc] |
|
|
|
This restricts access to blocks and enforces copy-on-write semantics. |
|
The only way you can get hold of a writable block through the |
|
transaction manager is by shadowing an existing block (ie. doing |
|
copy-on-write) or allocating a fresh one. Shadowing is elided within |
|
the same transaction so performance is reasonable. The commit method |
|
ensures that all data is flushed before it writes the superblock. |
|
On power failure your metadata will be as it was when last committed. |
|
|
|
The Space Maps |
|
-------------- |
|
|
|
dm-space-map.h |
|
dm-space-map-metadata.[hc] |
|
dm-space-map-disk.[hc] |
|
|
|
On-disk data structures that keep track of reference counts of blocks. |
|
Also acts as the allocator of new blocks. Currently two |
|
implementations: a simpler one for managing blocks on a different |
|
device (eg. thinly-provisioned data blocks); and one for managing |
|
the metadata space. The latter is complicated by the need to store |
|
its own data within the space it's managing. |
|
|
|
The data structures |
|
------------------- |
|
|
|
dm-btree.[hc] |
|
dm-btree-remove.c |
|
dm-btree-spine.c |
|
dm-btree-internal.h |
|
|
|
Currently there is only one data structure, a hierarchical btree. |
|
There are plans to add more. For example, something with an |
|
array-like interface would see a lot of use. |
|
|
|
The btree is 'hierarchical' in that you can define it to be composed |
|
of nested btrees, and take multiple keys. For example, the |
|
thin-provisioning target uses a btree with two levels of nesting. |
|
The first maps a device id to a mapping tree, and that in turn maps a |
|
virtual block to a physical block. |
|
|
|
Values stored in the btrees can have arbitrary size. Keys are always |
|
64bits, although nesting allows you to use multiple keys.
|
|
|