docs: tech overfiew: fix line length

Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This commit is contained in:
Thomas Lamprecht 2021-02-04 12:05:27 +01:00
parent 1531185dd0
commit 3253d8a2e4

View File

@ -6,35 +6,34 @@ Technical Overview
Datastores Datastores
---------- ----------
A Datastore is the logical place where :ref:`Backup Snapshots <backup_snapshot>` A Datastore is the logical place where :ref:`Backup Snapshots
and their chunks are stored. Snapshots consist of a manifest, blobs, <backup_snapshot>` and their chunks are stored. Snapshots consist of a
dynamic- and fixed-indexes (see :ref:`terminology`), and are stored in the manifest, blobs, dynamic- and fixed-indexes (see :ref:`terminology`), and are
following directory structure: stored in the following directory structure:
<datastore-root>/<type>/<id>/<time>/ <datastore-root>/<type>/<id>/<time>/
The deduplication of datastores is based on reusing chunks, which are The deduplication of datastores is based on reusing chunks, which are
referenced by the indexes in a backup snapshot. This means that multiple referenced by the indexes in a backup snapshot. This means that multiple
indexes can reference the same chunks, reducing the amount of space indexes can reference the same chunks, reducing the amount of space needed to
needed to contain the data (even across backup snapshots). contain the data (even across backup snapshots).
Chunks Chunks
------ ------
A chunk is some (possibly encrypted) data with a CRC-32 checksum at A chunk is some (possibly encrypted) data with a CRC-32 checksum at the end and
the end and a type marker at the beginning. It is identified by the a type marker at the beginning. It is identified by the SHA-256 checksum of its
SHA-256 checksum of its content. content.
To generate such chunks, backup data is split either into fixed-size or To generate such chunks, backup data is split either into fixed-size or
dynamically sized chunks. The same content will be hashed to the same dynamically sized chunks. The same content will be hashed to the same checksum.
checksum.
The chunks of a datastore are found in The chunks of a datastore are found in
<datastore-root>/.chunks/ <datastore-root>/.chunks/
This chunk directory is further subdivided by the first four byte of the This chunk directory is further subdivided by the first four byte of the chunks
chunks checksum, so the chunk with the checksum checksum, so the chunk with the checksum
a342e8151cbf439ce65f3df696b54c67a114982cc0aa751f2852c2f7acc19a8b a342e8151cbf439ce65f3df696b54c67a114982cc0aa751f2852c2f7acc19a8b
@ -42,11 +41,11 @@ lives in
<datastore-root>/.chunks/a342/ <datastore-root>/.chunks/a342/
This is done to reduce the number of files per directory, as having This is done to reduce the number of files per directory, as having many files
many files per directory can be bad for file system performance. per directory can be bad for file system performance.
These chunk directories ('0000'-'ffff') will be preallocated when a datastore is These chunk directories ('0000'-'ffff') will be preallocated when a datastore
created. is created.
Fixed-sized Chunks Fixed-sized Chunks
^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^
@ -54,20 +53,20 @@ Fixed-sized Chunks
For block based backups (like VMs), fixed-sized chunks are used. The content For block based backups (like VMs), fixed-sized chunks are used. The content
(disk image), is split into chunks of the same length (typically 4 MiB). (disk image), is split into chunks of the same length (typically 4 MiB).
This works very well for VM images, since the file system on the guest This works very well for VM images, since the file system on the guest most
most often tries to allocate files in contiguous pieces, so new files get often tries to allocate files in contiguous pieces, so new files get new
new blocks, and changing existing files changes only their own blocks. blocks, and changing existing files changes only their own blocks.
As an optimization, VMs in `Proxmox VE`_ can make use of 'dirty bitmaps', As an optimization, VMs in `Proxmox VE`_ can make use of 'dirty bitmaps', which
which can track the changed blocks of an image. Since these bitmap can track the changed blocks of an image. Since these bitmap are also a
are also a representation of the image split into chunks, we have representation of the image split into chunks, we have a direct relation
a direct relation between dirty blocks of the image and chunks we have between dirty blocks of the image and chunks we have to upload, so only
to upload, so only modified chunks of the disk have to be uploaded for a backup. modified chunks of the disk have to be uploaded for a backup.
Since we always split the image into chunks of the same size, unchanged Since we always split the image into chunks of the same size, unchanged blocks
blocks will result in identical checksums for those chunks, so such chunks do not will result in identical checksums for those chunks, so such chunks do not need
need to be backed up again. This way storage snapshots are not needed to find to be backed up again. This way storage snapshots are not needed to find the
the changed blocks. changed blocks.
For consistency, `Proxmox VE`_ uses a QEMU internal snapshot mechanism, that For consistency, `Proxmox VE`_ uses a QEMU internal snapshot mechanism, that
does not rely on storage snapshots either. does not rely on storage snapshots either.
@ -75,40 +74,40 @@ does not rely on storage snapshots either.
Dynamically sized Chunks Dynamically sized Chunks
^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^
If one does not want to backup block-based systems but rather file-based systems, If one does not want to backup block-based systems but rather file-based
using fixed-sized chunks is not a good idea, since every time a file systems, using fixed-sized chunks is not a good idea, since every time a file
would change in size, the remaining data gets shifted around and this would change in size, the remaining data gets shifted around and this would
would result in many chunks changing, reducing the amount of deduplication. result in many chunks changing, reducing the amount of deduplication.
To improve this, `Proxmox Backup`_ Server uses dynamically sized chunks To improve this, `Proxmox Backup`_ Server uses dynamically sized chunks
instead. Instead of splitting an image into fixed sizes, it first generates instead. Instead of splitting an image into fixed sizes, it first generates a
a consistent file archive (:ref:`pxar <pxar-format>`) and uses a rolling hash consistent file archive (:ref:`pxar <pxar-format>`) and uses a rolling hash
over this on-the-fly generated archive to calculate chunk boundaries. over this on-the-fly generated archive to calculate chunk boundaries.
We use a variant of Buzhash which is a cyclic polynomial algorithm. We use a variant of Buzhash which is a cyclic polynomial algorithm. It works
It works by continuously calculating a checksum while iterating over the by continuously calculating a checksum while iterating over the data, and on
data, and on certain conditions it triggers a hash boundary. certain conditions it triggers a hash boundary.
Assuming that most files of the system that is to be backed up have not changed, Assuming that most files of the system that is to be backed up have not
eventually the algorithm triggers the boundary on the same data as a previous changed, eventually the algorithm triggers the boundary on the same data as a
backup, resulting in chunks that can be reused. previous backup, resulting in chunks that can be reused.
Encrypted Chunks Encrypted Chunks
^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^
Encrypted chunks are a special case. Both fixed- and dynamically sized Encrypted chunks are a special case. Both fixed- and dynamically sized chunks
chunks can be encrypted, and they are handled in a slightly different manner can be encrypted, and they are handled in a slightly different manner than
than normal chunks. normal chunks.
The hashes of encrypted chunks are calculated not with the actual (encrypted) The hashes of encrypted chunks are calculated not with the actual (encrypted)
chunk content, but with the plaintext content concatenated with chunk content, but with the plaintext content concatenated with the encryption
the encryption key. This way, two chunks of the same data encrypted with key. This way, two chunks of the same data encrypted with different keys
different keys generate two different checksums and no collisions occur for generate two different checksums and no collisions occur for multiple
multiple encryption keys. encryption keys.
This is done to speed up the client part of the backup, since it only needs This is done to speed up the client part of the backup, since it only needs to
to encrypt chunks that are actually getting uploaded. Chunks that exist encrypt chunks that are actually getting uploaded. Chunks that exist already in
already in the previous backup, do not need to be encrypted and uploaded. the previous backup, do not need to be encrypted and uploaded.
Caveats and Limitations Caveats and Limitations
----------------------- -----------------------
@ -116,19 +115,20 @@ Caveats and Limitations
Notes on hash collisions Notes on hash collisions
^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^
Every hashing algorithm has a chance to produce collisions, meaning two (or more) Every hashing algorithm has a chance to produce collisions, meaning two (or
inputs generate the same checksum. For SHA-256, this chance is negligible. more) inputs generate the same checksum. For SHA-256, this chance is
To calculate such a collision, one can use the ideas of the 'birthday problem' negligible. To calculate such a collision, one can use the ideas of the
from probability theory. For big numbers, this is actually infeasible to 'birthday problem' from probability theory. For big numbers, this is actually
calculate with regular computers, but there is a good approximation: infeasible to calculate with regular computers, but there is a good
approximation:
.. math:: .. math::
p(n, d) = 1 - e^{-n^2/(2d)} p(n, d) = 1 - e^{-n^2/(2d)}
Where `n` is the number of tries, and `d` is the number of possibilities. Where `n` is the number of tries, and `d` is the number of possibilities. So
So for example, if we assume a large datastore of 1 PiB, and an average chunk for example, if we assume a large datastore of 1 PiB, and an average chunk size
size of 4 MiB, we have :math:`n = 268435456` tries, and :math:`d = 2^{256}` of 4 MiB, we have :math:`n = 268435456` tries, and :math:`d = 2^{256}`
possibilities. Using the above formula we get that the probability of a possibilities. Using the above formula we get that the probability of a
collision in that scenario is: collision in that scenario is:
@ -136,31 +136,29 @@ collision in that scenario is:
3.1115 * 10^{-61} 3.1115 * 10^{-61}
For context, in a lottery game of 6 of 45, the chance to correctly guess all For context, in a lottery game of 6 of 45, the chance to correctly guess all 6
6 numbers is only :math:`1.2277 * 10^{-7}`. numbers is only :math:`1.2277 * 10^{-7}`.
So it is extremely unlikely that such a collision would occur by accident So it is extremely unlikely that such a collision would occur by accident in a
in a normal datastore. normal datastore.
Additionally, SHA-256 is prone to length extension attacks, but since Additionally, SHA-256 is prone to length extension attacks, but since there is
there is an upper limit for how big the chunk are, this is not a an upper limit for how big the chunk are, this is not a problem, since a
problem, since a potential attacker cannot arbitrarily add content to potential attacker cannot arbitrarily add content to the data beyond that
the data beyond that limit. limit.
File-based Backup File-based Backup
^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^
Since dynamically sized chunks (for file-based backups) are created on a custom Since dynamically sized chunks (for file-based backups) are created on a custom
archive format (pxar) and not over the files directly, there is no relation archive format (pxar) and not over the files directly, there is no relation
between files and the chunks. This means we have to read all files again between files and the chunks. This means we have to read all files again for
for every backup, otherwise it would not be possible to generate a consistent every backup, otherwise it would not be possible to generate a consistent pxar
pxar archive where the original chunks can be reused. archive where the original chunks can be reused.
Verification of encrypted chunks Verification of encrypted chunks
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
For encrypted chunks, only the checksum of the original (plaintext) data For encrypted chunks, only the checksum of the original (plaintext) data is
is available, making it impossible for the server (without the encryption key), available, making it impossible for the server (without the encryption key), to
to verify its content against it. Instead only the CRC-32 checksum gets checked. verify its content against it. Instead only the CRC-32 checksum gets checked.