docs: add more thoughts about chunk size
This commit is contained in:
parent
60e6ee46de
commit
37f1b7dd8d
21
README.rst
21
README.rst
|
@ -112,3 +112,24 @@ Modern SSD are much faster, lets assume the following::
|
|||
MAX(64KB) = 354 MB/s;
|
||||
MAX(4KB) = 67 MB/s;
|
||||
MAX(1KB) = 18 MB/s;
|
||||
|
||||
|
||||
Also, the average chunk directly relates to the number of chunks produced by
|
||||
a backup::
|
||||
|
||||
CHUNK_COUNT = BACKUP_SIZE / ACS
|
||||
|
||||
Here are some staticics from my developer worstation::
|
||||
|
||||
Disk Usage: 65 GB
|
||||
Directories: 58971
|
||||
Files: 726314
|
||||
Files < 64KB: 617541
|
||||
|
||||
As you see, there are really many small files. If we would do file
|
||||
level deduplication, i.e. generate one chunk per file, we end up with
|
||||
more than 700000 chunks.
|
||||
|
||||
Instead, our current algorithm only produce large chunks with an
|
||||
average chunks size of 4MB. With above data, this produce about 15000
|
||||
chunks (factor 50 less chunks).
|
||||
|
|
Loading…
Reference in New Issue