src/backup.rs: improve GC problem description

This commit is contained in:
Dietmar Maurer 2019-03-31 09:44:35 +02:00
parent c8ec450e37
commit c374f05499
1 changed files with 13 additions and 7 deletions

View File

@ -65,8 +65,8 @@
//! //!
//! To free up some storage, we run a garbage collection process at //! To free up some storage, we run a garbage collection process at
//! regular intervals. The collector uses an mark and sweep //! regular intervals. The collector uses an mark and sweep
//! approach. In the first run, it scans all .idx files to mark used //! approach. In the first phase, it scans all .idx files to mark used
//! chunks. The second run then removes all unmarked chunks from the //! chunks. The second phase then removes all unmarked chunks from the
//! store. //! store.
//! //!
//! The above locking mechanism makes sure that we are the only //! The above locking mechanism makes sure that we are the only
@ -79,18 +79,24 @@
//! //!
//! The idea here is to mark chunks by updating the `atime` (access //! The idea here is to mark chunks by updating the `atime` (access
//! timestamp) on the chunk file. This is quite simple and does not //! timestamp) on the chunk file. This is quite simple and does not
//! need RAM. //! need additional RAM.
//! //!
//! One minor problem is that recent Linux versions use the `relatime` //! One minor problem is that recent Linux versions use the `relatime`
//! mount flag by default for performance reasons (yes, we want //! mount flag by default for performance reasons (yes, we want
//! that). When enabled, `atime` data is written to the disk only if //! that). When enabled, `atime` data is written to the disk only if
//! the file has been modified since the `atime` data was last updated //! the file has been modified since the `atime` data was last updated
//! (`mtime`), or if the file was last accessed more than a certain //! (`mtime`), or if the file was last accessed more than a certain
//! amount of time ago (by default 24h). //! amount of time ago (by default 24h). So we may only delete chunks
//! with `atime` older than 24 hours.
//!
//! Another problem arise from running backups. The mark phase does
//! not find any chunks from those backups, because there is no .idx
//! file for them (created after the backup). Chunks created or
//! touched by those backups may have an `atime` as old as the start
//! time of those backup. Please not that the backup start time may
//! predate the GC start time. Se we may only delete chunk older than
//! the start time of those running backup jobs.
//! //!
//! Another problem arise when running backups references old
//! chunks. We need to make sure that the sweep does not remove such
//! chunks. Not sure how to implement that.
//! //!
//! ## Store `marks` in RAM using a HASH //! ## Store `marks` in RAM using a HASH
//! //!