src/backup.rs: start explaining different GC algorithm
This commit is contained in:
		| @ -17,7 +17,7 @@ | ||||
| //! so that we can update the software without rebooting the host. But | ||||
| //! such restarts must not abort running backup jobs, so we need to | ||||
| //! keep the old service running until those jobs are finished. This | ||||
| //! implies that we need some kink of locking for the | ||||
| //! implies that we need some kind of locking for the | ||||
| //! ChunkStore. Please note that it is perfectly valid to have | ||||
| //! multiple parallel ChunkStore writers, even when they write the | ||||
| //! same chunk (because the chunk would have the same name and the | ||||
| @ -39,7 +39,8 @@ | ||||
| //! | ||||
| //!   Acquire shared lock for ChunkStore (process wide). | ||||
| //! | ||||
| //!   Note: We create temporary (.tmp) file, then do an atomic rename ... | ||||
| //!   Note: When creating .idx files, we create temporary (.tmp) file, | ||||
| //!   then do an atomic rename ... | ||||
| //! | ||||
| //! | ||||
| //! * Garbage Collect: | ||||
| @ -56,7 +57,7 @@ | ||||
| //!   socket. | ||||
| //! | ||||
| //! | ||||
| //! # Garbage Collection | ||||
| //! # Garbage Collection (GC) | ||||
| //! | ||||
| //! Deleting backups is as easy as deleting the corresponding .idx | ||||
| //! files. Unfortunately, this does not free up any storage, because | ||||
| @ -69,10 +70,31 @@ | ||||
| //! store. | ||||
| //! | ||||
| //! The above locking mechanism makes sure that we are the only | ||||
| //! process running GC. | ||||
| //! process running GC. But we still want to be able to create backups | ||||
| //! during GC, so there may be multiple backup threads/tasks | ||||
| //! running. Either started before GC started, or started while GC is | ||||
| //! running. | ||||
| //! | ||||
| //! ## `atime` based GC | ||||
| //! | ||||
|  | ||||
| //! The idea here is to mark chunks by updating the `atime` (access | ||||
| //! timestamp) on the chunk file. This is quite simple and does not | ||||
| //! need RAM. | ||||
| //! | ||||
| //! One minor problem is that recent Linux versions use the `relatime` | ||||
| //! mount flag by default for performance reasons (yes, we want | ||||
| //! that). When enabled, `atime` data is written to the disk only if | ||||
| //! the file has been modified since the `atime` data was last updated | ||||
| //! (`mtime`), or if the file was last accessed more than a certain | ||||
| //! amount of time ago (by default 24h). | ||||
| //! | ||||
| //! Another problem arise when running backups references old | ||||
| //! chunks. We need to make sure that the sweep does not remove such | ||||
| //! chunks. Not sure how to implement that. | ||||
| //! | ||||
| //! ## Store `marks` in RAM using a HASH | ||||
| //! | ||||
| //! Not sure if this is better. TODO | ||||
|  | ||||
| mod chunk_stat; | ||||
| pub use chunk_stat::*; | ||||
|  | ||||
		Reference in New Issue
	
	Block a user