While the user chosen description is not allowed to be
empty, we do leave it empty for recovery keys, as a "dummy
description" makes little sense...
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
otherwise the user is confronted with a generic error like "permission
check failed" with no indication that it refers to a request made to the
remote PBS instance..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Currently there's not yet a node config and the WA config is
somewhat "tightly coupled" to the user entries in that
changing it can lock them all out, so for now I opted for
fewer reorganization and just use a digest of the
canonicalized config here, and keep it all in the tfa.json
file.
Experimentally using the flatten feature on the methods with
an`Updater` struct similar to what the api macro is supposed
to be able to derive on its own in the future.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
modifying @pam users credentials should be only possible for root@pam,
otherwise it can have unintended consequences.
also enforce the same limit on user creation (except self_service check,
since it makes no sense during user creation)
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
Signal does not yet re-implement Stream (and is not yet wrapped in
tokio-stream either).
see https://github.com/tokio-rs/tokio/pull/3383
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
to wrap a Receiver in a Stream. this will likely move back into tokio
proper once we have a std Stream..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Child itself is no longer a Future, but it has a new wait() async fn
that does the same thing
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
enter() now returns a guard, and the builder got revamped to make the
choice between MT and current thread explicit.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
We always automatically unload tapes to free library slots,
so it should not happen that an ejected tape resides inside the drive.
This is just a safe guard to handle the situation in case it happens ...
You can manually produce the situation by ejecting a tape without unloading:
mt -f /dev/nst0 eject
Note: Our "proxmox-tape eject" does automatic unload
Try to provide generic implementation for complex operations:
- unload_to_free_slot
- load_media
- export media
- clean drive
- online_media_changer_ids
the old variant attempted to parse a tokenid as userid and returned the
cryptic parsing error to the client, which is rather confusing.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Use timeout futures for sections that might hang in certain error
conditions. This is mostly intended to be used as a safeguard, not a
first line of defense - i.e. best-effort avoidance of total hangs.
Not every future used for the HttpClient/H2Client is changed, only those
where a quick response is to be expected. For example, the response
reading futures are left alone, so data transfer is never capped with
timeout, only the initial server connect.
It is also used for upgrading to H2 connections, as that can take a long
time on overloaded servers.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
it seems that sometimes, the child process signal gets handled
before the parent process signal. Systemd then ignores the
childs signal (finished reloading) and only after going into
reloading state because of the parent. this will never finish.
Instead, wait for the state to change to 'reloading' after sending
that signal in the parent, an only fork afterwards. This way
we ensure that systemd knows about the reloading before actually trying
to do it.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Fabian Ebner <f.ebner@proxmox.com>
of ProcessLockSharedGuard.
We use a counter to determine if we can unlock the file again, but
we never actually decremented the writer count, so we held the
lock forever.
This fixes the issue that we could not start a garbage collect after
a reload, as long as the old process is still running, even when that
process has no active backup anymore but another long running task
(e.g. file download, terminal, etc.).
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
document all public things, add some doc links and make some
previously-public things only available for test cases or within the
crate:
previously public, now private:
- AclTreeNode::extract_user_roles (we have extract_roles())
- AclTreeNode::extract_group_roles (same)
- AclTreeNode::delete_group_role (exists on AclTree)
- AclTreeNode::delete_user_role (same)
- AclTreeNode::insert_group_role (same)
- AclTreeNode::insert_user_role (same)
- AclTree::write_config (we have save_config())
- AclTree::load (we have config()/cached_config())
previously public, now crate-internal:
- AclTree::from_raw (only used by tests)
- split_acl_path (used by some test binaries)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
instead of just logging the error. this should never happen in practice
unless someone is messing with the keyfile, in which case, it's better
to abort.
update tests accordingly (wrong fingerprint should fail, no fingerprint
should get the expected one).
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
the RSA key and the encryption key itself are hard-coded to avoid
stalling the test runs because of lack of entropy, they have no special
significance otherwise.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
when restoring an encrypted key, the original one is obviously not
available to check the fingerprint with.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
this fixes the issue that on some filesystems, you cannot recursively
remove a directory when you hold a lock on a file inside (e.g. nfs/cifs)
it is not really backwards compatible (so during an upgrade, there
could be two daemons have the lock), but since the locking was
broken before (see previous patch) it should not really matter
(also it seems very unlikely that someone will trigger this)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
'lock_manifest' returns a Result<File, Error> so we always got the result,
even when we did not get the lock, but we acted like we had.
bubble the locking error up
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
if no groups were found, the task log was very confusing as it
contained no real information why nothing was synced, e.g.:
Starting datastore sync job 'remote:datastore:local-datastore:s-79412799-e6ee'
Sync datastore 'local-datastore' from 'remote/datastore'
sync job 'remote:datastore:local-datastore:s-79412799-e6ee' end
TASK OK
this patch simply logs how many groups were found and are about to be synced:
Starting datastore sync job 'remote:datastore:local-datastore:s-79412799-e6ee'
Sync datastore 'local-datastore' from 'remote/datastore'
found 0 groups to sync
sync job 'remote:datastore:local-datastore:s-79412799-e6ee' end
TASK OK
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
percentage of verified groups, interpolating based on snapshot count
within the group. in most cases, this will also be closer to 'real'
progress since added snapshots (those which will be verified) in active
backup groups will be roughly evenly distributed, while number of total
snapshots per group will be heavily skewed towards those groups which
have existed the longest, even though most of those old snapshots will
only be re-verified very infrequently.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
BackupInfo::list_backup_groups is identical code-wise, and makes more
sense as entry point for listing groups.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
WalkDir does not follow symlinks by default anyway, and this behaviour
is not documented anywhere. e.g., if a sysadmin mounts 'extra storage'
for some backup group or type (not knowing that only metadata is stored
in those directories), GC will ignore all the indices contained within
and happily garbage collect their chunks..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
for safety reason, GC finds and marks all index files below the
datastore base path. as a result of regular operations, only index files
within the expected scheme of <TYPE>/<ID>/<TIMESTAMP> should exist.
add a small check + warning if the index list contains index files out
side of this expected scheme, so that an admin with shell access can
investigate.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
we have messages starting the phases anyway, and limit the number of
progress updates so that context remains available at all times.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
before adding more fields to the tuple, let's just create the struct
inside the match arms to improve readability.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
and use this information to add more information to client backup log
and guide the download manifest decision.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
the errors Vec can contain failed groups as well (e.g., if a group has
no or an invalid owner).
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
else users have to manually search through a potentially very long task
log to find the entries that are different.. this is the same summary
printed at the end of a manual verify task.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
from formatting functions to main function, and pass along the key data
lines instead of the full string.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
this is stricter than the check that happened on manifest load, as it
also fails if the manifest is signed but we don't have a key available.
add some additional output at the start of a backup to indicate whether
a previous manifest is available to base the backup on.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
otherwise loading will run into the signature mismatch which is
technically true, but not the complete picture in this case.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
if the manifest is signed/the contained archives/blobs are encrypted.
stored in 'unprotected' area, since there is already a strong binding
between key and manifest via the signature, and this avoids breaking
backwards compatibility for a simple usability improvement.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
and set/generate it on
- key creation
- key passphrase change
- key decryption if not already set
- key encryption with master key
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
since we systemd-encode parts of the upid string, and those can contain
characters that are invalid in urls (e.g. '\'), we have to percent encode
those
add a 'percent_encode_component' helper, so that we can maybe change
the AsciiSet for all uses at the same time
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Simplify the phase 2 code by treating .bad files just like regular
chunks, with the exception of stat logging.
To facilitate, we need to touch .bad files in phase 1. We only do this
under the condition that 1) the original chunk is missing (as before),
and 2) the original chunk is still referenced somewhere (since the code
lives in the error handler for a failed chunk touch, it only gets called
for chunks we expect to be there, i.e. ones that are referenced).
Untouched they will then be cleaned up after 24 hours (or after the last
longer-running task finishes).
Reason 2) is also a fix for .bad files not being cleaned up at all if
the original is no longer referenced anywhere (e.g. a user deleting all
snapshots after seeing some corrupt chunks appear).
cond_touch_path is introduced to touch arbitrary paths in the chunk
store with the same logic as touching chunks.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
unprivileged users should only see the counts related to their part of
the datastore.
while we're at it, switch to a list groups, filter groups, count
snapshots approach (like list_snapshots) to speedup calls to this
endpoint when many unprivileged users share a datastore.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
used in the PBS GUI, but also for PVE usage queries which don't need all
the extra expensive information..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
by listing groups first, then filtering, then listing group snapshots.
this cuts down the number of openat/getdirents calls for users that just
have a partial view of the datastore.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Useful to avoid the need for a long (and possibly changing) list of include-dev
options in certain situations, e.g. nested ZFS file systems. The option is
already implemented and seems to work as expected. The checks for virtual
filesystems are not affected by this option.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
The patterns from the archive root's .pxarexclude file are already present in
self.patterns when encode_pxarexclude_cli is called. Pass along the number of
CLI patterns and slice accordingly.
Suggested-By: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
previously a .pxarexclude entry in the root of the archive caused the file to
be generated as well, because the patterns are read before calling
generate_directory_file_list and within the function it wasn't possible to
distinguish between a pattern coming from the CLI and a pattern coming from
archive/root/.pxarexclude
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
The documentation states:
.pxarexclude files are treated as regular files and will be included in the
backup archive.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
There is no leading slash in an entry's full_path, causing an anchored
exclude at the root level to fail, e.g. having "/name" as the content of the
file archive/root/.pxarexclude didn't match the file archive/root/name
Fix this by prepending a leading slash before matching.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Add the versions command to proxmox-backup-manager with a similar output
to pveversion [-v]. It prints the packages line by line with only the
package name, followed by the version and, for proxmox-backup and
proxmox-backup-server, some additional information (running kernel,
running version).
In addition it supports the optional output-format parameter which can
be used to print the complete data in either json, json-pretty or text
format. If output-format is specified, the --verbose parameter is
ignored and the detailed list of packages is printed.
With the addition of the versions command, the report is extended as
well.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
Add an optional string field to APTUpdateInfo which can be used for
extra information.
This is used for passing running kernel and running version information
in the versions API call together with proxmox-backup and
proxmox-backup-server.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
for now this only does the 'postfix' -> 'postfix@-' conversion,
fixes the issue that we only showed the 'postfix' service syslog
(which is rather empty in a default setup) instead of the instance one
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
This patch prints the source of the encryption key when running
operations with proxmox-backup-client.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Currently if you generate a default encryption key:
`proxmox-backup-client key create --kdf none`
all backup operations which don't explicitly disable encryption will be
encrypted with this key.
I found it quite surprising, that my backups were all encrypted without
me explicitly specfying neither key nor encryption mode
This patch informs the user when the default key is used (and no
crypt-mode is provided explicitly)
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
when authenticating a token, and not just when authenticating a
user/ticket.
Reported-By: Dominik Jäger <d.jaeger@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
sd_notify is not synchronous, iow. it only waits until the message
reaches the queue not until it is processed by systemd
when the process that sent such a message exits before systemd could
process it, it cannot be associated to the correct pid
so in case of reloading, we send a message with 'MAINPID=<newpid>'
to signal that it will change. if now the old process exits before
systemd knows this, it will not accept the 'READY=1' message from the
child, since it rejects the MAINPID change
since there is no (AFAICS) library interface to check the unit status,
we use 'systemctl is-active <SERVICE_NAME>' to check the state until
it is not 'reloading' anymore.
on newer systemd versions, there is 'sd_notify_barrier' which would
allow us to wait for systemd to have all messages from the current
pid to be processed before acknowledging to the child, but on buster
the systemd version is to old...
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
log invalid owners to system log, and continue with next group just as
if permission checks fail for the following operations:
- verify store with limited permissions
- list store groups
- list store snapshots
all other call sites either handle it correctly already (sync/pull), or
operate on a single group/snapshot and can bubble up the error.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
for more useful log output
old:
Nov 10 11:50:51 foo pvestatd[3378]: proxmox-backup-client failed: Error: error trying to connect: tcp connect error: No route to host (os error 113)
new:
Nov 10 11:55:21 foo pvestatd[3378]: proxmox-backup-client failed: Error: error trying to connect: error connecting to https://thebackuphost:8007/ - tcp connect error: No route to host (os error 113)
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
similar to what we do for zfs. By bailing before partitioning, the disk is
still considered unused after a failed attempt.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
If a package is or will be installed from the enterprise repo, retrieve
the changelog from there as well (securely via HTTPS and authenticated
with the subcription key).
Extends the get_string method to take additional headers, in this case
used for 'Authorization'. Hyper does not have built-in basic auth
support AFAICT but it's simple enough to just build the header manually.
Take the opportunity and also set the User-Agent sensibly for GET
requests, just like for POST.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
to easily check the store of a worker_id
this fixes the issue that one could not filter by type 'syncjob' and
datastore simultaneously
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
very basic, based on API/concepts of PVE one.
Still missing, addint an extra_info string option to APTUpdateInfo
and pass along running kernel/PBS version there.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Clippy complains about the number of paramters we have for
create_archive and it really does need to be made somewhat
less awkward and more usable. For now we just log to stderr
as we previously did. Added todo-comments for this.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
the basedir is already /usr/share/javascript/proxmox-backup/
so adding a subdir of that as alias is not needed
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
with remote Authids, not local Userids.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
if the user/token could have either configured/manually executed the
task, but it was either executed via the schedule (root@pam) or
another user/token.
without this change, semi-privileged users (that cannot read all tasks
globally, but are DatastoreAdmin) could schedule jobs, but not read
their logs once the schedule executes them. it also makes sense for
multiple such users to see eachothers manually executed jobs, as long as
the privilege level on the datastore (or remote/remote_store/local
store) itself is sufficient.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
since the store is not a path parameter, we need to do manual instead of
schema checks. also dropping Datastore.Backup here
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
to allow on-demand scanning of remote datastores accessible for the
configured remote user.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
instead of await'ing the result of 'create_service' directly,
poll it together with the shutdown_future
if we reached that, fork_restart the new daemon, and await
the open future from 'create_service'
this way the old process still handles open connections until they finish,
while we already start a new process that handles new incoming connections
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
they are not an error and we should retry the read
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
when a file shrunk during backup, we endlessly looped, reading/copying 0 bytes
we already have code that handles shrunk files, but we forgot to
break from the read loop
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
we have information here not available in the access log, especially
if the /api2/extjs formatter is used, which encapsulates errors in a
200 response.
So keep the auth log for now, but extend it use from create ticket
calls to all authentication failures for API calls, this ensures one
can also fail2ban tokens.
Do that logging in a central place, which makes it simple but means
that we do not have the user ID information available to include in
the log.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
add all of our configuration files in /etc/proxmox-backup/ further,
call some ZFS tool to get that status.
Also, use the subscription command form manager, as we often require
more info than the status. Also, adapt formatting a bit.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
those are not in a hot code path, and it is not really much work to
build them on the go..
It may not matther much, but it is unnecessary. Rust will probably
inline most of it anyway..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
and change all users of the /status/tasks api call to this
with this change we can now delete the /status/tasks api call
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
instead of returning 0 elements (which does not really make sense anyway),
change it so that there is no limit anymore (besides usize::MAX)
this is technically a breaking change for the api, but i guess
no one is using limit=0 for anything sensible anyway
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
in the api we use PROXMOX_SAFE_ID_REGEX for backup ids, but here
(where we use it to list them) we use a local regex
since the first is a superset of the one used here, simply extend
the local one
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
should cover all the current scenarios. remote server-side checks can't
be meaningfully unit-tested, but they are simple enough so should
hopefully never break.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
in case the garbage_collection errors out, we never set the in-memory
state, so if it failed, the last 'good' starttime was considered
for the schedule
this could lead to the job running every minute instead of the
correct schedule
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
instead of manually, this has the advantage that we now set
the jobstate correctly and can return with an error if it is
currently running (instead of failing in the task)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
index files that were smaller than their respective header size,
would fail with
"failed to fill whole buffer"
instead now check explicitely for the size and fail with
"index too small (size)"
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
try do reduce some unecessary lines, make match arms more precise so
one can faster see what's actually happening.
Also, avoid
> return Err(format_err!(...))
stuff, just use bail!()
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
re-use the future we already have for task log rotation to trigger
it.
Move the FileLogger in ApiConfig into an Arc, so that we can actually
update it and REST using the new one.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
so that we can easily get the main PID of the last recently launched
daemon. Will be used to get the control socket of that one for access
lgo rotate in a future patch
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
this is internal for now, use the comanndo socket struct
implementation, and ideally not a new one but the existing ones
created in the proxy and api daemons.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Allows to extend the use of that socket in the future, e.g., for log
rotate re-open signaling.
To reflect this we use a more general name, and change the commandos
to a more clear namespace.
Both are actually somewhat a breaking change, but the single real
world issue it should be able to cause is, that one won't be able to
stop task from older daemons, which still use the older abstract
socket name format.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This is a preparatory step to replace the task control socket with it
and provide a "reopen log file" command for the rest server.
Kept it simple by disallowing to register new commands after the
socket gets spawned, this avoids the need for locking.
If we really need that we can always wrap it in a Arc<RWLock<..>> or
something like that, or even nicer, register at compile time.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
writing to a file can explode quite easily.
time formatting to rfc3339 should be more robust, but it has a few
conditions where it could fail, so catch that too (and only really
do it if required).
The writes to stdout are left as is, it normally is redirected to
journal which is in memory, and thus breaks later than most stuff,
and at that point we probably do not care anymore anyway.
It could make sense to actually return a result here..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
We renamed the last one always to a file without compression
extension, even if it was .zst previously. So always add the correct
ending to the new last one, if compress was true.
Further, we cannot detect if there'd be a compression required if we
rotated (renamed) it already to the file with .zst included.
So check on rotation itself if it would be a "no .zst" -> ",zst"
transition, and call compress there.
it really should be OK now *knocking wood*
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
by requiring
- Datastore.Backup permission for target datastore
- Remote.Read permission for source remote/datastore
- Datastore.Prune if vanished snapshots should be removed
- Datastore.Modify if another user should own the freshly synced
snapshots
reading a sync job entry only requires knowing about both the source
remote and the target datastore.
note that this does not affect the Authid used to authenticate with the
remote, which of course also needs permissions to access the source
datastore.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
instead of hard-coding 'backup@pam'. this allows a bit more flexibility
(e.g., syncing to a datastore that can directly be used as restore
source) without overly complicating things.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
no idea why I added it as "delete", for all other such operations we
use the "remove" sub-command...
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
again, base idea copied off PVE, but, we safe the information about
which pending version we send a mail out already in a separate
object, to keep the api return type APTUpdateInfo clean.
This also makes a few things a bit easier, as we can update the
package status without saving/restoring the notify information.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
apt changes some of its state/cache also if it errors out, most of
the time, so we actually want to print both, stderr and stdout.
Further, only warn if its exit code is non-zero, for the same
rationale, it may bring updates available even if it errors (e.g.,
because a future pbs-enterprise repo is additionally configured but
not accessible).
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
it's not used anywhere, and not needed either until the day we might
implement push syncs.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
equivalent to verifying a whole datastore, except for reading job
(entries), which is accessible to regular Datastore.Audit/Backup users
as well.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
for verifying a whole datastore. Datastore.Backup now allows verifying
only backups owned by the triggering user.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
they are returned when reading the manifest, which just requires
Datastore.Audit as well. Datastore.Read is for reading backup contents,
not metadata.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
even for otherwise unprivileged users.
since effective privileges of an API token are always intersected with
those of their owning user, this does not allow an unprivileged user to
elevate their privileges in practice, but avoids the need to involve a
privileged user to deploy API tokens.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
a user should be allowed to read/list/overwrite backups owned by their
own tokens, but a token should not be able to read/list/overwrite
backups owned by their owning user.
when changing ownership of a backup group, a user should be able to
transfer ownership to/from their own tokens if the backup is owned by
them (or one of their tokens).
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
since it's not possible to extend existing structs, UserWithTokens
duplicates most of user::User.. to avoid duplicating user::ApiToken as
well, this returns full API token IDs, not just the token name part.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
in most generic places. this is accompanied by a change in
RpcEnvironment to purposefully break existing call sites.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
with an optional Tokenname, appended with '!' as delimiter in the string
representation like for PVE.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Make it more clear that removed files are chunks (not indexes or
something like that, user cannot know that we do not touch them here)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
fixes commit b4fb262335, which copied
over the "Removed bad files:" block, but only adapted the log text,
not the actual variable.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
instead of prerotating 1000 tasks
(which resulted in 2 writes each time an active worker was finished)
simply append finished tasks to the archive (which will be rotated)
page cache should be good enough so that we can get the task logs fast
since existing installations might have an 'index' file, we
still have to read tasks from there, but only if it exists
this simplifies the TaskListInfoIterator a good amount
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
by moving the properties of the storage status out again to the top
level object
also introduce proper structs for the types used, to get type-safety
and better documentation for the api calls
this changes the backup counts from an array of [groups,snapshots] to
an object/struct with { groups, snapshots } and include 'other' types
(though we do not have any at this moment)
this way it is better documented
this also adapts the ui code to cope with the api changes
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
saves files mtime as i64 instead of u64 which enables backup of
files with negative mtime
the catalog_decode_i64 is compatible to encoded u64 values (if < 2^63)
but not reverse, so all "old" catalogs can be read with the new
decoder, but catalogs that contain negative mtimes will decode wrongly
on older clients
also remove the arbitrary maximum value of 2^63 - 1 for
encode_u64 (we just use up to 10 bytes now) and correctly
decode them and update the comments accordingly
adds also test for i64 encode/decode and for compatibility between
u64 encode and i64 decode
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
fixes commit 16f9f244cf which extended
the return schema of the status API but did not adapted the client
status command to that.
Simply define our own tiny return schema and use that.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
we never actually compressed any files, since we only looked at
the extension:
* if it was 'zst' (which was always true for newly rotated files), we
would not compress it
* even if it was not 'zst', we compressed it inplace, never adding '.zst'
(possibly compressing them multiple times as zstd)
now we add new rotated files simply as '.X' and add a 'target' to the
compress fn, where we rename it to (but now we have to unlink the source
path)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
commit a4915dfc2b made a wrong fix, as
it did not observed that the last expressions was done under the
invariant that we had a last verification result, because if none
could be loaded we already returned true (include).
It thus broke the case for "never re-verify", which is important when
using multiple schedules, a more high frequent one for new,
unverified snapshots, and a low frequency to re-verify older snapshots,
e.g., monthly.
Fix this case again, rework the code to avoid this easy to oversee
invariant. Use a nested match to better express the implication of
each setting, and add some comments.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
and load it again when opening it
this way we can persist the status of the last garbage collect across
daemon reloads and reboots
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
This is intended for when the server needs to do requests on
arbitrary, non PBS, external HTTP resources.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
we're a bit strict here what we accept, rather than changing that
lets do it like PVE/PMG and uppercase.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This is indented to be used for the PVE storage library, replacing
the missuse of the much more expensive status API call.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
For proxmox packages it works the same way as PVE, by retrieving the
changelog URL and issuing a HTTP GET to it, forwarding the output to the
client. As this is only supposed to be a workaround removed in the
future, a simple block_on is used to avoid async.
For debian packages we can simply call 'apt-get changelog' and forward
it's output.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
As always, libapt is mocking us with complexity, but we can get the
approximate result we want by retrieving dependencies of all
to-be-updated packages and then seeing if they are missing.
If they are, we assume they will be installed.
For this, query_detailed_info is extended to allow reading details for
non-installed packages, and this is also exposed in
list_installed_apt_packages via 'all_versions_for'. This is necessary so
we can retrieve changelogs for such packages.
Note that we cannot retrieve all that information all the time, as
querying details for packages that aren't installed takes a rather long
time.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Avoids custom hardcoded logic, but can only be used for debian packages
as of now. Adds a FIXME to switch over to use --print-uris only once our
package repos support that changelog format.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
To get package details for a specific version instead of only the
candidate.
Also cleanup filter function with extra struct instead of unnamed &str
parameters.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
...to avoid having the tools:: module depend on api2.
The get_string function is based directly on hyper and thus relatively
simple, not supporting redirects for example.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
gets rid of the return value and moving around of the zip
and decoder data
avoids cloning the path prefix on every recursion
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
by using the new ZipEncoder and recursively add files to it
the zip only contains directories, normal files and hardlinks (by simply
copying the content), no symlinks, etc.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
similar to StdChannelWriter, but implements AsyncWrite and sends
to a tokio::sync::mpsc::Sender
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This modules contains the 'ZipEncoder' struct, which wraps an async writer,
to create a ZIP archive on the fly
To create a ZIP file, have a target that implements AsyncWrite,
give it to ZipEncoder::new, add entries via 'add_entry' and
at the end, call 'finish'
for now, this does not implement compression (uses ZIPs STORE mode), and
does not support empty directories or hardlinks (or any other special
files)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Fixes a bug in which the userid of the ticket cache is updated,
when a user connects, but the ticket itself is not.
This means a newly connected user has a previously connected
user's ticket and thus, cannot do anything, as the client will
attempt to use the invalid ticket.
e.g. if john@pbs connected to the server first, followed by
mike@pbs, the following would be stored in the ticket cache.
{
"localhost": {
"mike@pbs": {
"ticket": "PBS:john@pbs:AAAA",
"timestamp": 1601039326,
"token": "BBBB"
}
}
}
Signed-off-by: Dylan Whyte <d.whyte@proxmox.com>
The first rotation is normally the one still opened by one or more
processes for writing, so it must NOT be replaced, removed, ..., as
this then makes the remaining logging, until those processes are
noticed that they should reopen the logfile due to rotation, goes
into nirvana, which is far from ideal for a log.
Only rotating (renaming) is OK for this active file, as this does not
invalidates the file and keeps open FDs intact.
So start compressing with the second rotation, which should be clear
to use, as all writers must have been told to reopen the log during
the last rotation, reopen is a fast operation and normally triggered
at least day ago (at least if one did not dropped the state file
manually), so we are fine to archive that one for real.
If we plan to allow faster rotation the whole rotation+reopen should
be locked, so that we can guarantee that all writers switched over,
but this is unlikely to be needed.
Again, this is was logrotate sanely does by default since forever.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
this is not the job of logrotate, and the real 20+ years battle
tested log rotate binary does not do so either as it's actually
pretty dangerous.
If we "replace" the file we break any logger which already opened a
new one here, e.g., a dameon starting up, and thus that writer would
log to nirvana.
It's the job of a logger to create a file if not existing, it makes
no sense to do it here.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
To cater to the paranoid, a new datastore-wide setting "verify-new" is
introduced. When set, a verify job will be spawned right after a new
backup is added to the store (only verifying the added snapshot).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Force consumers to use the lookup_datastore method instead of
potentially opening a datastore twice, and pass the config we have
already loaded into open_with_path, removing the need for open(1).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Commit 9070d11f4c introduced this change for other call sites,
assuming it is correct, this one was missed.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
and use that in ApiConfig to avoid that it is owned by root if the
proxmox-backup-api process creates it first.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
for now log auth errors also to the syslog, on a protected (LAN
and/or firewalled) setup this should normally happen due to
missconfiguration, not tries to break in.
This reduces syslog noise *a lot*. A current full journal output from
the current boot here has 72066 lines, of which 71444 (>99% !!) are
"successful auth for user ..." messages
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
reuse the FileLogger module in append mode.
As it implements write, which is not thread safe (mutable self) and
we use it in a async context we need to serialize access using a
mutex.
Try to use the same format we do in pveproxy, namely the one which is
also used in apache or nginx by default.
Use the response extensions to pass up the userid, if we extract it
from a ticket.
The privileged and unprivileged dameons log both to the same file, to
have a unified view, and avoiding the need to handle more log files.
We avoid extra intra-process locking by reusing the fact that a write
smaller than PIPE_BUF (4k on linux) is atomic for files opened with
the 'O_APPEND' flag. For now the logged request path is not yet
guaranteed to be smaller than that, this will be improved in a future
patch.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Add a generous limit now and return the correct error (414 URI Too
Long). Otherwise we could to pretty larger GET requests, 64 KiB and
possible bigger (at 64 KiB my simple curl test failed due to
shell/curl limitations).
For now allow a 3072 characters as combined length of URI path and
query.
This is conform with the HTTP/1.1 RFCs (e.g., RFC 7231, 6.5.12 and
RFC 2616, 3.2.1) which do not specify any limits, upper or lower, but
require that all server accessible resources mus be reachable without
getting 414, which is normally fulfilled as we have various length
limits for stuff which could be in an URI, in place, e.g.:
* user id: max. 64 chars
* datastore: max. 32 chars
The only known problematic API endpoint is the catalog one, used in
the GUI's pxar file browser:
GET /api2/json/admin/datastore/<id>/catalog?..&filepath=<path>
The <path> is the encoded archive path, and can be arbitrary long.
But, this is a flawed design, as even without this new limit one can
easily generate archives which cannot be browsed anymore, as hyper
only accepts requests with max. 64 KiB in the URI.
So rather, we should move that to a GET-as-POST call, which has no
such limitations (and would not need to base32 encode the path).
Note: This change was inspired by adding a request access log, which
profits from such limits as we can then rely on certain atomicity
guarantees when writing requests to the log.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
needs new proxmox dependency to get the RpcEnvironment changes,
adding client_ip getter and setter.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Rewrite most of the documentation to be more readable and correct
(according to the current implementations).
Add a table visualizing all different locks used to synchronize
concurrent operations.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Contains a link to the 'backup' module's doc, as that explains a lot
about the inner workings of PBS and probably marks a good entry point
for new readers.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Avoid races when updating manifest data by flocking a lock file.
update_manifest is used to ensure updates always happen with the lock
held.
Snapshot deletion also acquires the lock, so it cannot interfere with an
outstanding manifest write.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The 'Ok::<_, Self::Error>(res)' type annotation was from a time where
we could not use async, and had a combinator here which needed
explicity type information. We switched over to async in commit
91e4587343 and, as the type annotation
is already included in the Future type, we can safely drop it.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Given the .pxarexclude file
foo
/bar
The following happens:
exclude: /foo
exclude: /bar
exclude: /subdir/foo
include: /subdir/bar
since the `/bar` line is an absolute path
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Removing a snapshot has some more safety checks which we don't want to
ignore when removing an entire group (i.e. locking the manifest and
notifying GC).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
There's no point in having that as a seperate method, just parse the
thing into a struct and write it back out correctly.
Also makes further changes to the method simpler.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
If we can't acquire a lock (either because the snapshot disappeared, it
is about to be forgotten/pruned, or it is currently still running) skip
the snapshot. Hold the lock during verification, so that it cannot be
deleted while we are still verifying.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
...to avoid it being forgotten or pruned while in use.
Update lock error message for deletions to be consistent.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
To allow other reading operations on the base snapshot as well. No
semantic changes with this patch alone, as all other locks on snapshots
are exclusive.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
A removal can fail if the snapshot is already gone (this is fine, our
job is done either way) or we couldn't get a lock (also fine, it can't
be removed then, just warn the user so he knows what happened and why it
wasn't removed) - keep going either way.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
A snapshot that's currently being read can still appear in the prune
list, but should not be removed.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This adds a change-owner command to proxmox-backup-client,
that allows a caller with datastore modify privileges
to change the owner of a backup-group.
Signed-off-by: Dylan Whyte <d.whyte@proxmox.com>
To untangle the server code from the actual backup
implementation.
It would be ideal if the whole backup/ dir could become its
own crate with minimal dependencies, certainly without
depending on the actual api server. That would then also be
used more easily to create forensic tools for all the data
file types we have in the backup repositories.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Used to not require access to the WorkerTask struct outside
the `server` and `api2` module, so it'll be easier to
separate those backup/server/client parts into separate
crates.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This is only acquired in those two methods, both as shared. So it has
no use.
It seems, that it was planned in the past that the index deletion
should take the exclusive, while read and write takes the shared
flock on the index, as one can guess from the lock comments in commit
0465218953
But then later, in commit c8ec450e37)
the documented semantics where changed to use a temp file and do an
atomic rename instead for atomicity.
The reader shared flock on the index file was done inbetween,
probably as preparatory step, but was not removed again when strategy
was changed to using the file rename instead.
Do so now, to avoid confusion of readers and a useless flock.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
via HTTP2/backup reader protocol. they already could do so via the plain
HTTP download-file/.. API calls that the GUI uses, but the reader
environment required READ permission on the whole datastore instead of
just BACKUP on the backup group itself.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
a reader connection should not be allowed to read arbitrary chunks in
the datastore, but only those that were previously registered by opening
the corresponding index files.
this mechanism is needed to allow unprivileged users (that don't have
full READ permissions on the whole datastore) access to their own
backups via a reader environment.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
not triggered by any current code, but this would lead to a stack
exhaustion since borrow would call deref which would call borrow again..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Previously only Datastore.Modify was required for creating a new
datastore.
But, that endpoint allows one to pass an arbitrary path, of which all
parent directories will be created, this can allow any user with the
"Datastore Admin" role on "/datastores" to do some damage to the
system. Further, it is effectively a side channel for revealing the
systems directory structure through educated guessing and error
handling.
Add a new privilege "Datastore.Allocate" which, for now, is used
specifically for the create datastore API endpoint.
Add it only to the "Admin" role.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
avoiding the need for reshuffling all bits when a new privilege is
added at the start or in the middle of this definition.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
If a fuse_loop instance dies suddenly (e.g. SIGKILL), the FUSE mount and
loop device assignment are left behind. We can determine this scenario
on specific unmap, when the PID file is either missing or contains a PID
of a non-running process, but the backing file and potentially loop
device are still there.
If that's the case, do an "emergency cleanup", by unassigning the
loopdev, calling 'fusermount -u' and then cleaning any leftover files
manually.
With this in place, pretty much any situation is now recoverable via
only the 'proxmox-backup-client' binary, by either calling 'unmap' with
or without parameters.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
On unmap, only report success if the instance we are killing actually
terminates. This is especially important so that cleanup routines can be
assured that /run files are actually cleaned up after calling
cleanup_unused_run_files.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
A 'map' call will only clean up what it needs, that is only leftover
files or dangling instances of it's own name.
For a full cleanup the user can call 'unmap' without any arguments.
The 'cleanup on error' behaviour of map_loop is removed. It is no longer
needed (since the next call will clean up anyway), and in fact fixes a
bug where trying to map an image twice would result in an error, but
also cleanup the .pid file of the running instance, causing 'unmap' to
fail afterwards.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
So user doesn't need to remember which loop devices he has mapped to
what.
systemd unit encoding is used to transform a unique identifier for the
mapped image into a suitable name. The files created in /run/pbs-loopdev
will be named accordingly.
The encoding all happens outside fuse_loop.rs, so the fuse_loop module
does not need to care about encodings - it can always assume a name is a
valid filename.
'unmap' without parameter displays all current mappings. It's
autocompletion handler will list the names of all currently mapped
images for easy selection. Unmap by /dev/loopX or loopdev number is
maintained, as those can be distinguished from mapping names.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
the same as the regular TaskState, but without its fields, so that
we can use the api macro and use it as api call parameter
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Allows mapping fixed-index .img files (usually from VM backups) to be
mapped to a local loopback device.
The architecture uses a FUSE-backed temp file mapped to a loopdev:
/dev/loopX -> FUSE /run/pbs-loopdev/xxx -> backup client -> PBS
Since unmapping requires some cleanup (unmap the loopdev, stop FUSE,
remove the temp files) a special 'unmap' command is added, which uses a
PID file to send SIGINT to the backup-client instance started with
'map', which will handle the cleanup itself.
The polling with select! in mount.rs needs to be split in two, since we
have a chicken and egg problem between running FUSE and setting up the
loop device - so we need to do them concurrently, until the loopdev is
assigned, at which point we can report success and daemonize, and then
continue polling the FUSE loop future.
A loopdev module is added to tools containing all required functions for
mapping a loop device to the FUSE file, with the ioctls moved into an
inline module to avoid exposing them directly.
The client code is placed in the 'mount' module, which, while
admittedly a loose fit, allows reuse of the daemonizing code.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
if the archive file does not exist yet, we cannot rotate it, but it's not
actually an error, so just return Ok(false) to indicate no rotation took
place
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
while we probably do not add much more to them, it still looks ugly.
If this was made so that adding a World readable API call is "hard"
and not done by accident, it rather should be done as a test on build
time. But, IMO, the API permission schema definitions are easy to
review, and not often changed/added - so any wrong World readable API
call will normally still caught.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
In theory, one can do std::mem::forget, and ignore the drop handler. With
the lifetime hack, this could result in a crash.
So we simply require 'static lifetime now (futures also needs that).
Causes a panic if last_update is smaller than RRD_DATA_ENTRIES*reso,
which (I believe) can happen when inserting the first value for a DB.
Clamp the value to 0 in that case.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Fix a potential bug where errors that happen after the SendHandle has
been dropped while doing the thread join might have been ignored.
Requires internal check_abort to be moved out of 'impl SendHandle' since
we only have the Mutex left, not the SendHandle.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This can slow things down by a lot on setups with (relatively) high
seek time, in the order of doubling the backup times if cache isn't
populated with the last backups chunk inode info.
Effectively there's nothing known this protects us from in the
codebase. The only thing which was theorized about was the case
where a really long running backup job (over 24 hours) is still
running and writing new chunks, not indexed yet anywhere, then an
update (or manual action) triggers a reload of the proxy. There was
some theory that then a GC in the new daemon would not know about the
oldest writer in the old one, and thus use a less strict atime limit
for chunk sweeping - opening up a window for deleting chunks from the
long running backup.
But, this simply cannot happen as we have a per datastore process
wide flock, which is acquired shared by backup jobs and exclusive by
GC. In the same process GC and backup can both get it, as it has a
process locking granularity. If there's an old daemon with a writer,
that also has the lock open shared, and so no GC in the new process
can get exclusive access to it.
So, with that confirmed we have no need for a "half-assed"
verification in the backup finish step. Rather, we plan to add an
opt-in "full verify each backup on finish" option (see #2988)
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
We forgot to put braces around the DNS_NAME regex, and in
DNS_NAME_OR_IP_REGEX
this is wrong because the regex
^foo|bar$
matches 'foo' at the beginning and 'bar' at the end, so either
foobaz
bazbar
would match. only
^(foo|bar)$
matches only 'foo' and 'bar'
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
* add square brackets to ipv6 adresses in BackupRepository if they not
already have some (we save them without in the remote config)
* in get_pull_parameters, we now create a BackupRepository first and use
those values (which does the [] mapping), this also has the advantage
that we have one place less were we hardcode 8007 as port
* in the ui, add square brackets for ipv6 adresses for remotes
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
when upgrading from a version where we stored all tasks in the 'active' file,
we did not completly account for finished tasks still there
we should update the file when encountering any finished task in
'active' as well as filter them out on the api call (if they get through)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this adds the ability to add port numbers in the backup repo spec
as well as remotes, so that user that are behind a
NAT/Firewall/Reverse proxy can still use it
also adds some explanation and examples to the docs to make it clearer
for h2 client i left the localhost:8007 part, since it is not
configurable where we bind to
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
When creating a new zpool for a datastore, also instantiate an
import-unit for it. This helps in cases where '/etc/zfs/zool.cache'
get corrupted and thus the pool is not imported upon boot.
This patch needs the corresponding addition of 'zfs-import@.service' in
the zfsonlinux repository.
Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
we need this, because we append the port to this to get a target url
e.g. we print
format!("https://{}:8007/", address)
if address is now an ipv6 (e.g. fe80::1) it would become
https://fe80::1:8007/ which is a valid ipv6 on its own
by using square brackets we get:
https://[fe80::1]:8007/ which now connects to the correct ip/port
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
since len() and MAX_INDEX_TASKS are both usize, they underflow
instead of getting negative values
instead check the sizes and set them accordingly
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this starts a task once a day at "00:00" that rotates the task log
archive if it is bigger than 500k
if we want, we can make the schedule/size limit/etc. configurable,
but for now it's ok to set fixed values for that
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
since there are no users of this anymore and we now have a nicer
TaskListInfoIterator to use, we can drop this function
this also means that 'update_active_workers' does not need to return
a list anymore since we never used that result besides in
read_task_list
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this means that limiting with epoch now works correctly
also change the api type to i64, since that is what the starttime is
saved as
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this makes the filtering/limiting much nicer and readable
since we now have potentially an 'infinite' amount of tasks we iterate over,
and cannot now beforehand how many there are, we return the total count
as always 1 higher then requested iff we are not at the end (this is
the case when the amount of entries is smaller than the requested limit)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this is an iterator that reads/parses/updates the task list as
necessary and returns the tasks in descending order (newest first)
it does this by using our logrotate iterator and using a vecdeque
we can use this to iterate over all tasks, even if they are in the
archive and even if the archive is logrotated but only read
as much as we need
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
instead of removing tasks beyond the 1000 that are in the index
write them into an archive file by appending them at the end
this way we can later still read them
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
one for only the active tasks and one for up to 1000 finished tasks
factor out the parsing of a task file (we will later need this again)
and use iterator combinators for easier code
we now sort the tasks ascending (this will become important in a later patch)
but reverse (for now) it to keep compatibility
this code also omits the converting into an intermittent hash
since it cannot really happen that we have duplicate tasks in this list
(since the call is locked by an flock, and it is the only place where we
write into the lists)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this is a helper to rotate and iterate over log files
there is an iterator for open filehandles as well as
only the filename
also it has the possibilty to rotate them
for compression, zstd is used
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
also changes:
* correct comment about reset (replace 'sync' with 'action')
* check schedule change correctly (only when it is actually changed)
with this changes, we can drop the 'lookup_last_worker' method
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
we rely on the jobstate handling to write the error of the worker
into its state file, but we used '?' here in a block which does not
return the error to the block, but to the function/closure instead
so if a prune job failed because of such an '?', we did not write
into the statefile and got a wrong state there
instead use our try_block! macro that wraps the code in a closure
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
like the sync jobs, so that if an admin configures a schedule it
really starts the next time that time is reached not immediately
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
listing, updating or deleting a user is now possible for the user
itself, in addition to higher-privileged users that have appropriate
privileges on '/access/users'.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
filtered by those they are privileged enough to read individually. this
allows such users to configure prune/GC schedules via the GUI (the API
already allowed it previously).
permission-wise, a user with this privilege can already:
- list all stores they have access to (returns just name/comment)
- read the config of each store they have access to individually
(returns full config of that datastore + digest of whole config)
but combines them to
- read configs of all datastores they have access to (returns full
config of those datastores + digest of whole config)
user that have AUDIT on just /datastore without propagate can now no
longer read all configurations (but this could be added it back, it just
seems to make little sense to me).
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
by packing the auth into a RwLock and starting a background
future that renews the ticket every 15 minutes
we still use the BroadcastFuture for the first ticket and only
if that is finished we start the scheduled future
we have to store an abort handle for the renewal future and abort it when
the http client is dropped, so we do not request new tickets forever
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
like we do for PVE. this is visible on the dashboard, and caused 403 on
each update which bothers me when looking at the dev console.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
A client can omit uploading chunks in the "known_chunks" list, those
then also won't be written on the server side. Check all those chunks
mentioned in the index but not uploaded for existance and report an
error if they don't exist instead of marking a potentially broken backup
as "successful".
This is only important if the base snapshot references corrupted chunks,
but has not been negatively verified. Also, it is important to only
verify this at the end, *after* all index writers are closed, since only
then can it be guaranteed that no GC will sweep referenced chunks away.
If a chunk is found missing, also mark the previous backup with a
verification failure, since we know the missing chunk has to referenced
in it (only way it could have been inserted to known_chunks with
checked=false). This has the benefit of automatically doing a
full-upload backup if the user attempts to retry after seeing the new
error, instead of requiring a manual verify or forget.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Do not allow clients to reuse chunks from the previous backup if it has
a failed validation result. This would result in a new "successful"
backup that potentially references broken chunks.
If the previous backup has not been verified, assume it is fine and
continue on.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
- remove chrono dependency
- depend on proxmox 0.3.8
- remove epoch_now, epoch_now_u64 and epoch_now_f64
- remove tm_editor (moved to proxmox crate)
- use new helpers from proxmox 0.3.8
* epoch_i64 and epoch_f64
* parse_rfc3339
* epoch_to_rfc3339_utc
* strftime_local
- BackupDir changes:
* store epoch and rfc3339 string instead of DateTime
* backup_time_to_string now return a Result
* remove unnecessary TryFrom<(BackupGroup, i64)> for BackupDir
- DynamicIndexHeader: change ctime to i64
- FixedIndexHeader: change ctime to i64
since converting from i64 epoch timestamp to DateTime is not always
possible. previously, passing invalid backup-time from client to server
(or vice-versa) panicked the corresponding tokio task. now we get proper
error messages including the invalid timestamp.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
otherwise operations like catalog shell panic when viewing pxar archives
containing such entries, e.g. with mtime very far ahead into the future.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
by either printing the original, out-of-range timestamp as-is, or
bailing with a proper error message instead of panicking.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
even if it can't be handled by chrono. silently replacing it with epoch
0 is confusing..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
else we get the default of 16k, which is quite low for our use case.
this improves the TLS upload benchmark speed by about 30-40% for me.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
fixes the error, "manifest does not contain
file 'X.pxar'", that occurs when trying to mount
a pxar archive with 'proxmox-backup-client mount':
Signed-off-by: Dylan Whyte <d.whyte@proxmox.com>
similar to the other fix, if we do not set the buffer size manually,
we get better performance for high latency connections
restore benchmark from f.gruenbicher:
no delay, without patch: ~50MB/s
no delay, with patch: ~50MB/s
25ms delay, without patch: ~11MB/s
25ms delay, with path: ~50MB/s
my own restore benchmark:
no delay, without patch: ~1.5GiB/s
no delay, with patch: ~1.5GiB/s
25ms delay, without patch: 30MiB/s
25ms delay, with patch: ~950MiB/s
for some more details about those benchmarks see
https://lists.proxmox.com/pipermail/pbs-devel/2020-September/000600.html
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
by leaving the buffer sizes on default, we get much better tcp performance
for high latency links
throughput is still impacted by latency, but much less so when
leaving the sizes at default.
the disadvantage is slightly higher memory usage of the server
(details below)
my local benchmarks (proxmox-backup-client benchmark):
pbs client:
PVE Host
Epyc 7351P (16core/32thread)
64GB Memory
pbs server:
VM on Host
1 Socket, 4 Cores (Host CPU type)
4GB Memory
average of 3 runs, rounded to MB/s
| no delay | 1ms | 5ms | 10ms | 25ms |
without this patch | 230MB/s | 55MB/s | 13MB/s | 7MB/s | 3MB/s |
with this patch | 293MB/s | 293MB/s | 249MB/s | 241MB/s | 104MB/s |
memory usage (resident memory) of proxmox-backup-proxy:
| peak during benchmarks | after benchmarks |
without this patch | 144MB | 100MB |
with this patch | 145MB | 130MB |
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
We need to update the atime of chunk files if they already exist,
otherwise a concurrently running GC could sweep them away.
This is protected with ChunkStore.mutex, so the fstat/unlink does not
race with touching.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
The iterator of get_chunk_iterator is extended with a third parameter
indicating whether the current file is a chunk (false) or a .bad file
(true).
Count their sizes to the total of removed bytes, since it also frees
disk space.
.bad files are only deleted if the corresponding chunk exists, i.e. has
been rewritten. Otherwise we might delete data only marked bad because
of transient errors.
While at it, also clean up and use nix::unistd::unlinkat instead of
unsafe libc calls.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This ensures that following backups will always upload the chunk,
thereby replacing it with a correct version again.
Format for renaming is <digest>.<counter>.bad where <counter> is used if
a chunk is found to be bad again before a GC cleans it up.
Care has been taken to deliberately only rename a chunk in conditions
where it is guaranteed to be an error in the chunk itself. Otherwise a
broken index file could lead to an unwanted mass-rename of chunks.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
a range from high to low in rust results in an empty range
(see std::ops::Range documentation)
so we need to generate the range from 0..data.len() and then reverse it
also, the task log contains a newline at the end, so we have to remove
that (should it exist)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this implements parsing and calculating calendarevents that have a
basic date component (year-mon-day) with the usual syntax options
(*, ranges, lists)
and some special events:
monthly
yearly/annually (like systemd)
quarterly
semiannually,semi-annually (like systemd)
includes some regression tests
the ~ syntax for days (the last x days of the month) is not yet
implemented
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
instead of using 'as' and silently converting wrong,
use the TryInto trait and raise an error if we cannot convert
this should only happen if we have a negative year,
but this is expected (we do not want schedules from before the year 0)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
add_* are modeled after add_days
subtract one for set_mon to have a consistent interface for all fields
(i.e. getter/setter return/expect the 'real' number, not the ones
in the tm struct)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
the tm struct contains the year - 1900 but we added that
if we want to use the libc normalization correctly, the tm struct
must have the correct year in it, else the computations for timezones,
etc. fail
instead add a getter that adds the years and a setter that subtracts it again
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
if we give multiple options/ranges for a value, e.g.
2,4,8
we always choose the biggest, instead of the smallest that is next
this happens because in DateTimeValue::find_next(value)
'next' can be set multiple times and we set it when the new
value was *bigger* than the last found 'next' value, when in reality
we have to choose the *smallest* next we can find
reverse the comparison operator to fix this
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
we never passed 'false' to it anyway so remove it
(we can add it again if we should ever need it)
also remove the adding of wday (gets normalized anyway)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
we want to use dates for the calendarspec, and with that there are some
impossible combinations that cannot be detected during parsing
(e.g. some datetimes do not exist in some timezones, and the timezone
can change after setting the schedule)
so finding no timestamp is not an error anymore but a valid result
we omit logging in that case (since it is not an error anymore)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
mktime/gmtime can normalize time and even can handle special timezone
cases like the fact that the time 2:30 on specific day/timezone combos
do not exists
we have to convert the signature of all functions that use
normalize_time since mktime/gmtime can return an EOVERFLOW
but if this happens there is no way we can find a good time anyway
since normalize_time will always set wday according to the rest of the
time, remove set_wday
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
while it was correct, there was no measurable speed gain
(a benchmark yielded 2.8 ms for a spec that did not find a timestamp either way)
so remove it for simpler code
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
when trying to parse the task status, we seek 8k from the end
which may be into the middle of a line, so the datetime parsing
can fail (when the log message contains ': ')
This patch does a fast search for the last line, and avoid the
'lines' iterator.
for datastores where the requesting user has read or write permissions,
since the API method itself filters by that already. this is the same
permission setting and filtering that the datastore list API endpoint
does.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Because if not, the backups it creates have bogus permissions and may
seem like they got broken once the daemon is started again with the
correct user/group.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Save the state ("ok" or "failed") and the UPID of the respective
verify task. With this we can easily allow to open the relevant task
log and show when the last verify happened.
As we already load the manifest when listing the snapshots, just add
it there directly.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
It's a string-type.
Implement Serialize via Display, Deserialize via FromStr and
add an API_SCHEMA so that it can be used as a type within
the #[api] macro.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
to avoid `map_struct` which is actually unsafe because it
does not verify alignment constraints at all
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
There is no requirement to have at least
a blank line, attribute or comment in between two
interface definitions, e.g.
iface lo inet loopback
iface lo inet6 loopback
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
it really is not necessary, since the only time we are interested in
loading the state from the file is when we list it, and there
we use JobState::load directly to avoid the lock
we still need to create the file on syncjob creation though, so
that we have the correct time for the schedule
to do this we add a new create_state_file that overwrites it on creation
of a syncjob
for safety, we subtract 30 seconds from the in-memory state in case
the statefile is missing
since we call create_state_file from proxmox-backup-api,
we have to chown the lock file after creating to the backup user,
else the sync job scheduling cannot aquire the lock
also we remove the lock file on statefile removal
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
so that we can log if triggered by a schedule, and writing to a jobstatefile
also correctly polls now the abort_future of the worker, so that
users can stop a sync
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
and move the pull parameters into the worker, so that the task log
contains the error if there is one
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this is intended to be a generic helper to (de)serialize job states
(e.g., sync, verify, and so on)
writes a json file into '/var/lib/proxmox-backup/jobstates/TYPE-ID.json'
the api creates the directory with the correct permissions, like
the rrd directory
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
the endtime should be the timestamp of the last log line
or if there is no log at all, the starttime
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
representing a state via an enum makes more sense in this case
we also implement FromStr and Display to make it easy to convet from/to
a string
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
To prevent forgetting the base snapshot of a running backup, and catch
the case when it still happens (e.g. via manual rm) to at least error
out instead of storing a potentially invalid backup.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
This reverts commit d53fbe2474.
The HashSet and "register" function are unnecessary, as we already know
which backup is the one we need to check: the last one, stored as
'last_backup'.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
An flock on the snapshot dir itself is used in addition to the group dir
lock. The lock is used to avoid races with forget and prune, while
having more granularity than the group lock (i.e. the group lock is
necessary to prevent more than one backup per group, but the snapshot
lock still allows backups unrelated to the currently running to be
forgotten/pruned).
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Attempt to lock the backup directory to be deleted, if it works keep the
lock until the deletion is complete. This way we ensure that no other
locking operation (e.g. using a snapshot as base for another backup) can
happen concurrently.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
an encrypted Index should never reference a plain-text chunk, and an
unencrypted Index should never reference an encrypted chunk.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
these checks were already in place for regular downloading of backed up
files, also do them when attempting to decode a catalog, or when
downloading decoded files referenced by a pxar index.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
If the datastore holds broken backups for some reason, do not attempt to
base following snapshots on those. This would lead to an error on
/previous, leaving the client no choice but to upload all chunks, even
though there might be potential for incremental savings.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
When uploading an RSA encoded key alongside the backup,
the backup would fail with the error message: "wrong blob
file extension".
Adding the '.blob' extension to rsa-encrypted.key before the
the call to upload_blob_from_data(), rather than after, fixes
the issue.
Signed-off-by: Dylan Whyte <d.whyte@proxmox.com>
Commit 9fa55e09 "finish_backup: test/verify manifest at server side"
moved the finished-marking above some checks, which means if those fail
the backup would still be marked as successful on the server.
Revert that part and comment the line for the future.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
instead of bailing and stopping the entire GC process, warn about the
missing chunks and continue.
this results in "TASK WARNINGS: X" as the status.
Signed-off-by: Oguz Bektas <o.bektas@proxmox.com>
Used chunks are marked in phase1 of the garbage collection process by
using the atime property. Each used chunk gets touched so that the atime
gets updated (if older than 24h, see relatime).
Should there ever be a situation in which the phase1 in the GC run needs
a very long time to finish, it could happen that the grace period
calculated in phase2 is not long enough and thus the marking of the
chunks (atime) becomes invalid. This would result in the removal of
needed chunks.
Even though the likelyhood of this happening is very low, using the
timestamp from right before phase1 is started, to calculate the grace
period in phase2 should avoid this situation.
Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
and not just of previously synced ones.
we can't use BackupManifest::verify_file as the archive is still stored
under the tmp path at this point.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
for encrypted chunks this is currently not possible, as we need the key
to decode the chunk.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
regular chunks are only decoded when their contents are accessed, in
which case we need to have the key anyway and want to verify the digest.
for blobs we need to verify beforehand, since their checksums are always
calculated based on their raw content, and stored in the manifest.
manifests are also stored as blobs, but don't have a digest in the
traditional sense (they might have a signature covering parts of their
contents, but that is verified already when loading the manifest).
this commit does not cover pull/sync code which copies blobs and chunks
as-is without decoding them.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Errors while applying metadata will not be considered fatal
by default using `pxar extract` unless `--strict` was passed
in which case it'll bail out immediately.
It'll still return an error exit status if something had
failed along the way.
Note that most other errors will still cause it to bail out
(eg. errors creating files, or I/O errors while writing
the contents).
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
'username' here is without realm, but we really want to use user@realm
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
tokios kill_on_drop sometimes leaves zombies around, especially
when there is not another tokio::process::Command spawned after
so instead of relying on the 'kill_on_drop' feature, we explicitly
kill the child on a worker abort. to be able to do this
we have to use 'tokio::select' instead of 'futures::select' since
the latter requires the future to be fused, which consumes the
child handle, leaving us no possibility to kill it after fusing.
(tokio::select does not need the futures to be fused, so we
can reuse the child future after the select again)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
so that we can print a list at the end of the worker which backups
are corrupt.
this is useful if there are many snapshots and some in between had an
error. Before this patch, the task log simply says to 'look in the logs'
but if the log is very long it makes it hard to see what exactly failed.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
this makes it easier to see which chunks are corrupt
(and enables us in the future to build a 'complete' list of
corrupt chunks)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
The extraction algorithm has a state (bool) indicating
whether we're currently in a positive or negative match
which has always been initialized to true at the beginning,
but when the user provides a `--pattern` argument we need to
start out with a negative match.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
This should never trigger if everything else works correctly, but it is
still a very cheap check to avoid wrongly marking a backup as "OK" when
in fact some chunks might be missing.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Multiple backups within one backup group don't really make sense, but
break all sorts of guarantees (e.g. a second backup started after a
first would use a "known-chunks" list from the previous unfinished one,
which would be empty - but using the list from the last finished one is
not a fix either, as that one could be deleted or pruned once the first
simultaneous backup is finished).
Fix it by only allowing one backup per backup group at one time. This is
done via a flock on the backup group directory, thus remaining intact
even after a reload.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
To prevent a race with a background GC operation, do not allow deletion
of backups who's index might currently be referenced as the "known chunk
list" for successive backups. Otherwise the GC could delete chunks it
thinks are no longer referenced, while at the same time telling the
client that it doesn't need to upload said chunks because they already
exist.
Additionally, prevent deletion of whole backup groups, if there are
snapshots contained that appear to be currently in-progress. This is
currently unlikely to trigger, as that function is only used for sync
jobs, but it's a useful safeguard either way.
Deleting a single snapshot has a 'force' parameter, which is necessary
to allow deleting incomplete snapshots on an aborted backup. Pruning
also sets force=true to avoid the check, since it calculates which
snapshots to keep on its own.
To avoid code duplication, the is_finished method is factored out.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
Also swap the order of a couple of `.map_err().await` to
`.await.map_err()` since that's generally more efficient.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
a blob can be empty (e.g. an empty pct fw conf), so we
have to set the minimum size to the header size
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
And make verify_crc private for now. We always call load_from_reader() to
verify the CRC.
Also add load_chunk() to datastore.rs (from chunk_store::read_chunk())