Use timeout futures for sections that might hang in certain error
conditions. This is mostly intended to be used as a safeguard, not a
first line of defense - i.e. best-effort avoidance of total hangs.
Not every future used for the HttpClient/H2Client is changed, only those
where a quick response is to be expected. For example, the response
reading futures are left alone, so data transfer is never capped with
timeout, only the initial server connect.
It is also used for upgrading to H2 connections, as that can take a long
time on overloaded servers.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
it seems that sometimes, the child process signal gets handled
before the parent process signal. Systemd then ignores the
childs signal (finished reloading) and only after going into
reloading state because of the parent. this will never finish.
Instead, wait for the state to change to 'reloading' after sending
that signal in the parent, an only fork afterwards. This way
we ensure that systemd knows about the reloading before actually trying
to do it.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Tested-By: Fabian Ebner <f.ebner@proxmox.com>
of ProcessLockSharedGuard.
We use a counter to determine if we can unlock the file again, but
we never actually decremented the writer count, so we held the
lock forever.
This fixes the issue that we could not start a garbage collect after
a reload, as long as the old process is still running, even when that
process has no active backup anymore but another long running task
(e.g. file download, terminal, etc.).
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
document all public things, add some doc links and make some
previously-public things only available for test cases or within the
crate:
previously public, now private:
- AclTreeNode::extract_user_roles (we have extract_roles())
- AclTreeNode::extract_group_roles (same)
- AclTreeNode::delete_group_role (exists on AclTree)
- AclTreeNode::delete_user_role (same)
- AclTreeNode::insert_group_role (same)
- AclTreeNode::insert_user_role (same)
- AclTree::write_config (we have save_config())
- AclTree::load (we have config()/cached_config())
previously public, now crate-internal:
- AclTree::from_raw (only used by tests)
- split_acl_path (used by some test binaries)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
instead of just logging the error. this should never happen in practice
unless someone is messing with the keyfile, in which case, it's better
to abort.
update tests accordingly (wrong fingerprint should fail, no fingerprint
should get the expected one).
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
the RSA key and the encryption key itself are hard-coded to avoid
stalling the test runs because of lack of entropy, they have no special
significance otherwise.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
when restoring an encrypted key, the original one is obviously not
available to check the fingerprint with.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
this fixes the issue that on some filesystems, you cannot recursively
remove a directory when you hold a lock on a file inside (e.g. nfs/cifs)
it is not really backwards compatible (so during an upgrade, there
could be two daemons have the lock), but since the locking was
broken before (see previous patch) it should not really matter
(also it seems very unlikely that someone will trigger this)
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
'lock_manifest' returns a Result<File, Error> so we always got the result,
even when we did not get the lock, but we acted like we had.
bubble the locking error up
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
if no groups were found, the task log was very confusing as it
contained no real information why nothing was synced, e.g.:
Starting datastore sync job 'remote:datastore:local-datastore:s-79412799-e6ee'
Sync datastore 'local-datastore' from 'remote/datastore'
sync job 'remote:datastore:local-datastore:s-79412799-e6ee' end
TASK OK
this patch simply logs how many groups were found and are about to be synced:
Starting datastore sync job 'remote:datastore:local-datastore:s-79412799-e6ee'
Sync datastore 'local-datastore' from 'remote/datastore'
found 0 groups to sync
sync job 'remote:datastore:local-datastore:s-79412799-e6ee' end
TASK OK
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
percentage of verified groups, interpolating based on snapshot count
within the group. in most cases, this will also be closer to 'real'
progress since added snapshots (those which will be verified) in active
backup groups will be roughly evenly distributed, while number of total
snapshots per group will be heavily skewed towards those groups which
have existed the longest, even though most of those old snapshots will
only be re-verified very infrequently.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
BackupInfo::list_backup_groups is identical code-wise, and makes more
sense as entry point for listing groups.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
WalkDir does not follow symlinks by default anyway, and this behaviour
is not documented anywhere. e.g., if a sysadmin mounts 'extra storage'
for some backup group or type (not knowing that only metadata is stored
in those directories), GC will ignore all the indices contained within
and happily garbage collect their chunks..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
for safety reason, GC finds and marks all index files below the
datastore base path. as a result of regular operations, only index files
within the expected scheme of <TYPE>/<ID>/<TIMESTAMP> should exist.
add a small check + warning if the index list contains index files out
side of this expected scheme, so that an admin with shell access can
investigate.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
we have messages starting the phases anyway, and limit the number of
progress updates so that context remains available at all times.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
before adding more fields to the tuple, let's just create the struct
inside the match arms to improve readability.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
and use this information to add more information to client backup log
and guide the download manifest decision.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
the errors Vec can contain failed groups as well (e.g., if a group has
no or an invalid owner).
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
else users have to manually search through a potentially very long task
log to find the entries that are different.. this is the same summary
printed at the end of a manual verify task.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
from formatting functions to main function, and pass along the key data
lines instead of the full string.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
this is stricter than the check that happened on manifest load, as it
also fails if the manifest is signed but we don't have a key available.
add some additional output at the start of a backup to indicate whether
a previous manifest is available to base the backup on.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
otherwise loading will run into the signature mismatch which is
technically true, but not the complete picture in this case.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
if the manifest is signed/the contained archives/blobs are encrypted.
stored in 'unprotected' area, since there is already a strong binding
between key and manifest via the signature, and this avoids breaking
backwards compatibility for a simple usability improvement.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
and set/generate it on
- key creation
- key passphrase change
- key decryption if not already set
- key encryption with master key
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
since we systemd-encode parts of the upid string, and those can contain
characters that are invalid in urls (e.g. '\'), we have to percent encode
those
add a 'percent_encode_component' helper, so that we can maybe change
the AsciiSet for all uses at the same time
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Simplify the phase 2 code by treating .bad files just like regular
chunks, with the exception of stat logging.
To facilitate, we need to touch .bad files in phase 1. We only do this
under the condition that 1) the original chunk is missing (as before),
and 2) the original chunk is still referenced somewhere (since the code
lives in the error handler for a failed chunk touch, it only gets called
for chunks we expect to be there, i.e. ones that are referenced).
Untouched they will then be cleaned up after 24 hours (or after the last
longer-running task finishes).
Reason 2) is also a fix for .bad files not being cleaned up at all if
the original is no longer referenced anywhere (e.g. a user deleting all
snapshots after seeing some corrupt chunks appear).
cond_touch_path is introduced to touch arbitrary paths in the chunk
store with the same logic as touching chunks.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
unprivileged users should only see the counts related to their part of
the datastore.
while we're at it, switch to a list groups, filter groups, count
snapshots approach (like list_snapshots) to speedup calls to this
endpoint when many unprivileged users share a datastore.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
used in the PBS GUI, but also for PVE usage queries which don't need all
the extra expensive information..
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
by listing groups first, then filtering, then listing group snapshots.
this cuts down the number of openat/getdirents calls for users that just
have a partial view of the datastore.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Useful to avoid the need for a long (and possibly changing) list of include-dev
options in certain situations, e.g. nested ZFS file systems. The option is
already implemented and seems to work as expected. The checks for virtual
filesystems are not affected by this option.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
The patterns from the archive root's .pxarexclude file are already present in
self.patterns when encode_pxarexclude_cli is called. Pass along the number of
CLI patterns and slice accordingly.
Suggested-By: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
previously a .pxarexclude entry in the root of the archive caused the file to
be generated as well, because the patterns are read before calling
generate_directory_file_list and within the function it wasn't possible to
distinguish between a pattern coming from the CLI and a pattern coming from
archive/root/.pxarexclude
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
The documentation states:
.pxarexclude files are treated as regular files and will be included in the
backup archive.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
There is no leading slash in an entry's full_path, causing an anchored
exclude at the root level to fail, e.g. having "/name" as the content of the
file archive/root/.pxarexclude didn't match the file archive/root/name
Fix this by prepending a leading slash before matching.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Add the versions command to proxmox-backup-manager with a similar output
to pveversion [-v]. It prints the packages line by line with only the
package name, followed by the version and, for proxmox-backup and
proxmox-backup-server, some additional information (running kernel,
running version).
In addition it supports the optional output-format parameter which can
be used to print the complete data in either json, json-pretty or text
format. If output-format is specified, the --verbose parameter is
ignored and the detailed list of packages is printed.
With the addition of the versions command, the report is extended as
well.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
Add an optional string field to APTUpdateInfo which can be used for
extra information.
This is used for passing running kernel and running version information
in the versions API call together with proxmox-backup and
proxmox-backup-server.
Signed-off-by: Mira Limbeck <m.limbeck@proxmox.com>
for now this only does the 'postfix' -> 'postfix@-' conversion,
fixes the issue that we only showed the 'postfix' service syslog
(which is rather empty in a default setup) instead of the instance one
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
This patch prints the source of the encryption key when running
operations with proxmox-backup-client.
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
Currently if you generate a default encryption key:
`proxmox-backup-client key create --kdf none`
all backup operations which don't explicitly disable encryption will be
encrypted with this key.
I found it quite surprising, that my backups were all encrypted without
me explicitly specfying neither key nor encryption mode
This patch informs the user when the default key is used (and no
crypt-mode is provided explicitly)
Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
when authenticating a token, and not just when authenticating a
user/ticket.
Reported-By: Dominik Jäger <d.jaeger@proxmox.com>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
sd_notify is not synchronous, iow. it only waits until the message
reaches the queue not until it is processed by systemd
when the process that sent such a message exits before systemd could
process it, it cannot be associated to the correct pid
so in case of reloading, we send a message with 'MAINPID=<newpid>'
to signal that it will change. if now the old process exits before
systemd knows this, it will not accept the 'READY=1' message from the
child, since it rejects the MAINPID change
since there is no (AFAICS) library interface to check the unit status,
we use 'systemctl is-active <SERVICE_NAME>' to check the state until
it is not 'reloading' anymore.
on newer systemd versions, there is 'sd_notify_barrier' which would
allow us to wait for systemd to have all messages from the current
pid to be processed before acknowledging to the child, but on buster
the systemd version is to old...
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
log invalid owners to system log, and continue with next group just as
if permission checks fail for the following operations:
- verify store with limited permissions
- list store groups
- list store snapshots
all other call sites either handle it correctly already (sync/pull), or
operate on a single group/snapshot and can bubble up the error.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
for more useful log output
old:
Nov 10 11:50:51 foo pvestatd[3378]: proxmox-backup-client failed: Error: error trying to connect: tcp connect error: No route to host (os error 113)
new:
Nov 10 11:55:21 foo pvestatd[3378]: proxmox-backup-client failed: Error: error trying to connect: error connecting to https://thebackuphost:8007/ - tcp connect error: No route to host (os error 113)
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
similar to what we do for zfs. By bailing before partitioning, the disk is
still considered unused after a failed attempt.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
If a package is or will be installed from the enterprise repo, retrieve
the changelog from there as well (securely via HTTPS and authenticated
with the subcription key).
Extends the get_string method to take additional headers, in this case
used for 'Authorization'. Hyper does not have built-in basic auth
support AFAICT but it's simple enough to just build the header manually.
Take the opportunity and also set the User-Agent sensibly for GET
requests, just like for POST.
Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
to easily check the store of a worker_id
this fixes the issue that one could not filter by type 'syncjob' and
datastore simultaneously
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
very basic, based on API/concepts of PVE one.
Still missing, addint an extra_info string option to APTUpdateInfo
and pass along running kernel/PBS version there.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Clippy complains about the number of paramters we have for
create_archive and it really does need to be made somewhat
less awkward and more usable. For now we just log to stderr
as we previously did. Added todo-comments for this.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
the basedir is already /usr/share/javascript/proxmox-backup/
so adding a subdir of that as alias is not needed
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
with remote Authids, not local Userids.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
if the user/token could have either configured/manually executed the
task, but it was either executed via the schedule (root@pam) or
another user/token.
without this change, semi-privileged users (that cannot read all tasks
globally, but are DatastoreAdmin) could schedule jobs, but not read
their logs once the schedule executes them. it also makes sense for
multiple such users to see eachothers manually executed jobs, as long as
the privilege level on the datastore (or remote/remote_store/local
store) itself is sufficient.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
since the store is not a path parameter, we need to do manual instead of
schema checks. also dropping Datastore.Backup here
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
to allow on-demand scanning of remote datastores accessible for the
configured remote user.
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
instead of await'ing the result of 'create_service' directly,
poll it together with the shutdown_future
if we reached that, fork_restart the new daemon, and await
the open future from 'create_service'
this way the old process still handles open connections until they finish,
while we already start a new process that handles new incoming connections
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
they are not an error and we should retry the read
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
when a file shrunk during backup, we endlessly looped, reading/copying 0 bytes
we already have code that handles shrunk files, but we forgot to
break from the read loop
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
we have information here not available in the access log, especially
if the /api2/extjs formatter is used, which encapsulates errors in a
200 response.
So keep the auth log for now, but extend it use from create ticket
calls to all authentication failures for API calls, this ensures one
can also fail2ban tokens.
Do that logging in a central place, which makes it simple but means
that we do not have the user ID information available to include in
the log.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
add all of our configuration files in /etc/proxmox-backup/ further,
call some ZFS tool to get that status.
Also, use the subscription command form manager, as we often require
more info than the status. Also, adapt formatting a bit.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
those are not in a hot code path, and it is not really much work to
build them on the go..
It may not matther much, but it is unnecessary. Rust will probably
inline most of it anyway..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
and change all users of the /status/tasks api call to this
with this change we can now delete the /status/tasks api call
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>