proxmox-backup-proxy: stop accept() loop on daemon shutdown
On reload the old process hands over to the new process but needs to keep running until all its worker tasks are finished to avoid breaking a in-progress action like a xterm.js web shell or a backup creation/restore. During that wait time the receiving channel was already closed, but the TCP sockt accept listener was still left active by mistake. That paired with the `SO_REUSEPORT` being set on the underlying socket, made the kernel choose either the old or new process for new incoming connections, both still listened for them after all and reuse-port + multiple processes is often used as load-balancer mechanism. As the old proxy accepted connections but didn't process them anymore one could observer sporadic connection failures on any API call, well any new connection to the proxy, depending on which process got the it assigned. The fix is to stop accepting new connections one we shutdown, so poll the shutdown_future too during accept and just exit the accept-loop on shutdown. Note: This part of the code, nor other parts that could influence it, wasn't changed at all in recent times, so it's still unresolved for why it pops up only now. Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> Co-authored-by: Wolfgang Bumiller <w.bumiller@proxmox.com> [ T: add more (root cause) info and reword a bit ] Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
This commit is contained in:
parent
8550de7403
commit
20814a3986
|
@ -394,14 +394,18 @@ async fn accept_connection(
|
||||||
sender: tokio::sync::mpsc::Sender<ClientStreamResult>,
|
sender: tokio::sync::mpsc::Sender<ClientStreamResult>,
|
||||||
) {
|
) {
|
||||||
let accept_counter = Arc::new(());
|
let accept_counter = Arc::new(());
|
||||||
|
let mut shutdown_future = proxmox_rest_server::shutdown_future().fuse();
|
||||||
|
|
||||||
loop {
|
loop {
|
||||||
let (sock, peer) = match listener.accept().await {
|
let (sock, peer) = select! {
|
||||||
|
res = listener.accept().fuse() => match res {
|
||||||
Ok(conn) => conn,
|
Ok(conn) => conn,
|
||||||
Err(err) => {
|
Err(err) => {
|
||||||
eprintln!("error accepting tcp connection: {}", err);
|
eprintln!("error accepting tcp connection: {}", err);
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
|
},
|
||||||
|
_ = shutdown_future => break,
|
||||||
};
|
};
|
||||||
|
|
||||||
sock.set_nodelay(true).unwrap();
|
sock.set_nodelay(true).unwrap();
|
||||||
|
|
Loading…
Reference in New Issue