Packet loss is now making my Hetzner box almost unusable. Does anyone know much about mitigating this?
About a month ago I started getting massive packet loss between my residence in Texas and my Hetzner box in Falkenstein. Wireshark shows lots of TCP Out-Of-Order and TCP Dup ACK and most of the time I get less than 1Mbit throughput Hetzner to Texas. The traceroute goes from Heztner, telia.net, and then Comcast. Outbound from Texas goes through core-backbone.
I tried playing with TCP Window memory size and it didn’t help and I don’t think that’s the issue anyway after looking at other long distance connections that work without problems.
To confuse things further, I get 100Mbit from Hetzner Helsinki without problems. The IPv4 traceroute to there is via telia.net and I have almost no packet problems, and IPv6 is via core-backbone with lots of packet loss but evidently not enough that’s noticeable and throughput is still 100Mbit +.
So I don’t know what else to try and I’d hate to give up my Hetzner box in Falkenstein because it’s been great and very cheap for over a year now. And it used to work great for streaming and even a slightly laggy desktop VM or two.
Anyone got any ideas to resolve this? Or is it just “Well, that’s the internet. Tough luck.”
I did that only for the TCP Window memory up to 6MB max.
Not sure how to manage the SYN and SYNACK values; I’ll research that.
I think it’s from all of Hetzner Germany. I’ve tried pulling the 100MB test file from Hetzner and I get the same problems. I’ve also tried cloud servers in Nuremberg and Falkenstein and had the same issues.
Best you can do is take some mtr tests and put in a ticket with Hetzner about the issues you’re having. Maybe they can alter something in the peering or take a look at what might be causing it. I know they did a lot of network twiddling to get things right when they launched Helsinki and were responsive to weird routing reports while that was being sorted, etc.
The reason I bring this up is that I want you to be sure it isn’t just your ISP or an upstream provider they’re using. This could be a call to your ISP before a ticket to Hetzner. I don’t know if you’ve ruled that out, but it’s the right thing to assume as a possibility before doing so.
You see that high packet loss at point 3. In this case it’s okay because it doesn’t carry forward and the end point still has 0% loss. So this means the level3 routers are not responding to ICMP (go figure, level3). If it carried forward from that point all the way to the end, that’s how you’d know it was most likely the point of blame.
Things you probably already know but I like to leave around for readers.
(all with the same second hop btw) It seems ok to me
As Jarland pointed out, if it’s carried forward it’s ok, it’s not uncommon at all to have routers displaying some sort of packet loss when doing traceroutes. It’s usually ICMP rate limiting
No issues here, I’m on AT&T Fiber in DFW. I did notice SSH would lag more noticeably than usual recently but I didn’t actually check for packet loss at the time, so I don’t know if it was loss or not (maybe just high load on the systems with the high latency is the problem, idk), but right now there’s no loss after a few minutes ICMP/TCP.
Thank you for running tests/doing some more investigation first. One of the things our networking team often asks for first from customers with these kinds of questions is for mtr in both directions. --Katie
I am not sure myself what they will say here, and if they can help, so I’d suggest writing a support ticket. They focus on answering tickets in their queues and are not active on social media, so please give them the information you’ve already shared here.