-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fatal error: unexpected signal during runtime execution #1481
Comments
The log reports that this is a fatal error, but it looks like cAdvisor still starts up. Are you actually seeing any issues from this, other than the scary log message? It looks like a segfault from deep in the network package (probably something wrong with the glibc version we patch into alpine), but if the panic is recovered somewhere, it should only affect rkt, which it doesn't look like you're running with. Also, what OS / distro are you running on? I couldn't reproduce on my local workstation. |
I assumed this was because we have the Docker restart policy set to
Ubuntu 14.04 LTS w/ Kernel at 3.13.0-24-generic. |
@timstclair I confirmed that the retries are from cAdvisor and not due to our Docker restart policy. So it definitely recovers, but these retries account for almost 10 seconds added to startup:
This came to our attention because we have health checks on the cAdvisor container after we start it that were failing (We hit /api/ looking for 'Supported API versions', 3 retries, exponential backoff starting at 1 second). 10 seconds is pretty significant in terms of container startup, so #1483 (comment) would definitely be a welcome addition :) |
I'd be surprised if that error was contributing more than a couple seconds (though I could be wrong). Do you have a lot of containers running? If there are a lot of containers running on the system, cAdvisor can take a little while to load them before it starts. |
@timstclair I don't see the same 10 second delay when using cAdvisor 0.23.8 Using cAdvisor 0.23.8, the logs after a
Only 3 containers on this machine. Another update: I can't cause the delay/fault behavior in cAdvisor 0.24.0 consistently. It happens on some restarts, but not on others :/ |
Hi, i see the same, running the binary direct on CoreOS 1122.2.0, and cadvisor does not start
|
Version 23.02 does work |
hm this still happens, any idea? |
This also happens while running 0.24.1 standalone on Centos 7.1.1503, kernel 3.10.0-229.11.1.el7.x86_64. Works with 0.23.8. |
Also 0.24.0-alpha1 and 0.24.0 have the same issue. |
Interestingly, it works on 0.24.1 if I run |
Ok, correction - it seems pretty random if it works or not. 30% of our (identical) machines it works, the rest fail with |
@carlpett I think you're onto something, because we don't consistently see this either. |
Looks like the same issue as prometheus/alertmanager#267, which includes a possible fix. |
@timstclair You mean building with netgo? |
@timstclair @carlpett @amcrn building with -tags netgo fix the problem |
given the above, shouldn't the official cadvisor docker images now be built with the netgo tags? |
any news here? |
We are seeing this issue occur for random users of cAdvisor out in the wild. Even on very common linux distros like Ubuntu 14.04 LTS. I can also confirm that adding |
I'm having a similar issue for 2 days now. cAdvisor worked at first but after the second docker-compose down and then docker-compose up it stopped working. Below some info about the environment
Below the docker-compose file
Below logs being generated
Below the output of docker inspect
|
same cadvisor issue happening on CentOS Linux release 7.3.1611 vagrant host. |
I hit the same problem, Red Hat Enterprise Linux Server release 7.3. I have tried the pre-compiled 0.24.1 binary and also build version 0.24.2 myself (on Arch Linux with go1.8 amd64). Interestingly, when running cadvisor inside gdb, it does not crash. Also, when running the official docker image it works too.
|
The same error happens with version 0.25.0. But, I have recompiled cadvisor dynamically on the host system and now it works. 🎉
|
Use the official image and install findutils to run normally dockerfiles
|
This si to work around google#1481
This is to work around google#1481
+1 fixed by adding "-tags netgo" on ubuntu 16.04 with go1.8.1 |
Confirmed. Setting
|
I have the same issue on Fedora 25 with running $ sudo docker run --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker/:/var/lib/docker:ro --publish=8080:8080 --name=cadvisor --rm --privileged=true google/cadvisor:latest container is able to start without |
Same problem on "CentOS Linux release 7.3.1611 (Core) 3.10.0-514.21.2.el7.x86_64" with cAdvisor v0.26.1 binary. Confirmed, with "GODEBUG=netdns=go ./cadvisor" run without any issue. |
\followup: I have just upgrade to cadvisor 0.26.1 and are seeing this issue now also when running cadvisor on the host machine from systemd.
Mitigated by using |
I have the same issue on Ubuntu 18.0 |
Docker:
Docker version 1.11.2, build b9f10c9
cAdvisor:
0.24.0
Every so often when creating a cAdvisor container, the following panic occurs:
Output of
docker inspect cadvisor
:This seems to be unique to cAdvisor 0.24.0, because we had not seen this in the 0.23.x series.
The text was updated successfully, but these errors were encountered: