Discussion:
[fedora-arm] armhfp builder instability
Florian Weimer
2018-05-15 11:02:46 UTC
Permalink
Lately, I've seen quite a few spurious build failures. Random SIGBUS is
particularly common, and gcc reports that it cannot reproduce the SIGBUS
in a second compilation, which usually points to a kernel/hardware issue.

The latest problem was a hang during a build (on
buildvm-armv7-07.arm.fedoraproject.org), with this kernel:

Linux buildvm-armv7-07.arm.fedoraproject.org
4.16.6-302.fc28.armv7hl+lpae #1 SMP Tue May 1 23:15:35 UTC 2018 armv7l
armv7l armv7l GNU/Linux

This affects multiple builders, so I suspect a kernel issue, not dying
hardware because AFAIK, the machines are independent.

The issue also affects copying out the log files for Koji, so they
probably do not show the actual place of the hang.

Thanks,
Florian
_______________________________________________
arm mailing list -- ***@lists.fedoraproject.org
To unsubscribe send an email to arm-***@lists.fedoraproje
Peter Robinson
2018-05-15 11:06:42 UTC
Permalink
Post by Florian Weimer
Lately, I've seen quite a few spurious build failures. Random SIGBUS is
particularly common, and gcc reports that it cannot reproduce the SIGBUS in
a second compilation, which usually points to a kernel/hardware issue.
The latest problem was a hang during a build (on
Linux buildvm-armv7-07.arm.fedoraproject.org 4.16.6-302.fc28.armv7hl+lpae #1
SMP Tue May 1 23:15:35 UTC 2018 armv7l armv7l armv7l GNU/Linux
This affects multiple builders, so I suspect a kernel issue, not dying
hardware because AFAIK, the machines are independent.
The issue also affects copying out the log files for Koji, so they probably
do not show the actual place of the hang.
There's a stability issue post upgrade, the upgrades moved the
underlying hypervisors to RHEL 7.5 and the build VMs to Fedora 28 at
the same time, the issue is known and is being investigated/worked
upon.

Peter
_______________________________________________
arm mailing list -- ***@lists.fedoraproject.org
To unsubscribe send an email to arm-***@lists.
Florian Weimer
2018-05-15 16:04:23 UTC
Permalink
Post by Peter Robinson
Post by Florian Weimer
Lately, I've seen quite a few spurious build failures. Random SIGBUS is
particularly common, and gcc reports that it cannot reproduce the SIGBUS in
a second compilation, which usually points to a kernel/hardware issue.
The latest problem was a hang during a build (on
Linux buildvm-armv7-07.arm.fedoraproject.org 4.16.6-302.fc28.armv7hl+lpae #1
SMP Tue May 1 23:15:35 UTC 2018 armv7l armv7l armv7l GNU/Linux
This affects multiple builders, so I suspect a kernel issue, not dying
hardware because AFAIK, the machines are independent.
The issue also affects copying out the log files for Koji, so they probably
do not show the actual place of the hang.
There's a stability issue post upgrade, the upgrades moved the
underlying hypervisors to RHEL 7.5 and the build VMs to Fedora 28 at
the same time, the issue is known and is being investigated/worked
upon.
As I'm facing what seems to be the very same issue
(https://koji.fedoraproject.org/koji/taskinfo?taskID=26957996), I'd
like to ask whether there's a ticket/bug/issue opened for this that I
could follow and get some notification when it's solved.
It turns out there already was a kernel bug, which I just made public:

https://bugzilla.redhat.com/show_bug.cgi?id=1576593

Thanks,
Florian
_______________________________________________
arm mailing list -- ***@lists.fedoraproject.org
To unsubscribe send an email to ar

Loading...