x86_64 kernel does not start under qemu

Philippe Gerum rpm at xenomai.org
Sat Mar 23 11:16:32 CET 2019


On 3/23/19 11:04 AM, Philippe Gerum via Xenomai wrote:
> On 3/22/19 9:50 PM, Jan Kiszka wrote:
>> On 21.03.19 17:07, Jan Kiszka wrote:
>>> On 21.03.19 12:57, Richard Weinberger wrote:
>>>> Am Donnerstag, 21. März 2019, 12:02:45 CET schrieb Jan Kiszka:
>>>>> FWIW, I've just seen this issue as well, with QEMU in KVM mode: I ran into that
>>>>> lockup when my host was under full load while Xenomai booted in the VM. And it
>>>>> seems reproducible. Debugging...
>>>>
>>>> Oh, good to hear that!
>>>> I played a little with your config but got badly interrupted with other stuff.
>>>> Your config seems to work but mostly because things are slower due to debugging stuff
>>>> you've enabled. Maybe this info helps.
>>>>
>>>
>>> It's a race, so everything that changes timing also changes
>>> probabilities. I'm starting to nail it down:
>>>
>>> (gdb) info threads
>>>    Id   Target Id         Frame
>>> * 4    Thread 4 (CPU#3 [halted ]) __ipipe_halt_root (use_mwait=-2147483634) at ../arch/x86/kernel/ipipe.c:317
>>>    3    Thread 3 (CPU#2 [running]) rep_nop () at ../arch/x86/include/asm/processor.h:655
>>>    2    Thread 2 (CPU#1 [halted ]) __ipipe_halt_root (use_mwait=-2147483634) at ../arch/x86/kernel/ipipe.c:317
>>>    1    Thread 1 (CPU#0 [halted ]) __ipipe_halt_root (use_mwait=-2147483634) at ../arch/x86/kernel/ipipe.c:317
>>> (gdb) monitor info lapic
>>> dumping local APIC state for CPU 3
>>>
>>> LVT0     0x00010700 active-hi edge  masked                      ExtINT (vec 0)
>>> LVT1     0x00010400 active-hi edge  masked                      NMI
>>> LVTPC    0x00010400 active-hi edge  masked                      NMI
>>> LVTERR   0x000000fe active-hi edge                              Fixed  (vec 254)
>>> LVTTHMR  0x00010000 active-hi edge  masked                      Fixed  (vec 0)
>>> LVTT     0x000400ef active-hi edge                 tsc-deadline Fixed  (vec 239)
>>> Timer    DCR=0x0 (divide by 2) initial_count = 0
>>> SPIV     0x000001ff APIC enabled, focus=off, spurious vec 255
>>> ICR      0x000008fd logical edge de-assert no-shorthand
>>> ICR2     0x02000000 mask 00000010 (APIC ID)
>>> ESR      0x00000000
>>> ISR      239
>>> IRR      236 237 238 239
>>>
>>> APR 0x00 TPR 0x00 DFR 0x0f LDR 0x08 PPR 0xe0
>>>
>>>
>>> So we are halting while we didn't finish vector 239 (timer) yet. And
>>> that means we re-enabled interrupts while the timer was being processed
>>> - a bug in I-pipe.
>>>
>>> This is while another CPU tries to run ipipe_critical_enter, never
>>> reaching CPU 3 this way (via IPI_CRITICAL_VECTOR = 236).
>>>
>>> Jan
>>>
>>
>> This might be the fix, but I need to sleep over it. Will send a PR next 
>> week.
>>
>> ---8<---
>>
>> ipipe: Call present timer ack handlers unconditionally
>>
>> This plugs a race for timers that are per-CPU but share the same
>> interrupt number. When setting them up, there is a window where the
>> first CPU already called ipipe_request_irq, but some other CPU did not
>> yet ran through grab_timer, thus have ipipe_stolen = 0.
>>
>> Moreover, it is questionable that non-stolen timers should not call
>> their ack functions.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka at siemens.com>
>> ---
>>  kernel/ipipe/timer.c | 11 ++++-------
>>  1 file changed, 4 insertions(+), 7 deletions(-)
>>
>> diff --git a/kernel/ipipe/timer.c b/kernel/ipipe/timer.c
>> index 98d1192a2727..2d5f468ce7fb 100644
>> --- a/kernel/ipipe/timer.c
>> +++ b/kernel/ipipe/timer.c
>> @@ -369,13 +369,10 @@ static void __ipipe_ack_hrtimer_irq(struct irq_desc *desc)
>>  
>>  	if (desc)
>>  		desc->ipipe_ack(desc);
>> -
>> -	if (timer->host_timer->ipipe_stolen) {
>> -		if (timer->ack)
>> -			timer->ack();
>> -		if (desc)
>> -			desc->ipipe_end(desc);
>> -	}
>> +	if (timer->ack)
>> +		timer->ack();
>> +	if (desc && timer->host_timer->ipipe_stolen)
>> +		desc->ipipe_end(desc);
>>  }
>>  
>>  static int do_set_oneshot(struct clock_event_device *cdev)
>>
> 
> This is a regression I introduced in 4.14. Bottom line is that testing
> for ipipe_stolen in this context is pointless: if
> __ipipe_ack_hrtimer_irq() is called, this means that ipipe_request_irq()
> is in effect for the tick event, which requires this front handler to
> acknowledge the event, no matter what.
> 
> The reason is that we may not assume that the original tick handler
> (i.e. in the clockevent layer) would run next in that case, so the only
> safe place to ack the timer event is from __ipipe_ack_hrtimer_irq() if
> the timer is grabbed for the current CPU.
> 

That reasoning also applies to calling ipipe_end(): this must be done
unconditionally, because if __ipipe_ack_hrtimer_irq() is called, the
tick event will be delivered to Xenomai next, which will neither call
ipipe_end() for a tick event, nor propagate such event to the root stage
(at least not using the same IRQ line, but the host emulation tick
vector instead).

-- 
Philippe.



More information about the Xenomai mailing list