Jump to content
Welcome to our new Citrix community!
  • 0

XenServer 7.5 crash: kernel panic


Derick Fontes

Question

Dear community,

After increasing traffic on the VMs , I have problems(below) and the server reboots.

 

Any ideas/sugestions?

 

Version: XenServer release 7.5.0 (xenenterprise)

Kernel: Linux br-pr-cwb1-xs1 4.4.0+10 #1 SMP Thu Aug 9 14:42:20 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux


Ethernet controller: Solarflare Communications SFC9220 10/40G Ethernet Controller (rev 02)

driver: sfc
version: 4.10.1.1000-xen
firmware-version: 6.4.2.1020 rx1 tx1
bus-info: 0000:21:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: yes

 

dmesg for kernel panic:
[   3615.442305]  EMERG: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [swapper/2:0]
[   3615.442319]   WARN: Modules linked in: tun nfsv3 nfs fscache 8021q garp mrp stp llc openvswitch nf_defrag_ipv6 libcrc32c ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_multiport xt_conntrack nf_conntrack iptable_filter dm_multipath nls_iso8859_1 nls_cp437 vfat fat ipmi_devintf dm_mod sg crc32_pclmul aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper shpchp i2c_piix4 ipmi_si ipmi_msghandler tpm_tis tpm nls_utf8 isofs nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc ip_tables x_tables sd_mod hid_generic usbhid hid mpt3sas(O) raid_class scsi_transport_sas sfc(O) mdio ahci libahci libata igb(O) xhci_pci ptp pps_core xhci_hcd scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_mod xen_wdt efivarfs ipv6
[   3615.442366]   WARN: CPU: 2 PID: 0 Comm: swapper/2 Tainted: G           O    4.4.0+10 #1
[   3615.442368]   WARN: Hardware name: Supermicro Super Server/H11SSL-C, BIOS 1.0b 04/27/2018
[   3615.442371]   WARN: task: ffff88022be25400 ti: ffff88022be50000 task.ti: ffff88022be50000
[   3615.442372]   WARN: RIP: e030:[<ffffffffa02703c0>]  [<ffffffffa02703c0>] efx_ef10_tx_limit_len+0x0/0x30 [sfc]
[   3615.442384]   WARN: RSP: e02b:ffff880234e43688  EFLAGS: 00000206
[   3615.442386]   WARN: RAX: ffff880002638000 RBX: ffff88022ac63300 RCX: 00000000b25cda62
[   3615.442387]   WARN: RDX: 0000000000000000 RSI: 000000110f255680 RDI: ffff88022ac63300
[   3615.442388]   WARN: RBP: ffff880234e436b8 R08: 000000100f25563c R09: 0000000000000002
[   3615.442389]   WARN: R10: 0000000000000000 R11: ffffffff81a179a0 R12: ffff88000263b948
[   3615.442390]   WARN: R13: ffffffff00000000 R14: 000000110f255680 R15: ffffffffa0293e40
[   3615.442398]   WARN: FS:  00007f54d2e7e700(0000) GS:ffff880234e40000(0000) knlGS:0000000000000000
[   3615.442399]   WARN: CS:  e033 DS: 002b ES: 002b CR0: 0000000080050033
[   3615.442400]   WARN: CR2: 00007fda26b3a000 CR3: 00000002040aa000 CR4: 0000000000040660
[   3615.442403]   WARN: Stack:
[   3615.442404]   WARN:  ffffffffa027b897 0000000000000044 0000000000000001 fffffffffffffffe
[   3615.442406]   WARN:  ffff8800b0c60600 ffff88022b24a000 ffff880234e43798 ffffffffa027c497
[   3615.442408]   WARN:  ffff88022ac63000 000d505400000002 0000000000000000 0000880200000b96
[   3615.442410]   WARN: Call Trace:
[   3615.442411]   WARN:  <IRQ>
[   3615.442419]   WARN:  [<ffffffffa027b897>] ? efx_tx_map_chunk+0x47/0x90 [sfc]
[   3615.442427]   WARN:  [<ffffffffa027c497>] efx_enqueue_skb+0x7c7/0xcc0 [sfc]
[   3615.442434]   WARN:  [<ffffffff81097f32>] ? default_wake_function+0x12/0x20
[   3615.442438]   WARN:  [<ffffffff810aafb2>] ? autoremove_wake_function+0x12/0x40
[   3615.442440]   WARN:  [<ffffffff810b1c81>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[   3615.442447]   WARN:  [<ffffffffa027ca29>] efx_hard_start_xmit+0x99/0xb0 [sfc]
[   3615.442452]   WARN:  [<ffffffff814e8ee0>] dev_hard_start_xmit+0x2b0/0x3f0
[   3615.442455]   WARN:  [<ffffffff8150ac97>] sch_direct_xmit+0x97/0x1e0
[   3615.442457]   WARN:  [<ffffffff814e937e>] __dev_queue_xmit+0x26e/0x4c0
[   3615.442459]   WARN:  [<ffffffff814e95e0>] dev_queue_xmit+0x10/0x20
[   3615.442464]   WARN:  [<ffffffffa04b7a35>] ovs_vport_send+0xb5/0xc0 [openvswitch]
[   3615.442467]   WARN:  [<ffffffffa04ab257>] do_output.isra.28+0x57/0x170 [openvswitch]
[   3615.442470]   WARN:  [<ffffffffa04ac582>] do_execute_actions+0x10a2/0x1110 [openvswitch]
[   3615.442473]   WARN:  [<ffffffffa04ac622>] ovs_execute_actions+0x32/0xc0 [openvswitch]
[   3615.442476]   WARN:  [<ffffffffa04afa43>] ovs_dp_process_packet+0xd3/0xf0 [openvswitch]
[   3615.442480]   WARN:  [<ffffffffa04b7340>] ovs_vport_receive+0x90/0xa0 [openvswitch]
[   3615.442483]   WARN:  [<ffffffffa04b7340>] ? ovs_vport_receive+0x90/0xa0 [openvswitch]
[   3615.442487]   WARN:  [<ffffffff81076aa5>] ? irq_exit+0x85/0x90
[   3615.442490]   WARN:  [<ffffffff814d4282>] ? __alloc_skb+0x72/0x230
[   3615.442494]   WARN:  [<ffffffff811b2ceb>] ? __slab_alloc.constprop.60+0x44/0x52
[   3615.442497]   WARN:  [<ffffffff811a6bbd>] ? __kmalloc_track_caller+0x4d/0x170
[   3615.442499]   WARN:  [<ffffffff814d4282>] ? __alloc_skb+0x72/0x230
[   3615.442501]   WARN:  [<ffffffff814d351d>] ? __kmalloc_reserve.isra.30+0x2d/0x70
[   3615.442505]   WARN:  [<ffffffffa04b8560>] netdev_frame_hook+0x140/0x180 [openvswitch]
[   3615.442507]   WARN:  [<ffffffff814e6ca7>] __netif_receive_skb_core+0x577/0x8d0
[   3615.442511]   WARN:  [<ffffffff8100e357>] ? set_phys_to_machine+0x17/0x50
[   3615.442513]   WARN:  [<ffffffff8100e694>] ? set_foreign_p2m_mapping+0x304/0x330
[   3615.442517]   WARN:  [<ffffffff815a521a>] ? _raw_spin_unlock_irqrestore+0x1a/0x20
[   3615.442519]   WARN:  [<ffffffff814e704e>] __netif_receive_skb+0x4e/0x60
[   3615.442521]   WARN:  [<ffffffff814e70ad>] netif_receive_skb_internal+0x4d/0x90
[   3615.442523]   WARN:  [<ffffffff814d63df>] ? skb_checksum_setup+0x2bf/0x2f0
[   3615.442525]   WARN:  [<ffffffff814e7150>] netif_receive_skb+0x60/0x70
[   3615.442530]   WARN:  [<ffffffff8146aa1c>] xenvif_tx_action+0x86c/0x950
[   3615.442533]   WARN:  [<ffffffff810df152>] ? tick_program_event+0x62/0x70
[   3615.442535]   WARN:  [<ffffffff8146d2e9>] xenvif_poll+0x39/0x70
[   3615.442537]   WARN:  [<ffffffff814e745f>] net_rx_action+0x12f/0x320
[   3615.442540]   WARN:  [<ffffffff81076729>] __do_softirq+0x129/0x290
[   3615.442542]   WARN:  [<ffffffff81076a62>] irq_exit+0x42/0x90
[   3615.442546]   WARN:  [<ffffffff813c8bb5>] xen_evtchn_do_upcall+0x35/0x50
[   3615.442548]   WARN:  [<ffffffff815a74ee>] xen_do_hypervisor_callback+0x1e/0x40
[   3615.442549]   WARN:  <EOI>
[   3615.442552]   WARN:  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[   3615.442554]   WARN:  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[   3615.442556]   WARN:  [<ffffffff8100c570>] ? xen_safe_halt+0x10/0x20
[   3615.442559]   WARN:  [<ffffffff81020d67>] ? default_idle+0x57/0xf0
[   3615.442561]   WARN:  [<ffffffff8102149f>] ? arch_cpu_idle+0xf/0x20
[   3615.442563]   WARN:  [<ffffffff810ab322>] ? default_idle_call+0x32/0x40
[   3615.442565]   WARN:  [<ffffffff810ab57c>] ? cpu_startup_entry+0x1ec/0x330
[   3615.442568]   WARN:  [<ffffffff81013dd8>] ? cpu_bringup_and_idle+0x18/0x20
[   3615.442569]   WARN: Code: 48 89 e5 c1 e0 0d 05 18 0a 00 00 21 d1 48 03 86 80 00 00 00 89 08 89 97 30 01 00 00 5d c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 <0f> 1f 44 00 00 55 81 fa ff 3f 00 00 89 d0 48 89 e5 76 0e 48 8d
[   3615.442589]  EMERG: Kernel panic - not syncing: softlockup: hung tasks

 

Thanks for help.

 

Link to comment

9 answers to this question

Recommended Posts

dom0.log

 

Call Trace:
     [ffffffff810014aa] xen_hypercall_kexec_op+0xa/0x20
      ffffffff81156fd4  panic+0xfa/0x241
      ffffffff8110c5b4  watchdog_timer_fn+0x1a4/0x1d0
      ffffffff8110c410  watchdog_timer_fn+0/0x1d0
      ffffffff810d1f14  __hrtimer_run_queues+0x134/0x250
      ffffffff810d2346  hrtimer_interrupt+0xa6/0x180
      ffffffff8100c82e  xen_timer_interrupt+0x2e/0x130
      ffffffff8140c00d  add_interrupt_randomness+0x18d/0x1a0
      ffffffff810c053f  handle_irq_event_percpu+0x7f/0x1e0
      ffffffff810c3a8a  handle_percpu_irq+0x3a/0x50
      ffffffff810bfd42  generic_handle_irq+0x22/0x30
      ffffffff813c9d8b  __evtchn_fifo_handle_events+0x14b/0x170
      ffffffff813c9dc0  evtchn_fifo_handle_events+0x10/0x20
      ffffffff813c6dda  __xen_evtchn_do_upcall+0x4a/0x80
      ffffffff813c8bb0  xen_evtchn_do_upcall+0x30/0x50
      ffffffff815a74ee  xen_do_hypervisor_callback+0x1e/0x40
      ffffffff81097f32  default_wake_function+0x12/0x20
      ffffffff810aafb2  autoremove_wake_function+0x12/0x40
      ffffffff810b1c81  __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
      ffffffff814e8ee0  dev_hard_start_xmit+0x2b0/0x3f0
      ffffffff8150ac97  sch_direct_xmit+0x97/0x1e0
      ffffffff814e937e  __dev_queue_xmit+0x26e/0x4c0
      ffffffff814e95e0  dev_queue_xmit+0x10/0x20
      ffffffff81076aa5  irq_exit+0x85/0x90
      ffffffff814d4282  __alloc_skb+0x72/0x230
      ffffffff811b2ceb  __slab_alloc.constprop.60+0x44/0x52
      ffffffff811a6bbd  __kmalloc_track_caller+0x4d/0x170
      ffffffff814d4282  __alloc_skb+0x72/0x230
      ffffffff814d351d  __kmalloc_reserve.isra.30+0x2d/0x70
      ffffffff814e6ca7  __netif_receive_skb_core+0x577/0x8d0
      ffffffff8100e357  set_phys_to_machine+0x17/0x50
      ffffffff8100e694  set_foreign_p2m_mapping+0x304/0x330
      ffffffff815a521a  _raw_spin_unlock_irqrestore+0x1a/0x20
      ffffffff814e704e  __netif_receive_skb+0x4e/0x60
      ffffffff814e70ad  netif_receive_skb_internal+0x4d/0x90
      ffffffff814d63df  skb_checksum_setup+0x2bf/0x2f0
      ffffffff814e7150  netif_receive_skb+0x60/0x70
      ffffffff8146aa1c  xenvif_tx_action+0x86c/0x950
      ffffffff810df152  tick_program_event+0x62/0x70
      ffffffff8146d2e9  xenvif_poll+0x39/0x70
      ffffffff814e745f  net_rx_action+0x12f/0x320
      ffffffff81076729  __do_softirq+0x129/0x290
      ffffffff81076a62  irq_exit+0x42/0x90
      ffffffff813c8bb5  xen_evtchn_do_upcall+0x35/0x50
      ffffffff815a74ee  xen_do_hypervisor_callback+0x1e/0x40

Link to comment
On 3/4/2019 at 7:31 PM, Tobias Kreidl said:

How much memory do you have allocated to dom0 on your hosts? Run "top" and see if maybe your resources are being exhausted (it should swap zero to very little).

 

-=Tobias


#multiboot2 /boot/xen.gz dom0_mem=4096M,max:4096M watchdog ucode=scan dom0_max_vcpus=1-16 crashkernel=192M,below=4G console=vga vga=mode-0x0311
 multiboot2 /boot/xen.gz dom0_mem=8192M,max:8192M watchdog ucode=scan dom0_max_vcpus=1-16 crashkernel=256M,below=4G console=vga vga=mode-0x0311
 

 

First the VM freezes and soon after the hyperv. But in VM I have no signal in error logs.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...