Jump to content
Welcome to our new Citrix community!
  • 0

/var/log/* files with zero size


Andre Ferreira

Question

Hi, I have a XenServer 7.1 Pool with two physical servers.

 

Last week, one of them crashed and when it came back, I went to /var/log to start a diagnosis, and found that almost every single file in /var/log has size 0 (zero), and everything has a time stamp of the moment I type 'ls -l'. The only files with a different size are /var/log/audit.log and /var/log/xensource.log. A 'cat' in the latter one yielded the following content:

Apr 22 11:15:08 Vanmaanen xapi: [debug|Vanmaanen|1029 ha_monitor|HA monitor D:521f2fff1d8a|xapi_ha] Liveset: online 7c8b1088-a20f-4cb7-9114-6f17ff2418eb [*L  A ]; b9ac4bd6-9d77-4ba3-a3b6-f2ef13ce3bfb [ LM A ];
Apr 22 11:15:08 Vanmaanen xapi: [debug|Vanmaanen|1029 ha_monitor|HA monitor D:521f2fff1d8a|xapi_ha] Processing warnings
Apr 22 11:15:08 Vanmaanen xapi: [debug|Vanmaanen|1029 ha_monitor|HA monitor D:521f2fff1d8a|xapi_ha] Done with warnings
Apr 22 11:15:08 Vanmaanen xapi: [debug|Vanmaanen|1029 ha_monitor|HA monitor D:521f2fff1d8a|xapi_ha] The node we think is the master is still alive and marked as master; this is OK

 

A 'cat' one audit.log returned empty (although 'ls' reported 262 bytes in it).

 

After seeing this, I installed via XenCenter the latest cumulative update (XS7.1CU2, and it had XS7.1CU1 previously), and that auto-installed XS71ECU2001, XS71ECU2002, XS71ECU2003 and XS71ECU2004. The updates (and reboots) didn't help with the problem. /var/log still shows all files with zero size. 'df -h' shows there is plenty of space in the partition (8.3MB of 3.9GB used).

 

Lastly, I've added a remote log server in the properties, and the server is sending syslog messages properly, meaning that Syslog is working. Unfortunately, no error messages are popping up.

 

The Pool has two identical servers, and the master is not showing this problem.

 

Could somebody help with this? I would prefer to avoid a full re-install, as the server is in a remote location.

 

My setup (the two servers in the pool are identical)

Hardware: HP Proliant DL-360G8, 272GB RAM, with 4 900GB HDD in a RAID6 setup.

All VMs run in a NetApp storage via NFS, using dedicated network ports.

Both servers are running XenServer7.1CU2 with the latest patches.

 

Link to comment

11 answers to this question

Recommended Posts

  • 0

I'd make sure first you have space for logs in that area - null files can be a sign of no more available storage to write to logs. The message about the master is a bit disconcerting; who is currently really the pool mater? Are the servers all properly synchronized to NTP (check each with "ntpstat -s")?

 

-=Tobias

Link to comment
  • 0

I agree with space as well, your files shouldn't be 0 bytes in size. As far as df -h what does /dev/sda1 look like? You should 

see something like 18Gb size, 2Gb used and a usage of 15% give or take.  And does service xapi status show active (running)

in Green ?

 

--Alan--

 

Link to comment
  • 0

Tobias,

I do have enough space, as 'df -h' shows:

Filesystem                                                      Size  Used Avail Use% Mounted on
devtmpfs                                                        2.0G  8.0K  2.0G   1% /dev
tmpfs                                                           2.0G  104K  2.0G   1% /dev/shm
tmpfs                                                           2.0G  1.3M  2.0G   1% /run
tmpfs                                                           2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/sda1                                                        18G  1.7G   16G  11% /
xenstore                                                        2.0G     0  2.0G   0% /var/lib/xenstored
/dev/loop0                                                       44M   44M     0 100% /var/xen/xc-install
/dev/sda5                                                       3.9G  8.3M  3.7G   1% /var/log
<ISO Server>:/var/samba/nead/ISOs                    5.0T  3.4T  1.7T  68% /run/sr-mount/65054e55-bb7b-a85f-fddc-b422702718a3
<nfs address>:/vol_Flanders2/7a80c91e-f512-1b66-50f4-f69c6d4e9976  285G  126G  160G  44% /run/sr-mount/7a80c91e-f512-1b66-50f4-f69c6d4e9976
<nfs address>:/vol_swaps/7e5422b8-e990-625c-3897-5f15ed852d5a      256G   43G  214G  17% /run/sr-mount/7e5422b8-e990-625c-3897-5f15ed852d5a
<nfs address>:/vol_NFS_VMs/a24ebb78-4ac7-0f43-400d-fd7e77ebf77d    2.9T  1.9T  989G  67% /run/sr-mount/a24ebb78-4ac7-0f43-400d-fd7e77ebf77d
tmpfs                                                           393M     0  393M   0% /run/user/0

 

The pool master is the server that is NOT showing this anomalous behavior. Also, since I lost some of the confidence on this server, all my VMs are currently running in the pool master.

 

Lastly, NTP is fully synchronized, with my server showing as stratum 4.

 

Alan,

as shown by 'dh -h' above, /dev/sda1 does have 18GB and is using 1.7GB os that. 'service xapi status' shows that the service is active and running. Just for sake of curiosity, I also checked 'df -i' (inodes), and there is no problem here either. I had a couple of servers (not Xen) die on me because of inodes in the past, so I check this continuously via Nagios/SNMP.

 

Roberto

Link to comment
  • 0

How many version of the logs are there for the various ones that are zero in length? Are there logs that do have a positive size and are they rotating properly? ? Did you try manually deleting some to see if they rotate properly the next time they do rotate?

 

That all said I'm wondering if your server suffered some sort of major corruption. That is very strange behavior you are experiencing with it.

 

-=Tobias

Link to comment
  • 0

Messages.nn.gz went up to 30.

 

I've tried deleting the /var/log/messages files and then restart rsyslog. No /var/log/messages file was created, but syslog is being sent to the central syslog server.

 

I'm starting to think that something really broke in this server and there will be no option but to fully reinstall it... :-(

 

Roberto

Link to comment
  • 0

Many logs will go just to the master.  And regarding the topic of re-installing to play it safe, a clean install doesn't take that long and if you have adequate spare server space, you can do it all with zero downtime.

 

I've seen server corruption before that was something that looked like it could not be readily figured out or solved, in which case  a clean rebuild ended up being a faster and more reliable means of dealing with the issue.

 

One way of testing integrity would be to switch the master to a different host to see if the behavior changes or not.

 

-=Tobias

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...