Bit Rot: Always read your log files

Whenever there is a new Ubuntu release, there's one part of the world that thinks that it is the worst upgrade ever, and the other part of the world that thinks that its the best thing that ever happened. Its the same with every distro, I guess.

After upgrading to Hardy, I thought it was the worst upgrade ever. The system was unstable, buggy, and programs would crash randomly. This was until I read my log files after an X server crash.


from /var/log/syslog :
May  8 12:34:08 ranjandesk -- MARK --
May  8 12:46:35 ranjandesk gdm[5796]: WARNING: gdm_slave_xioerror_handler: Fatal X error - Restarting :0
May  8 12:46:51 ranjandesk kernel: [ 6802.873564] Bad page state in process 'kdesktop'
May  8 12:46:51 ranjandesk kernel: [ 6802.873565] page:c1a04120 flags:0x80000000 mapping:00000000 mapcount:-285212672 count:0
May  8 12:46:51 ranjandesk kernel: [ 6802.873566] Trying to fix it up, but a reboot is needed
May  8 12:46:51 ranjandesk kernel: [ 6802.873567] Backtrace:
May  8 12:46:51 ranjandesk kernel: [ 6802.873591]  [bad_page+0x63/0xa0] bad_page+0x63/0xa0
May  8 12:46:51 ranjandesk kernel: [ 6802.873620]  [get_page_from_freelist+0x343/0x3a0] get_page_from_freelist+0x343/0x3a0
May  8 12:46:51 ranjandesk kernel: [ 6802.873662]  [agpgart:__alloc_pages+0x4f/0x340] __alloc_pages+0x4f/0x340
May  8 12:46:51 ranjandesk kernel: [ 6802.873684]  [anon_vma_prepare+0x1d/0xe0] anon_vma_prepare+0x1d/0xe0
May  8 12:46:51 ranjandesk kernel: [ 6802.873687]  [loop:kunmap_atomic+0x6a/0xa0] kunmap_atomic+0x6a/0xa0
May  8 12:46:51 ranjandesk kernel: [ 6802.873699]  [__handle_mm_fault+0x8c7/0xb00] __handle_mm_fault+0x8c7/0xb00
May  8 12:46:51 ranjandesk kernel: [ 6802.873713]  [snd_pcm:getnstimeofday+0x36/0x96b0] getnstimeofday+0x36/0xd0
May  8 12:46:51 ranjandesk kernel: [ 6802.873732]  [sched_clock+0x1a/0x70] sched_clock+0x1a/0x70
May  8 12:46:51 ranjandesk kernel: [ 6802.873755]  [do_page_fault+0x126/0x690] do_page_fault+0x126/0x690
May  8 12:46:51 ranjandesk kernel: [ 6802.873792]  [do_page_fault+0x0/0x690] do_page_fault+0x0/0x690
May  8 12:46:51 ranjandesk kernel: [ 6802.873798]  [error_code+0x72/0x80] error_code+0x72/0x80
May  8 12:46:51 ranjandesk kernel: [ 6802.873818]  [clip_ioctl+0x3a0/0x510] clip_ioctl+0x3a0/0x510
May  8 12:46:51 ranjandesk kernel: [ 6802.873836]  =======================

Bad Page? I know the next thing to do. Check the RAM! I ran the memtest86+ thingy that comes with Ubuntu. And sure enough, it showed errors in Test 5. The RAM chip works fine in the first slot, so it would seem that the problem is with the second slot. Since the mobo and RAM are still under warranty, I can get this fixed before its too late.

I wouldn't even know that there was something wrong with my RAM slot if I hadn't read my log files. I remember in my old machine, the graphics card fan was bust, and it would start running in "modulated clock mode". The log files told me.

A few days ago, I was getting random telnet attempts from IPs in Korea and China. Again, it was in the log files. Most probably botnets. I decided to remove telnetd and just stick to ssh instead.

Even if you don't know what the log files mean, doing a Google can show a few results with bugs or solutions to the problem.

Bit Rot

Saturday, May 17, 2008

Always read your log files

No comments: