Support Forum
The Forums are a place to find answers on a range of Fortinet products from peers and product experts.
Matt_Garrett
New Contributor

FortiWiFi 60D units locking up

In the past months we upgraded a large number of FortiWiFi 60D units to 5.2.4 and started seeing issues with units locking up and not responding randomly.   The only way to resolve is to unplug power and reboot.

 

We are seeing this on a number of units.  We send out logs to FortiAnalyzer and we found that after this hard reboot logging to memory is again enabled.  We contacted Fortinet Support and this is a known big to be fixed in 5.2.7.  I am not entirely convinced that this setting is causing the lock ups.  Logs indicate nothing and in fact some units have few to no logs prior to lock up.  Seems to be very random in nature, but also appears to only when during normal business hours.

 

Anyone else having any similar issues or thoughts on this?

 

-M

23 REPLIES 23
bartman10
Contributor

I know it's an entirely different unit.. but try disabling the WiFI and see what happens. See my rant about FWF-50E.

 

My issue is related to new power management I was told is only used in the new 50E WiFi chip.. but hey who knows.. 

300E x3, 200D, 140D, 94D, 90D x2, 80D, 40C, handful of 60E's.. starting to loose track.

Over 100 WiFi AP's and growing.

FAZ-200D

FAC-VM 2 node cluster

Friends don't let friends FWF!

300E x3, 200D, 140D, 94D, 90D x2, 80D, 40C, handful of 60E's.. starting to loose track. Over 100 WiFi AP's and growing. FAZ-200D FAC-VM 2 node cluster Friends don't let friends FWF!
Matt_Garrett

Thank you for the response.  Unfortunately we are unable to disable the wifi at this juncture.

 

 

Chris_Carson
New Contributor

We have had the same issue happen at two clients.  We went thru two different FWF60D units with random lockups and no errors to report.  The LEDs where illuminated, but nobody was home.  We had a dedicate USB cable with FortiExplorer and the unit would not detect(it would disappear) until a power cycle.  After about a month we ripped them out and put in a FGT-100D and 3 FortiAPs.  At the time support(TAC) was not of much help.  The physical units are back in our stock, but we are scared to deploy them to another client.  We cannot be deploying FGT 92D or 100Ds at clients with 5 computers.

ede_pfau

So if I understand OP and @Chris, v5.2.3 or v5.2.7 should fix the problem. I mean, the forum would be flooded by complaints if the FWF60D (as being a volume model) locked up all the time with all previous firmware versions.

 

@Chris: any chance you'd put one FWF60D on v5.2.3 and let it run in the office for a week, and report back?


Ede

"Kernel panic: Aiee, killing interrupt handler!"
Ede"Kernel panic: Aiee, killing interrupt handler!"
Chris_Carson

We were running v5.2.3 and still had issues.  Again we don't have the equipment currently in production anymore since we replaced it with a bigger unit.  We can perform some testing next week.

 

 

--

This the only thing we ever got out of our two units when it would crash with a serial console cable was:

 

"FWF60D login: Unable to handle kernel NULL pointer dereference at virtual address 00000030 mm = 80228500 pgd = e3a01e1f *pgd =

 

and after a reboot..

 

FWF60D# diag debug crashlog read 1: 2015-09-24 13:41:10 the killed daemon is /bin/pyfcgid: status=0x0 2: 2015-09-29 09:04:11 the killed daemon is /bin/dhcpd: status=0x0 3: 2015-09-29 09:04:54 the killed daemon is /bin/dhcpd: status=0x0 4: 2015-09-29 09:15:12 the killed daemon is /bin/dhcpd: status=0x0 5: 2015-10-01 09:16:57 the killed daemon is /bin/dhcpcd: status=0x0 6: 2015-10-01 09:16:58 the killed daemon is /bin/dhcpcd: status=0x0 7: 2015-10-01 09:20:52 Interface wan2 is brought down. process_id=33, process_name="cmdbsvr" 8: 2015-10-01 09:21:22 the killed daemon is /bin/dhcpcd: status=0x0 9: 2015-10-01 09:21:22 the killed daemon is /bin/dhcpcd: status=0x0 10: 2015-10-01 09:23:19 the killed daemon is /bin/dhcpcd: status=0x0 11: 2015-10-01 09:23:19 the killed daemon is /bin/dhcpcd: status=0x0 12: 2015-10-01 09:29:35 the killed daemon is /bin/pyfcgid: status=0x0 13: 2015-10-01 10:53:58 the killed daemon is /bin/pyfcgid: status=0x0 14: 2015-10-01 16:06:01 Interface dmz is brought down. process_id=123, process_name="httpsd" 15: 2015-10-01 16:06:01 Interface wan1 is brought down. process_id=123, process_name="httpsd" 16: 2015-10-01 16:06:01 Interface wan2 is brought down. process_id=123, process_name="httpsd" 17: 2015-10-01 16:06:01 Interface internal1 is brought down. process_id=123, process_name="httpsd" 18: 2015-10-01 16:06:01 Interface internal2 is brought down. process_id=123, process_name="httpsd" 19: 2015-10-01 16:06:01 Interface internal3 is brought down. process_id=123, process_name="httpsd" 20: 2015-10-01 16:06:02 Interface internal4 is brought down. process_id=123, process_name="httpsd" 21: 2015-10-01 16:06:02 Interface internal5 is brought down. process_id=123, process_name="httpsd" 22: 2015-10-01 16:06:02 Interface internal6 is brought down. process_id=123, process_name="httpsd" 23: 2015-10-01 16:06:02 Interface internal7 is brought down. process_id=123, process_name="httpsd" 24: 2015-10-07 10:11:31 the killed daemon is /bin/pyfcgid: status=0x0 25: 2015-10-07 11:44:06 the killed daemon is /bin/pyfcgid: status=0x0 26: 2015-10-07 11:58:27 the killed daemon is /bin/dhcpd: status=0x0 27: 2015-10-07 11:59:29 the killed daemon is /bin/dhcpd: status=0x0 28: 2015-10-07 12:03:22 the killed daemon is /bin/dhcpd: status=0x0 29: 2015-10-07 14:39:18 the killed daemon is /bin/pyfcgid: status=0x0 30: 2015-10-07 14:56:07 the killed daemon is /bin/pyfcgid: status=0x0 31: 2015-10-07 15:04:58 the killed daemon is /bin/pyfcgid: status=0x0 32: 2015-10-07 19:03:36 the killed daemon is /bin/pyfcgid: status=0x0

Our Fortinet Ticket Number was: 1502710

 

Hope this helps someone...

bartman10

I'm no Linux "pro" but as far as I know "unable to handle kernel NULL pointer" error like that is a Kernel Oops that does not always hang the kernel but it can hang the kernel. 

 

I'm sure after 2-3 months of asking you to collect logs and playing around with level 1-2 support this may actually be looked at by someone who may actually understand and care to look into it. Or at least that's been my exp with Fortinet support when hard crashing their devices. 

They handle it like some minor thing.. meeh.. upload the device config so we can look at it.. @#$%%^ There is NO config, setting or anything that should crash and burn your device!! NONE! Accept maybe for a network loop.. but christ it's like you are telling them Word crashed or something!! 

THE APPLIANCE HARD LOCKED! this should NEVER happen.

 

Ref my exact same joy with FWF-50E hard locking when my kids watch Youtube on their iPhones.. and or use WiFi.

300E x3, 200D, 140D, 94D, 90D x2, 80D, 40C, handful of 60E's.. starting to loose track.

Over 100 WiFi AP's and growing.

FAZ-200D

FAC-VM 2 node cluster

Friends don't let friends FWF!

300E x3, 200D, 140D, 94D, 90D x2, 80D, 40C, handful of 60E's.. starting to loose track. Over 100 WiFi AP's and growing. FAZ-200D FAC-VM 2 node cluster Friends don't let friends FWF!
Matt_Garrett

Ahh it's nice to know that I am not going crazy...

 

We have a few hundred FWF60Ds out in the world and we have probably 5-10% of them randomly locking up like this.  We don't have the ability to physically get to them so I cannot confirm what is going on via the console port.

 

We have since put a couple of the "defective" units in much smaller environments (2-3 users) without any issue.  Still not sure where to go.  I will skip TAC for now and send over to our channel managers to see what can be done.

 

Thanks everyone!

Chris_Carson

Please post your TAC Ticket #s..  I'm kicking this to some Fortinet employees to see if we can make some progress.

 

My Fortinet Ticket Number was: 1502710

Chris_Carson

Just heard back from someone...

 

Now I'm asking how I can figure out my flash wear and If I need an RMA.

 

Mechanical disks didn't wear out this fast ;)

 

-----Original Message----- From: J <[]@fortinet.com[/link]> To: Chris Carson <chris> Subject: RE: [Fwd: Fortinet Forums Miscellaneous -- FortiOS and FortiGate: FortiWiFi 60D units locking up] Date: Wed, 23 Mar 2016 19:07:56 +0000  

I love stirring up a hornet’s nest. It looks like this issue was discovered in the 80 model was affecting the smaller ones as well in CSB-151124-1. So from what I’ve been researching 5.2.4 and above should resolve this problem. I believe this is caused by logging to the ssd on the box. Put the latest 5.2.5 or 5.4 on a unit and let’s see if this fixes it.

 

I was originally entering this to Rob’s last email but then this came in.

 

J

 

 

-------------------------------

 

https://forum.fortinet.com/tm.aspx?m=130694

 

CSB-151124-1 Fortinet 1 Technical Support Customer Support Bulletin Number: CSB-151124-1 Released: 27th November 2015 Modified: Subject: FortiGate flash disk errors Product: FortiGate low end devices Description: FortiGate devices with internal storage running FortiOS 5.0 or 5.2 may experience flash disk errors in cases where the flash disk has reached a finite number of program–erase cycles (typically written as P/E cycles). While Fortinet has designed all flash-based units with this limitation in mind under expected usage, experience with a very low proportion of users shows that an issue can be caused by excessive writing, updating, and modification of files on the flash disks. Features in FortiOS that may cause heavy disk usage are: 1. Disk logging 2. WanOpt & WebCache 3. Local-in policies 4. Device identification 5. DHCP and/or PPPoE 6. Excessive reboots or power cycles Typical symptoms experienced once this condition is met can be (but not limited to) as follows: - Problems accessing web GUI - Failure to execute CLI commands - 99% CPU usage by system - Connectivity issues - Partial or total functionality failure of device (usually DHCP stops working) - Alerts and error messages found in the event log as below: EXT3-fs: group descriptors corrupted ! EXT3-fs error (device sd(8,3)): ext3_check_descriptors: Block bitmap for group 17 not in group (block 17334272)! OR The following critical firewall event was detected: Kernel error. CSB-151124-1 Fortinet 2 Technical Support Customer Support Bulletin date=2015-10-19 time=08:49:12 devname=FortiGate devid=FGT60D3912621349 logid=0100020010 type=event subtype=system level=critical vd="root" logdesc="Kernel error" msg="EXT3-fs error (device sd(8,3)): ext3_get_inode_loc: unable to read inode block - inode=132, block=8" OR EXT2-fs error (device sd(8,3)): ext2_free_blocks: Freeing blocks not in datazone - block = 4294967295, count = 1 - Boot failures and error messages during boot up: Initializing firewall... System is starting... Starting system maintenance... Scanning /dev/mtd1... (100%) Formatting shared data partition ... done! EXT3-fs: error loading journal. EXT3-fs: error loading journal. Potentially Affected Products: Low end FortiGate/FortiWifi models with flash storage 20C, 40C, 60C, 80C, 60D, 90D, 100D Potentially Affected OS: FortiOS 4.x FortiOS 5.0 FortiOS 5.2 Remedy: The issue may be temporarily addressed by formatting boot disk and log disk. Should the issue occur, which would suggest the flash disk has reached its lifetime, you should create an RMA case and attach your current/backup configuration file and self test HQIP logs. Improvements to minimize this issue will be included in FortiOS 5.2.5 patch release and 5.4.0 minor release, the current ETA for release of both versions is December 2015. Fortinet recommend customers to upgrade to FortiOS 5.2.5 or later as soon as it is available in order to minimize flash wear. Not doing so may result in a reduced life time of the device and cause high RMA return rates.

 

Special notes 1. Disk logging a. The 5.0.2 release notes advises against enabling this feature. Starting from 5.0.6, this CSB-151124-1 Fortinet 3 Technical Support Customer Support Bulletin feature is disabled by default on units with flash disk. b. It is possible to enable it from the CLI and when one does so a notification message is displayed: “enabling disk logging impacts overall performance and reduces the lifetime of the unit.”. c. You should avoid usage of disk logging on all units and use remote logging storage such as FortiAnalyzer or FortiCloud.

 

current End User License Agreement. The information in this Customer Support Bulletin is provided for remedial purposes and is designed to assist customers in corrective action that may be helpful to the customer.  

Labels
Top Kudoed Authors