FAP320Bs dropping
I have four FAP320B APs controlled by a cluster of FGT90 firewalls.
Periodically I will see individual APs "leave" the network. The APs will rejoin after some period of time (minutes). Less frequently, I will see all four APs leave the network more or less simultaneously and rejoin after some period of time (the same minutes timespan). This happens on the order of 5 to 40 individual events per day.
Now when one AP leaves, usually the client will join to another nearby AP and all is well. However when all four leave simultaneously, there are obviously no APs available to service connection requests and the users notice. This happens on the order of 1 to 3 times per day.
The FortiGate is not logging any useful information.
This is an example of a failure: clients notice a problem, then approximately two minutes later the FortiGate logs this:
2015-02-26T09:01:14.128012-05:00 wheel date=2015-02-26 time=09:01:14 devname=fw-ottawa-A devid=FGT90D3Z-------- logid=0104043522 type=event subtype=wireless level=notice vd="root" logdesc="physical AP activity" sn="FP3 20B3X--------" ap="elpfap1" profile="FAP320B-default" ip=10.8.0.31 meshmode="mesh root ap" snmeshparent="N/A" action="ap-fail" reason="Control message maximal retransmission limit reached" msg="AP elpfap1 failed." 2015-02-26T09:01:14.128646-05:00 wheel date=2015-02-26 time=09:01:14 devname=fw-ottawa-A devid=FGT90D3Z-------- logid=0104043522 type=event subtype=wireless level=notice vd="root" logdesc="physical AP activity" sn="FP320B3X--------" ap="elpfap1" profile="FAP320B-default" ip=10.8.0.31 meshmode="mesh root ap" snmeshparent="N/A" action="ap-leave" reason="Control message maximal retransmission limit reached" msg="AP elpfap1 left." 2015-02-26T09:01:45.390396-05:00 wheel date=2015-02-26 time=09:01:45 devname=fw-ottawa-A devid=FGT90D3Z-------- logid=0104043522 type=event subtype=wireless level=notice vd="root" logdesc="physical AP activity" sn="FP320B3X--------" ap="elpfap1" profile="FAP320B-default" ip=10.8.0.31 meshmode="mesh root ap" snmeshparent="N/A" action="ap-join" reason="N/A" msg="AP elpfap1 joined." 2015-02-26T09:01:45.393459-05:00 wheel date=2015-02-26 time=09:01:45 devname=fw-ottawa-A devid=FGT90D3Z-------- logid=0104043526 type=event subtype=wireless level=notice vd="root" logdesc="physical AP radio activity" sn="FP320B3X--------" ap="elpfap1" ip="10.8.0.31" radioid=1 configcountry="US " opercountry="US " cfgtxpower=23 opertxpower=21 action="config-txpower" msg="AP elpfap1 radio 1 cfg txpower is changed to 23 dBm." 2015-02-26T09:01:45.394045-05:00 wheel date=2015-02-26 time=09:01:45 devname=fw-ottawa-A devid=FGT90D3Z-------- logid=0104043526 type=event subtype=wireless level=notice vd="root" logdesc="physical AP radio activity" sn="FP320B3X--------" ap="elpfap1" ip="10.8.0.31" radioid=2 configcountry="US " opercountry="US " cfgtxpower=27 opertxpower=22 action="config-txpower" msg="AP elpfap1 radio 2 cfg txpower is changed to 27 dBm." 2015-02-26T09:01:45.543350-05:00 wheel date=2015-02-26 time=09:01:45 devname=fw-ottawa-A devid=FGT90D3Z-------- logid=0104043526 type=event subtype=wireless level=notice vd="root" logdesc="physical AP radio activity" sn="FP320B3X--------" ap="elpfap1" ip="10.8.0.31" radioid=1 configcountry="US " opercountry="US " cfgtxpower=23 opertxpower=21 action="country-config-success" msg="AP elpfap1 radio 1 country US (841) set success." 2015-02-26T09:01:45.547094-05:00 wheel date=2015-02-26 time=09:01:45 devname=fw-ottawa-A devid=FGT90D3Z-------- logid=0104043526 type=event subtype=wireless level=notice vd="root" logdesc="physical AP radio activity" sn="FP320B3X--------" ap="elpfap1" ip="10.8.0.31" radioid=1 configcountry="US " opercountry="US " cfgtxpower=23 opertxpower=21 action="oper-txpower" msg="AP elpfap1 radio 1 oper txpower is changed to 21 dBm." 2015-02-26T09:01:45.547569-05:00 wheel date=2015-02-26 time=09:01:45 devname=fw-ottawa-A devid=FGT90D3Z-------- logid=0104043526 type=event subtype=wireless level=notice vd="root" logdesc="physical AP radio activity" sn="FP320B3X--------" ap="elpfap1" ip="10.8.0.31" radioid=1 configcountry="US " opercountry="US " cfgtxpower=23 opertxpower=21 action="country-config-success" msg="AP elpfap1 radio 1 country US (841) set success." 2015-02-26T09:01:45.550162-05:00 wheel date=2015-02-26 time=09:01:45 devname=fw-ottawa-A devid=FGT90D3Z-------- logid=0104043526 type=event subtype=wireless level=notice vd="root" logdesc="physical AP radio activity" sn="FP320B3X--------" ap="elpfap1" ip="10.8.0.31" radioid=1 configcountry="US " opercountry="US " cfgtxpower=23 opertxpower=21 action="oper-channel" msg="AP elpfap1 radio 1 operating channel 0 ==> 149." 2015-02-26T09:01:45.658946-05:00 wheel date=2015-02-26 time=09:01:45 devname=fw-ottawa-A devid=FGT90D3Z-------- logid=0104043526 type=event subtype=wireless level=notice vd="root" logdesc="physical AP radio activity" sn="FP320B3X--------" ap="elpfap1" ip="10.8.0.31" radioid=2 configcountry="US " opercountry="US " cfgtxpower=27 opertxpower=22 action="country-config-success" msg="AP elpfap1 radio 2 country US (841) set success." 2015-02-26T09:01:45.712637-05:00 wheel date=2015-02-26 time=09:01:45 devname=fw-ottawa-A devid=FGT90D3Z-------- logid=0104043526 type=event subtype=wireless level=notice vd="root" logdesc="physical AP radio activity" sn="FP320B3X--------" ap="elpfap1" ip="10.8.0.31" radioid=2 configcountry="US " opercountry="US " cfgtxpower=27 opertxpower=22 action="oper-txpower" msg="AP elpfap1 radio 2 oper txpower is changed to 22 dBm." 2015-02-26T09:01:45.713336-05:00 wheel date=2015-02-26 time=09:01:45 devname=fw-ottawa-A devid=FGT90D3Z-------- logid=0104043526 type=event subtype=wireless level=notice vd="root" logdesc="physical AP radio activity" sn="FP320B3X--------" ap="elpfap1" ip="10.8.0.31" radioid=2 configcountry="US " opercountry="US " cfgtxpower=27 opertxpower=22 action="country-config-success" msg="AP elpfap1 radio 2 country US (841) set success." 2015-02-26T09:01:45.766208-05:00 wheel date=2015-02-26 time=09:01:45 devname=fw-ottawa-A devid=FGT90D3Z-------- logid=0104043526 type=event subtype=wireless level=notice vd="root" logdesc="physical AP radio activity" sn="FP320B3X--------" ap="elpfap1" ip="10.8.0.31" radioid=2 configcountry="US " opercountry="US " cfgtxpower=27 opertxpower=22 action="oper-channel" msg="AP elpfap1 radio 2 operating channel 0 ==> 11."
The APs are using TrendNET TPE-113GI POE injectors for power. These are connected back to Dell PowerConnect 5548 switches, which then connect to Juniper EX2200 switches, and finally into the firewall cluster.
When examined through the Manged Devices pane, the devices are not showing as being rebooted because of these outages.
The APs remain pingable during these outages.
Things that support has had me try:
- downgrade the APs from 5.2.2 to 5.0.9 (the firewall is still running 5.2.2) -- this reduced my individual ap-fail frequency from hundreds per day down to what you see now, easilly an order-of-magnitude drop
- stop using the APs passthrough -- some of the APs had computers daisy-chained off of their second ethernet ports; interestingly when the APs "left", the computers could not pass information through to the rest of the network; changing this had no effect on the events (although it has made those previously-daisy-chained computers much more reliable)
- change which ethernet device is connected to the injector -- this had no effect
- check the FortiGate management page during an outage to see if the device was present or not -- because these events are unpredictable and brief, I have not been able to "catch" the device in the act so to speak.
Does anyone have any ideas what I can do to diagnose and correct this issue. I have tickets in to support, but as you can see above the results have been less than ideal.
