Ranching technology post #9 - some additional code for the surveillance system

Frustrating syslog entries showing a surveillance station dropping off the Wi-Fi network and ceasing attempts to reconnect. 
As you may have read in previous ranching technology posts on this blog our ranch has an off-grid surveillance system built from relatively low-cost components (e.g. Raspberry Pi's, usb webcams, etc.), open-source software, and some basic scripting.
A diagram of the internal components of the central surveillance station.
There is one central surveillance station with cameras, a WiFi router, and a mobile broadband Internet connection. 
A diagram of the internal components of an independent surveillance station.

Linked to this central station via WiFi are two separate or "independent" surveillance stations located 100 or more meters away.  The above graphics are intended to give the reader a general overview of the internal hardware for both types of surveillance stations, but a more detailed description and photographs of the central surveillance station design can be found here and a description of the independent surveillance stations can be found here.

A diagram of the surveillance system when it is working properly.  Internet access is supplied via a mobile broadband device linked to a Wi-Fi router.  The Wi-Fi router relays the Internet connection to the central station's main computer and to any external device able to connect to the router.  In this diagram one independent surveillance station and one personal computer are connected to the Wi-Fi router. 
Though this system has largely worked very well, it has occasionally run into problems (often related to weather extremes or insect life) for which I've needed find solutions.  This specific post describes software based solutions I've developed to automatically remedy three different disruptions to surveillance system communications.

Before discussing the solutions I will discuss the three types of disruption.

Problem 1: The first type of disruption, depicted in the above graphic, is a breakdown in the communications link between the Wi-Fi router and the cellular tower five miles away.  In most cases this breakdown is caused by a malfunction in the network bridge (built from a Raspberry Pi) connecting the router to the mobile broadband device.  The specific malfunction occurring within the network bridge has not been identified, but there is often a correlation between this malfunction and the sudden onset of severe weather with a rapid change in humidity and temperature.  This form of disruption occurs least often and the only known remedy is to reset the Raspberry Pi network bridge.
Problem 2: The second and most common type of communications disruption is a break down in communications between an independent station and the Wi-Fi router.  Based upon personal observations and an examination of syslog entries in the independent stations this type of disruption starts out as what would normally be considered a temporary loss of the Wi-Fi network availability.  This temporary attenuation of the Wi-Fi signal is often caused by heavy rain or evening dew and usually subsides after a period of hours.  Initially the Raspberry Pi computer in an independent station will make attempts to reacquire the Wi-Fi connection.  However, for reasons I am still unable to determine, at some point during this period of outage the Raspberry Pi will essentially give up trying to connect and never try again.  Once an independent surveillance station has reached this point you can fix the problem by either physically removing and then inserting the Pi's Wi-Fi dongle or resetting the Pi.

Problem 3: The final and least common communications disruption is a malfunction of the Ethernet connection between the main computer and Wi-Fi router within the central station.  In the darkest and coldest days of winter the system can temporarily loose power at night and then start up midway through the next day.  For reasons unknown to me on some of these occasions the main computers Ethernet connection will never initiate.  In addition to cold weather power outages, the occasional pitter patter of tiny insect feet across the circuit boards inside the surveillance system during warm weather can cause a loss of Ethernet connection (that's my theory at least).

And now for the software solutions to these annoying communications disruptions.  As with some of my previous ranching technology posts I'm going to have to go "full-nerd" from here on.
The cron job for rebooting the network bridge every morning at 2AM.
After thinking it over I decided that the best solution to Problem 1 was to create a cron job that rebooted the network bridges Raspberry Pi operating system every day at 2AM.  While not the well tailored solution I would have like to produce, this solution resolves Problem 1 with the least amount of time and effort on my part.  
The script I ultimately ended up using to resolve Problems 2 and 3.  The independent surveillance stations and main computer ping the mobile broadband device's local IP address every five minutes.  If no reply is made the script waits 3 hours and tries to ping again.  If no reply is made the second time the system reboots.  The three hour wait help prevent excessive rebooting in the event the the broad band device is down and is also about as long a time as I'm willing to wait for system operations to resume. 
My solution to Problems 2 & 3, shown and explained in the above image, is somewhat more complex than the previously discussed solution, but still fairly simple. This script, named fixhardwareerrors.sh, is based on a solution I found posted here by Arne Schmitz.  The specific solution Arne posted did not work for me, but the modified version of Arnie's script, shown in the above image, has worked well for me now for several months.  It has resolved several independent station disruptions and one main computer disruption.
The script is initiated upon system startup by an entry in the rc.local file.

Unlike Arne's script, which is executed periodically by cron, my script runs as an infinite loop initiated at startup by the above command placed in the rc.local file.

Hopefully, with these three communications annoyances mitigated to my satisfaction, I will now have more time available to focus my future efforts on expanding the capabilities of my surveillance system.

Comments

Popular posts from this blog

Front-end loader mounted t-post installer

DIY cultipacker-roller

Face fly nightmare