All rPis went belly up

Posted on
Thu Aug 17, 2017 6:37 am
johnpolasek offline
Posts: 911
Joined: Aug 05, 2011
Location: Aggieland, Texas

All rPis went belly up

Around 2 pm yesterday; The system has been working so well that I didn't notice till this morning, but one of the rPi's stopped updating it's beacon about 13:38 and the other three about 14:41, and since then, I'm not getting any beacon or gpio data from any of them... and with the .135 version's minimal debug info, I'm not getting any failed update messages in the Indigo log either. I'm at work, so all I can see through the VPN is that none of the Pis are responding to pings. I still have the appliance links on two of them, but cycling power on those tow didn't help. Once I get home, I can put a screen and keyboard on them to see if they are still alive locally, but what logs should I look at to see what went wrong (assuming they haven't totally locked up somehow)?

Posted on
Thu Aug 17, 2017 8:56 am
kw123 offline
User avatar
Posts: 8333
Joined: May 12, 2013
Location: Dallas, TX

Re: All rPis went belly up

If they do not respond to ping it's likely they are shutdown.
But a power cycle should restart them.


Sent from my iPhone using Tapatalk

Posted on
Thu Aug 17, 2017 9:49 am
johnpolasek offline
Posts: 911
Joined: Aug 05, 2011
Location: Aggieland, Texas

Re: All rPis went belly up

kw123 wrote:
If they do not respond to ping it's likely they are shutdown.
But a power cycle should restart them.


Sent from my iPhone using Tapatalk



The fact that neither of the two that I could power cycle came back up means that it is something else; I'm currently hoping that something just trashed their IPs and/or WIFI passwords, but I'm downloading your newest Jesse just in case something did a cd /;rm * -r from root...

Posted on
Thu Aug 17, 2017 12:40 pm
kw123 offline
User avatar
Posts: 8333
Joined: May 12, 2013
Location: Dallas, TX

Re: All rPis went belly up

Are you sure they have power and power went off ? If they continue to have power they will not restart after shutdown.
Both not rebooting is very unlikely.
Which shutdown command did you issue?


Sent from my iPhone using Tapatalk

Posted on
Thu Aug 17, 2017 12:45 pm
johnpolasek offline
Posts: 911
Joined: Aug 05, 2011
Location: Aggieland, Texas

Re: All rPis went belly up

Not being there to watch them, I can't be SURE the appliancelincs powered down, but they did acknowledge the off and on commands.

Posted on
Fri Aug 18, 2017 4:58 am
johnpolasek offline
Posts: 911
Joined: Aug 05, 2011
Location: Aggieland, Texas

Re: All rPis went belly up

The problem was in the wireless router; no idea what it was doing, but cycling power on a Pi when I got home did nothing, but cycling power on the router brought 2 of them back within a couple of minutes on their own and cycling power on the remaining 2 brought them back up as well.

Posted on
Fri Aug 18, 2017 5:31 am
kw123 offline
User avatar
Posts: 8333
Joined: May 12, 2013
Location: Dallas, TX

Re: All rPis went belly up

When they lose connection to the indigoserver they will reboot themselves every 10 minutes or so. I believe the other would have come back also just a little later.


Sent from my iPhone using Tapatalk

Posted on
Fri Aug 18, 2017 8:29 am
johnpolasek offline
Posts: 911
Joined: Aug 05, 2011
Location: Aggieland, Texas

Re: All rPis went belly up

kw123 wrote:
When they lose connection to the indigoserver they will reboot themselves every 10 minutes or so. I believe the other would have come back also just a little later.


Sent from my iPhone using Tapatalk


So the next question is whether there is any way to detect when an rPi loses connection, even if it's bluetooth beacon is still seen by the other Pis in the house... is the "online" property an indication that the device is communicating rather than the "up" status indicating that it's beacon is working? I'd like to add an indicator to my control page and maybe a plot to see when the various servers are not working...

Posted on
Fri Aug 18, 2017 8:37 am
kw123 offline
User avatar
Posts: 8333
Joined: May 12, 2013
Location: Dallas, TX

Re: All rPis went belly up

Good suggestion . I will think about it.
Naturally it will not work if you have just one rpi or if the rpi Bluetooth don't see each other.


Sent from my iPhone using Tapatalk

Posted on
Sat Aug 19, 2017 5:21 am
johnpolasek offline
Posts: 911
Joined: Aug 05, 2011
Location: Aggieland, Texas

Re: All rPis went belly up

Playing with implementing this last night, I came up with a better way; I created PixOnline variables for each Pi and triggers on each Pi_IN_x variable change with an action to set PixOnline to true with a 1 minute and 5 second delay to set the PixOnline to false override prior delay... so as long as a Pi is updating it's information at least once a minute, the Online variable remains true, but if it doesn't the Online goes false until the next message comes in. The only thing I'm not sure of is whether it'll still work when all the beacons leave, but since the rPis are cross communicating, that should keep the messages coming, correct?

Posted on
Sat Aug 19, 2017 6:31 am
kw123 offline
User avatar
Posts: 8333
Joined: May 12, 2013
Location: Dallas, TX

Re: All rPis went belly up

As long as they see each other yes

I will add another status:
State-ip which will have up/down/expired.
It will only be "up"ed if the rpi sends any message ( through ip network) . And down-ed after 90 secs if no message and expired after another 90 secs. ... end of august.
That will be independent of bluetooth status



Sent from my iPhone using Tapatalk

Posted on
Thu Aug 31, 2017 4:46 pm
johnpolasek offline
Posts: 911
Joined: Aug 05, 2011
Location: Aggieland, Texas

Re: All rPis went belly up

Played with using the online to set a true false on variable and got this... 2 and 3 are the ones which have the 32 gig cards... which log should i be looking at?
Attachments
Pis.png
Pis.png (77.41 KiB) Viewed 2910 times

Posted on
Thu Aug 31, 2017 7:30 pm
kw123 offline
User avatar
Posts: 8333
Joined: May 12, 2013
Location: Dallas, TX

Re: All rPis went belly up

to confirm this I would do an endless ping overnight:

Code: Select all
ping 192.168.1.1  | while read xx; do echo "$(date): $xx";sleep 5;  done > out

replace the 192.168.1.1 with the ip# of the RPI (you can do this in 2 windows for 2 RPI)

it will produce (with 2 secs sleep):
Code: Select all
Thu Aug 31 20:44:35 CDT 2017: PING 192.168.1.1 (192.168.1.1): 56 data bytes
Thu Aug 31 20:44:37 CDT 2017: 64 bytes from 192.168.1.1: icmp_seq=0 ttl=64 time=0.496 ms
Thu Aug 31 20:44:39 CDT 2017: 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.484 ms
Thu Aug 31 20:44:41 CDT 2017: 64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.534 ms

then next day look at file "out"


also could you run an SQL report on the state w indigo utilities ?


logfile to look at:
Code: Select all
cat /var/log/syslog


Karl

Posted on
Fri Sep 01, 2017 7:32 am
johnpolasek offline
Posts: 911
Joined: Aug 05, 2011
Location: Aggieland, Texas

Re: All rPis went belly up

We're heading home for the holiday, but I'll try that once I get back; I'm wondering if it has something to do with the 32 Gig cards; Pi0 and 1 have 16s, even though the big ones are supposed to be fast, I'm wondering if there's some kind of latency involved. Also, I assume the "blips" I'm seeing just after midnight every night on all the Pis is the nightly reboot? Did you ever consider deliberately staggering that task (say by 5 or 10 minutes per Pi#) so they aren't all rebooting at once?

Posted on
Fri Sep 01, 2017 7:55 am
kw123 offline
User avatar
Posts: 8333
Joined: May 12, 2013
Location: Dallas, TX

Re: All rPis went belly up

Reboot staggered is a good idea. Will do hour + minute*2*pi#


Sent from my iPhone using Tapatalk

Page 1 of 1

Who is online

Users browsing this forum: No registered users and 5 guests

cron