Menu
Index

Contact
Atom Feed
Comments Atom Feed

Similar Articles

2014-06-21 22:10
PICing up 433MHz Signals for OSS Home Automation - Part 8
2014-02-16 13:12
PICing up 433MHz Signals for OSS Home Automation - Part 5
2014-04-19 09:42
PICing up 433MHz Signals for OSS Home Automation - Part 6
2014-07-21 18:37
PICing up 433MHz Signals for OSS Home Automation - Part 9
2013-12-15 10:34
PICing up 433MHz Signals for OSS Home Automation - Part 1

Recent Articles

2019-07-28 16:35
git http with Nginx via Flask wsgi application (git4nginx)
2018-05-15 16:48
Raspberry Pi Camera, IR Lights and more
2017-04-23 14:21
Raspberry Pi SD Card Test
2017-04-07 10:54
DNS Firewall (blackhole malicious, like Pi-hole) with bind9
2017-03-28 13:07
Kubernetes to learn Part 4

Glen Pitt-Pladdy :: Blog

PICing up 433MHz Signals for OSS Home Automation - Part 7

Things are progressing gradually still, however there are some annoying hard-to-catch bugs that are getting caught.

Buffer overruns

With adding a regular WDT ping to the system comms failures were happening every few days. The irregularity of them made them difficult to catch and after adding a lot of debug code I came to the conclusion that the problem was likely buffer overruns.

What appears to be happening was that the regular received data communications are 4 bytes, but the WDT ping acknowledge is only 1 byte. When the buffer is full new data gets thrown, but the buffer being a power of 2 means that the received data communications fit neatly and we miss complete transmissions so the framing remains consistent even if data is lost. Introduce an extra byte in during a buffer full condition and suddenly we end up with mismatched framing on an overrun.

I was considering handshaking but that will actually be no better - there isn't anywhere to buffer significant data on the PIC so data will still get lost just at a different point in the chain.

The only solution is then to find a way of reducing the risk of buffer overruns and recovering from them. They are rare so the loss of functionality will not be significant. Remember this 433MHz stuff is open-loop with no anti-clash mechanism so likely plenty of losses anyway. By the infrequency of these problems I already know the losses are insignificant compared to what I see from transmission clashes, so the priority is mostly recovery but we will look at ways that we can improve things anyway.

Software Improvements

Since we have a 10ms timeout currently on reads, there should is very little chance of buffer overruns with normal transmission volumes. Since the system is not threaded, the question becomes more of how we minimize the time between reads. In the longer term having a read thread is an option, but for now making things work better first is the main priority.

As with many pieces of software the way it evolves as it gets written means that some quick fix or at the time necessary approaches to processing data become unnecessary inefficiencies after a few cycles of evolution. This was exactly the case with the host side code. In the read loop it turned out after removing swathes of old commented code I had this:

for ( my $i = 0; $i < length ( $newdata ); ++$i ) {
    push ( @{$self->{rxbuffer}}, ord ( substr ( $newdata, $i, 1 ) ) );
}

That's exceedingly inefficient code, and matches a classic C interview question for programmers! Not only is it O(n2) in C (Perl may do caching or have string metadata that avoids this) but it processes characters one at a time from the input buffer ($newdata) which is asking for trouble in a high level scripting language and who knows what other overheads it may create in the Perl runtime engine.

This was replaced with the much more efficient, scalable and tidier single line:

push ( @{$self->{rxbuffer}}, unpack ( 'C1024', $newdata ) );

.... and does seem to have reduced overruns though with the infrequency of them anyway it's not conclusive.

Hardware

I've also previously considered improving the rather crude hysteresis and filter setup used currently. This should avoid a lot of noise induced receive events that get sent to the host.

Previously I adapted the existing circuit with a filter while maintaining crude hysteresis.

XD-RF-5V 433MHz RX Module After Modification

Now I'm introducing another resistor to filter the signal with a higher value cap so overall the impedance driving hysteresis is low relative to the input impedance of the hysteresis circuit.

XD-RF-5V 433MHz RX Module After Modification 2

This should avoid filtering the hysteresis as well which risked susceptibility to bursts of noise on transitions and during idle times.

While I had the soldering iron out and the cover off I also changed the resistor on the green (RX) LED to 1K2 from 680R to bring its brightness into line with the others... could go further even.

Initial indications are promising - the data LED is spending a lot more time off in high-interference situations which suggests a lot less transitions are being sent back to the host. This should not only reduce the chance of buffer overruns and associated processing on the host, but also slightly improve the

Software Recovery

Once a failure occurs the input stream becomes unparsible and the server quits, shortly followed by the PIC being rest by its WDT which then isn't getting pings. What is required is a smart error handler that looks at each layer until it finds one which allows it to recover the situation.

My initial list of layers of recovery to look at:

  • Can the corrupted data/frames be identified and removed from the buffer? This will rely on working back from the end of the buffer matching the frames until the bad frame (last one at the front of the buffer) can be identified and cleared.
  • Does stopping reception and re-syncing communications fix the problem? This will use the Reset and Ping commands to verify sync and re-sync it.
  • Last resort has to be to close the device, wait long enough for the WDT to ensure it resets the PIC and re-open the device

The thing that will need handling when any of these occur is what happens if an error occurs during a transmission. We need to then re-try the transmission after the recovery and to do that a retrying wrapper will be needed for the transmission handling parts.

This is something I haven't yet had the time to look at as it's going to be a fair amount of work to implement and very difficult to test given the infrequency of failures (especially now after mods).