Glen Pitt-Pladdy :: BlogPICing up 433MHz Signals for OSS Home Automation - Part 7 | |||
Things are progressing gradually still, however there are some annoying hard-to-catch bugs that are getting caught. Buffer overrunsWith adding a regular WDT ping to the system comms failures were happening every few days. The irregularity of them made them difficult to catch and after adding a lot of debug code I came to the conclusion that the problem was likely buffer overruns. What appears to be happening was that the regular received data communications are 4 bytes, but the WDT ping acknowledge is only 1 byte. When the buffer is full new data gets thrown, but the buffer being a power of 2 means that the received data communications fit neatly and we miss complete transmissions so the framing remains consistent even if data is lost. Introduce an extra byte in during a buffer full condition and suddenly we end up with mismatched framing on an overrun. I was considering handshaking but that will actually be no better - there isn't anywhere to buffer significant data on the PIC so data will still get lost just at a different point in the chain. The only solution is then to find a way of reducing the risk of buffer overruns and recovering from them. They are rare so the loss of functionality will not be significant. Remember this 433MHz stuff is open-loop with no anti-clash mechanism so likely plenty of losses anyway. By the infrequency of these problems I already know the losses are insignificant compared to what I see from transmission clashes, so the priority is mostly recovery but we will look at ways that we can improve things anyway. Software ImprovementsSince we have a 10ms timeout currently on reads, there should is very little chance of buffer overruns with normal transmission volumes. Since the system is not threaded, the question becomes more of how we minimize the time between reads. In the longer term having a read thread is an option, but for now making things work better first is the main priority. As with many pieces of software the way it evolves as it gets written means that some quick fix or at the time necessary approaches to processing data become unnecessary inefficiencies after a few cycles of evolution. This was exactly the case with the host side code. In the read loop it turned out after removing swathes of old commented code I had this:
for ( my $i = 0; $i < length ( $newdata ); ++$i ) { That's exceedingly inefficient code, and matches a classic C interview question for programmers! Not only is it O(n2) in C (Perl may do caching or have string metadata that avoids this) but it processes characters one at a time from the input buffer ($newdata) which is asking for trouble in a high level scripting language and who knows what other overheads it may create in the Perl runtime engine. This was replaced with the much more efficient, scalable and tidier single line: push ( @{$self->{rxbuffer}}, unpack ( 'C1024', $newdata ) ); .... and does seem to have reduced overruns though with the infrequency of them anyway it's not conclusive. HardwareI've also previously considered improving the rather crude hysteresis and filter setup used currently. This should avoid a lot of noise induced receive events that get sent to the host. Previously I adapted the existing circuit with a filter while maintaining crude hysteresis.
Now I'm introducing another resistor to filter the signal with a higher value cap so overall the impedance driving hysteresis is low relative to the input impedance of the hysteresis circuit.
This should avoid filtering the hysteresis as well which risked susceptibility to bursts of noise on transitions and during idle times. While I had the soldering iron out and the cover off I also changed the resistor on the green (RX) LED to 1K2 from 680R to bring its brightness into line with the others... could go further even. Initial indications are promising - the data LED is spending a lot more time off in high-interference situations which suggests a lot less transitions are being sent back to the host. This should not only reduce the chance of buffer overruns and associated processing on the host, but also slightly improve the Software RecoveryOnce a failure occurs the input stream becomes unparsible and the server quits, shortly followed by the PIC being rest by its WDT which then isn't getting pings. What is required is a smart error handler that looks at each layer until it finds one which allows it to recover the situation. My initial list of layers of recovery to look at:
The thing that will need handling when any of these occur is what happens if an error occurs during a transmission. We need to then re-try the transmission after the recovery and to do that a retrying wrapper will be needed for the transmission handling parts. This is something I haven't yet had the time to look at as it's going to be a fair amount of work to implement and very difficult to test given the infrequency of failures (especially now after mods).
|
|||
This is a bunch of random thoughts, ideas and other nonsense, and is not intended to be taken seriously. I'm experimenting and mostly have no idea what I am doing with most of this so it should be taken with cuation and at your own risk. Intrustive technologies are minimised where possible. For the purposes of reducing abuse and other risks hCaptcha is used and has it's own policies linked from the widget.
Copyright Glen Pitt-Pladdy 2008-2023
|