Back in September, Zevv titled a post "Isolating semi-periodic
waveforms in a signal" in which he gave a link to
http://tinyurl.com/pmgraph (a plot of household power usage). He
was kind enough to send me a week's worth of raw data ( sampled
every 2 seconds).
As I am visually oriented, the first thing I did was to plot the
data. I saw at least two distinct types of noise:
1. apparently random small fluctuations
2. large spikes associated with state changes
As these spikes were very large, independent of size of following
level shift, and *EXACTLY* of 1 sample duration, I arbitrarily
replaced them with the value of the following data point.
My next iteration was to replace all samples between a pair of
state changes with the average during that period. That was
useful to point out what would have to be taken into account for
a better approximation.
The next thing was to consider doing a running average over a set
of n samples between state changes. Two problems:
1. how to chose n
2. what to do within n samples of start/end of current state
If this were "the good old days", I would grab graph paper,
french curve and a straight edge to do some calibrated eyeball
curve fitting.
A later post to another group showed up, but this did not
so "if at first you don't succeed .... .. . '
But it's no "the good old days" and I want a less tedious and
more reproducible method. I looked at tools available in Scilab
and came across "lsq_splin" which given m data points and n
breakpoints (m>n, >> implied) generates *a* curve of m points
which is a least squares fit.
I've some playing/experimenting and demonstrated that too many
breakpoints is as poor a solution as too few (surprise surprise).
What guidelines are there for choosing number and location of
breakpoints?
What are good search terms to use so that Google would show
informative pages?
I'm explicitly looking to doing _piecewise_ approximations as
creating an analytical function to represent discontinuous data
is a fool's errand. I tilt at windmills enough already ;/
{P.S. This student hasn't been in math class for ~50 yrs]
TIA