By Kardi Teknomo, PhD .

< Previous | Next | Contents >

Data Revival from the Statistics

In many cases, we do not want to store all the data. Since the measurement can be taken at every second or millisecond, storing both data and the statistic require huge amount of storage. Thus, we keep updating the statistics of the system, until certain time, we want to recover some of the previous data that we never stored before, for example to make some chart of it. How is it possible to revive the previous data, without really storing it? The task is not impossible if we know certain statistics such as time average and or time variance. Let us consider case by case.

Case-1: Store only the sampling time and time average

In the previous example , we have the following results:

Time ( Data Revival )

1

2

3

4

Time-Average ( Data Revival )

Data Revival

Data Revival

Data Revival

Data Revival

Suppose we only stored all the time-average data. Can we get back the real measurement data only based on the stored time-average? Yes, we can revive the measurement data only based on the time-average, provided we know the sequence data number Data Revival .

Remember that we have recursive formula (3) to compute the time-average. I write it again in here for your convenient:

Data Revival (3)

Rearrange the equation (3) for Data Revival we have

Data Revival (5)

As before, the subscript Data Revival start at 1, therefore Data Revival is undefined and we can put any number for it. Using equation (5) we can compute back the measurement data based only on two consecutive time-averages.

Time

( Data Revival )

Average

( Data Revival )

Revival Measurement Data

( Data Revival )

1

Data Revival

Data Revival

2

Data Revival

Data Revival

3

Data Revival

Data Revival

4

Data Revival

Data Revival

The results of the revival measurement are the same as the real measurement data.

Case-2: Store only 2 consecutive time average and time variance

To restore data from the statistics using equation (5) above, we need to know the sequence data number Data Revival . Suppose we do not know the sequence data number Data Revival but we only know time average and time variance of the data, can we revive the real measurement?

Yes, we can revive the real measurement data from two consecutive time average and time variance using the following formula

Data Revival (6)

where,

Data Revival

Click here to see proof of the Data revival formula

Using previous example , we have the statistics (time average and time variance) and we can revive the data. Obviously, we use quadratic formula to get the data, thus two possible value are the results.

Time

( Data Revival )

Average

( Data Revival )

Variance

( Data Revival )

a

b

c

Square root of Discriminant

( Data Revival discriminant )

Revival Measurement Data

( Data Revival {+})

Revival Measurement Data

( Data Revival {-})

1

Data Revival

Data Revival

-

-

2

Data Revival

Data Revival

-1

10

-24

2

4

6

3

Data Revival

Data Revival

-2.33

39.33

-136

16.67

4.857

12

4

Data Revival

Data Revival

-0.4167

3.9167

-1.5

3.583

0.4

9

Note that the only first measurement data is revived using positive sign of equation (6), while the others revival measurement data are obtained using the negative sign. This rule is true for any measurement data.


Suppose you have two consecutive means and variances of a series of measurement but you do not know the how many data you have inputted and you want to recover the last measurement value. To give you better understanding about what I mean by Data Revival, I created here an interactive program to revive that last measurement value based on the two consecutive arithmetic means and variances.

Previous mean , Previous variance

Current mean , Current variance


You can try to experiment with your own means and average using the program above. For example, using the interactive program, you may be able to answer the following questions
  • If you have keep the variance at 1.0 and previous mean was at 0 and the current mean is 1, what was the last data value that change that mean?
  • Is it possible to keep the means (current and previous) constant at 1.0 and variance changes from previous variance = 1.0 into 2.0 (= current variance)? What is the last data value to make that change?
  • Is it possible to keep the means (current and previous) constant at 1.0 and variance changes from previous variance = 2.0 into current variance = 1.0? What is the last data value to make that change?

I understand the potential that some bad people may use the Data Revival formula above to cheat their measurement data to get the statistical results that they want to but similar to any other great tools, the utilization of the tools is really depending on the moral of the person.

< Previous | Next | Contents >

These tutorial is copyrighted .

Preferable reference for this tutorial is

Teknomo, Kardi. (2006) Recursive Average and Variance.
http://people.revoledu.com/kardi/tutorial/RecursiveStatistic/index.html