Kardi Teknomo
Kardi Teknomo Kardi Teknomo Kardi Teknomo
   
 
  Research
  Publications
  Tutorials
  Resume
  Resources
  Contact

Visit Tutorials below:
Adaptive Learning from Histogram
Adjacency matrix
Analytic Hierarchy Process (AHP)
ArcGIS tutorial
Arithmetic Mean
Bayes Theorem
Bootstrap Sampling
Bray Curtis Distance
Break Even Point
Chebyshev Distance
City Block Distance
Conditional Probability
Continued Fraction
Data Analysis from Questionnaire
Data Revival from Statistics
Decimal to Rational
Decision tree
Difference equations
Digital Root
Discriminant analysis
Divisibility
Eigen Value using Excel
Euclidean Distance
Euler Integration
Euler Number
Excel Iteration
Excel Macro
Excel Tutorial
Feasibility Study
Financial Analysis
Generalized Inverse
Generalized Mean
Geometric Mean
Ginger Bread Man and Chaos
Graph Theory
Growth Model
Hamming Distance
Harmonic Mean
Hierarchical Clustering
Independent Events
Incident matrix
Jaccard Coefficient
Kernel basis function
Kernel Regression
k-Means clustering
K Nearest Neighbor
LAN Connections Switch
Learning from data
Lehmer Mean
Linear Algebra
Logarithm Rules
Mahalanobis Distance
Market Basket Analysis
Mean Absolute Deviation
Mean and Average
Mean, median, mode
Minkowski Distance
Minkowski Mean
Monte Carlo Simulation
Multi Agent System
Multicriteria decision making
Mutivariate Distance
Newton Raphson
Non-Linear Transformation
Normalization Index
Normalized Rank
Ordinary Differential Equation
Page Rank
Palindrome
PI
Power rules
Prime Factor
Prime Number
Q Learning
Quadratic Function
Queueing Theory
Rank Reversal
Recursive Statistics
Regression Model
Reinforcement Learning
Root of Polynomial
Runge-Kutta
Scenario Analysis
Sierpinski gasket
Sieve of Erastosthenes
Similarity and Distance
Solving System Equation
Standard deviation
Summation Tricks
Support Vector Machines
System dynamic
Time Average
Tower of Hanoi
Variance
Vedic Square
Visual Basic (VB) tutorial
What If Analysis

 

Data Revival from the Statistics

By Kardi Teknomo, PhD.

<Previous | Next | Contents>

In many cases, we do not want to store all the data. Since the measurement can be taken at every second or millisecond, storing both data and the statistic require huge amount of storage. Thus, we keep updating the statistics of the system, until certain time, we want to recover some of the previous data that we never stored before, for example to make some chart of it. How is it possible to revive the previous data, without really storing it?  The task is not impossible if we know certain statistics such as time average and or time variance. Let us consider case by case.

Case-1: Store only the sampling time and time average

In the previous example, we have the following results:

Time ( Data Revival )

1

2

3

4

Time-Average ( Data Revival )

Data Revival

Data Revival

Data Revival

Data Revival

Suppose we only stored all the time-average data. Can we get back the real measurement data only based on the stored time-average? Yes, we can revive the measurement data only based on the time-average, provided we know the sequence data number Data Revival .

Remember that we have recursive formula (3) to compute the time-average. I write it again in here for your convenient:

Data Revival                            (3)

Rearrange the equation (3) for Data Revival  we have

Data Revival                            (5)

As before, the subscript Data Revival  start at 1, therefore Data Revival  is undefined and we can put any number for it. Using equation (5) we can compute back the measurement data based only on two consecutive time-averages.

Time

( Data Revival )

Average

( Data Revival )

Revival Measurement Data

( Data Revival )

1

Data Revival

Data Revival

2

Data Revival

Data Revival

3

Data Revival

Data Revival

4

Data Revival

Data Revival

The results of the revival measurement are the same as the real measurement data.

Case-2: Store only 2 consecutive time average and time variance

To restore data from the statistics using equation (5) above, we need to know the sequence data number Data Revival . Suppose we do not know the sequence data number Data Revival  but we only know time average and time variance of the data, can we revive the real measurement?

Yes, we can revive the real measurement data from two consecutive time average and time variance using the following formula

Data Revival                                         (6)

where,

Data Revival

Click here to see proof of the Data revival formula

Using previous example, we have the statistics (time average and time variance) and we can revive the data. Obviously, we use quadratic formula to get the data, thus two possible value are the results.

Time

( Data Revival )

Average

( Data Revival )

Variance

( Data Revival )

a

b

c

Square root of Discriminant

(Data Revival discriminant)

Revival Measurement Data

( Data Revival {+})

Revival Measurement Data

( Data Revival {-})

1

Data Revival

Data Revival  

       

-

-

2

Data Revival

Data Revival

-1

10

-24

2

4

6

3

Data Revival

Data Revival

-2.33

39.33

-136

16.67

4.857

12

4

Data Revival

Data Revival

-0.4167

3.9167

-1.5

3.583

0.4

9

Note that the only first measurement data is revived using positive sign of equation (6), while the others revival measurement data are obtained using the negative sign. This rule is true for any measurement data.


Suppose you have two consecutive means and variances of a series of measurement but you do not know the how many data you have inputted and you want to recover the last measurement value. To give you better understanding about what I mean by Data Revival, I created here an interactive program to revive that last measurement value based on the two consecutive arithmetic means and variances.

Previous mean , Previous variance

Current mean , Current variance


You can try to experiment with your own means and average using the program above. For example, using the interactive program, you may be able to answer the following questions
  • If you have keep the variance at 1.0 and previous mean was at 0 and the current mean is 1, what was the last data value that change that mean?
  • Is it possible to keep the means (current and previous) constant at 1.0 and variance changes from previous variance = 1.0 into 2.0 (= current variance)? What is the last data value to make that change?
  • Is it possible to keep the means (current and previous) constant at 1.0 and variance changes from previous variance = 2.0 into current variance = 1.0? What is the last data value to make that change?

I understand the potential that some bad people may use the Data Revival formula above to cheat their measurement data to get the statistical results that they want to but similar to any other great tools, the utilization of the tools is really depending on the moral of the person.

<Previous | Next | Contents>

These tutorial is copyrighted.

Preferable reference for this tutorial is

Teknomo, Kardi. (2006) Recursive Average and Variance.
http://people.revoledu.com/kardi/tutorial/RecursiveStatistic/index.html

   
 
© 2007 Kardi Teknomo. All Rights Reserved.
Designed by CNV Media