The Cumulative Effect, Of Adding Many Random Numbers

The question must have crossed many people’s minds, of what the cumulative effect is, if they take the same calculated risk many times, i.e., if they add a series of numbers, each of which is random, and for the sake of argument, if each numbers has the same standard deviation.

The formal answer to that question is explained in This WiKiPedia Article. What the article states, is that ‘If two independently-random numbers are added, their expected values are added, as well as their variance, to give the expected value and the variance of the sum.’

But, what I already know, is that standard deviation is actually the square root of variance. Conversely, variance is already standard deviation squared. Therefore, the problem could be such, that the standard deviation of the individual numbers is known in advance, but that (n) random numbers are to be added. And then, because it is the square root of variance, the standard deviation of the sum will increase, as the square root of (n), times whatever the standard deviation of any one number in the series was.

This realization should be important to any people, who have a gambling problem, because people may have a tendency to think, that if they had ‘bad luck’ at a gambling table, ‘future good luck’ will come, to cancel out the bad luck they’ve already experienced. This is generally untrue, because as (n) increases, the square root of (n) will also just take the sum – of individual bets if the reader wishes – further and further away, from the expected value, because the square root of (n) will still increase. On average!

But, if we are to consider the case of gambling, then we must also take into account the expected value, which is just the average return of one bet. In the real-world case of gambling, this value is biased against the player, and earns the gambling establishment its profit. Well, according to what I wrote above, this will continue to increase linearly.

Now, the question which may come to mind next would be, what effect such a summation of data has on averages. And the answer lies in the fact that the square root of (n), is a half-power of (n). A full power of (n) would grow linearly with (n), while the zero-power of (n), would just stay constant.

And so the effect of summing many random numbers will first of all be, that the maximum and the minimum result theoretically possible, will be (n) times as far apart as they were for any one random number. This reflects the possibility, that ‘if (n) dice were rolled’, they could theoretically all come up as the maximum value possible, or all come up as the minimum value possible. And what this does to the graph of the distribution, is it initially makes the domain of the distribution curve linearly wider, along the x-axis, as a function of (n) – as the first power of (n).

(Updated 05/16/2018 … )

(As of 05/14/2018 : )

But, if we were next to shrink that graph horizontally, to make it appear as wide as any graph we’d look at, we’d also be shrinking the distribution curve, horizontally. And the net effect is that the distribution curve would become narrower. This distribution curve also represents what would happen to the average, since the average is just the sum divided by (n), and then the standard deviation of the average will become the reciprocal of the square root of (n), times the standard deviation of any one random number.

So this importance of a half-power of (n) needs to be emphasized. If we divide it by (n), we get the negative half-power of (n) ! This is the reciprocal of the square root of (n) ! And, in Calculus, if we try to solve for a ratio between two power-functions of the same parameter, then as that parameter becomes large, the greater power will dominate that ratio.

This observation is also in-line, with what I wrote in This Earlier Posting, which was, ‘If we have standard deviation, which is homologous to voltage, hand-in-hand with variance, which is homologous to signal-energy, it’s the signal-energy, and the variance, between two inputs, which will remain accurate when simply added.’


On the side I may comment, that I have been running Genetic Algorithms Evolution on one of my computers, using a paid-for Windows-based application from the past, which was named ‘Discipulus‘. One of the problems it was given, as a machine-learning exercise, was to take 24 random input-variables, and to compute whether their sum is positive or negative.

Each line in the data-set has a different amplitude between the input-variables, which also means that their standard deviation changes from one line to the next. But to make the example solvable, their expected value is zero, meaning that they are equally prone to be positive as they are to be negative. And the last number in each line of my data-set, seems to be either a one or a zero, to indicate that the sum of the variables either did or did not exceed zero. Thus, 50% of the expected output-values should be 0, and 50% of them should be 1.

The Genetic Algorithms I’ve been playing with, are supposed to predict whether a 0 or a 1 should result from these completely-random input values, based on the learning data-set. But one handicap which my GAs have, is the fact that their random evolution, only causes them to utilize a small subset of the 24 input-values each time. The GAs evolve to predict either a 0 or a 1, based on input-values which they’ve evolved to make use of in randomly-created, but then selected and mutated computations.

The best GA which I’ve been able to grow, predicts this outcome with 87.6% accuracy. But this is achieved, even if the GA was only utilizing maybe 1/4 of the input-variables.

In short, those GAs are able to look at just a subset of ‘their gambling experience’, and decide ‘whether to walk away from the table’, or ‘whether it’s worth continuing to make bets’.


 

According to my own understanding, given only 1/4 of the input-variables, it should only be possible to predict the outcome with 75% accuracy. But the fact that one of my GAs seems to exceed that goal, may be attributable to a flaw in how Genetic Algorithms are trained.

When trained to ‘learn’ one data-set, GAs will sometimes be better at predicting its outcome, than the outcome of equivalent data-sets. In effect, the GA may turn into more of a data-compression example, than a proof, that the concept was recognized, which this data-set was meant to sample.

This is already a reason why, in-vitro AI learning is broken into two steps: a learning data-set, and a validation data-set, the latter of which the AI in question was no longer given a chance to learn from. The validation data-set is supposed to determine, to what extent a given AI has truly mastered ‘the principle in the data-set’, and not ‘one data-set’.

Well just as there can be discrepancies between how fit a GA is, according to the learning and according to the validation data-sets, there can actually be discrepancies between how well the same GA scores, given two validation data-sets…

(Update 05/15/2018 : )

Also, in my experiments with Genetic Algorithms, I tend to use data-sets which were supplied to me, as opposed to data-sets which I could have created myself – so far. Well the supplied data-set I used for the example above, might not have contained truly random numbers, only numbers which ‘look random’ to my causal inspection.

In that case, when ‘learning the data-set’, the GAs may have been able to exploit any lack of randomness, which by myself, I cannot even see.

(Update 05/16/2018 : )

I have now created a new project, for the Genetic Algorithms software ‘Discipulus’, for which I created my own two Data-Sets, according to the description I gave above.

The result was, that the most evolved GA was only able to guess the outcome of the Validation Data-Set, with 66% accuracy, based on using 1/4 of the 24 Input-variables, from the Training Data-Set. The following URL provides the text-files, which constituted this project, except for the actual software:

http://dirkmittler.homeip.net/Random24-Class/

At the same time, the best result which a GA was able to achieve, based entirely on the Learning / Training Data-Set, was 97%. But that GA, only achieved 55% on the Validation Data-Set, even though both Data-Sets were generated by the same algorithm!

Dirk

 

Print Friendly, PDF & Email

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This site uses Akismet to reduce spam. Learn how your comment data is processed.