Author

Topic: Statistical analysis. (Read 424 times)

legendary
Activity: 1862
Merit: 1011
Reverse engineer from time to time
July 31, 2015, 04:14:49 PM
#9
I don't think calculating frequencies is enough.

A truly random sequence should not have repeated patterns or other ways to predict a number given its predecessors
Well the sequences of numbers I have, should be random, HOWEVER, there are rules to the numbers generated, in a given set of 20 numbers(with any number between 1-80), no number should repeat. This by itself I suppose already breaks the chain of truly random.

There is also another caveat, the numbers, when I get them are ordered from descending to ascending order, but they are generated in a random fashion. I can't change this.
legendary
Activity: 1240
Merit: 1001
Thank God I'm an atheist
July 31, 2015, 04:13:20 PM
#8
I don't think calculating frequencies is enough.

A truly random sequence should not have repeated patterns or other ways to predict a number given its predecessors

Search Google for Chi-squared test.
legendary
Activity: 1862
Merit: 1011
Reverse engineer from time to time
July 31, 2015, 04:05:52 PM
#7
I did the first number crunching

1 => 2688
2 => 2671
3 => 2689
4 => 2685
5 => 2724
6 => 2680
7 => 2635
8 => 2628
9 => 2706
10 => 2605
11 => 2707
12 => 2731
13 => 2665
14 => 2716
15 => 2768
16 => 2699
17 => 2682
18 => 2676
19 => 2620
20 => 2700
21 => 2648
22 => 2682
23 => 2700
24 => 2774
25 => 2635
26 => 2706
27 => 2605
28 => 2650
29 => 2633
30 => 2716
31 => 2761
32 => 2719
33 => 2729
34 => 2700
35 => 2703
36 => 2676
37 => 2620
38 => 2721
39 => 2746
40 => 2681
41 => 2636
42 => 2671
43 => 2709
44 => 2689
45 => 2577
46 => 2633
47 => 2741
48 => 2706
49 => 2666
50 => 2611
51 => 2635
52 => 2593
53 => 2668
54 => 2761
55 => 2694
56 => 2699
57 => 2702
58 => 2704
59 => 2628
60 => 2676
61 => 2636
62 => 2671
63 => 2726
64 => 2633
65 => 2611
66 => 2635
67 => 2668
68 => 2650
69 => 2731
70 => 2735
71 => 2692
72 => 2678
73 => 2753
74 => 2761
75 => 2768
76 => 2694
77 => 2657
78 => 2721
79 => 2700
80 => 2672

The numbers don't appear to be too much non-uniform, e.g every number seems to appear more or less equally. That's a bit discouraging tbh, but this is just basic counting, there's bound to be various other algorithms I can run on this.
sr. member
Activity: 392
Merit: 268
Tips welcomed: 1CF4GhXX1RhCaGzWztgE1YZZUcSpoqTbsJ
July 31, 2015, 01:28:42 PM
#6
This is something I could whip up in Java or something in a couple of hours. The graphs you would probably want would be frequency of each value 1-80 over the thousand or so sets, as well as the average of each set over time. This could be done as histograms in whatever output format (including printing to standard out).
Should I go on a limb that you require payment for this?

I could do it for free, as I do programming for fun anyway, but it would probably not happen immediately, as I am a bit busy with work.
But do you know of any algorithms that can be used on the data set? I can definitely think of counting all the numbers and see which show up more often, this one is easy. But what else?

To be honest, not really. I haven't taken a statistics course at all (went straight on to calculus).
legendary
Activity: 1862
Merit: 1011
Reverse engineer from time to time
July 31, 2015, 01:23:52 PM
#5
This is something I could whip up in Java or something in a couple of hours. The graphs you would probably want would be frequency of each value 1-80 over the thousand or so sets, as well as the average of each set over time. This could be done as histograms in whatever output format (including printing to standard out).
Should I go on a limb that you require payment for this?

I could do it for free, as I do programming for fun anyway, but it would probably not happen immediately, as I am a bit busy with work.
But do you know of any algorithms that can be used on the data set? I can definitely think of counting all the numbers and see which show up more often, this one is easy. But what else?
sr. member
Activity: 392
Merit: 268
Tips welcomed: 1CF4GhXX1RhCaGzWztgE1YZZUcSpoqTbsJ
July 30, 2015, 05:29:20 AM
#4
This is something I could whip up in Java or something in a couple of hours. The graphs you would probably want would be frequency of each value 1-80 over the thousand or so sets, as well as the average of each set over time. This could be done as histograms in whatever output format (including printing to standard out).
Should I go on a limb that you require payment for this?

I could do it for free, as I do programming for fun anyway, but it would probably not happen immediately, as I am a bit busy with work.
legendary
Activity: 1862
Merit: 1011
Reverse engineer from time to time
July 30, 2015, 05:28:16 AM
#3
This is something I could whip up in Java or something in a couple of hours. The graphs you would probably want would be frequency of each value 1-80 over the thousand or so sets, as well as the average of each set over time. This could be done as histograms in whatever output format (including printing to standard out).
Should I go on a limb that you require payment for this?
sr. member
Activity: 392
Merit: 268
Tips welcomed: 1CF4GhXX1RhCaGzWztgE1YZZUcSpoqTbsJ
July 29, 2015, 06:25:34 PM
#2
This is something I could whip up in Java or something in a couple of hours. The graphs you would probably want would be frequency of each value 1-80 over the thousand or so sets, as well as the average of each set over time. This could be done as histograms in whatever output format (including printing to standard out).
legendary
Activity: 1862
Merit: 1011
Reverse engineer from time to time
July 29, 2015, 07:26:07 AM
#1
I am not quite sure what I need, as I've never done anything remotely connected to statistical analysis, but here I go.

I have several thousand of sets of numbers, each set has 20 numbers ranging from 1 through 80, that cannot repeat. E.g if one set has the number 80, then you will not encounter this same number in the same set.

I want to test the distribution of these numbers(I think it's called distribution idk), or if these numbers are randomly generated, if there are numbers that show up more often than others, or if there are time frames where some numbers get generated more often than others. To test if there is some bias towards some particular numbers or not. Finding some pattern etc.

Can anyone help with this? Some guiding, hints etc?
Jump to: