Box plots:Notes:- merit_full: Full dataset, from 24/1/2018 to 24/1/2019.
- merit2daysdrop: the first two days dropped (24-25/1/2018, id = 1, 2)
- merit: the first 26 days dropped due to outliers (24/1/2018 to 18/2/2018, id = 1-26)
- mean: mean
- sd: standard deviation
- p50: median
- p25: Q1
- p75: Q3
Raw statistics:. tabstat merit_full merit_2daysdrop merit, s(n mean sd p50 p25 p75 min max) c(s)
variable | N mean sd p50 p25 p75 min max
-------------+--------------------------------------------------------------------------------
merit_full | 366 842.9071 888.2252 640 521 843 312 13018
merit_2day~p | 364 793.2005 534.7651 639 521 839 312 4493
merit | 340 684.0324 261.525 621.5 515 773.5 312 2463
----------------------------------------------------------------------------------------------
Due to effects of outliers, the means fall from 843 (full dataset) to 794 (first two days dropped), then to 685 (first 26 days dropped).
Moreover, the medians present almost the same pattern, when medians for full dataset, first two days dropped, and first 26 days dropped are 640, 639, and 622, respectively.
Statistics should be used are from truncated dataset by dropping the first 26 days.
- Mean +/- sd: 685 +/- 262
- Median (interquartile range): 622 (515 - 774)
The median of the truncated dataset present nearly the true median of one year intraday-merits.
It means that:
- 50% of observed days have intraday merits above 622, whislt 50% of rest observed days have intraday merits below 622.
- 50% of observed days have intraday merits in the range from 515 to 774 (the interquartile range).