October 11, 2020

3 types of quartile positioning

REMARKS:
  • Assume that the authors have the data with 10 points, every points are unique, range from 0 to 9.
  • To view data points visualizations correctly, use monospace fonts/displaying (If not thing is wrong, you should view this page as mentioned way).
  • L,H mean virtual data points 1 point backward/forward respectively.
  • 1st,2nd,3rd "," between data point, represent fractional data-ordering position between 2 data at 0.25,0.5,0.75 respectively.
  • 1st,2nd,3rd,4th,5th "|" represent the quartile rank at 0,1,2,3,4 respectively.
  • This explaination just comes from my experience, therefore may be inaccurated.

INCLUSIVE QUARTILE

L,,,0,,,1,,,2,,,3,,,4,,,5,,,6,,,7,,,8,,,9,,,H

    |12345678|12345678|12345678|12345678|    

Data-ordering Position Formula: q(N-1)+1

Q0 Data-ordering Position: 0

Q4 Data-ordering Position: N

Minimum Determinable Quartile Rank Formula: 0

Maximum Determinable Quartile Rank Formula: 1

Description: Min&Max quartiles are determinable, are retrieved from min and max of data respectively. Highly understandable by human.

When To Use This: Any data that is required/recommended to not have the limit-range (means a data point has lower and upper limit), should use this method. Or to compare the rank (like competitive rank) in the human-highly-understandable way. In data science related python code, 'numpy.qurtile' uses this.

SEMI-EXCLUSIVE QUARTILE

Note: The authors doesn't know the official name of this method, so just named it self)

L,,,0,,,1,,,2,,,3,,,4,,,5,,,6,,,7,,,8,,,9,,,H

  |123456789|123456789|123456789|123456789|  

Data-ordering Position Formula: qN+0.5

Q0 Data-ordering Position: Midpoint between L and 0

Q4 Data-ordering Position: Midpoint between N and H

Minimum Determinable Quartile Rank Formula: 0.5/N

Maximum Determinable Quartile Rank Formula: 1-0.5/N

Description: Min&Max quartiles are not determinable (as they're out of bound/data_range). As you can see, the Q0 and Q4 are fallen at lower limit of minimum data, and at upper limit of maximum data respectively; in this method, both lower and upper limit of each data points are only overlapped to each others at the middle of a data point. (casually explaining): compare to the inclusive method, they're at the order number of Q0 and Q4, that being added by -0.5,0.5 respectively. This method would be less popular than the others.

When To Use This: Any data that is required/recommended to have the limit-range (means a data point has lower and upper limit), and may be highly optimized by this method, should use this method. However, it's recommended to use the exclusive method, unless you know the reason to use the another (for instance, lower and upper limit in this method are more compatible with the data then the common exclusive method one).

EXCLUSIVE QUARTILE

L,,,0,,,1,,,2,,,3,,,4,,,5,,,6,,,7,,,8,,,9,,,H

|1234567890|1234567890|1234567890|1234567890|

Data-ordering Position Formula: q(N+1)

Q0 Data-ordering Position: L

Q4 Data-ordering Position: H

Minimum Determinable Quartile Rank Formula: 1/(N+1)

Maximum Determinable Quartile Rank Formula: N/(N+1)

Description: Min&Max quartiles are not determinable (as they're out of bound/data_range). As you can see, the Q0 and Q4 are fallen at lower limit of minimum data, and at upper limit of maximum data respectively; in this method, both lower and upper limit of each data points are overlapped to that of another data point, for the entire range of the data. (casually explaining): compare to the inclusive method, they're at the order number of Q0 and Q4, that being added by -1,1 respectively. Most common/complex statistical problem requires this method, althrough it seems conflict with human understanding at first. In academic school (from my experience in Thailand), will mainly reference this method.

When To Use This: Common/Complex statistics problem requires this method. Any data that is required/recommended to have the limit-range (means a data point has lower and upper limit), and may be highly optimized by this method, should use this method. However, it's recommended to use this one by default, unless you know the reason to use the another.