PAPA scoring schemes

Bullshit! :slightly_smiling:

You say that now that PAPA 18 is over, but I guarantee Lefkoff & Associates would be suing for emotional damages over the changed scoring system.

Just to prove that’s correct, I went into Steinman’s time machine, took a trip back to alternate universe PAPA 18 and captured this picture after you failed to qualify by 2 points.

8 Likes

You guys are hilarious! Truth is, PAPA qualifying has always been about the bombs OR the consistent ticket. You can get in with either one.

Actually these scores by JonR and Ben brought up a discussion in the waiting lines at PAPA. Right now, any score 88th or below is worth 0 points. This means a score of 580M on GoT, and score of 2M on GoT, were both treated equally. Yet, both of those scores could have been part of tickets where the players were nearly tied, but clearly one player had a better GoT game. Only top 88 players get points is historical, from when this system began, and we barely had more than 100 plays on a machine. But now with number of plays exceeding 150 for almost every machine, and in some cases in the 200s, is it time to revamp the points? Maybe top 150 scores should get points? Suppose 1st place was 200 points, then we could even have a scale that still is top heavy, but doesn’t have that big drop when a score gets knocked from 1st to 2nd or 2nd to 3rd, but only single point drop after 3rd. E.g. 200, 190, 185, 180, 176, 172, 168, 164, 160, 157, etc. Once we get to about 20th, then single point drop all the way to 150th or so. How would that change who qualifies and who doesn’t? And assuming it does result in changes, is it for the better? Right now the player with three high scores and two zeros sees their score bubble up, whereas a player with five very good, but no excellent scores will fall like a rock. I don’t know what the right answer is, but I do agree with JonR that maybe it’s time to give a score of 500M more points than a score of 2M.

The Kidforce league up in Cleveland has adjusted to this. I think when I was last involved, their point system might’ve been 145, 135, 130, 127, 125, 124, 123…There were maybe 120 players that season. Now the points are probably something different.

@jdelz Do you know how Joe matches up the game points with the number of players involved?

The effects of a zero score are pretty interesting.

In looking at the PAPA 19 qualifying scores, the 2 players on the one point bubble this year would not have benefitted from an awarding of points from, let’s say 200 to 0.

Cryss S, had no zero scores in his best run, but still just missed the cut. Recalibrating does not really help him. Escher L. was tied with Cryss, but had 2 zero scores. However, the 2 zeroes were not near the upper end of the scores that awarded zero points. So it was not a case of “all that work for nothing.” Recalibrating the scale probably would not have helped him.

This to me is the interesting case: A little farther down the qualifying list is Johnny M. who missed the qualifying cut by 5 points. However, his GoT score of 509 MIL on his best entry ticket was one of the top 10 zero scores out of 84 zero scores.

Looking at other qualifiers in A who made the cut but had a zero on their best A ticket for GoT were Andy R. and Jason W. Comparing: Johnny’s GoT score was considerably better than Andy’s and Jason’s scores on their best A ticket: Johnny 95th, Andy 125th, Jason 150th.

In looking at the composite qualifying score, Andy was at 251, Jason finished at 227, (they both made the cut) Johnny at 204, but all had zeros on their best tickets for GoT. If points were awarded for everything and assuming 1 point differentials that far down the list, Johnny would have 55 points more than Jason when comparing their best tickets where the zero exists. Johnny would have only gained 30 points on Andy. So just looking at those scores, Johnny would have qualified much higher than Jason.

Granted, I did not recalibrate every scenario and every ticket, but this particular example stuck out to me as a case of all zeroes are not equal.

I don’t agree. 500 million and 2 million on Game of Thrones are both below-average scores.

My opinion is that any below-average game should be worth zero. I don’t feel that the difference between a below-average game and a super below-average game should determine whether a player qualifies in the PAPA format.

If you want a comparison for a place where this is already happening, look at Classics. The cutoff for 87th is pretty close to the total number of entries on most machines, which gives below-average games some credit. As a result, in Classics you are hit very, very hard for a poor game.

I greatly prefer rewarding players for their good games, with the relative strength of those good games mattering more than the relative strength of below-average games.

An alternative would be to change the scoring system to be “exponential decay” (100-95-95% of that-etc), which I’ve seen in some Euro events. This gives a longer “tail” where you’ll still receive points, but not many – I find this fairer in noting that there is a difference between 60th and 80th but that difference is not as important as the difference between 5th and 25th.

We used to use exponential decay in the scoring at CA Extreme but I think we set the decay too steep. Lots of people ended up with low qualifying totals, and it was replaced with the current PAPA system.

3 Likes

Just to see what happens, I looked at the players down through Trent in 33rd place under various scenarios. Current is the 88th place goes to zero used now. 134 is with a scale going 150-140-135-132-130-129-etc., 184 starts at 200. Sum Ranks is just adding up their positions on each game. Sum percents is doing the sum of their percentiles.

For those near the top, it matters little. What I do see, though, is a few entries where players had all 5 games in the top 88 - - still in the top 1/3, not just 1/2 on the machines played - - yet lost out to others due to the scale used. You can make a pretty good case that Cryss, Julio, Paul and Andy Jr. had tickets that were better than Johan’s or Jason’s,

Player Current (88) 134 Places 184 Places Sum Ranks Sum Percents
Massenkoff 1 1 1 1 1
Elwin 2 2 2 2 2
George 3 3 3 3 3
Davidson 4 7 7 14 9
Z Sharpe 5 4 4 4 4
Gagno 6 13 13 19 16
Kerins 6 8 8 7 11
Hugosson 8 9 9 8 7
Rosa 8 24 24 23 28
Hansen 10 14 14 26 22
Henderson 11 15 15 18 15
J Sharpe 11 5 5 5 5
Belsito 13 17 17 20 19
Stix 14 6 6 6 6
Acciari 15 20 20 22 21
McKinnie 16 19 19 24 29
Genberg 17 25 25 32 27
Werdrick 18 31 31 30 33
Replogle 19 22 22 16 18
Runsten 20 26 26 27 24
Becker 21 21 21 15 17
Stewart 22 23 23 16 12
Sutter 23 10 10 9 8
Birrell 24 32 32 28 26
Stephens 25 11 11 10 13
Lefkoff 25 33 33 33 32
Vicario 27 12 12 11 10
Modica 28 29 29 31 31
Ojamies 29 27 27 25 30
Bowden 30 28 28 29 25
Jongma 31 16 16 12 14
Rosa II 32 18 18 13 20
Augenstein 33 30 30 20 23

1 Like

For reference, the number of non-voided games on the A bank ranged from 171 on GoT to 328 on EBD with an average of 258; only GoT had under 220.

Thanks for doing all that data analysis, Bob. To my eye, given that data, there’s not much reason to change… changing would just move the zero line from one arbitrary position to another arbitrary position.

Bowen’s exponential decay proposal is kinda interesting but personally, I don’t think I want to deal with comparing my score of 214.3791 to someone else’s score of 214.9714… that feels unnecessarily pedantic.

I can’t tell if I went up or down based on Bob’s analysis, but like most pinball related things that I hear about WPPR, if Bob’s changes made me go up, I fucking love it!

If they make me go down, I’m pretty sure it’s the shittiest proposal I’ve heard in my entire life :slight_smile:

Josh,

In your case, you’d go up. But the real question is, when to start the zeroes. As was mentioned above, the current scale was set - arbitrarily - when fewer games were played. Clearly a rescaling is appropriate at some point. Have we reached that point? Re Bowen’s comment about “any below average game should be worth zero,” well, we’ve passed that. Below average would now be about 129th, not 88th. If we want “below average” to be the criteria, then my 150-140-135-132-130-129-128 … 1-0 scale is what is closest right now. But “below average” is an arbitrary decision, too. Still, when I look at the specific tickets I mentioned, I can’t help but feel that those four were better tickets on the whole than the other two that got in.

As for exponential decay, that would be pretty decent if you used 95% for 2nd and then 99% for succeeding increments. [Gets rid of CalEx’s problem of decaying too quickly.] I’ve redone my table with that one added, and it looks pretty good; those four do get in, but the top 5 are still the top 5, and 6-15 just reshuffle slightly. If the playoffs took just the top 16, I’d say don’t bother to change based on what I see. But since they now take 24, the differences in those second-tier tickets now matter.

Lefkoff has run actual historical PAPA data through alternate 150, and 200 point scales.

The tldr of his data imo is that the larger the scale, the more accurate the qualifying results.

I moved the remainder of the PAPA scoring schemes discussion over from the “World’s Worst Scores” thread, but it appears that Discourse doesn’t retain the original posting order of moved messages. Brilliant. :confused: Sorry about the two clumps of messages being out of order.

2 Likes

If you want examples, here’s my table with the top 33 from this year, their finish on each machine [first 5 figures], and points awarded. Choose your comparisons; I like Julio’s all-top-80 and 4 top 50 scores [49 78 50 29 27] vs. Werdrick [191 150 16 1 33] or Genberg [105 13 319 7 12].

As I’ve indicated, I agree with Cayle’s take that all 5 scores should count positively rather than to just ignore below-average games. PAPA’s ticket format is less subject to “going for broke to get a top score” than Herb-style qualifying, where a zero-type score doesn’t matter, you’ll just try again. It still happens now when someone knows they have a weak game or two and need a top 10 score on their last machine or two to get in with the current scale, but I think it’s less common than in Herb. That gives you less of an excuse, if you want to call it that, for having such a score.

I also note that the scale has widened over the years as posted above, at least mildly in line with player count increases.

Player AVEN BDS CV DH EBD GoT IJ PZ Tommy WW AVEN BDS CV DH EBD GoT IJ PZ Tommy WW Scale 1
Massenkoff 25 13 18 8 46 0 63 75 70 0 0 0 0 80 42 330
Elwin 59 53 3 14 2 29 35 85 74 0 90 0 0 0 0 313
George 9 2 48 97 17 79 90 0 40 0 0 0 0 0 71 280
Davidson 16 22 31 181 4 0 72 0 66 0 0 57 0 84 0 279
Z Sharpe 40 20 8 10 109 0 48 0 68 0 80 78 0 0 0 274
Gagno 182 11 23 24 42 0 0 0 77 0 0 65 0 64 46 252
Kerins 35 27 35 126 3 0 53 0 0 0 61 53 0 0 85 252
Hugosson 15 47 31 8 126 73 0 41 57 80 0 0 0 0 0 251
Rosa 1 155 125 2 27 100 0 0 0 0 0 0 90 61 0 251
Hansen 8 62 228 6 26 80 26 0 0 0 82 0 62 0 0 250
Henderson 54 165 3 25 25 34 0 0 0 0 85 0 63 0 63 245
J Sharpe 27 10 66 70 22 61 0 0 78 0 0 22 0 18 66 245
Belsito 24 40 35 183 9 64 0 48 53 0 0 0 0 0 79 244
Stix 39 36 27 30 68 0 49 0 52 0 0 61 0 58 20 240
Acciari 29 189 30 33 25 59 0 58 0 0 0 55 0 63 0 235
McKinnie 186 1 28 67 35 0 100 0 60 0 21 0 0 0 53 234
Genberg 105 13 319 7 12 0 0 0 75 0 0 81 76 0 0 232
Werdrick 191 150 16 1 33 0 0 0 0 0 0 72 100 55 0 227
Replogle 75 137 16 5 31 13 0 0 72 0 83 0 0 0 57 225
Runsten 165 110 55 2 1 0 0 0 33 90 0 0 0 0 100 223
Becker 65 30 18 19 128 23 58 70 0 0 69 0 0 0 0 220
Stewart 38 41 43 131 11 50 0 47 0 0 0 45 0 77 0 219
Sutter 62 64 36 52 14 26 0 0 24 52 0 36 0 74 0 212
Birrell 42 4 9 156 139 46 0 0 0 84 79 0 0 0 0 209
Stephens 56 33 84 44 15 0 32 0 0 55 4 0 44 0 73 208
Lefkoff 21 208 24 11 239 67 0 0 64 0 77 0 0 0 0 208
Vicario 49 78 50 29 27 39 10 38 0 0 0 0 59 0 61 207
Modica 32 14 272 95 14 56 74 0 0 0 0 74 0 0 0 204
Ojamies 84 10 172 14 44 0 0 0 4 78 0 0 74 0 44 200
Bowden 55 22 62 14 210 33 0 66 26 0 74 0 0 0 0 199
Jongma 58 85 5 33 61 30 0 3 83 0 55 0 0 0 27 198
Rosa II 73 5 60 21 86 15 0 83 0 0 28 0 67 2 0 195
Augenstein 16 79 36 132 28 72 0 0 9 0 52 0 0 0 60 193

I think Bowen’s comments about how PAPA Classics feels now is spot on. You play a shit game and you’re like FUCKKKKK! I’m so fucked!!!

The thing is, and maybe other people can share their experiences, PAPA A used to feel just like that to me. If you had a just flat out shit game, you were like FUCKKKKK! My entry is so fucked! Where as if you had an average game or even slightly below average game, you would get compensated enough that 2 ‘average’ games could save a run where you could put 3 other games together.

Now I have a shit game, and I’m not even phased by it. There’s plenty of room to survive that these days because you’re likely to have some ‘good games’ that still net you 0 points in your run.

My personal mindset has definitely changed from it being about true consistency across 5 games, to just making sure I pound out a solid 3 out of 5. Not that that’s a bad or good thing.

It’s less of this:

(About 40 seconds in is about how I felt about my PAPA run once I had a shit game back in the day)

1 Like

Don’t you mean “FARRTTTTSS!!!” ? :smile:

1 Like

Definitely not back then pre-kids :slightly_smiling:

I don’t mind the PAPA scoring for PAPA, where you are pretty much guaranteed to have scores that give a player 0 points, but I absolutely loathe it for small events. With the rising popularity of the neverdrains software as well as Scott Danesi’s software where this is called “IFPA Scoring” for some reason, I see the 100-90-85-84 etc. scoring applied regularly to events with 20 or so players. Then you have situations where the difference between last and 3rd, and 3rd to first are about the same. I think that the scoring should be scaled in some way to match the participation in the tournament.

A couple years ago I looked at the distributions of scores in one of the PAPAs and found that typically the scores on most games wound up being in a lognormal distribution, regardless of the game. Thinking of maybe trying to look at this year’s PAPA data and apply a scoring system that gives you a point for the “percentile” your score winds up in just to add to the data here.

1 Like

While I realize we have pretty lenient swearing policy here on TiltForums, I would appreciate it if people could keep it down to a swears : non-swears ratio of better than 1:3. Shop for ideas here!

3 Likes

Go fart yourself Dunlap :slight_smile:

2 Likes