PAPA scoring schemes

BMU · April 24, 2016, 9:10pm

Josh,

In your case, you’d go up. But the real question is, when to start the zeroes. As was mentioned above, the current scale was set - arbitrarily - when fewer games were played. Clearly a rescaling is appropriate at some point. Have we reached that point? Re Bowen’s comment about “any below average game should be worth zero,” well, we’ve passed that. Below average would now be about 129th, not 88th. If we want “below average” to be the criteria, then my 150-140-135-132-130-129-128 … 1-0 scale is what is closest right now. But “below average” is an arbitrary decision, too. Still, when I look at the specific tickets I mentioned, I can’t help but feel that those four were better tickets on the whole than the other two that got in.

As for exponential decay, that would be pretty decent if you used 95% for 2nd and then 99% for succeeding increments. [Gets rid of CalEx’s problem of decaying too quickly.] I’ve redone my table with that one added, and it looks pretty good; those four do get in, but the top 5 are still the top 5, and 6-15 just reshuffle slightly. If the playoffs took just the top 16, I’d say don’t bother to change based on what I see. But since they now take 24, the differences in those second-tier tickets now matter.

cayle · April 25, 2016, 7:34am

Lefkoff has run actual historical PAPA data through alternate 150, and 200 point scales.

The tldr of his data imo is that the larger the scale, the more accurate the qualifying results.

joe · April 25, 2016, 6:22pm

I moved the remainder of the PAPA scoring schemes discussion over from the “World’s Worst Scores” thread, but it appears that Discourse doesn’t retain the original posting order of moved messages. Brilliant. Sorry about the two clumps of messages being out of order.

BMU · April 25, 2016, 7:08pm

If you want examples, here’s my table with the top 33 from this year, their finish on each machine [first 5 figures], and points awarded. Choose your comparisons; I like Julio’s all-top-80 and 4 top 50 scores [49 78 50 29 27] vs. Werdrick [191 150 16 1 33] or Genberg [105 13 319 7 12].

As I’ve indicated, I agree with Cayle’s take that all 5 scores should count positively rather than to just ignore below-average games. PAPA’s ticket format is less subject to “going for broke to get a top score” than Herb-style qualifying, where a zero-type score doesn’t matter, you’ll just try again. It still happens now when someone knows they have a weak game or two and need a top 10 score on their last machine or two to get in with the current scale, but I think it’s less common than in Herb. That gives you less of an excuse, if you want to call it that, for having such a score.

I also note that the scale has widened over the years as posted above, at least mildly in line with player count increases.

Player AVEN BDS CV DH EBD GoT IJ PZ Tommy WW AVEN BDS CV DH EBD GoT IJ PZ Tommy WW Scale 1
Massenkoff 25 13 18 8 46 0 63 75 70 0 0 0 0 80 42 330
Elwin 59 53 3 14 2 29 35 85 74 0 90 0 0 0 0 313
George 9 2 48 97 17 79 90 0 40 0 0 0 0 0 71 280
Davidson 16 22 31 181 4 0 72 0 66 0 0 57 0 84 0 279
Z Sharpe 40 20 8 10 109 0 48 0 68 0 80 78 0 0 0 274
Gagno 182 11 23 24 42 0 0 0 77 0 0 65 0 64 46 252
Kerins 35 27 35 126 3 0 53 0 0 0 61 53 0 0 85 252
Hugosson 15 47 31 8 126 73 0 41 57 80 0 0 0 0 0 251
Rosa 1 155 125 2 27 100 0 0 0 0 0 0 90 61 0 251
Hansen 8 62 228 6 26 80 26 0 0 0 82 0 62 0 0 250
Henderson 54 165 3 25 25 34 0 0 0 0 85 0 63 0 63 245
J Sharpe 27 10 66 70 22 61 0 0 78 0 0 22 0 18 66 245
Belsito 24 40 35 183 9 64 0 48 53 0 0 0 0 0 79 244
Stix 39 36 27 30 68 0 49 0 52 0 0 61 0 58 20 240
Acciari 29 189 30 33 25 59 0 58 0 0 0 55 0 63 0 235
McKinnie 186 1 28 67 35 0 100 0 60 0 21 0 0 0 53 234
Genberg 105 13 319 7 12 0 0 0 75 0 0 81 76 0 0 232
Werdrick 191 150 16 1 33 0 0 0 0 0 0 72 100 55 0 227
Replogle 75 137 16 5 31 13 0 0 72 0 83 0 0 0 57 225
Runsten 165 110 55 2 1 0 0 0 33 90 0 0 0 0 100 223
Becker 65 30 18 19 128 23 58 70 0 0 69 0 0 0 0 220
Stewart 38 41 43 131 11 50 0 47 0 0 0 45 0 77 0 219
Sutter 62 64 36 52 14 26 0 0 24 52 0 36 0 74 0 212
Birrell 42 4 9 156 139 46 0 0 0 84 79 0 0 0 0 209
Stephens 56 33 84 44 15 0 32 0 0 55 4 0 44 0 73 208
Lefkoff 21 208 24 11 239 67 0 0 64 0 77 0 0 0 0 208
Vicario 49 78 50 29 27 39 10 38 0 0 0 0 59 0 61 207
Modica 32 14 272 95 14 56 74 0 0 0 0 74 0 0 0 204
Ojamies 84 10 172 14 44 0 0 0 4 78 0 0 74 0 44 200
Bowden 55 22 62 14 210 33 0 66 26 0 74 0 0 0 0 199
Jongma 58 85 5 33 61 30 0 3 83 0 55 0 0 0 27 198
Rosa II 73 5 60 21 86 15 0 83 0 0 28 0 67 2 0 195
Augenstein 16 79 36 132 28 72 0 0 9 0 52 0 0 0 60 193

pinwizj · April 25, 2016, 7:23pm

I think Bowen’s comments about how PAPA Classics feels now is spot on. You play a shit game and you’re like FUCKKKKK! I’m so fucked!!!

The thing is, and maybe other people can share their experiences, PAPA A used to feel just like that to me. If you had a just flat out shit game, you were like FUCKKKKK! My entry is so fucked! Where as if you had an average game or even slightly below average game, you would get compensated enough that 2 ‘average’ games could save a run where you could put 3 other games together.

Now I have a shit game, and I’m not even phased by it. There’s plenty of room to survive that these days because you’re likely to have some ‘good games’ that still net you 0 points in your run.

My personal mindset has definitely changed from it being about true consistency across 5 games, to just making sure I pound out a solid 3 out of 5. Not that that’s a bad or good thing.

It’s less of this:

(About 40 seconds in is about how I felt about my PAPA run once I had a shit game back in the day)

jdelz · April 25, 2016, 7:32pm

Don’t you mean “FARRTTTTSS!!!” ?

pinwizj · April 25, 2016, 7:42pm

Definitely not back then pre-kids

timballs · April 25, 2016, 9:50pm

I don’t mind the PAPA scoring for PAPA, where you are pretty much guaranteed to have scores that give a player 0 points, but I absolutely loathe it for small events. With the rising popularity of the neverdrains software as well as Scott Danesi’s software where this is called “IFPA Scoring” for some reason, I see the 100-90-85-84 etc. scoring applied regularly to events with 20 or so players. Then you have situations where the difference between last and 3rd, and 3rd to first are about the same. I think that the scoring should be scaled in some way to match the participation in the tournament.

A couple years ago I looked at the distributions of scores in one of the PAPAs and found that typically the scores on most games wound up being in a lognormal distribution, regardless of the game. Thinking of maybe trying to look at this year’s PAPA data and apply a scoring system that gives you a point for the “percentile” your score winds up in just to add to the data here.

gdd · April 25, 2016, 11:14pm

While I realize we have pretty lenient swearing policy here on TiltForums, I would appreciate it if people could keep it down to a swears : non-swears ratio of better than 1:3. Shop for ideas here!

pinwizj · April 25, 2016, 11:43pm

Go fart yourself Dunlap

BMU · April 25, 2016, 11:50pm

Got a point there, Josh. I know sometimes in Classics when game 1 is so-so and game 2 stinks or vice versa, I’ve opted to void the ticket right then to start another run and save time with the lines. Reducing the scale on Classics to 50 or 60 makes that less bad.

It really comes down to what are you trying to measure? What’s the relative value of great vs. good vs. so-so vs. stinkers. Cayle’s point about a one-point value difference between games could represent 100M points or 100 points on the same machine shows another weakness, too. The trouble with trying to address point differentials directly, though, is that pinball scoring is very non-linear in practice. The scores collectively may be in a lognormal distribution, but individual position differences are very irregular. And the reward of getting one more shot vs. draining right after your first Sparky multiball means a lot less than one more shot or not when Crank It Up is lit [at least for some of us].

Right now, the point system values good-to-great games more highly relative to the cost of so-so or worse games than it used to. Which, if either, is right? While I think that if you’re going to use a ticket format rather than best-game, all games should count, I really hate voiding tickets after a clunker kills one. Higher player counts and games-played counts is slowly giving more advantage to top-heavy tickets. Hey, if it was an easy choice, we wouldn’t be debating it.

This post was created without any natural or artificial swearing ingredients.

raydaypinball · April 26, 2016, 1:29am

Holy $%^&* that ticket A is ridiculous to not qualify. Although to be fair, I don’t know what person in their right mind would choose to play games with such ridiculously bad value as TWD and Mustang, where you needed such huge scores for such little gain.

But yeah, that ticket should have qualified for sure, it’s 5 games that show skill in the game of pinball.

kdeangelo · April 26, 2016, 2:30am

Enter the land of what PAPA 19 could have been:

http://www.neverdrains.com/papa19scoring

BMU · April 26, 2016, 3:00am

Hey, Karl, no fair, you have the database to work with

kdeangelo · April 26, 2016, 3:07am

Just trying to make it easier for everyone to speculate

Fascinating to watch the Escher/CDS bubble going from 100 to 125 point scoring.

Snailman · April 26, 2016, 3:12am

Whoa! CDS goes from 25th up to nearly a bye-worthy qualifier

bkerins · April 26, 2016, 3:37am

One other option would be to stop scaling by 1 point per player sometime down the line (where?). Say, after the top 50, it became 1 point per two players that pass your score.

This would still give premium value to high scores, while allowing “bleeder” games to still potentially pick up a few points. It would extend the current top 87 another 37 places, scoring points to the top 124 scores per game.

I still believe that a below average score should receive little or no credit, regardless of relative quality. In a best-game format, a 20 million Game of Thrones is identical to a 2 million Game of Thrones, so I feel it should be similar here. A 20 million Game of Thrones is 10 times better … and 10 times zero is zero!

BMU · April 26, 2016, 3:46am

Bowen,

Hey, I was just crunching that one now, except that I also looked at an extra option of going down by 3 players after a while, then 4, etc. via once the points get to 50 it goes 49-49-48-48, then later 40-40-39-39-39-38-38-38, every four below 30, every 5 below 20, every 6 below 10.

BMU · April 26, 2016, 3:49am

Q: “If you have two #1 scores, how crappy do your other three games have to be before you should not make the cut?”

pinballcorpse · April 26, 2016, 4:06am

Wow that is awesome!

When I was looking at Cryss and Escher in my earlier analysis. I was only focused on zero scores. I had no idea how much Cryss’ ticket would have changed with the scaling.

Edit:

Further, Johnny’s overall is not helped by the overall rescaling. Just goes to show ALL of the data needs to be analyzed.

A really cool “what if” program.

Amazing.

Thanks for sharing.