IFPA rating for team league salary cap?

I think this brings up an interesting question: Are there many other games/sports where competitors commonly compete in multiple events simultaneously?

Chess and SpeedCubing

Ah, thanks for the correction! I guess that makes eff% more stable. I’m still not sure how accurate a reflection of skill it is. It really depends on who else was participating in the tournaments. It’s probably better than the rating value though, which seems to be very sensitive to recent results and jumps around a lot.

It is entirely plausible for the ratings to change after a single event. But that’s not a bad thing. The change in an individual players rating is determined by who they beat and lose to in an event. If you beat players that are rated higher than you, your rating goes up. If you lose to players rated lower than you, your rating goes down. The magnitude of the change is determined by the difference in player ratings. If you play really poorly in a big event like pinbrawl, your rating will change in a big way ( Especially if your rating deviation was large to begin with)

I’ve done several analyses with the IFPA ratings, and for the most part they are surprisingly good. Sure there are some funny cases, but for the most part, not bad.

@ryanwanger if you use matchplay for a lot of events, I can show you how to easily compute glicko ratings from actual head to head data. Would probably be very useful for your purposes.


I agree that the ratings are typically pretty good. One needs only to check the ratings of the people at the top of the ranking chart to see that rating correlates very well to ranking. I also don’t have a problem with the volatility of the rating. The truth is my true rating is probably somewhere in the middle. Before Pinbrawl I was almost 1600, but I had several good results in a row. After Pinbrawl I was at 1400, but I had clearly played poorly there.

The problem with using rating for a league like this is not that rating is a poor indicator of skill, but rather that it is can be a poor indicator of skill variance. The difference between an 1800 player and a 1200 player is pretty clearly seen from tournament results. The difference between a 1400 and 1200? Or 1400 and 1600? A lot harder to see when ratings are so volatile. And especially harder to see when the same player might play with a true skill of 1400 one night but 1200 or 1600 the next.

I would argue that as players get better, much of their skill increase comes from becoming more consistent in their play. Consistency leads to less volatility in a player’s rating, which means that of course the best players are well represented by their ratings. But for less skilled players, more of their play comes down to randomness, and thus their ratings are much more volatile.

As you’ve been running 2 seasons already, why not use the data you’ve captured in the league?
There should be enough matches played to be able to seed everyone with the same accuracy as using Ranking, Rating or Eff %ge.

You say there are 6 teams of 10 - 60 players.
Rank/seed everyone from 1-60, A “perfect” spread would be for each team to total 305 seeding pts. Set a range that teams must fall within.

or better still batch rank them. (1-6 are seeded in top group, 7-12 2nd group, etc. etc.) and then each team has to pick 1 from each group to make up their team

1 Like

I’ve actually been a captain in Ultimate Frisbee in a league (in Boston) that did a draft like this. Can you explain in more detail about the baggage thing?

This was 12 years ago, so my memory is fuzzy…but I think it’s possible that in a situation where for example a 13 baggaged with a 17 - I would pick the 17 in an early round and then when we got to a round where 13s were being picked, I wouldn’t get a pick that round (because I’d already have the 13 that came as baggage). Does that seem right? Or am I making stuff up?

My understanding is that when choosing either of a pair of baggaged players, you get both players at that point and skip your next pick. Also, the pick order every round is from the team with the fewest total points at that point to the one with the most points. So, picking a high-high pair may mean you get a later draw even after the skip, while a high-low pair might yield you first pick after the skipped round. There’s a limit on the number of players that can baggage together (2 or rarely 3), and they must be mutual (i.e. no chaining).

Boston might have a different system, but the way Pittsburgh traditionally has done it, you lose your next round’s pick regardless of rankings. So if you pick up a 17 and a 13, you don’t get a pick next round even if the players are still 17s. This means that sometimes baggaged pairs are picked in later rounds than their top skill would indicate.

The team totals for skill are figured at the end of each round and that affects the order the captains pick players in the next round (assuming they have a pick).

For ultimate rec league teams, it’s important to have the same number of people (and consistent numbers of women and men) on each team as well as to foster skill parity. I’m not sure number of players would matter in a pinball league, although each team should probably have a minimum.

For the Pittsburgh Ultimate draft system, we initially used self-rankings. Later we added oversight to make sure rankings were reasonable, and when the community grew enough that we didn’t all know each other, we started recruiting people to help revise rankings before big drafts.

I haven’t captained for a few years, but this is about how it worked when I did.

1 Like

Thanks @ErinK, that does sound similar. I still think an A and a B division would be helpful…even if the team as a whole is equal, having a wide disparity of skills on a pinball team could mean that many matches are between two very unequal players. In past season, captains have often sacrificed weaker players against the strongest opponents. If their #1 is better than your #1, then don’t waste your best player on what will likely be a loss anyway.

True, but the glory when you’re the weaker player(s) and pull out a win in those circumstances…

Do players ever lose levels in this sort of system, over time? I wonder if that is upsetting for a player. If pinball had something similar I’d guess players would gain and lose levels frequently.

My feelings on team leagues, as a Seattle Monday Night Pinball player, is that the point of the team league is to have a good time with your friends. Our team league has grown massive and a big part of that is people bringing friends who don’t normally play in tournaments to come join their team and hang out. The idea of a draft for the league or any tight restriction of who can be on what teams is a huge turn away from any team league to me. This is why Monday Night Pinball has its restriction set up, and why we have the “grandfather” rule allowing teams to continue to stay together as players improve. Does this lead to teams being better than others? Yes of course, but you are never going to have a perfectly even distribution of skill across teams, and I find that home vs away matches does an all right job of balancing out. Even the best teams in our league still lose and most often they lose away games. No matter how you restrict team selection you will still end up with some players having to go up against better players and getting crushed. You will still end up with teams who aren’t as strong losing more than they win. That is pinball and competition. Might as well let people have fun with their friends. Almost every match I see pretty much everyone having fun win or lose in Seattle’s league. Also, the fact that some teams form up with friends who aren’t as skilled come in and get crushed really only seems to foster an environment of those people wanting to go out and play more to get better. Every single team in Monday Night Pinball has risen to the challenge and grown in playing skill massively over the last few years. I think you might find that imposing a draft with all these restrictions would shy people away from the league if they can’t play with their closer friends, and it would lead to people being less inspired to go get better because all the teams are already even, and if you get better, you might be booted to another team. I for one would quit MNP if my team was not allowed to stay together just because we are good.


I’m not excited about a draft anymore. But I will probably still do a salary cap.

We currently use your exact system, with the same restrictions.

This is not true in our league. Last season everyone played every other team both home and away, and every series was a 2-0 sweep.

So, it sounds like you’re on one of the best teams? I’m getting that in my league…players from the top two teams say: “it’s fine, don’t change anything”…meanwhile the other four teams don’t seem nearly as excited. We dropped from 7 teams, to 6, and an are in danger of losing 2 more for the upcoming season.

I want people to be able to play with friends, but, just as you described, I want something that is welcoming to new players. We’re not getting new teams or new players…just thinning down to the existing group of dedicated players (who already play in all of the other local events).

I’m hoping that a B Division, and smaller teams will help with that.


Lose levels over time in ultimate? Not usually (and not automatically). They review the rankings every season/league but it’s a pretty small scale (I think there are maybe four or five qualities and a 5-point scale, so your skills and experience might still be the same but your athleticism would go up and down.)

Oh, and the player rankings we use aren’t public. They’re only used for the draft, which is private. So you don’t get to obsessively compare yourself against the other players if you’re not a captain or league administrator. And it’s not public knowledge who gets picked in which round.


In addition to the previously mentioned arguments against a draft based system (not a chance in Hades I’d be willing to drive to Renton or Edmonds every other week), I think it would kill off a lot of the fun traditions that form when teams persist across multiple seasons.

That said, along with the traditions does come a subtle but definite cliqueness. I’d be interested in seeing a mechanism that kept team legacy while still encouraging teams to rotate through players each season, and encouraged players have more interaction with other teams.

Just brainstorming, but maybe something to the effect of requiring a player to sit-out, transfer, or sub one out of every n seasons could be interesting. Combined with a salary cap, the churn should keep teams fairly balanced while not killing off the chance to build a team legacy with your friends.


Our ultimate group also runs some leagues where you specify a core group and then draft about half the team. That wouldn’t work super well for a 5-person team.

There have been teams that stayed together for multiple seasons during the draft days, but it required a lot of finagling. I didn’t mention that trades are also allowed, as long as they happen immediately after the draft. So I remember one season where another team specifically drafted a player to mess with the team who wanted her, then her team tried to pick good players to trade for her. (I think they worked it out.)

I’d love to run some things past whomever runs the NYC team league. Anyone know how to get in touch?

Kris Medina runs the league. There’s a contact form at the bottom of the pinball nyc site that i’d try. http://www.pinballnyc.com/contact/

1 Like

Bumping this as I’m going to start a new season next month. (This league hasn’t been run since my original post).

Refreshing my stats, I think Rating currently fits this group of players more closely than Eff %.

Now, the goal is to figure out the math on how to use Rating for a salary cap. If anyone has thoughts on how one might translate Rating into a different scale that quantifies the difference in ability in a less abstract way, please do chime in. :slight_smile:

(For example I do think Eff % was close in this regard…someone with a 20% would win about twice as often when playing someone with a 10%).