Many chess players wonder how different rating systems map to each other. There is a common idea that it’s impossible to map the ratings because they are different pools of players and different time controls. You can also argue that playing OTB is much different than playing online. These are valid points, but we can still look at how the ratings compare to give chess players a guide as to where they stand in different rating pools. This post will explain how we create our Rating Comparisons.
Gather Data Sources
First, we download all of the ratings for players in our database. This includes USCF, FIDE, Chess.com, and Lichess ratings. Jesse pulls API data from lichess and Chess.com, and Matt downloads the latest rating supplements for USCF and FIDE.
We will use the Chess.com blitz vs. USCF ratings as an example for the remaining steps.
Subset The Data
Players In Both Rating Pools
Next, we find all players from our sources with a chess.com username and a USCF rating ID. This is our widest net on all possible players eligible for the comparison.
Recent Non-Provisional Players
We don’t want to include players that have only played a handful of games. We would also like to exclude anyone who last played long ago. To handle this for online ratings, we subset only players with RD < 150. See the Chess Ratings post for more information on RD values and how ratings work.
There will be some errors and abnormalities in the data that we need to check for. After hand-verifying some of the egregious values, we are left with a pretty clean set of players to analyze.
Rank The Data
Now that we have a pretty clean set of data based on players in both rating pools, we rank them individually to help remove the noise. Here’s an example of the input data and the ranked data. Notice how the 1100 and 1150 USCF values swap places, so both are ranked low to high.
This gives us a very smooth line to map the two rating systems. We also estimate each player’s rating based on the ranked data, which will later be used for the +/- values.
In the comparison table, we create values every 50-100 points for Chess.com blitz and look up the corresponding USCF rating in the ranked data.
Residuals +/- Values
The final step is to determine how certain we are in these predictions. We take the chess.com rating for each player and align it with the corresponding USCF rating in the ranked data.
Next, we take the difference between the predicted USCF rating and the actual USCF rating. Here’s an example of the distribution between predicted and actual. This histogram happens to be for blitz and bullet, but we can see most values are centered at the predicted value, and an equal but decreasing number of players fall into the bins as we move left and right.
We can see how each comparison distribution looks by taking the 1st, and 3rd quartiles predicted minus actual values (residuals). I add a +/- value in the tables corresponding to one standard deviation. Here’s an example:
If your Chess.com blitz rating is 1550 and you’re wondering what an equivalent USCF rating would be, the best guess is 1540. Out of the players in our database that have both ratings, we’d expect 50% of players to be between 1540-260 (1280) and 1540+260 (1800). That’s a very big range, so that should be kept in mind when comparing these ratings. All the rating comparisons have 50% of players falling between 75 and 130 points.