— This upcoming video may not be available to view yet.

In my previous post, I described an extension to Elo that could handle multiple players. The next restriction to overcome is that Elo assumed players of equal skill have an equal chance of winning.

In most games, players don’t actually have an equal chance of winning a single game. Chess overcomes this by having a match consist of several individual games with players switching who goes first. This is a good solution for two player games but gets awkward for multiplayer games. It is also inconvenient if having a multi-game match is undesirable for any reason.

Adjusted rating

Elo is fundamentally designed to handle players of different skill levels and produce a probability of them winning. The approach is therefore to determine an adjusted rating for someone such that the probability of winning is as desired.

The formula to work out an Elo expected score is:

$E=\frac{1}{\mathrm{1+{10}^{\frac{R}{400}}}}$

In our case we have an expected score (the win probability) and want to work out a rating difference, so we can just rearrange to get:

We then subtract this adjustment from the player’s actual rating to get their effective rating to use in the rest of the Elo calculations.

The reason we subtract it is that for players A and B, the rating difference is calculated as R_{B} - R_{A} which is the opposite way round to what we want.

Here are the rating adjustments for some sample win probabilities:

Win probability

Rating adjustment

0.01

-798

0.10

-382

0.25

-191

0.49

-7

0.50

0

0.51

+7

0.75

+191

0.90

+382

0.99

+798

If you try to calculate the adjustment for a win probability of 0 or 1, you get -∞ and +∞ respectively.

This means, for example, if you beat someone of an equal rating at a game you only have a 10% chance of winning, it’s the same as beating someone rated 382 points higher at a fair game.

Putting it all together

After explaining the theory of how to expand an Elo rating to support all the variations in Tic-tac-toe Collection, it’s time to put it into practice.

While writing this I’ve also been working on an implementation.

As well as adding it to Tic-tac-toe Collection, I’m planning on creating some kind of site and/or app just to work out Elo ratings for various results (and hopefully even to maintain ratings for players in any kind of competition you might want to run).

— This upcoming video may not be available to view yet.

One of the features I have planned for Tic-tac-toe Collection is a player rating system.

One such common rating system is the Elo rating system, most famously used in chess. There is a lot of discussion about how good it is, but for my purposes it seems fine - at least as a starting point.

There are two fundamental restrictions on Elo that I would have to overcome though:

It is limited to two players.

It assumes each player has an equal chance of winning if they have equal skill.

Elo

First an introduction into how Elo rating works. There are two parts:

Given two player ratings, calculate an expected result.

The expected result is essentially the chance of winning, given as a decimal between 0 and 1, in which 0 represents a guaranteed loss and 1 a guaranteed win.

Given a player rating, an expected result, and an actual result, calculate how much the rating should change.

If you do better than expected then your score will go up. If you do worse then your score will go down. With a bigger difference there will be a bigger change.

In SME, players are ordered by performance, and then a simulated match is played between pairs of consecutive players, with the players’ ratings updated using the normal Elo method for two players.

This is certainly acceptable, and Tom includes some tests to show it works well. One oddity is if you get a victory over someone with a much higher rating than you, but also beat them by more than one place, you essentially don’t get the benefit for it. For example, consider the following result:

Position

Player

Initial rating

vs player above

vs player below

Total

1st

A

1000

+24

+24

2nd

B

1200

−24

+27

+3

3rd

C

1500

−27

−27

Here, Player A with a rating of 1000 has beaten Player C with a rating of 1500, but essentially hasn’t gotten the credit for it.

My unnamed Elo

My solution is to conceptually run the algorithm on every pair of players (in this case there would be A v B, B v C and A v C). There is a straightforward optimization so that the estimating part of the operation does not directly depend on the other players, just the difference between a single player’s expected and actual scores. So the algorithm is actually:

Calculate the expected scores for each pair of players.

Sum the expected score for each player.

Divide each score by the number of players to normalize back to the range 0 to 1.

Calculate the rating change for each player using their actual score and the combined expected score.

The details

With the same data as above, the results are as follows:

Player

Initial rating

Expected score

A

1000

0.097

B

1200

0.304

C

1500

0.599

Which is made up of the following score pairs:

Pair

A

B

C

A v B

0.24

0.26

B v C

0.15

0.85

A v C

0.05

0.95

This results in rating changes of:

Position

Player

Initial rating

Change

1st

A

1000

+29

2nd

B

1200

−9

3rd

C

1500

−19

Player A has gained more which is good. That was basically the goal of the change.

Player B has now changed a small gain into a moderate loss. That’s a little odd and probably not desired, after all the victory against C should be more significant than the loss against A

Player C has changed a big loss into a slightly smaller (but still big) loss. After some thought, that is probably reasonable. Although C is expected to win just having more players should generally reduce your expectation of winning and therefore how much you are penalized for failing.

Improvements

So why did these results happen? It might be better to look at a table also including expected and actual results, which reveals a choice that has to be made that I have not yet mentioned:

Position

Player

Initial rating

Expected score

Actual score

Change

1st

A

1000

0.097

1

+29

2nd

B

1200

0.304

0

−9

3rd

C

1500

0.599

0

−19

Notice the actual scores used. Player A has a score of 1 indicating a win. Players B and C have a score of 0, which indicates a loss, but an equal loss.

If we instead use different values for the actual score we get a more sensible result

Position

Player

Initial rating

Actual score

Change

1st

A

1000

0.67

+18

2nd

B

1200

0.33

+1

3rd

C

1500

0

−19

In this case, the “actual score” for a player placed Nth out of C is:

$\frac{2N}{C\times \left(C-1\right)}$Explain this formula

This formula is conceptually simple, but a bit opaque when simplified.

Think of it as each player getting a number of shares of the total score. The person in last gets 0 shares, the person second from last gets 1 share, then next player gets 2 shares, and so on. The person in first place gets C−1 shares.

That means each player gets N shares and the total number of shares is equal to the C−1th Triangular number.

This simplifies to:
$T\left(C-1\right)=\frac{C\times \left(C-1\right)}{2}$

Therefore each player gets:

$N/\frac{C\times \left(C-1\right)}{2}$

which simplifies to:

$\frac{2N}{C\times \left(C-1\right)}$

Handling ties

To be as general as possible (and because it actually happens in some of my multiplayer Tic-tac-toe variants) we need to handle arbitrary ties.

The simplest way is to evenly distribute the score to all the players that are tied. So if the top two players in a three-player game tie, their scores of 0.67 and 0.33 are summed and split between them (0.5 each).

As a more complex example, consider:

Position

“Raw” score

Actual score

1st

0.286

0.262

1st

0.238

0.262

3rd

0.190

0.143

3rd

0.143

0.143

3rd

0.095

0.143

6th

0.048

0.048

7th

0

0

Final thoughts

I’ve addressed the first restriction of Elo, only supporting two players. As for unfair games, that will have to wait as this post is long enough as it is.

— This upcoming video may not be available to view yet.

This was a question that came up recently, and the answer is interesting enough I decided to step through the process of working it out.

A good year

There are seven days in the week. If the year had a number of days in it that was divisible by seven (say 364), then the answer would be easy: every year and every date would fall on the same day every year.

This would have some nice properties. Some of the more complicated holidays that need to happen on certain days would get simpler, and the poem about what kind of person you are based on the day of your birth would be more poignant since your birthday would be the same each year.

A normal year

But alas, the number of days in the year is not divisible by seven, and we have to suffer with a different calendar each year. But there is good news, there are only seven possible calendars - one starting on each day of the week.

With 365 days in the year, we would have the following possible calendars:

Monday 1st January, Tuesday 2nd January … Monday 31st December

Tuesday 1st January, Wednesday 2nd January … Tuesday 31st December

Wednesday 1st January, Thursday 2nd January … Wednesday 31st December

Thursday 1st January, Friday 2nd January … Thursday 31st December

Friday 1st January, Saturday 2nd January … Friday 31st December

Saturday 1st January, Sunday 2nd January … Saturday 31st December

Sunday 1st January, Monday 2nd January … Sunday 31st December

After we have cycled through those calendars, the pattern starts again. Hooray.

Leap years

Of course it is not that simple. Every four years we have a year with an extra day in it, throwing everything off. And to fix it we will need Maths.

We have a cycle of seven starting days, overlaid with a four year cycle of leap years. To determine when the large cycle repeats we need to calculate the least common multiple of four and seven. Since four and seven are coprime (they have no common divisors), their least common multiple is their product.

4 × 7 = 28

So you will be able to reuse your calendar every 28 years.

Knowing when the whole cycle repeats give you an answer to when you can reuse your calendar, but not necessarily the best answer. Applying a bit of logical reasoning, you can work out that there are only 14 possible calendars: one for each day of the week in a leap year, and one for each day of the week in a common year.

A non-leap year is also called a common year.

Sadly, at this point clever maths is probably less illustrative than a brute force approach. Since we know the cycle is only 28 items long, the force is not that brutish…

I shall name the possible calendars with a number indicating the day of the week they start on (1 for Monday), and a letter L or C, indicating a leap year or common year respectively.

Lets start with a common year:

C1

As you may have noticed from earlier, the year after a common year begins with the next day.

C1 C2

We can continue the pattern up to the first leap year, since leapyness does not affect how a year starts.

C1 C2 C3 L4

But what next? A leap year has an extra day, so instead of going up by one, we go up by two.

C1 C2 C3 L4 C6

We are now ready to complete the pattern (remembering after 7, we go back to 1)

The cycle is 28 years long, and there are 14 possible calendars so you might naïvely think each one appears twice, but that isn’t the case. The leap years appear once each, and the common years appears three times each.

This means for leap years, our guess at waiting 28 years to reuse a calendar is correct. For common years, you can reuse your calendar, on average every 9⅓ years, but in a seemingly irregular pattern.

Making it useful

For any given year, you only need to know when you are relative to a leap year to work out when a calendar can be reused:

Year

Years until next reuse

Leap year

28

Year after leap year

6

Two years after leap year

11

Year before leap year

11

So if it is two years after a leap year, for example 2018, you can reuse your calendar 11 years later, in 2029.

If you want to calculate the next reuse after that, then the method is straightforward, but hard to explain clearly. First, I’ll provide a variation of the above table with just common years.

Year

Years until next reuse

Year after leap year

6

Two years after leap year

11

Year before leap year

11

Year after leap year

6

Two years after leap year

11

Year before leap year

11

Year after leap year

6

Two years after leap year

11

Year before leap year

11

Year after leap year

6

Two years after leap year

11

Year before leap year

11

To work out the reuse after the next reuse, step backwards through the table (wrapping round when you get to the beginning).

So, if it is 2019 (a year before a leap year), the next reuse is 11 years later in 2030. This is two years after a leap year, so the next reuse is found by reading the row above - 11 years later again in 2041. Now we are a year after a leap - repeating the process and reading the row above we find the next reuse is 6 years later in 2047.

This can be summarised in the following decision tree:

Taking it further

What we have so far will work for the vast majority of people, at least for working out calendars within your own lifetime. But if you want to work out years close to 2100, it will fail.

The reason is that the idea that a leap year is every four years is not the whole story. If the year is also divisible by 100, then it is not a leap year. So 2100 is not a leap year. And nether was 1900.

But… what about 2000? If you check a 2000 calendar you’ll find February 29th. Turns out there is another rule to leap years - if the year is also divisible by 400, it is a leap year. So 2000 was, and 2400 will be.

The reason these are added are to make up for the fact that the length of the year is not nicely divisible by the length of the day. There are currently no further rules but the day will drift again at some point.

There is a proposal to make every year divisible by 4000 into a common year but based on historical precedent we will probably completely change the calendar before that happens.

At first glance that table looks unhelpful - the numbers in the last column are all over the place. But, with a bit of careful thinking it’s only a small extension to the previous method.

The final algorithm

Any more issues

I have assumed two calendars are the same if they start on the same day and have the same number of days. But there are other differences, like when certain holidays occur.

Some of the holidays will change in the same way and so won’t matter. For example in the UK, Christmas Day is a bank holiday. If it happens to fall on a weekend, then the next weekday is a bank holiday in lieu.

Many holidays however do change, and it is mostly to do with the moon. For instance the definition of when Easter occurs in western Christianity is:

the first Sunday after the first full moon that falls on or after the vernal equinox

To make things a bit weirder, it’s not really the vernal equinox, but March 21 (the vernal equinox is often on March 21, but in reality happens some time between March 19 and March 22 and can vary with timezone). It’s also not the full moon in the astronomical sense, but the Paschal full moon, which is an approximation of the astronomical full moon based on the 19-year Metonic cycle. Full details here.