Elo Ranking System - Logseq Knowledge Base

Error converting content: marked is not a function

- **Chapter 1: A Friendly Rivalry**
collapsed:: true
- In a quaint corner of Boston, two friends, Sid and Ercan, engage in their weekly ritual: a game of chess. Sid, with his seasoned experience and a rating of 1300, often imparts wisdom from his years of playing. Ercan, the young and vibrant 21-year-old with a budding rating around 900, brings fresh strategies to the board. Their games, intense yet filled with camaraderie, are a highlight of their week.
- **Chapter 2: The Spark of an Idea**
collapsed:: true
- One crisp Boston evening, as the game reaches a thrilling crescendo, Ercan, with his characteristic enthusiasm, expresses, "Sid, every game I play with you feels like a new lesson. But I wonder, how do I measure my progress? How do I know if I'm truly improving or if I'm just having a lucky day?"

Sid ponders this, the weight of Ercan's words sinking in. "You're right," Sid muses. "We need a system, something that not only tracks wins and losses but truly captures a player's skill and progress."

The idea takes root. The duo decides that their Friday games will be more than just friendly matches; they will be the crucible in which a new, revolutionary player rating system is forged
-
- **Chapter 3: The Win Ratio Experiment**
collapsed:: true
- The first approach Sid and Ercan consider is straightforward: the Win Ratio. Simply put, it's the number of games a player has won divided by the total games they've played.

\[ \text{Win Ratio} = \frac{\text{Number of Wins}}{\text{Total Games Played}} \]

This method seems intuitive. After a few weeks of tracking, however, both realize it doesn't tell the full story. Beating a novice or a grandmaster counts the same, a flaw that can't be ignored.
- **Python Example: Simple Win Ratio**
- ```python
def win_ratio(wins, total_games):
"""
Calculate the win ratio for a player.

Args:
- wins (int): Number of games the player has won
- total_games (int): Total number of games the player has played

Returns:
- float: Win ratio
"""
return wins / total_games if total_games != 0 else 0

# Example: After 10 games, Ercan has won 3 against Sid
ercan_wins = 3
total_games_played = 10

ercan_win_ratio = win_ratio(ercan_wins, total_games_played)
ercan_win_ratio
```
- After 10 intense games, Ercan emerges with a win ratio of 0.3, or 30%. While this provides a snapshot of his performance, it's too simplistic. It doesn't account for the skill disparity between the players or the nature of each match.
- **Chapter 4: Weighted Wins - The Next Evolution**
collapsed:: true
- The duo realizes that not all wins are equal. To address this, they decide to weight each win based on the opponent's strength. If Ercan defeats a player of Sid's caliber, it should count for more than beating a complete novice.

But how to determine this weight? They need a metric to gauge the relative strengths of players. This leads them to the idea of a rating system. The difference in ratings between two players could potentially serve as this weight.
- **Python Example: Weighted Wins**
collapsed:: true
- ```python
def weighted_win_ratio(player_rating, opponent_rating, wins, total_games):
"""
Calculate the weighted win ratio based on the opponent's strength.

Args:
- player_rating (int): Rating of the player
- opponent_rating (int): Average rating of the opponents faced
- wins (int): Number of games the player has won
- total_games (int): Total number of games the player has played

Returns:
- float: Weighted win ratio
"""
weight = opponent_rating / player_rating
return (wins * weight) / total_games if total_games != 0 else 0

# Example: After 10 games against Sid (with a rating of 1300), Ercan (with a rating of 1000) has won 3
ercan_rating = 1000
sid_rating = 1300

# Reinitializing the number of wins for Ercan and total games played
ercan_wins = 3
total_games_played = 10

ercan_weighted_win_ratio = weighted_win_ratio(ercan_rating, sid_rating, ercan_wins, total_games_played)
ercan_weighted_win_ratio
```
- RESULT
- ```
0.39
```
- With the weighted win ratio in place, Ercan's performance against Sid, who has a higher rating, is reflected more accurately. His win ratio, when considering the strength of his opponent, rises to 0.39 or 39%. This provides a clearer picture, as defeating a stronger opponent is rightly given more significance.
- Side Note: **Understanding the Weighted Win Ratio:**
collapsed:: true
- When we talk about a "weighted" win ratio, we're trying to give more significance to wins against stronger opponents. In the world of chess, beating a much higher-rated opponent is a more significant achievement than beating someone of a lower or equal rating.
- To make this concept concrete, imagine two scenarios:
- 1. Ercan plays 10 games against beginners, winning all of them. His traditional win ratio would be 100%. However, given the skill difference, these wins might not be particularly indicative of Ercan's growth as a player.

2. Ercan plays 10 games against you, Sid, who's more experienced. If Ercan manages to win even 3 of those games, it's a significant accomplishment.
- In our formula:
- \[ \text{Weighted Win Ratio} = \left( \frac{\text{opponent's rating}}{\text{player's rating}} \right) \times \text{Win Ratio} \]
- The term \(\frac{\text{opponent's rating}}{\text{player's rating}}\) acts as a multiplier. When Ercan plays against you, Sid, this multiplier is greater than 1 (since 1300/1000 = 1.3). This means his wins are "amplified" when calculating the weighted win ratio, leading to a value higher than the traditional win ratio.
- In the given scenario, the 39% suggests that when the strength of the opponent (you, in this case) is considered, Ercan's performance is equivalent to winning 3.9 out of 10 games against an average player of his own rating.
- **Connecting to Ratings:**

While the weighted win ratio gives a more nuanced picture of performance against varied opponents, it doesn't directly adjust a player's rating. For that, we need a system that:

1. Predicts the outcome based on current ratings.
2. Updates the ratings post-match based on the actual outcome vs. the predicted outcome.

This is where the concept of the rating differential and the logistic function will come into play, allowing us to predict match outcomes and adjust ratings accordingly. The weighted win ratio was a stepping stone to this idea, emphasizing the importance of considering the relative strengths of opponents.
-
- **Chapter 5: The Rating Differential - The Heartbeat of Prediction**
collapsed:: true
- In the quiet aftermath of another friendly chess duel, Ercan, leaning back, asks, "Sid, if our ratings differ, can we predict the outcome of our matches?"

Sid, always up for a challenge, ponders deeply. "Ratings should, in essence, reflect our skills. The bigger the gap, the more predictable the match outcome. But how do we mathematically capture this intuition?"
- **Step 1: A Linear Approach**

Sid sketches a straight line on a graph. "Imagine if we plotted the difference in our ratings against the probability of me winning. If our ratings are equal, it's a 50-50 game. As the rating difference grows in my favor, my winning chances increase linearly."

However, this linear model has a flaw. It doesn't capture the idea that, beyond a certain rating difference, additional points don't significantly change the outcome. The match becomes increasingly predictable.
- **Step 2: Enter the Sigmoid**

Sid then sketches a new curve, an S-shape known as the sigmoid or logistic function. "This," he explains, "better captures our intuition. When the rating difference is zero, the outcome is unpredictable. But as the difference grows, the curve flattens, indicating that the outcome becomes more certain."

Ercan, always eager to grasp the core of an idea, asks, "But how do we translate this curve into a formula we can use?"
- **Step 3: The Logistic Function's Rich Heritage**
collapsed:: true
- The night deepens, and Sid, with an air of nostalgia, begins, "Ercan, the sigmoid or logistic function isn't new. It has deep historical roots. Pierre François Verhulst, a 19th-century Belgian mathematician, first developed it while studying population growth. He was fascinated by how populations grow rapidly initially but taper off as resources become limited."
- Sid continues, "In the 20th century, the logistic function found applications in various fields, including biology, chemistry, and yes, chess! The shape of the curve, with its steady middle and tapering ends, mirrors many natural processes."
- He sketches the curve, marking the areas where the function is most sensitive and where it starts to plateau. "See, Ercan, when the rating difference is minimal, even a slight change can significantly affect the outcome. But as the difference grows, the outcome becomes more certain, and the curve flattens."
- Ercan, intrigued, asks, "But how did you think of applying it here?"
- Sid smiles, "Years of diving deep into various fields gives one the ability to connect seemingly unrelated dots. The logistic function, with its predictive nature, seemed a natural fit for our problem."
- **Python Visualization: The Logistic Curve**
collapsed:: true
- ```python
import numpy as np
import matplotlib.pyplot as plt

# Generating values for the rating difference
rating_diff = np.linspace(-1000, 1000, 400)

# Logistic function calculation
E = 1 / (1 + 10**(-rating_diff/400))

# Plotting the curve
plt.figure(figsize=(10, 6))
plt.plot(rating_diff, E, '-r', label='Logistic Curve')
plt.title('Logistic Curve representing Expected Score')
plt.xlabel('Rating Difference (R2 - R1)')
plt.ylabel('Expected Score for Player 1')
plt.axhline(0.5, color='grey', linewidth=0.5)
plt.axvline(0, color='grey', linewidth=0.5)
plt.grid(True, which='both', linestyle='--', linewidth=0.5)
plt.legend()
plt.show()

```
- There we go! The Logistic Curve visualizes the expected score for Player 1 based on the rating difference between two players. The y-axis represents the probability of Player 1 winning. As we move along the x-axis, either left or right from the center, the probability becomes more decisive.
- Notice how, around a zero difference, the curve is most sensitive. Even a small shift can drastically change the expected outcome. As we move further away from the center, the curve starts to plateau, indicating increased predictability in match outcomes.
- Sid delves into the mathematics. "The logistic function can be expressed as:
\[ E = \frac{1}{1 + 10^{\frac{(R2 - R1)}{400}}} \]
- Where:
- \( E \) is the expected score or probability of Player 1 winning.
- \( R1 \) and \( R2 \) are the ratings of Player 1 and Player 2.
- This function gives us a value between 0 and 1, representing the probability of Player 1 winning."
- Ercan, squinting at the formula, remarks, "But why the base 10? And why divide by 400?"
- **Step 4: Calibration and Intuition**
- Sid, reflecting on Ercan's query about the choice of base 10 and the divisor 400, explains, "Ercan, the choice isn't arbitrary. Let's consider the base first. By choosing 10, we're working with a familiar scale. It's intuitive. Now, the 400 divisor? It's a scaling factor. We wanted a certain rating difference to correspond to a specific expected outcome. Through experimentation and data from our matches, we found 400 to be the sweet spot. It made the curve sensitive enough to differences without being overly so."
- Ercan nods, understanding dawning, "So, it's like tuning a guitar. You tweak until it sounds just right."
- Sid chuckles, "Exactly! And remember, math provides the tools, but it's the art of application that makes all the difference."
- **Step 5: The Prediction Engine**
- With the logistic function in hand, they now have a tool to predict match outcomes based on ratings. But predictions are just one side of the coin. After each match, they need a mechanism to adjust the ratings based on the actual result compared to the prediction.

This feedback loop, they realize, will be the core of their new rating system, ensuring it remains dynamic and responsive.

As Sid and Ercan sit, their game board between them, they realize they're on the brink of creating something revolutionary. The foundation is laid; now, it's time to build upon it.
- **Chapter 6: Feedback Loops and Dynamic Adjustments**
collapsed:: true
- As the Boston sun cast its golden hue over the chessboard, Sid and Ercan realized that prediction was just one piece of the puzzle. They needed to complete the loop: adjust ratings after each game based on performance.

---
- **Step 1: Defining the Adjustment**
collapsed:: true
- Sid began, "Our system predicts an outcome based on our ratings. After our game, we have the actual result. The difference between the predicted and actual result will guide our rating adjustment."

Ercan, eyebrows furrowed, thought aloud, "So, if I win against you, despite the system predicting a loss for me due to our rating difference, I should gain more points. Conversely, if I lose a game the system expects me to win, my rating should drop more significantly."

"Exactly," Sid affirmed. "The magnitude of this adjustment depends on two things: the difference between expected and actual outcomes, and a predefined 'K-factor', which determines how drastically ratings change."

---
- **Step 2: Introducing the K-factor**
collapsed:: true
- Sid explained, "The K-factor is crucial. It's like the sensitivity knob on a device. A high K-factor means ratings can swing dramatically after a single game, making it suitable for players just starting out. A lower K-factor provides stability, ideal for established players."

Ercan, ever inquisitive, asked, "How do we determine this K-factor?"

Sid responded, "It's often set based on the player's experience. Beginners might have a K-factor of 40, intermediates 20, and seasoned players around 10. But remember, there's an element of art to this. Over time, we'll adjust based on what feels right."

---
- **Step 3: Putting It All Together - The Rating Update Formula**
collapsed:: true
- Sid jotted down the formula:

\[ R' = R + K \times (S - E) \]

Where:
- \( R' \) is the new rating.
- \( R \) is the old rating.
- \( K \) is the K-factor.
- \( S \) is the actual score (1 for a win, 0.5 for a draw, 0 for a loss).
- \( E \) is the expected score from our logistic function.
- Ercan, looking at the formula, remarked, "So after each game, we plug in our ratings, the game's result, and this formula spits out our new ratings?"

Sid nodded, "Precisely. Over time, as we play more games and adjust our ratings, they'll become an ever more accurate reflection of our skills."
- **Step 4: The First Test**
collapsed:: true
- Excited to put their new system into practice, Sid and Ercan decided their next game would be the first test. With every move and strategy, the underlying pulse of their new rating system added an extra layer of thrill to their match.
- ---
- The room was filled with a palpable tension and excitement, not just from the ongoing game but from the realization that they were onto something groundbreaking. They had taken their first steps towards creating a dynamic, responsive, and fair rating system.
- **Chapter 7: Trials, Refinements, and Revelations**
collapsed:: true
- The Dynamo Rating System, though theoretically sound, was yet to be tested in the real world. Eager to see their creation in action, Sid and Ercan decided to simulate their matches, tracking rating changes over a series of games.
- **Step 1: Laying the Foundation**
- With a sparkle in his eyes, Ercan started laying out the parameters for the simulation on his laptop. "Alright, Sid. Let's start with our current ratings. You're at 1300, and I'm hovering around 900. We'll simulate 50 games and inject some unpredictability into the outcomes, making it more realistic."
- **Step 2: The K-factor's Dynamics**
- Sid, glancing at the screen, inquired, "Ercan, how are you managing the K-factor's adjustments?"
- Ercan, typing away, responded, "I'm starting with a K-factor of 40 for myself, reflecting my newcomer status. But here's the twist: as the games progress and I get more experience, I'll gradually reduce the K-factor. This will give me a more stable rating as I play more games."
- **Step 3: Code in Action**
- Ercan began to code the simulation, explaining each part to Sid as he crafted the functions.
- Update Rating Code
- ```python
def update_ratings(r1, r2, score1, k_factor):
"""
Update player ratings after a match.

Args:
- r1 (float): Rating of Player 1
- r2 (float): Rating of Player 2
- score1 (float): Actual score of Player 1 (1 for win, 0.5 for draw, 0 for loss)
- k_factor (int): Sensitivity of rating adjustment

Returns:
- tuple: New ratings for Player 1 and Player 2
"""
expected_score1 = 1 / (1 + 10**((r2 - r1) / 400))
expected_score2 = 1 - expected_score1

new_r1 = r1 + k_factor * (score1 - expected_score1)
new_r2 = r2 + k_factor * ((1 - score1) - expected_score2)

return new_r1, new_r2

# Test the function with a hypothetical game where you win (score1 = 1)
update_ratings(1300, 900, 1, 40)

```
- With the functions defined, Ercan initiated the simulation, "Let's see how our ratings evolve, game by game."
- After the simulation ran its course, Ercan plotted the results, showcasing the dynamic nature of their system. The graph depicted their ratings' evolution, with Ercan's experiencing significant shifts initially and Sid's remaining relatively consistent.
- **Step 4: Crafting the Series of Games**
- Ercan began typing away, crafting the code that would simulate a series of games between the two of them. "Sid, I'm going to simulate 50 games. We'll observe how our ratings change over time, adjusting based on actual outcomes and our predicted scores."
- After defining the rating update function, Ercan realized they needed a broader simulation to truly understand the system's behavior over multiple games.
- Simulation Code
collapsed:: true
- ```python
import random

def simulate_games(num_games, r1, r2, initial_k_factor):
"""
Simulate a series of games and track rating changes.

Args:
- num_games (int): Number of games to simulate
- r1 (float): Initial rating of Player 1
- r2 (float): Initial rating of Player 2
- initial_k_factor (int): Initial K-factor value

Returns:
- list: Ratings of Player 1 and Player 2 after each game
"""
ratings_r1 = [r1]
ratings_r2 = [r2]
k_factor = initial_k_factor

for _ in range(num_games):
expected_score1 = 1 / (1 + 10**((r2 - r1) / 400))

# Simulating the game's outcome based on expected scores
outcome = random.random()
if outcome < expected_score1:
score1 = 1 # Player 1 wins
elif outcome < expected_score1 + (1 - expected_score1) / 2:
score1 = 0.5 # Draw
else:
score1 = 0 # Player 2 wins

r1, r2 = update_ratings(r1, r2, score1, k_factor)
ratings_r1.append(r1)
ratings_r2.append(r2)

# Adjusting the K-factor dynamically
k_factor = max(10, k_factor - 0.5)

return ratings_r1, ratings_r2

# Simulating 50 games
num_games = 50
initial_r1 = 1300
initial_r2 = 900
initial_k_factor = 40

ratings_r1, ratings_r2 = simulate_games(num_games, initial_r1, initial_r2, initial_k_factor)

# Plotting the results
plt.figure(figsize=(12, 7))
plt.plot(ratings_r1, label="Sid's Rating", color='blue')
plt.plot(ratings_r2, label="Ercan's Rating", color='red')
plt.title('Evolution of Ratings over 50 Games')
plt.xlabel('Number of Games')
plt.ylabel('Rating')
plt.legend()
plt.grid(True)
plt.show()

```
- **Chapter 8: Insightful Iterations**
collapsed:: true
- Having built and tested their system, Sid and Ercan took a moment to reflect on their journey and the insights they had garnered.
- **Step 1: The Beauty of Dynamic Systems**
- Ercan mused, "You know, Sid, the most fascinating aspect of the Dynamo System is its ability to self-correct. As we play more, the system gets a clearer picture of our skills and adjusts accordingly."
- Sid nodded, "It's a testament to the power of feedback loops in learning and adaptation. We've essentially built a system that learns from every match."
- **Step 2: The Challenge of Calibration**
- Ercan highlighted another point, "Calibrating the K-factor was challenging. Making it dynamic, adjusting to the player's experience, was key. It ensures the system is sensitive for beginners and stable for seasoned players."
- Sid replied, "It's a balance, isn't it? Too much sensitivity and the ratings become erratic. Too little, and they don't reflect recent performances."
- **Step 3: The Journey Ahead**
- As they looked forward, Sid remarked, "There's so much more we can do. Perhaps introducing other factors like game duration, or even specific strategies employed."
- Ercan grinned, "The possibilities are endless. The Dynamo System is just the beginning."
- **Chapter 9: Chess Ratings and the AI Paradigm**
collapsed:: true
- As the sun began its descent, casting a warm hue in the room, Sid and Ercan found themselves delving into a profound conversation. With Sid's extensive background in artificial intelligence infrastructure, they began drawing parallels between their Dynamo Rating System and the intricate world of AI.
- ---
- **Step 1: Foundations of Learning**
- "Consider this, Ercan," Sid began, "Our rating system, at its core, is about learning. After every match, it 'learns' from the outcome and updates our ratings. This continuous feedback and adjustment is akin to how AI models train."
- Ercan, intrigued, responded, "So you're saying our matches are like training data, and each game outcome helps refine the model – in this case, our ratings?"
- "Exactly," Sid affirmed.
- ---
- **Step 2: The Gradient Descent Analogy**
- Sid continued, "In deep learning, there's this concept of gradient descent. It's a method to minimize the error of a model's predictions. After each epoch, or training cycle, the model tweaks its parameters to get closer to the true outcomes. Our Dynamo System's adjustment of ratings based on the difference between expected and actual outcomes is eerily similar."
- Ercan, processing this, remarked, "It's like our system is trying to minimize its 'error' – the difference between our expected and real-world performance."
- ---
- **Step 3: Overfitting and the K-Factor**
- "You know," Sid mused, "There's another parallel. In AI, if a model trains too closely on its training data, it might perform poorly on unseen data. This is called overfitting. Our K-factor adjustment is a safeguard against that. By reducing the K-factor over time, we ensure our system doesn't over-adjust based on recent performances and maintains a holistic view."
- Ercan nodded, "It ensures our system remains general and adaptable, not just fixated on recent matches."
- ---
- **Step 4: The Continuous Evolution**
- Sid leaned forward, "What excites me the most, Ercan, is the continuous evolution. Just as AI models evolve with more data, our system will keep refining itself as we play more games. It's a living, breathing entity, always learning, always adapting."
- Ercan, with a smile, concluded, "Chess, ratings, AI – different domains, yet bound by the same principles of learning and adaptation."
- ---
- By drawing this deep connection between the Dynamo Rating System and AI, Sid and Ercan realized that the principles of learning and adaptation are universal, transcending domains and applications.