š® Do Junglers Really Deserve the Blame?
This project investigates whether Junglers are statistically responsible for a teamās loss using real match data from League of Legends. Scroll down for each section of the analysis.
š§ Table of Contents
- 1. Introduction & Research Question
- 2. Data Cleaning
- 3. Feature Exploration
- 4. Baseline Model
- 5. Final Model + Evaluation
- 6. Conclusion
š§© 1. Introduction & Research Question
Stating the Question
Looking at the data, it seems like there are enough statistics for me to finally explore a long-standing question: āAre Junglers really the ones to blame?ā As someone who only knows League of Legends through my friends and some media outlets, it seems like players often hate on Junglers simply because they need someone to blame. In other words, the reason why Junglers are perceived as trolls could be psychological, and not actually backed by evidence. So, I will take this opportunity to examine whether there is any correlation between a Junglerās performanceārelative to their teammatesāand the outcome of the game. Or in other words, figure out whether Junglers performing poorly have a significant impact on the outcome of the game, compared to other roles.
Why You Should Care
I will explore this questions in hopes to contribute to broader discussions in esports analytics about how performance should be measured fairly. However, at the core, if you are like one of my friends and blames the Junglers without a valid statistical reasoning, man up for the poor guy and give him a chance by reading this report.
Data Set Summary
The dataset includes professional League of Legends match statistics from the 2024 season:
- Shape: (117636, 161)
- Relevant Columns:
earnedgoldshare
: The proportion of team gold earneddamageshare
: The proportion of team damage dealtkills
,assists
,teamkills
: Used to calculate kill participation [(kills + assists) / teamkills]total cs
: Used to calculate share of teamās creep scoredeaths
,teamdeaths
: Used to calculate share of teamās deathsteam_dragons
,team_barons
,team_towers
,team_inhibitors
: Team objectives securedopp_team_dragons
,opp_team_barons
,opp_team_towers
,opp_team_inhibitors
: Objectives lost to the opposing team
š§¹ 2. Data Cleaning
Cleaning the Data
I filtered the dataset to only include rows where the role was ājngā, then selected a subset of columns related to Jungler performance. I engineered three new features:
kill_participation
= (kills + assists) / teamkillscs_participation
= total cs / team csdeathshare
= deaths / teamdeaths
There were no missing values in the final subset, so no imputation was needed.
Glancing at the Cleaned Data
Head of Relevant Columns (besides meta data):
result | earnedgoldshare | damageshare | kill_participation | cs_participation | deathshare | team_dragons | team_barons | team_towers | team_inhibitors | opp_team_dragons | opp_team_barons | opp_team_towers | opp_team_inhibitors |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.154069 | 0.176101 | 1 | 0.146692 | 0.25 | 2 | 0 | 2 | 0 | 3 | 2 | 9 | 1 |
1 | 0.157264 | 0.0692822 | 0.8125 | 0.153636 | 0 | 3 | 2 | 9 | 1 | 2 | 0 | 2 | 0 |
0 | 0.186598 | 0.056958 | 0.666667 | 0.189331 | 0.294118 | 0 | 0 | 2 | 0 | 4 | 1 | 9 | 1 |
1 | 0.177672 | 0.181341 | 0.882353 | 0.170557 | 0.333333 | 4 | 1 | 9 | 1 | 0 | 0 | 2 | 0 |
1 | 0.21582 | 0.139214 | 0.619048 | 0.206107 | 0.333333 | 2 | 1 | 10 | 2 | 1 | 0 | 0 | 0 |
Univariate Analysis
Overall, features were either already well normalized or skewed.
- Already Normalized Data:
Interpretation: Distribution is not skewed. Most Junglers contribute between 15% and 25% of their teamās total creep score, which makes sense since there are 5 players and lane players typically get more minions. This supports its use in raw form without transformation.
- Skewed Data:
Interpretation: Right-skewed distribution. A small number of Junglers contribute disproportionately high damage, likely due to champion pick (e.g., AP assassins or carry builds). Most hover below 20%. This skew justifies transforming damageshare
using a quantile or log-based method during modeling to reduce outlier impact.
Bivariate Analysis
For bivariate analysis, I looked at how objective control counts relate to win rate. Overall, there were positive correlations between team objective counts and the likelihood of winning. However, some relationships had unexpected shapes.
- Team Dragons vs. Win Rate
Interpretation: There is a clear positive relationship ā teams that secure more dragons win more often, which reinforces the strategic value of dragon control.
- Team Barons vs. Win Rate
Interpretation: The shape is bimodal; teams often either win without any barons or dominate after securing one or more, suggesting that Baron may act more as a win accelerator than a deciding factor.
- Opp. Team Dragons vs. Win Rate
Interesting Aggregations
An interesting aggregation is by patch and champion. This lets us observe how Jungler performance varies across different champions and across changing environments (patch versions).
Significance: This aggregation helps contextualize Jungler performance. If certain champions or patches lead to systematically better or worse stats, it suggests that win/loss outcomes may be influenced as much by the meta as by the individual player. In other words, Junglers might be blamed for poor outcomes that were more about champion viability or patch balance than their own decisions.
Imputation
Zero imputations were necessary!!
š 3. Feature Exploration
Stating the Problem
As mentioned in the introduction, the goal of this project is to determine whether Junglers are truly to blame for their teamās loss. To examine this, I framed the problem as a binary classification task: given a Junglerās performance stats in a match, can we predict whether their team won or lost?
Target Variable
The target is result
, where:
1
= win0
= loss
This makes it a clean 0/1 classification setup.
Feature Selection
To avoid confounding the model with non-Jungler factors, I used only Jungler-centric or Jungler-related team objective features:
earnedgoldshare
,damageshare
,kill_participation
,cs_participation
,deathshare
team_dragons
,team_barons
,team_towers
,team_inhibitors
opp_team_dragons
,opp_team_barons
,opp_team_towers
,opp_team_inhibitors
These features capture both individual contribution and team-level performance signals often associated with Jungler impact (e.g., objective control, map pressure, survivability).
Why This Problem Framing Works
If a model trained only on Jungler-related stats can predict game outcomes with reasonable accuracy, then it suggests that Junglers do have a measurable and consistent effect on whether a team wins or loses. If the model performs poorly, it would suggest that outcomes are either:
- Driven by other roles or external factors (draft, coordination, etc.), or
- Too noisy to reliably attribute to one role.
This approach allows for a statistical investigation of the blame narrative, grounded in real match data.
š¤ 4. Baseline Model
Feature Selection
For the baseline model, I used four core Jungler-related features:
kill_participation
deathshare
earnedgoldshare
damageshare
These are metrics that are most relative to the other roles.
Model Setup
For baseline, I selected Logistic Regression
(because why not) and StandardScaler
for basic normalization. The model was evaluated on a held-out test set (20% split from the full dataset).
Results
- Accuracy: 0.531
- Precision (loss): 0.53
- Precision (win): 0.53
- Recall (loss): 0.50
- Recall (win): 0.57
Interpretation
The baseline model performs only slightly better than random guessing. While Jungler stats like kill participation and gold share show some signal, they are not strong enough on their own to reliably predict game outcomes. This suggests that Junglers may influence wins or losses, but these features alone are not sufficient to āblameā them ā or at least not to predict results with high confidence. This gives a good foundation to build a more sophisticated model using additional features and transformations in the next step.
š§ 5. Final Model
Feature Engineering
To improve on the baseline model, I added team-level objective features in addition to the original Jungler metrics. I then applied the following transformations using ColumnTransformer
:
- Standard-scaled features:
earnedgoldshare
,damageshare
,kill_participation
,cs_participation
- Quantile-transformed features (to normalize skew):
deathshare
,team_dragons
,team_barons
,team_towers
,team_inhibitors
opp_team_dragons
,opp_team_barons
,opp_team_towers
,opp_team_inhibitors
These transformations helped ensure that both normally distributed and skewed features were standardized for effective model training.
Model Tuning & Selection
I evaluated three classifiers using GridSearchCV
with 5-fold cross-validation:
- LogisticRegression:
C = [0.1, 1, 10]
- SVC:
C = [0.1, 1, 10]
, kernels =['linear', 'rbf']
- RandomForestClassifier:
n_estimators = [50, 100]
,max_depth = [5, 10, None]
All models used the same preprocessing pipeline and training/testing split.
Best Model & Results
- Best Model:
SVC
withC = 10
,kernel = 'rbf'
- Test Accuracy: 0.9828
SVC Classification Report:
- Precision (loss): 0.99
- Recall (loss): 0.98
- Precision (win): 0.98
- Recall (win): 0.99
- F1-score (macro avg): 0.98
RandomForest
- Accuracy: 0.9815
- Best Parameters:
n_estimators=100
,max_depth=10
- F1-score (macro avg): 0.98
Logistic Regression
- Accuracy: 0.9760
- Best Parameters:
C=10
- F1-score (macro avg): 0.98
Confusion Matrix
Below are the confusion matrices for each final model, showing how often predictions matched the actual outcomes.
- Logistic Regression
- Random Forest
- Support Vector Classifier (SVC)
Interpretation
All three models performed significantly better than the original baseline, with SVC achieving the highest accuracy. These results show that Jungler performance, when properly contextualized with team objectives and transformed for skew, is highly predictive of match outcome. In this dataset, Jungler-related stats were enough to predict wins and losses with nearly 99% accuracy ā suggesting that the roleās impact is not just perception, but measurable.
While this doesnāt mean Junglers are always to blame, it strongly supports the idea that their performance is deeply tied to team success or failure.