šŸŽ® Do Junglers Really Deserve the Blame?

This project investigates whether Junglers are statistically responsible for a team’s loss using real match data from League of Legends. Scroll down for each section of the analysis.


🧭 Table of Contents


🧩 1. Introduction & Research Question

Stating the Question

Looking at the data, it seems like there are enough statistics for me to finally explore a long-standing question: ā€˜Are Junglers really the ones to blame?’ As someone who only knows League of Legends through my friends and some media outlets, it seems like players often hate on Junglers simply because they need someone to blame. In other words, the reason why Junglers are perceived as trolls could be psychological, and not actually backed by evidence. So, I will take this opportunity to examine whether there is any correlation between a Jungler’s performance—relative to their teammates—and the outcome of the game. Or in other words, figure out whether Junglers performing poorly have a significant impact on the outcome of the game, compared to other roles.

Why You Should Care

I will explore this questions in hopes to contribute to broader discussions in esports analytics about how performance should be measured fairly. However, at the core, if you are like one of my friends and blames the Junglers without a valid statistical reasoning, man up for the poor guy and give him a chance by reading this report.

Data Set Summary

The dataset includes professional League of Legends match statistics from the 2024 season:


🧹 2. Data Cleaning

Cleaning the Data

I filtered the dataset to only include rows where the role was ā€˜jng’, then selected a subset of columns related to Jungler performance. I engineered three new features:

There were no missing values in the final subset, so no imputation was needed.

Glancing at the Cleaned Data

Head of Relevant Columns (besides meta data):

result earnedgoldshare damageshare kill_participation cs_participation deathshare team_dragons team_barons team_towers team_inhibitors opp_team_dragons opp_team_barons opp_team_towers opp_team_inhibitors
0 0.154069 0.176101 1 0.146692 0.25 2 0 2 0 3 2 9 1
1 0.157264 0.0692822 0.8125 0.153636 0 3 2 9 1 2 0 2 0
0 0.186598 0.056958 0.666667 0.189331 0.294118 0 0 2 0 4 1 9 1
1 0.177672 0.181341 0.882353 0.170557 0.333333 4 1 9 1 0 0 2 0
1 0.21582 0.139214 0.619048 0.206107 0.333333 2 1 10 2 1 0 0 0

Univariate Analysis

Overall, features were either already well normalized or skewed.

Interpretation: Distribution is not skewed. Most Junglers contribute between 15% and 25% of their team’s total creep score, which makes sense since there are 5 players and lane players typically get more minions. This supports its use in raw form without transformation.

Interpretation: Right-skewed distribution. A small number of Junglers contribute disproportionately high damage, likely due to champion pick (e.g., AP assassins or carry builds). Most hover below 20%. This skew justifies transforming damageshare using a quantile or log-based method during modeling to reduce outlier impact.

Bivariate Analysis

For bivariate analysis, I looked at how objective control counts relate to win rate. Overall, there were positive correlations between team objective counts and the likelihood of winning. However, some relationships had unexpected shapes.

Interpretation: There is a clear positive relationship — teams that secure more dragons win more often, which reinforces the strategic value of dragon control.

Interpretation: The shape is bimodal; teams often either win without any barons or dominate after securing one or more, suggesting that Baron may act more as a win accelerator than a deciding factor.

Interesting Aggregations

An interesting aggregation is by patch and champion. This lets us observe how Jungler performance varies across different champions and across changing environments (patch versions).

Significance: This aggregation helps contextualize Jungler performance. If certain champions or patches lead to systematically better or worse stats, it suggests that win/loss outcomes may be influenced as much by the meta as by the individual player. In other words, Junglers might be blamed for poor outcomes that were more about champion viability or patch balance than their own decisions.

Imputation

Zero imputations were necessary!!


šŸ“Š 3. Feature Exploration

Stating the Problem

As mentioned in the introduction, the goal of this project is to determine whether Junglers are truly to blame for their team’s loss. To examine this, I framed the problem as a binary classification task: given a Jungler’s performance stats in a match, can we predict whether their team won or lost?

Target Variable

The target is result, where:

This makes it a clean 0/1 classification setup.

Feature Selection

To avoid confounding the model with non-Jungler factors, I used only Jungler-centric or Jungler-related team objective features:

These features capture both individual contribution and team-level performance signals often associated with Jungler impact (e.g., objective control, map pressure, survivability).

Why This Problem Framing Works

If a model trained only on Jungler-related stats can predict game outcomes with reasonable accuracy, then it suggests that Junglers do have a measurable and consistent effect on whether a team wins or loses. If the model performs poorly, it would suggest that outcomes are either:

This approach allows for a statistical investigation of the blame narrative, grounded in real match data.


šŸ¤– 4. Baseline Model

Feature Selection

For the baseline model, I used four core Jungler-related features:

These are metrics that are most relative to the other roles.

Model Setup

For baseline, I selected Logistic Regression (because why not) and StandardScaler for basic normalization. The model was evaluated on a held-out test set (20% split from the full dataset).

Results

Interpretation

The baseline model performs only slightly better than random guessing. While Jungler stats like kill participation and gold share show some signal, they are not strong enough on their own to reliably predict game outcomes. This suggests that Junglers may influence wins or losses, but these features alone are not sufficient to ā€œblameā€ them — or at least not to predict results with high confidence. This gives a good foundation to build a more sophisticated model using additional features and transformations in the next step.


šŸ”§ 5. Final Model

Feature Engineering

To improve on the baseline model, I added team-level objective features in addition to the original Jungler metrics. I then applied the following transformations using ColumnTransformer:

These transformations helped ensure that both normally distributed and skewed features were standardized for effective model training.

Model Tuning & Selection

I evaluated three classifiers using GridSearchCV with 5-fold cross-validation:

All models used the same preprocessing pipeline and training/testing split.

Best Model & Results

SVC Classification Report:

RandomForest

Logistic Regression

Confusion Matrix

Below are the confusion matrices for each final model, showing how often predictions matched the actual outcomes.

Interpretation

All three models performed significantly better than the original baseline, with SVC achieving the highest accuracy. These results show that Jungler performance, when properly contextualized with team objectives and transformed for skew, is highly predictive of match outcome. In this dataset, Jungler-related stats were enough to predict wins and losses with nearly 99% accuracy — suggesting that the role’s impact is not just perception, but measurable.

While this doesn’t mean Junglers are always to blame, it strongly supports the idea that their performance is deeply tied to team success or failure.