Winning Solutions of Chelsea FC player recruitment

Winning Solutions of Chelsea FC player recruitment.

Featured image

Chelsea FC is an English professional football club based in Fulham, West London…

However, these teams hava been performing poorly recently.
Let’s find the best player Chelsea FC needs and strengthen the team.

source code
Python | FIFA23 OFFICIAL DATASET
run
28.3s

Data, Modules Loading and Config

Data Loading

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns
import warnings # ignoring warning massages
warnings.filterwarnings('ignore')
# load data (The FIFA23 data doesn't reflect 'Main Position' well, so we are using FIFA22 data)
data = pd.read_csv("/content/drive/MyDrive/Dataset/FIFA dataset/FIFA22_official_data.csv")

Basic Exploration

Basic Exploration

data.head()

output: image

Delete unnecessary columns

print(data.columns)
data = data.drop(columns=['Photo', 'Flag', 'Club Logo', 'Real Face'])

ouput: image

Exploritory Data Analysis of Chelsea FC

EDA, Data Visualization

1. Check the positions and ages of Chelsea players

Chelsea = data[data['Club']=='Chelsea']
plt.figure(figsize=(10,6))
sns.countplot(x = Chelsea['Age'])

output: image

plt.figure(figsize=(10,6))
sns.countplot(x = Chelsea['Best Position'])

output: image

1.1 Check overalls by position using a box-and-whisker diagram

plt.figure(figsize = (10,6))
sns.boxplot(data=Chelsea, x='Best Position', y='Overall')

output: image

2. Compare with other strong teams

I choose data from Real Madrid CF and Arsenal FC which i think are strong teams

# Insert data of Arsenal FC and Real Madrid CF in df1
df1 = data[(data['Club']=='Chelsea')|(data['Club']=='Arsenal')|(data['Club']=='Real Madrid CF')]

# Filtering players whose value over 1M
df1 = df1[df1['Value'].str.contains('M')]

# Likewise, delete unnecessary columns
df1.info()
df1 = df1.drop(columns=['Marking'])

# Conversion data in Value column to Float type
df1['Value'] = df1['Value'].str.slice(1,-1).astype(float)

output: image

2.1 Identify insufficeint positions

cs = df1[df1['Club']=='Chelsea'].sort_values(by='Overall',ascending=False)
rm = df1[df1['Club']=='Real Madrid CF'].sort_values(by='Overall',ascending=False)
mc = df1[df1['Club']=='Manchester City'].sort_values(by='Overall',ascending=False)
data['Best Position'].unique()

output: image

To compare the main players select starting lineup for each team (Based on overall)

Based on 4-4-2 tactics, 1 GK, 4 CB, 4 MF, 2 ST are selected

gk_list = ['GK']
cb_list = ['CB','RB','LB','RWB','LWB']
mf_list=['CAM','CM','CDM','LM','RM']
st_list=['ST','LW','RW','CF']
st_count = 2
mf_count = 4
cb_count = 4
gk_count = 1

cs_id = []
for index in cs.index:
  if cs['Best Position'][index] in gk_list:
    if gk_count != 0:
      cs_id.append(cs['ID'][index])
      gk_count -= 1
  elif cs['Best Position'][index] in cb_list:
    if cb_count != 0:
      cs['Best Position'][index]='CB'
      cs_id.append(cs['ID'][index])
      cb_count -= 1
  elif cs['Best Position'][index] in mf_list:
    if mf_count != 0:
      cs['Best Position'][index]='MF'
      cs_id.append(cs['ID'][index])
      mf_count -= 1
  else:
    if st_count != 0:
      cs['Best Position'][index]='ST'
      cs_id.append(cs['ID'][index])
      st_count -= 1
cs = cs[cs['ID'].isin(cs_id)]
st_count=2
mf_count=4
cb_count=4
gk_count=1

rm_id=[]
for index in rm.index:
  if rm['Best Position'][index] in gk_list:
    if gk_count != 0:
      rm_id.append(rm['ID'][index])
      gk_count -= 1
  elif rm['Best Position'][index] in cb_list:
    if cb_count != 0:
      rm['Best Position'][index]='CB'
      rm_id.append(rm['ID'][index])
      cb_count -= 1
  elif rm['Best Position'][index] in mf_list:
    if mf_count != 0:
      rm['Best Position'][index]='MF'
      rm_id.append(rm['ID'][index])
      mf_count -= 1
  else:
    if st_count != 0:
      rm['Best Position'][index]='ST'
      rm_id.append(rm['ID'][index])
      st_count -= 1
rm=rm[rm['ID'].isin(rm_id)]
st_count=2
mf_count=4
cb_count=4
gk_count=1

mc_id=[]
for index in mc.index:
  if mc['Best Position'][index] in gk_list:
    if gk_count != 0:
      mc_id.append(mc['ID'][index])
      gk_count -= 1
  elif mc['Best Position'][index] in cb_list:
    if cb_count != 0:
      mc['Best Position'][index]='CB'
      mc_id.append(mc['ID'][index])
      cb_count -= 1
  elif mc['Best Position'][index] in mf_list:
    if mf_count != 0:
      mc['Best Position'][index]='MF'
      mc_id.append(mc['ID'][index])
      mf_count -= 1
  else:
    if st_count != 0:
      mc['Best Position'][index]='ST'
      mc_id.append(mc['ID'][index])
      st_count -= 1
mc = mc[mc['ID'].isin(mc_id)]
# Put them back together for comparison
df = pd.concat([cs,mc,rm], axis=0)
plt.figure(figsize=(10,6))
sns.boxplot(data=df, x='Best Position', y='Overall', hue='Club')

output: image

plt.figure(figsize=(10,6))
sns.boxplot(data=df, x='Best Position', y='Value', hue='Club')

output: image

Comparing other teams, Chelsea Fc is inferior overall and thier defense is particularly weak

Solution, Data exploration

1. Making own formula

Create a formula for player’s score by weighting what i think is most important in a player (eg. age, overall, etc..)
Point = (Overall*2+Potential)/Age

cs['Point'] = (cs['Overall']*2+cs['Potential'])/cs['Age']
cs[cs['Best Position']=='CB'][['Name', 'Overall', 'Potential','Age','Point','Joined','Value']]

output: image

Thiago Silva is one of my most favorite players but his score is low since his age

2. Find a replacement player

# Players who are not in Chelsea FC and have an overall over 83
market=data[(data['Best Position']=='CB')&(data['Club']!='Chelsea')&(data['Overall']>=83)]
market['Point']=(market['Overall']*2+market['Potential'])/market['Age']
want = market[['Name','Club','Age','Overall','Potential','Value','Point','Joined']]
want

output: Uploading image.png…

fig, ax = plt.subplots(nrows=2,ncols=2)

fig.set_size_inches((12,8))
plt.subplots_adjust(wspace = 0.4, hspace = 0.2)

sns.barplot(data=want, x='Age', y='Name', ax=ax[0,0])
sns.barplot(data=want, x='Overall', y='Name', ax=ax[0,1])
sns.barplot(data=want, x='Potential', y='Name', ax=ax[1,0])
sns.barplot(data=want, x='Point', y='Name', ax=ax[1,1])

output: image

Conclusion: Chelsea FC offers a contract to M.de Light who has the highest point