In [1]:
import pandas as pd 
import sqlite3
import sqlalchemy
import os

Dataset

In addition, here, we use FIFA 20 Complete Player Datasetet and European Soccer Database from Kaggle.com.

The datasets that we use from the above likes are:

  • Country.csv
  • League.csv
  • players_19.csv
  • players_20.csv
  • Team.csv
  • teams_and_leagues.csv

Acknowledgements

The datasets also publicly available website sofifa.com.

Creating a SQLite Database

In [2]:
Engine = sqlalchemy.create_engine('sqlite:///FIFA.db')
for file in os.listdir("Data"):
    df = pd.read_csv('Data/' + file, index_col=0)
    df.to_sql(file.replace('.csv',''), Engine, if_exists='replace')
del file, df, Engine
Querying Data From A Table

In this article, we provide a summary of basic SQL Server commands and syntaxes. For sake of demonstrations SQL Server syntaxes, we use SQLite3. Moreover, the following function can help us to mimic SQL Server Environment in this Jupyter notebook.

In [3]:
connection = sqlite3.connect('FIFA.db')
cursor = connection.cursor()

def mySQL(Syntax): return pd.read_sql_query(Syntax, connection)

Here we provide a summary of popular SQL Server command with examples. For full notes, please see the official SQL Server documents at docs.microsoft.com.

Query Data: columns of a table

SELECT column_list FROM Table [into New_Table];

To select all columns use *

SELECT * FROM Table;
In [4]:
mySQL("Select short_name from players_20")
Out[4]:
short_name
0 L. Messi
1 Cristiano Ronaldo
2 Neymar Jr
3 J. Oblak
4 E. Hazard
... ...
18273 Shao Shuai
18274 Xiao Mingjie
18275 Zhang Wei
18276 Wang Haijian
18277 Pan Ximing

18278 rows × 1 columns

Query Data: columns of a table with a filter

SELECT column_list FROM Table
    WHERE Condition;
In [5]:
mySQL("""
Select
    short_name,
    club,
    age
from
    players_20
where
    age <20
""")
Out[5]:
short_name club age
0 M. de Ligt Juventus 19
1 J. Sancho Borussia Dortmund 19
2 João Félix Atlético Madrid 19
3 Vinícius Jr. Real Madrid 18
4 P. Foden Manchester City 19
... ... ... ...
1887 J. Ryan UCD AFC 19
1888 M. Gallagher Finn Harps 18
1889 Huang Jiahui Dalian YiFang FC 18
1890 Zhang Wei Hebei China Fortune FC 19
1891 Wang Haijian Shanghai Greenland Shenhua FC 18

1892 rows × 3 columns

Query Data: distinct entries of some columns of a table

SELECT DISTINCT column_list FROM Table;
In [6]:
mySQL("""
Select
    distinct nationality
from
    players_20
where
    age <20
""")
Out[6]:
nationality
0 Netherlands
1 England
2 Portugal
3 Brazil
4 Italy
... ...
82 Lithuania
83 Latvia
84 Antigua & Barbuda
85 Cyprus
86 Azerbaijan

87 rows × 1 columns

Query Data: Sort the result set in ascending or descending order

SELECT column_list FROM Table
    ORDER BY column_list [ASC |DESC]

Ascending:

In [7]:
mySQL("""
Select
    short_name,
    age
from
    players_20
where
    age <25
order by
    age asc
""")
Out[7]:
short_name age
0 A. Hložek 16
1 Fábio Silva 16
2 E. Millot 16
3 S. Esposito 16
4 A. Velasco 16
... ... ...
8609 Chen Zhechao 24
8610 Ye Chugui 24
8611 M. Nelson 24
8612 D. Retamal 24
8613 H. Altıntaş 24

8614 rows × 2 columns

Descending:

In [8]:
mySQL("""
Select
    short_name,
    age
from
    players_20
where
    age <25
order by
    age desc
""")
Out[8]:
short_name age
0 R. Sterling 24
1 Bernardo Silva 24
2 M. Škriniar 24
3 J. Kimmich 24
4 S. Milinković-Savić 24
... ... ...
8609 D. Burns 16
8610 O. Stefánsson 16
8611 J. Starbuck 16
8612 R. Wikberg 16
8613 E. Sartorius 16

8614 rows × 2 columns

Moreover, orderby has some options Offse and Fetch

SELECT column_list FROM Table
    ORDER BY column_list [ASC |DESC]
        OFFSET offset_row_count {ROW | ROWS}
            FETCH {FIRST | NEXT} fetch_row_count {ROW | ROWS} ONLY

Query Data: Group by

SELECT
    column_list,
    aggregate(a Column)
FROM
    Table
GROUP BY
    column_list;
In [9]:
mySQL("""
Select
    count(sofifa_id),
    age,
    nationality
from
    players_20
Group by
    nationality,
    age
""")
Out[9]:
count(sofifa_id) age nationality
0 1 22 Afghanistan
1 1 27 Afghanistan
2 1 17 Albania
3 4 19 Albania
4 3 20 Albania
... ... ... ...
1731 1 25 Zimbabwe
1732 1 27 Zimbabwe
1733 1 28 Zimbabwe
1734 1 29 Zimbabwe
1735 2 33 Zimbabwe

1736 rows × 3 columns

Query Data: Having

SELECT
    column_list,
    aggregate(a Column)
FROM
    Table
GROUP BY
    column_list
HAVING
    Condition
In [10]:
mySQL("""
Select
    count(sofifa_id),
    age,
    nationality
from
    players_20
Group by
    nationality,
    age
Having
    nationality in ('Scotland')
""")
Out[10]:
count(sofifa_id) age nationality
0 4 17 Scotland
1 11 18 Scotland
2 14 19 Scotland
3 24 20 Scotland
4 20 21 Scotland
5 24 22 Scotland
6 23 23 Scotland
7 19 24 Scotland
8 17 25 Scotland
9 15 26 Scotland
10 21 27 Scotland
11 10 28 Scotland
12 13 29 Scotland
13 9 30 Scotland
14 12 31 Scotland
15 9 32 Scotland
16 13 33 Scotland
17 8 34 Scotland
18 4 35 Scotland
19 5 36 Scotland
20 2 37 Scotland