Part 1: data frames (extra powerful dictionaries)
Contents
Part 1: data frames (extra powerful dictionaries)#
This workbook does not require you to load any datasets.
We will need to import both pandas and numpy by running the cell below:
import pandas as pd
import numpy as np
Before we analyse this spreadsheet let us dig in a bit more into Data Frames.
Let us create a simple dictionary first:
gradebook = {}
gradebook["Student id"] = ["UP123", "UP124", "UP125", "UP126"]
gradebook["Marks (out of 10)"] = ["10", "5", "7", "6"]
#display the gradebook
gradebook
{'Student id': ['UP123', 'UP124', 'UP125', 'UP126'],
'Marks (out of 10)': ['10', '5', '7', '6']}
Now, let’s create a Data Frame that holds the same information, but it is much more flexible than a dictionary in terms of functionality:
df_gradebook = pd.DataFrame()
df_gradebook["Student id"] = ["UP123", "UP124", "UP125", "UP126"]
df_gradebook["Marks (out of 10)"] = ["10", "5", "7", "6"]
# Display the gradebook
df_gradebook
| Student id | Marks (out of 10) | |
|---|---|---|
| 0 | UP123 | 10 |
| 1 | UP124 | 5 |
| 2 | UP125 | 7 |
| 3 | UP126 | 6 |
Even though we have used lists to construct the Data Frame columns they are flexible to use. The columns were converted to pandas Series: the pandas Series uses numpy arrays, but adds extra functionality!
type(df_gradebook["Student id"])
pandas.core.series.Series
To print a single column of the data frame we use:
print(df_gradebook["Marks (out of 10)"])
0 10
1 5
2 7
3 6
Name: Marks (out of 10), dtype: object
Exercise 1.1#
Create a simple Data Frame that represents the following spreadsheet:

Print each of the columns of this Data Frame.