C3-4: Pandas

C3-4: Pandas


L1: Anaconda


L2: Jupyter Notebooks


L3: NumPy


L4: Pandas

Creating Pandas Series

import pandas as pd | Import Pandas

my_series = pd.Series(data = ['data1', 'data2', data3', 'data4'], index = ['row1', 'row2', 'row3', 'row4']) | Creates Series

my_series # Display Series

my_series.shape # Show shape as tuple

my_series.ndim # Dimensions as int

my_series.size # Number of elements

my_series.index # Show index

my_seriex.values # Show values

x = bananas in my_series # Check if 'bananas' is an index label, returns True or False

# We import Pandas as pd into Python
import pandas as pd

# We create a Pandas Series that stores a grocery list
groceries = pd.Series(data = [30, 6, 'Yes', 'No'], index = ['eggs', 'apples', 'milk', 'bread'])

# We display the Groceries Pandas Series
groceries

eggs 30

apples 6

milk Yes

bread No

dtype: object

# We print some information about Groceries
print('Groceries has shape:', groceries.shape)
print('Groceries has dimension:', groceries.ndim)
print('Groceries has a total of', groceries.size, 'elements')

Groceries has shape: (4,)

Groceries has dimension: 1

Groceries has a total of 4 elements

# We print the index and data of Groceries
print('The data in Groceries is:', groceries.values)
print('The index of Groceries is:', groceries.index)

The data in Groceries is: [30 6 'Yes' 'No']

The index of Groceries is: Index(['eggs', 'apples', 'milk', 'bread'], dtype='object')

# We check whether bananas is a food item (an index) in Groceries
x = 'bananas' in groceries

# We check whether bread is a food item (an index) in Groceries
y = 'bread' in groceries

# We print the results
print('Is bananas an index label in Groceries:', x)
print('Is bread an index label in Groceries:', y)

Is bananas an index label in Groceries: False

Is bread an index label in Groceries: True

Accessing and Deleting Elements in Pandas

my_series['apples'] # Access element using a single label

my_series[['apples', 'pears']] # Access element using two or more labels

.loc | .iloc # The attribute .loc stands for location and it is used to explicitly state that we are using a labeled index. Similarly, the attribute .iloc stands for integer location and it is used to explicitly state that we are using a numerical index. Let's see some examples:

# We access elements in Groceries using index labels:

# We use a single index label
print('How many eggs do we need to buy:', groceries['eggs'])
print()

# we can access multiple index labels
print('Do we need milk and bread:\n', groceries[['milk', 'bread']]) 
print()

# we use loc to access multiple index labels
print('How many eggs and apples do we need to buy:\n', groceries.loc[['eggs', 'apples']]) 
print()

# We access elements in Groceries using numerical indices:

# we use multiple numerical indices
print('How many eggs and apples do we need to buy:\n',  groceries[[0, 1]]) 
print()

# We use a negative numerical index
print('Do we need bread:\n', groceries[[-1]]) 
print()

# We use a single numerical index
print('How many eggs do we need to buy:', groceries[0]) 
print()
# we use iloc to access multiple numerical indices
print('Do we need milk and bread:\n', groceries.iloc[[2, 3]]) 

How many eggs do we need to buy: 30

Do we need milk and bread:
milk Yes
bread No
dtype: object

How many eggs and apples do we need to buy:
eggs 30
apples 6
dtype: object

How many eggs and apples do we need to buy:
eggs 30
apples 6
dtype: object

Do we need bread:
bread No
dtype: object

How many eggs do we need to buy: 30

Do we need milk and bread:
milk Yes
bread No
dtype: object

Pandas Series are also mutable like NumPy ndarrays, which means we can change the elements of a Pandas Series after it has been created. For example, let's change the number of eggs we need to buy from our grocery list.

# We display the original grocery list
print('Original Grocery List:\n', groceries)

# We change the number of eggs to 2
groceries['eggs'] = 2

# We display the changed grocery list
print()
print('Modified Grocery List:\n', groceries)

Original Grocery List:
eggs 30
apples 6
milk Yes
bread No
dtype: object

Modified Grocery List:
eggs 2
apples 6
milk Yes
bread No
dtype: object

We can also delete items from a Pandas Series by using the .drop() method. The Series.drop(label) method removes the given label from the given Series. We should note that the Series.drop(label) method drops elements from the Series out of place, meaning that it doesn't change the original Series being modified. Let's see how this works:

# We display the original grocery list
print('Original Grocery List:\n', groceries)

# We remove apples from our grocery list. The drop function removes elements out of place
print()
print('We remove apples (out of place):\n', groceries.drop('apples'))

# When we remove elements out of place the original Series remains intact. To see this
# we display our grocery list again
print()
print('Grocery List after removing apples out of place:\n', groceries)

Original Grocery List:
eggs 30
apples 6
milk Yes
bread No
dtype: object

We remove apples (out of place):
eggs 30
milk Yes
bread No
dtype: object

Grocery List after removing apples out of place:
eggs 30
apples 6
milk Yes
bread No
dtype: object

We can delete items from a Pandas Series in place by setting the keyword inplace to True in the .drop() method. Let's see an example:

# We display the original grocery list
print('Original Grocery List:\n', groceries)

# We remove apples from our grocery list in place by setting the inplace keyword to True
groceries.drop('apples', inplace = True)

# When we remove elements in place the original Series its modified. To see this
# we display our grocery list again
print()
print('Grocery List after removing apples in place:\n', groceries)

Original Grocery List:
eggs 30
apples 6
milk Yes
bread No
dtype: object

Grocery List after removing apples in place:
eggs 30
milk Yes
bread No
dtype: object

Arithmetic Operations on Pandas Series


You'll only receive email when they publish something new.

More from understanding
All posts