C3-4: Pandas
August 29, 2019•1,003 words
C3-4: Pandas
L1: Anaconda
L2: Jupyter Notebooks
L3: NumPy
L4: Pandas
Creating Pandas Series
import pandas as pd | Import Pandas
my_series = pd.Series(data = ['data1', 'data2', data3', 'data4'], index = ['row1', 'row2', 'row3', 'row4']) | Creates Series
my_series # Display Series
my_series.shape # Show shape as tuple
my_series.ndim # Dimensions as int
my_series.size # Number of elements
my_series.index # Show index
my_seriex.values # Show values
x = bananas in my_series # Check if 'bananas' is an index label, returns True or False
# We import Pandas as pd into Python
import pandas as pd
# We create a Pandas Series that stores a grocery list
groceries = pd.Series(data = [30, 6, 'Yes', 'No'], index = ['eggs', 'apples', 'milk', 'bread'])
# We display the Groceries Pandas Series
groceries
eggs 30
apples 6
milk Yes
bread No
dtype: object
# We print some information about Groceries
print('Groceries has shape:', groceries.shape)
print('Groceries has dimension:', groceries.ndim)
print('Groceries has a total of', groceries.size, 'elements')
Groceries has shape: (4,)
Groceries has dimension: 1
Groceries has a total of 4 elements
# We print the index and data of Groceries
print('The data in Groceries is:', groceries.values)
print('The index of Groceries is:', groceries.index)
The data in Groceries is: [30 6 'Yes' 'No']
The index of Groceries is: Index(['eggs', 'apples', 'milk', 'bread'], dtype='object')
# We check whether bananas is a food item (an index) in Groceries
x = 'bananas' in groceries
# We check whether bread is a food item (an index) in Groceries
y = 'bread' in groceries
# We print the results
print('Is bananas an index label in Groceries:', x)
print('Is bread an index label in Groceries:', y)
Is bananas an index label in Groceries: False
Is bread an index label in Groceries: True
Accessing and Deleting Elements in Pandas
my_series['apples'] # Access element using a single label
my_series[['apples', 'pears']] # Access element using two or more labels
.loc | .iloc # The attribute .loc stands for location and it is used to explicitly state that we are using a labeled index. Similarly, the attribute .iloc stands for integer location and it is used to explicitly state that we are using a numerical index. Let's see some examples:
# We access elements in Groceries using index labels:
# We use a single index label
print('How many eggs do we need to buy:', groceries['eggs'])
print()
# we can access multiple index labels
print('Do we need milk and bread:\n', groceries[['milk', 'bread']])
print()
# we use loc to access multiple index labels
print('How many eggs and apples do we need to buy:\n', groceries.loc[['eggs', 'apples']])
print()
# We access elements in Groceries using numerical indices:
# we use multiple numerical indices
print('How many eggs and apples do we need to buy:\n', groceries[[0, 1]])
print()
# We use a negative numerical index
print('Do we need bread:\n', groceries[[-1]])
print()
# We use a single numerical index
print('How many eggs do we need to buy:', groceries[0])
print()
# we use iloc to access multiple numerical indices
print('Do we need milk and bread:\n', groceries.iloc[[2, 3]])
How many eggs do we need to buy: 30
Do we need milk and bread:
milk Yes
bread No
dtype: object
How many eggs and apples do we need to buy:
eggs 30
apples 6
dtype: object
How many eggs and apples do we need to buy:
eggs 30
apples 6
dtype: object
Do we need bread:
bread No
dtype: object
How many eggs do we need to buy: 30
Do we need milk and bread:
milk Yes
bread No
dtype: object
Pandas Series are also mutable like NumPy ndarrays, which means we can change the elements of a Pandas Series after it has been created. For example, let's change the number of eggs we need to buy from our grocery list.
# We display the original grocery list
print('Original Grocery List:\n', groceries)
# We change the number of eggs to 2
groceries['eggs'] = 2
# We display the changed grocery list
print()
print('Modified Grocery List:\n', groceries)
Original Grocery List:
eggs 30
apples 6
milk Yes
bread No
dtype: object
Modified Grocery List:
eggs 2
apples 6
milk Yes
bread No
dtype: object
We can also delete items from a Pandas Series by using the .drop() method. The Series.drop(label) method removes the given label from the given Series. We should note that the Series.drop(label) method drops elements from the Series out of place, meaning that it doesn't change the original Series being modified. Let's see how this works:
# We display the original grocery list
print('Original Grocery List:\n', groceries)
# We remove apples from our grocery list. The drop function removes elements out of place
print()
print('We remove apples (out of place):\n', groceries.drop('apples'))
# When we remove elements out of place the original Series remains intact. To see this
# we display our grocery list again
print()
print('Grocery List after removing apples out of place:\n', groceries)
Original Grocery List:
eggs 30
apples 6
milk Yes
bread No
dtype: object
We remove apples (out of place):
eggs 30
milk Yes
bread No
dtype: object
Grocery List after removing apples out of place:
eggs 30
apples 6
milk Yes
bread No
dtype: object
We can delete items from a Pandas Series in place by setting the keyword inplace to True in the .drop() method. Let's see an example:
# We display the original grocery list
print('Original Grocery List:\n', groceries)
# We remove apples from our grocery list in place by setting the inplace keyword to True
groceries.drop('apples', inplace = True)
# When we remove elements in place the original Series its modified. To see this
# we display our grocery list again
print()
print('Grocery List after removing apples in place:\n', groceries)
Original Grocery List:
eggs 30
apples 6
milk Yes
bread No
dtype: object
Grocery List after removing apples in place:
eggs 30
milk Yes
bread No
dtype: object