Week 4 Monday#
Announcements#
The midterm is a week from today. A sample midterm will be posted on Canvas.
The midterm is closed book, closed computer.
The best way to study is to go over the HW, quizzes, and the sample midterm. Next priority would be the lecture notes.
HW 4 distributed today.
HW 3 due tonight.
Plan for today#
list comprehension
f-strings
lambda functions
map
,apply
andapplymap
functions
List comprehension and f-strings#
The expression
pd.to_datetime("10-23-2023").day_name()
produces the string"Monday"
. Using this idea, make the following length 7 list, and name the resultday_list
.
["Monday", "Tuesday", ..., "Sunday"]
We don’t need to use the dt
accessor here, because we are working with a single value, not with a pandas Series.
import pandas as pd
pd.to_datetime("10-23-2023").day_name()
'Monday'
Let’s slowly build up to making this list. Here we get the day numbers we will use.
for i in range(23,30):
print(i)
23
24
25
26
27
28
29
The following is not going to work… how would Python know that the i inside the string was a variable?
# generate string '10-23-2023' by changing the date
for i in range(23,30):
print('10-i-2023')
10-i-2023
10-i-2023
10-i-2023
10-i-2023
10-i-2023
10-i-2023
10-i-2023
To fix the above error, we need to convert i
to a string, using str
. This works, but we will see a much more elegant way below.
for i in range(23,30):
print('10-' + str(i) + '-2023')
10-23-2023
10-24-2023
10-25-2023
10-26-2023
10-27-2023
10-28-2023
10-29-2023
Here is the exact same thing, but using f-strings (a way to embed expressions inside string ). (These were added relatively recently, in Python 3.6.) Notice the two changes: (1) We put the letter f
before the opening quotation mark, and (2) we put the variable inside curly brackets.
# f-string way
for i in range(23,30):
print(f"10-{i}-2023")
10-23-2023
10-24-2023
10-25-2023
10-26-2023
10-27-2023
10-28-2023
10-29-2023
Use the to_datetime
function and the day_name
method to get the strings for the days of the week.
for i in range(23,30):
print(pd.to_datetime(f"10-{i}-2023").day_name())
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday
day_list = []
for i in range(23,30):
day_list.append(pd.to_datetime(f"10-{i}-2023").day_name())
day_list
['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
The list comprehension approach is much more elegant. The most basic example of list comprehension is [A for B in C]
, where A
is what should go into the list, B
is the variable name, and C
is whatever we are iterating over. The best way to gain comfort with list comprehension is to do the same thing using a for loop, and compare. For example, compare the following to what we just did.
day_list2 = [pd.to_datetime(f"10-{i}-2023").day_name() for i in range(23,30)]
day_list2
['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
Use f-strings when you need to embed variables or expressions inside strings, especially when readability and concise syntax are priorities.
Use str
primarily for converting other data types to strings or when working with static strings. If you’re working in environments with Python versions earlier than 3.6, you’ll need to avoid f-strings.
name = "Alice"
age = 30
print(f"{name} is {age} years old.")
Alice is 30 years old.
Lambda Functions#
A lambda function is a small anonymous function. A lambda function can take any number of arguments, but can only have one expression.
f = lambda x: x + 10 # input:x, output: x+10
f(5)
15
map
, apply
and applymap
functions#
map
: Works element-wise on a Series.
apply
: Works on a row/column basis of a DataFrame.
applymap
: Works element-wise on a DataFrame.
import pandas as pd
# Sample DataFrame
data = {
'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9]
}
df = pd.DataFrame(data)
print(df)
A B C
0 1 4 7
1 2 5 8
2 3 6 9
df['A'].map(lambda x: x*2)
0 2
1 4
2 6
Name: A, dtype: int64
# Using with DataFrame's column:
df['A'].apply(lambda x: x*2)
0 2
1 4
2 6
Name: A, dtype: int64
df
A | B | C | |
---|---|---|---|
0 | 1 | 4 | 7 |
1 | 2 | 5 | 8 |
2 | 3 | 6 | 9 |
Using apply
with DataFrame’s row:
If you set axis=1, you can apply a function that acts on each row. The function will receive a row (as a Series), and you can access its columns.
axis=0 (rows) is like a “drop” (going down the rows). axis=1 (columns) is like a “slide” (going across the columns).
df["sum"] = df.apply(lambda row: row["A"] + row["B"] + row["C"], axis = 1)
df
A | B | C | sum | |
---|---|---|---|---|
0 | 1 | 4 | 7 | 12 |
1 | 2 | 5 | 8 | 15 |
2 | 3 | 6 | 9 | 18 |
The applymap
function is used to apply a function to each individual element in the entire DataFrame.
df.applymap(lambda x: x**2) #square
A | B | C | sum | |
---|---|---|---|---|
0 | 1 | 16 | 49 | 144 |
1 | 4 | 25 | 64 | 225 |
2 | 9 | 36 | 81 | 324 |