Week 4 Monday

Week 4 Monday#

Announcements#

The midterm is a week from today. A sample midterm will be posted on Canvas.
The midterm is closed book, closed computer.
The best way to study is to go over the HW, quizzes, and the sample midterm. Next priority would be the lecture notes.
HW 4 distributed today.
HW 3 due tonight.

Plan for today#

list comprehension
f-strings
lambda functions
map, apply and applymap functions

List comprehension and f-strings#

The expression pd.to_datetime("10-23-2023").day_name() produces the string "Monday". Using this idea, make the following length 7 list, and name the result day_list.

["Monday", "Tuesday", ..., "Sunday"]

We don’t need to use the dt accessor here, because we are working with a single value, not with a pandas Series.

import pandas as pd
pd.to_datetime("10-23-2023").day_name()

'Monday'

Let’s slowly build up to making this list. Here we get the day numbers we will use.

for i in range(23,30):
    print(i)

The following is not going to work… how would Python know that the i inside the string was a variable?

# generate string '10-23-2023' by changing the date

for i in range(23,30):
    print('10-i-2023')

10-i-2023
10-i-2023
10-i-2023
10-i-2023
10-i-2023
10-i-2023
10-i-2023

To fix the above error, we need to convert i to a string, using str. This works, but we will see a much more elegant way below.

for i in range(23,30):
    print('10-' + str(i) + '-2023')

Here is the exact same thing, but using f-strings (a way to embed expressions inside string ). (These were added relatively recently, in Python 3.6.) Notice the two changes: (1) We put the letter f before the opening quotation mark, and (2) we put the variable inside curly brackets.

# f-string way
for i in range(23,30):
    print(f"10-{i}-2023")

Use the to_datetime function and the day_name method to get the strings for the days of the week.

for i in range(23,30):
    print(pd.to_datetime(f"10-{i}-2023").day_name())

Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday

day_list = []
for i in range(23,30):
    day_list.append(pd.to_datetime(f"10-{i}-2023").day_name())

day_list

['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

The list comprehension approach is much more elegant. The most basic example of list comprehension is [A for B in C], where A is what should go into the list, B is the variable name, and C is whatever we are iterating over. The best way to gain comfort with list comprehension is to do the same thing using a for loop, and compare. For example, compare the following to what we just did.

day_list2 = [pd.to_datetime(f"10-{i}-2023").day_name() for i in range(23,30)]
day_list2

['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

Use f-strings when you need to embed variables or expressions inside strings, especially when readability and concise syntax are priorities.

Use str primarily for converting other data types to strings or when working with static strings. If you’re working in environments with Python versions earlier than 3.6, you’ll need to avoid f-strings.

name = "Alice"
age = 30
print(f"{name} is {age} years old.")

Alice is 30 years old.

Lambda Functions#

A lambda function is a small anonymous function. A lambda function can take any number of arguments, but can only have one expression.

f = lambda x: x + 10 # input:x, output: x+10
f(5)

`map`, `apply` and `applymap` functions#

map: Works element-wise on a Series. apply: Works on a row/column basis of a DataFrame. applymap: Works element-wise on a DataFrame.

import pandas as pd

# Sample DataFrame
data = {
    'A': [1, 2, 3],
    'B': [4, 5, 6],
    'C': [7, 8, 9]
}

df = pd.DataFrame(data)
print(df)

df['A'].map(lambda x: x*2)

0    2
1    4
2    6
Name: A, dtype: int64

# Using with DataFrame's column:
df['A'].apply(lambda x: x*2)

0    2
1    4
2    6
Name: A, dtype: int64

df

	A	B	C
0	1	4	7
1	2	5	8
2	3	6	9

Using apply with DataFrame’s row: If you set axis=1, you can apply a function that acts on each row. The function will receive a row (as a Series), and you can access its columns.

axis=0 (rows) is like a “drop” (going down the rows). axis=1 (columns) is like a “slide” (going across the columns).

df["sum"] = df.apply(lambda row: row["A"] + row["B"] + row["C"], axis = 1) 
df

	A	B	C	sum
0	1	4	7	12
1	2	5	8	15
2	3	6	9	18

The applymap function is used to apply a function to each individual element in the entire DataFrame.

df.applymap(lambda x: x**2) #square

	A	B	C	sum
0	1	16	49	144
1	4	25	64	225
2	9	36	81	324

Created in Deepnote