Tuesday, April 26, 2022

Python - Handle CSV file and pandas (Day 25)

This is a 100 Days challenge to learn a new language (Python). 100 Days of Code - The Complete Python Pro Bootcamp 

I will post some notes to motivate myself to finish this challenge.


Read a CSV file



Using readlines of file object

Ex:
# Open file
with open("./weather_data.csv", encoding="utf-8") as csv_file:
    # Using readlines function of file object
    # to read all the lines of a file in a list
    data = csv_file.readlines()
    print(data)

Result:
 [
'day,temp,condition\n',
'Monday,12,Sunny\n',
'Tuesday,14,Rain\n',
'Wednesday,15,Rain\n',
'Thursday,14,Cloudy\n',
'Friday,21,Sunny\n',
'Saturday,22,Sunny\n',
'Sunday,24,Sunny'
]

According to the result, we need to spend lots of efforts to clean up the data (comma or newline character) before analyzing it.


Utilize csv module



The csv module implements classes to read and write tabular data in CSV format.

Ex:
import csv

# Open a file
with open("./weather_data.csv", encoding="utf-8") as csv_file:
    # csv.reader() will return a reader object
# which will iterate over lines
    data = csv.reader(csv_file)

    # Go through each line
    for row in data:
        print(row)

Result:
 ['day', 'temp', 'condition']
   ['Monday', '12', 'Sunny']
   ['Tuesday', '14', 'Rain']
   ['Wednesday', '15', 'Rain']
   ['Thursday', '14', 'Cloudy']
   ['Friday', '21', 'Sunny']
   ['Saturday', '22', 'Sunny']
   ['Sunday', '24', 'Sunny']

Now the result looks much better. Using csv module save lots of our time.

But if we want to get all temperatures from the previous result and save them into a list. Is there any better way to handle it?

Ex:
import csv

# Open a file
with open("./weather_data.csv", encoding="utf-8") as csv_file:
    # csv.reader() will return a reader object
# which will iterate over lines
    data = csv.reader(csv_file)

    # Define a list
    temperature = []

    # Go through each line
    for row in data:
        # Skip csv header row
        if row[1] != "temp":
            # 1 is a magic number
            temperature.append(row[1])

print(temperature)

Result:
 ['12', '14', '15', '14', '21', '22', '24']


pandas Library



The code above looks long and tedious.

There are some libraries can save our time, and pandas module is popular.

Ex:
import pandas

# Read a comma-separated values (csv) file into DataFrame.
data = pandas.read_csv("weather_data.csv")

# Read temp column
print(data["temp"])

Result:
  0    12
  1    14
  2    15
  3    14
  4    21
  5    22
  6    24
  Name: temp, dtype: int64

Now the code looks cleaner.

That is why python developers always use pandas module (or other library) to deal with csv file no matter how easy the tasks.


Exp 1 - Get the average temperature



Ex:
import pandas

# Read a comma-separated values (csv) file into DataFrame.
data = pandas.read_csv("weather_data.csv")

# Transform Series to List first
temp_list = data["temp"].to_list()
# Then using built-in function to finish the calculation
average = sum(temp_list) / len(temp_list)
print(average)

# Or Using Series.mean()
print(data["temp"].mean())


Exp 2 - Get rows with conditions



Ex:
import pandas

# Read a comma-separated values (csv) file into DataFrame.
data = pandas.read_csv("weather_data.csv")

# To select rows based on a conditional expression,
# use a condition inside the selection brackets [].
# Condition: data["day"] == "Monday"
monday_data = data[data["day"] == "Monday"]
print(monday_data)


Project - US State Game






* Load US Map image to Turtle Graphic
* Read all states and their relating x and y position from csv file
* Ask for user input (screen.textinput)
* Use title of screen.textinput to keep tracking the scores of this game
* Show correct guesses in specific position (which is read from csv)
* Use loop to allow users to keep guessing
* Track the correct guesses and output the missing states in csv when game is over

main.py
from turtle import Screen, Turtle
import pandas

# Constants
BG_IMAGE_PATH = "blank_states_img.gif"
INPUT_CSV_PATH = "50_states.csv"
OUTPUT_CSV_PATH = "states_to_learn.csv"

# Init Screen
screen = Screen()
screen.title("US State Game")
screen.bgpic(BG_IMAGE_PATH)

# Read state info from csv
state_data_frame = pandas.read_csv(INPUT_CSV_PATH)
states_list = state_data_frame["state"].to_list()
# Track the correct guesses
correct_guess_states_list = []

def write_state_to_screen(name, pos_x, pos_y):
    """Write state text in map with input position"""
    tim = Turtle()
    tim.penup()
    tim.color("black")
    tim.hideturtle()
    tim.setpos(pos_x, pos_y)
    tim.write(name)

def write_message_to_screen(text):
    """Show Message with input text"""
    tim = Turtle()
    tim.penup()
    tim.color("red")
    tim.hideturtle()
    tim.setpos(0, 0)
    tim.write(text, False, align="center", font=("Courier", 24, "normal"))

def get_user_input():
    """Generate a textinput and return the user_input to the caller"""
    textinput_title = "Guess the state"
    # Update textinput title if needed
    if len(correct_guess_states_list) > 0:
        textinput_title = f"{len(correct_guess_states_list)}/50 State
Correct"

    # Ask user input
    user_input = screen.textinput(
        title=textinput_title, prompt="What's another state's name?"
    )

    # If user clicked 'cancel'
    if user_input is None:
        return None

    # Uisng title() to get the title case of user input
    return user_input.title()

def write_the_missing_states_to_csv():
    """Write missing states to csv"""
    # Define a dictionary
    output_disc = {"state": []}

    for state in states_list:
        if state not in correct_guess_states_list:
            output_disc["state"].append(state)

    # Use a dictionary to initialize a DataFrame Object
    output_data_frame = pandas.DataFrame(output_disc)
    # Use 'to_csv' function of the DataFrame Object to output a csv file
    output_data_frame.to_csv(OUTPUT_CSV_PATH)


while len(correct_guess_states_list) < 50:
    # Get User Input
    answer = get_user_input()

    # Exit this game
    # 1. if users clicked cancel button of the input box
    # 2. if users typed 'exit'
    if answer is None or answer == "Exit":
        break

    if answer in states_list:
        # Select the state info from csv source
        answer_state_series = state_data_frame[
            state_data_frame["state"] == answer
        ].iloc[0]

        # Show the state
        write_state_to_screen(
            answer_state_series["state"],
            answer_state_series["x"],
            answer_state_series["y"],
        )

        # Append the correct guess to the tracking list
        correct_guess_states_list.append(answer)

# Determine the game result
if len(correct_guess_states_list) == 50:
    write_message_to_screen("You Win")
else:
    write_message_to_screen("You Lose")

    # Generate a csv file for the missing states for player to learn
    write_the_missing_states_to_csv()

screen.exitonclick()

Friday, April 22, 2022

Python - Files, Directories and Paths (Day 24)

This is a 100 Days challenge to learn a new language (Python). 100 Days of Code - The Complete Python Pro Bootcamp 

I will post some notes to motivate myself to finish this challenge.


Open a File



Use the built-in function open()

Ex:
# Use open function to get the file object
f = open("my_file.txt", encoding="utf-8")

# By default the read() method returns the whole text
print(f.read())

# It is a good practice to always close the file
# when you are done with it.
f.close()


Open Mode



r          read only (default)
w        write(override)
a         write(append)
x         create
t          text mode (default)
b         binary mode (return contents as bytes objects without any decoding)


Read



Ex: 'r' is the default mode
# Use open function to get the file object
f = open("my_file.txt", encoding="utf-8")

# readline() will return a line of a file
# When reading input from the stream, if newline parameter is None
# (default), universal newlines mode is enabled.
# Lines in the input can end in '\n', '\r', or '\r\n',
# and these are translated into '\n' before being returned to the caller.
# Then the output for the following line will be 'Hello World!\n'
# We can use end="" parameter for the print function to
# avoid two newline character
print(f.readline(), end="")

# read 2 characters
print(f.read(2))

# read the rest of the current line
print(f.readline(), end="")

# Use for-loop to go through all lines
for line in f:
    print(line, end="")

# Close the file
f.close()


Write



Ex: Use 'a' model to append the file
# Use open function to get the file object
f = open("my_file.txt", mode="a", encoding="utf-8")

f.write("I am here\n")

# Close the file
f.close()


Close a File



From the example above, we cannot guarantee that the file will be closed successfully if there are exceptions when doing the file operations. In order to make sure we closed the file, we can add those file operations under try-finally block.

Ex: Use try-finally block
# Use open function to get the file object
file = open("my_file.txt", encoding="utf-8")

try:
    # Wrap the instructions inside try block
    print(file.read())
finally:
    # Close the file
    file.close()


Now, the code looks complicated.

Then we can use 'with' statement to make it more cleaner.

It is good practice to use the with keyword when dealing with file objects. The advantage is that the file is properly closed after its suite finishes, even if an exception is raised at some point (Reference).

Ex: 
# Use with statement
with open("my_file.txt", encoding="utf-8") as file:
    print(file.read())
    # NOTE: We don't need to call close() function anymore


Absolute Path V.S. Relative Path



File System:




If we tried to access my_work.txt from main.py:

* Absolute path:
    /Work/my_work.txt

* Relative path:
    ../../my_work.txt

NOTE:
* / means root folder
* ./ means the current folder
* ../ means the parent folder


Project - Refactoring our previous Snake Game



1. Adding highest score to scoreboard.py
2. Make snake game can keep going
3. Save the highest score to the file and load it when app launch


scoreboard.py
from turtle import Turtle

class Scoreboard(Turtle):

    def __init__(self):
        """Constructor"""
        super().__init__()

        # attributes
        self.score = 0
        self.highest_score = self.load_highest_score()

        # init scoreboard
        self.hideturtle()
        self.color("white")
        self.penup()
        self.setpos(0, 280)
        self.refresh_score()

    def increase_score(self):
        """Increase score"""
        self.score += 1

        self.refresh_score()

    def refresh_score(self):
        """Show the latest score on screen"""
        # Delete the turtle’s drawings from the screen.
        self.clear()
        # Write text
        self.write(
            f"Score: {self.score}, Highest Score: {self.highest_score}",
            False,
            align="center",
        )

    def reset(self):
        """Reset scores and update screen"""
        # Update highest score if needed
        if self.score > self.highest_score:
            self.highest_score = self.score
            # Save highest score to our file system
            self.save_highest_score()

        # Reset the score
        self.score = 0

        # Update Screen
        self.refresh_score()

    def load_highest_score(self):
        """Load Highest Score from file"""
        with open("data.txt", mode="r", encoding="utf-8") as file:
            return int(file.read())

    def save_highest_score(self):
        """Save Highest Score to the file"""
        with open("data.txt", mode="w", encoding="utf-8") as file:
            # Make sure to transform it to a string before writing
            file.write(str(self.highest_score))

food.py
import random
from turtle import Turtle

class Food(Turtle):

    def __init__(self):
        """Constructor"""
        super().__init__()

        # Init food
        self.shape("circle")
        self.penup()
        self.color("red")
        self.shapesize(stretch_wid=0.5, stretch_len=0.5)
        self.speed("fastest")

        # refresh food with random position
        self.refresh()

    def refresh(self):
        """Refresh food with new random position"""
        x_position = random.randint(-280, 280)
        y_position = random.randint(-280, 280)
        self.setpos(x_position, y_position)

snake.py
from turtle import Turtle

# Constants
STARTING_POSITIONS = [(0, 0), (-20, 0), (-40, 0)]
OUT_POSITION = (1000, 1000)
MOVING_DISTANCE = 20
UP = 90
DOWN = 270
LEFT = 180
RIGHT = 0


class Snake:

    def __init__(self):
        """Constructor"""
        # Define an attribute to track snake body
        self.segments = []

        # Call func to init snake
        self.create_snake()

        # Create a attribute instead of using magic number 0
        self.head = self.segments[0]

    def create_snake(self):
        """Adding segment(turtle object) to segments list"""
        for position in STARTING_POSITIONS:
            # add segment with passing position one by one
            self.add_segment(position)

    def add_segment(self, position):
        """Add segment to the end of snake body"""
        segment = Turtle()
        segment.shape("square")
        segment.color("white")
        segment.penup()
        segment.setpos(position)

        self.segments.append(segment)

    def extend(self):
        """Extend the sanke"""
        self.add_segment(self.segments[-1].pos())

    def move(self):
        """Make sanke move with current heading"""
        # Starting from tail, make each segment move to the
# position of the previous segment
        for index in range(len(self.segments) - 1, 0, -1):
            new_x = self.segments[index - 1].xcor()
            new_y = self.segments[index - 1].ycor()
            self.segments[index].setpos(new_x, new_y)

        # Using attribute instead of magic number
        self.head.forward(MOVING_DISTANCE)

    def move_up(self):
        """move_up"""
        if self.head.heading() != DOWN:
            self.head.setheading(UP)

    def move_right(self):
        """move_right"""
        if self.head.heading() != LEFT:
            self.head.setheading(RIGHT)

    def move_left(self):
        """move_left"""
        if self.head.heading() != RIGHT:
            self.head.setheading(LEFT)

    def move_down(self):
        """move_down"""
        if self.head.heading() != UP:
            self.head.setheading(DOWN)

    def reset(self):
        """Reset snake"""
        # Move all segments of snake to out of the screen area
        for s in self.segments:
            s.setpos(OUT_POSITION)

        # Clear the tracking attribut
        self.segments.clear()

        # Create a new snake
        self.create_snake()

        # Update snake's head
        self.head = self.segments[0]

main.py
import time
from turtle import Screen
from snake import Snake
from food import Food
from scoreboard import Scoreboard

# Screen Setup
screen = Screen()
screen.setup(width=600, height=600)
screen.title("My Snake Game")
screen.bgcolor("black")
# Turn turtle animation off# Disable screen
screen.tracer(0)

# init
snake = Snake()
food = Food()
scoreboard = Scoreboard()

# Listen events
# Bind fun to key-release event of key
screen.onkey(snake.move_up, "Up")
screen.onkey(snake.move_down, "Down")
screen.onkey(snake.move_right, "Right")
screen.onkey(snake.move_left, "Left")
# Set focus on TurtleScreen (in order to collect key-events)
screen.listen()

game_is_on = True

while game_is_on:
    # Perform a TurtleScreen update# Update Screen
    screen.update()

    # Give some delay for this while loop
    time.sleep(0.1)

    # Keep moving
    snake.move()

    # Check if there is a collision for food
    if snake.head.distance(food) < 15:
        # Increase score
        scoreboard.increase_score()

        # Make snake longer
        snake.extend()

        # Update food position
        food.refresh()

    # Check if there is a collision for wall
    if (
        snake.head.xcor() >= 285
        or snake.head.xcor() <= -285
        or snake.head.ycor() >= 285
        or snake.head.ycor() <= -285
    ):
        # Reset scoreboard
        scoreboard.reset()

        # Reset snake
        snake.reset()

    for segment in snake.segments[1:]:
        if snake.head.distance(segment) <= 15:
            # Reset scoreboard
            scoreboard.reset()

            # Reset snake
            snake.reset()

# Bind bye() method to mouse clicks on the Screen.
screen.exitonclick()


Challenge - Mail Merge



* Generate letters which are ready to send with the correct content, and make them located into Output/ReadyToSend folder.
* The letter content can be found by a template letter template_letter.txt with a placeholder [name].
* Loop through the list of names from invited_names.txt to replace the letter content placeholder.



template_letter.txt
Dear [name],
How are you? Frank

invited_names.txt
Sean Jake Joey Nini

main.py
names = []
letter_with_placeholder = ""

# Read invited_names.txt and append each name to a list
with open("./Input/Names/invited_names.txt") as file:
    for line in file:
        # Use strip() to remove both the leading and the trailing
# characters
        names.append(line.strip())

# Read template_letter.txt
with open("./Input/Letters/template_letter.txt") as file:
    letter_with_placeholder = file.read()

# Loop through all names
for name in names:
    # Use name to replace the placeholder
    letter_with_name = letter_with_placeholder.replace("[name]", name)
    # Use x model to create a new file with the correct content
    with open(
        f"./Output/ReadyToSend/letter_for_{name}.txt",
mode="x"
    ) as file:
        file.write(letter_with_name)