3.6. Files in Python (Optional Section)#

3.6.1. Files Basics#

In Python, you can read and write files using various file handling methods. Here’s an overview of how to perform file reading and writing operations [Downey, 2015, Python Software Foundation, 2023]:

3.6.1.1. Reading a File#

To read data from a file, you can use the open() function with the mode set to 'r', which stands for read mode. The read() or readlines() methods can be used to retrieve the content from the file.

# Example: Reading from a file
# The file is available at the following URL:
# https://raw.githubusercontent.com/HatefDastour/ENGG_680/2a92eacf4c54319e234b3c1100f812015aed2d89/Files/Example.txt
file_path = 'Example.txt'

# Using the 'with' statement to ensure proper file handling and automatic closing
with open(file_path, 'r') as file:
    # Read the entire file content into a single string
    content = file.read()
    print(content)

# Alternatively, you can read the lines of the file into a list
with open(file_path, 'r') as file:
    lines = file.readlines()
    for line in lines:
        # Using strip() to remove trailing newline characters
        print(line.strip())  # This prints each line of the file without newline characters
This is an example txt file.
This is an example txt file.

3.6.1.2. Writing to a File#

To write data to a file, use the open() function with the mode set to 'w', which stands for write mode. This will create a new file if it doesn’t exist or overwrite the file’s content if it already exists.

# Example: Writing to a file
file_path = 'Output.txt'

# Writing a single line to the file
with open(file_path, 'w') as file:
    file.write("This is a line written to the file.")

# Writing multiple lines to the file
lines = ["Line 1\n", "Line 2\n", "Line 3\n"]
with open(file_path, 'w') as file:
    file.writelines(lines)

3.6.1.3. Appending to a File#

To add new data to an existing file without overwriting the existing content, use the 'a' mode (append mode) with the open() function.

# Example: Appending to a file
file_path = 'Output.txt'

# Appending a new line to the file
with open(file_path, 'a') as file:
    file.write("\nThis is a new line appended to the file.")

Remember to handle file I/O operations carefully, especially when dealing with sensitive or large files, by using proper error handling and closing the files after usage. The with statement used in the examples ensures the file is automatically closed after the block of code executes, even if there is an exception.

3.6.2. Filenames and paths#

In Python, dealing with filenames and paths is essential for file manipulation, reading, and writing. Python provides several modules to work with filenames and paths conveniently [Downey, 2015, Python Software Foundation, 2023].

3.6.2.1. os module#

The os module also contains functions related to file and path operations.

  • os.getcwd(): Returns the current working directory.

  • os.chdir(): Changes the current working directory.

Example:

import os

# Getting the current working directory
print(os.getcwd())  # Output: /path/to/current/directory

# Changing the current working directory
os.chdir('/new/directory/path')
print(os.getcwd())  # Output: /new/directory/path

3.6.2.2. os.path module#

This module provides various functions to work with file paths across different platforms (Windows, macOS, Linux).

  • os.path.join(): Joins two or more pathname components into a complete path.

  • os.path.abspath(): Returns the absolute version of a path.

  • os.path.basename(): Returns the base name of a file path.

  • os.path.dirname(): Returns the directory name of a file path.

  • os.path.exists(): Checks if a path exists.

  • os.path.isfile(): Checks if a path points to a regular file.

  • os.path.isdir(): Checks if a path points to a directory.

Example:

import os

# Joining paths
path1 = "/path/to/directory"
path2 = "file.txt"

# Using os.path.join to combine path1 and path2 into a full path
full_path = os.path.join(path1, path2)
print(full_path)  # Output: /path/to/directory/file.txt

# Getting the basename and dirname
file_path = "/path/to/some/file.txt"

# Using os.path.basename to extract the filename from the path
print(os.path.basename(file_path))  # Output: file.txt

# Using os.path.dirname to extract the directory path from the full path
print(os.path.dirname(file_path))   # Output: /path/to/some

# Checking if a path exists
# os.path.exists returns True if the specified path exists, otherwise False
print(os.path.exists(file_path))    # Output: True

# Checking if it's a file or directory
# os.path.isfile returns True if the path points to a file, otherwise False
# os.path.isdir returns True if the path points to a directory, otherwise False
print(os.path.isfile(file_path))    # Output: True
print(os.path.isdir(file_path))     # Output: False
/path/to/directory\file.txt
file.txt
/path/to/some
False
False
False

3.6.2.3. pathlib module#

The pathlib module provides an object-oriented approach to deal with paths and filenames. It is available in Python 3.4 and later versions.

Example:

from pathlib import Path

# Creating a Path object for the specified file path
file_path = Path("/path/to/file.txt")

# Getting the basename and dirname using attributes of the Path object
print(file_path.name)     # Output: file.txt
print(file_path.parent)   # Output: /path/to

# Checking if a path exists using the exists() method of the Path object
print(file_path.exists()) # Output: True

# Checking if it's a file or directory using is_file() and is_dir() methods
print(file_path.is_file()) # Output: True
print(file_path.is_dir())  # Output: False
file.txt
\path\to
False
False
False

When working with files, it’s essential to handle exceptions and ensure that the paths are correctly formatted, especially when dealing with user inputs or dynamic paths. Using the os.path or pathlib modules helps ensure portability and compatibility across different platforms.

3.6.3. Catching exceptions#

In Python, catching exceptions is a crucial aspect of error handling. It allows you to gracefully handle errors that might occur during the execution of your code and prevents the program from crashing abruptly. Python provides a try-except block to catch and handle exceptions. The general syntax is as follows [Downey, 2015, Python Software Foundation, 2023]:

try:
    # code that may raise an exception
    # ...
except SomeException as e:
    # code to handle the exception
    # ...
else:
    # code to be executed if no exception occurs (optional)
    # ...
finally:
    # code that will be executed regardless of whether an exception occurred or not (optional)
    # ...

Here’s a breakdown of the try-except block:

  • The try block contains the code that might raise an exception.

  • The except block catches the specified exception (or multiple exceptions separated by commas) and executes the code within it if the corresponding exception occurs. You can assign the caught exception to a variable (e.g., as e) for further analysis.

  • The else block is optional and will be executed only if no exception occurs in the try block.

  • The finally block is also optional and will be executed regardless of whether an exception occurred or not. It is typically used for cleanup tasks like closing files, releasing resources, etc.

Example:

try:
    num1 = int(input("Enter a number: "))
    num2 = int(input("Enter another number: "))
    result = num1 / num2
except ValueError as ve:
    print("Error: Invalid input. Please enter valid integers.")
except ZeroDivisionError as zde:
    print("Error: Division by zero is not allowed.")
else:
    print(f"The result of the division is: {result}")
finally:
    print("Execution completed.")

In this example, if the user enters non-integer inputs or tries to divide by zero, the corresponding exceptions will be caught, and an appropriate error message will be displayed. If the division is successful, the result will be printed. The “Execution completed” message will always be displayed, regardless of whether an exception occurred or not.

It’s essential to handle specific exceptions rather than using a broad except block that catches all exceptions. This ensures that you handle errors properly and can provide more informative error messages to users or log files.

Keep in mind that catching exceptions should be used for exceptional cases, not for normal flow control. It’s not recommended to use exceptions for regular program flow as they may negatively impact the performance of your code. Use them judiciously and thoughtfully to improve the reliability and robustness of your Python programs.

3.6.4. Databases#

In Python, dbm stands for “database manager,” and it is a built-in module that provides a simple interface to create and manipulate key-value databases. The dbm module allows you to store and retrieve data using a key as an identifier. There are several variants of dbm available in Python, and the appropriate variant is chosen based on the availability of the underlying database library on your system [Downey, 2015, Python Software Foundation, 2023]:

  • dbm.gnu: This variant uses the GNU DBM library.

  • dbm.ndbm: This variant uses the ndbm library, which is available on most Unix-based systems.

  • dbm.dumb: This variant provides a fallback implementation using simple file-based storage (it’s the slowest, but it doesn’t require an external library).

Here’s a general overview of how to use dbm in Python:

  1. Import the appropriate variant of the dbm module.

import dbm # import dbm.gnu or dbm.ndbm or dbm.dumb
  1. Open/create a database file.

# The 'c' mode creates the database if it doesn't exist, and opens it in read-write mode.
# The 'n' mode always creates a new empty database, overwriting any existing file.
# The 'r' mode opens an existing database in read-only mode.
# The 'w' mode opens an existing database in read-write mode or creates a new one if it doesn't exist.
db = dbm.open("my_database.db", "c")  # Replace "my_database.db" with your desired filename
  1. Manipulate the database using key-value pairs.

# Insert data
db[b"key1"] = b"value1"  # Note: Keys and values must be bytes (use encode/decode for strings).

# Retrieve data
value = db[b"key1"]
print(value.decode())  # Output: "value1"

# Update data
db[b"key1"] = b"updated_value"

# Delete data
del db[b"key1"]

# Check if a key exists in the database
if b"key1" in db:
    print("Key exists.")
else:
    print("Key does not exist.")
value1
Key does not exist.
  1. Close the database when you’re done working with it.

db.close()

Remember to handle exceptions when working with the dbm module, as errors may occur while opening or writing to the database file.

import dbm

try:
    db = dbm.open("my_database.db", "c")
    # Perform database operations here
except dbm.error as e:
    print(f"An error occurred: {e}")
finally:
    db.close()

Keep in mind that dbm is not as feature-rich as full-fledged database systems like SQLite or other database management systems. It is mainly suitable for simple key-value storage requirements, and for more complex applications, you may want to consider using more advanced database solutions.

3.6.5. Pickling#

Pickling in Python refers to the process of serializing and deserializing objects, allowing you to convert complex Python objects into a byte stream and vice versa. This is useful for saving data to disk or transmitting it over networks while preserving its structure and state [Downey, 2015, Python Software Foundation, 2023].

The primary module for pickling in Python is the pickle module, which comes built-in with Python. Here’s a brief overview of how to use pickling: 6.1. Pickling (Serialization)

  1. Import the pickle module:

import pickle
  1. Create the Python object you want to pickle:

data = {'Name': 'John',
        'Age': 35,
        'City': 'Calgary'}
  1. Open a file in binary write mode to save the pickled data:

with open('data.pickle', 'wb') as file:
    pickle.dump(data, file)

3.6.5.1. Unpickling (Deserialization)#

  1. Import the pickle module:

import pickle
  1. Open the pickled file in binary read mode:

with open('data.pickle', 'rb') as file:
    loaded_data = pickle.load(file)

The pickle.dump(obj, file) function is used to serialize the obj into a file. The pickle.load(file) function is used to deserialize the object from the file.

Remember that pickling and unpickling should be used with caution, especially when dealing with untrusted sources or when sharing data across different Python versions. If the objects are complex or contain a lot of data, consider using alternative formats like JSON or databases for better human readability and interoperability across different languages.

3.6.6. Pipes#

In Python, pipes are a form of inter-process communication (IPC) that allow different processes to communicate by connecting the output of one process to the input of another. The subprocess module provides support for creating and managing subprocesses, including using pipes for communication between the parent and child processes [Downey, 2015, Python Software Foundation, 2023].

Here’s a basic example of using pipes in Python:

import subprocess

# Create a subprocess and specify that we want to capture its standard output
process = subprocess.Popen(['ls', '-l'], stdout=subprocess.PIPE)

# Read the output from the subprocess
output, _ = process.communicate()

# Decode the bytes output to a string (assuming it's in UTF-8 encoding)
output_str = output.decode('utf-8')

# Print the output of the 'ls -l' command
print(output_str)

In this example, we use the subprocess.Popen function to create a subprocess that runs the ls -l command (on Unix-based systems). The stdout=subprocess.PIPE argument tells the subprocess to capture its standard output, which we can later read using the process.communicate() method.

Pipes are commonly used to pass data between processes in more complex scenarios, such as in data processing or when chaining multiple commands together. They can also be used to establish communication between Python processes or between Python and other programs.

Keep in mind that this is a basic example, and you may need to handle more complex scenarios, error handling, and potential deadlocks or other issues in real-world applications. If you need to perform more advanced IPC or want to create more complex pipelines, you might consider using other libraries like multiprocessing or subprocess.Popen with more options for controlling the communication flow.

3.6.7. JSON Data#

In Python, you can work with JSON (JavaScript Object Notation) files to both write data to a JSON file and read data from a JSON file. JSON is a popular format for storing and exchanging data because of its simplicity and compatibility with various programming languages. Here’s how you can write and read JSON files:

3.6.7.1. Writing JSON Data to a File:#

To write JSON data to a file, you typically follow these steps:

  1. Create a Python dictionary or a data structure containing the data you want to store as JSON.

  2. Use the json.dump() method to serialize and write the data to a JSON file.

Example:

import json

# Create a dictionary with sample data
data = {"Name": "John",
        "Age": 35,
        "City": "Calgary"
        }

# Specify the file path where you want to save the JSON data
file_path = "data.json"

# Open the file in write mode and write the JSON data to it
with open(file_path, 'w') as json_file:
    json.dump(data, json_file)

This code will create a JSON file named “data.json” and write the content of the data dictionary into it.

3.6.7.2. Reading JSON Data from a File:#

To read JSON data from a file, you follow these steps:

  1. Open the JSON file in read mode using open().

  2. Use the json.load() method to deserialize the JSON data into a Python data structure (usually a dictionary or a list).

Example:

import json

# Specify the file path where the JSON data is stored
file_path = "data.json"

# Open the file in read mode and load the JSON data
with open(file_path, 'r') as json_file:
    loaded_data = json.load(json_file)

# Access and work with the loaded JSON data as a Python dictionary
print("Name:", loaded_data["Name"])
print("Age:", loaded_data["Age"])
print("City:", loaded_data["City"])
Name: John
Age: 35
City: Calgary

This code will read the JSON data from the “data.json” file and store it in the loaded_data variable, allowing you to access and manipulate it within your Python program.

Remark

In the context of reading and writing JSON files:

  1. FileNotFoundError: This exception occurs when you attempt to open a file that doesn’t exist. It’s common when trying to read a file that hasn’t been created yet or has been moved or deleted.

    To handle FileNotFoundError, you can use a try...except block. Here’s how you can modify the code to handle it:

    import json
    
    file_path = "data.json"
    
    try:
        with open(file_path, 'r') as json_file:
            loaded_data = json.load(json_file)
    except FileNotFoundError:
        print(f"The file '{file_path}' does not exist.")
    

    By catching this exception, your program will print an informative message and continue running rather than crashing if the file is not found.

  2. PermissionError: This exception occurs when you try to open a file for which you don’t have the necessary permissions (e.g., trying to write to a read-only file).

    To handle PermissionError, you can include it in the try...except block like this:

    import json
    
    file_path = "data.json"
    
    try:
        with open(file_path, 'w') as json_file:
            json.dump(data, json_file)
    except PermissionError:
        print(f"You do not have permission to write to the file '{file_path}'.")
    

    Handling this exception allows your program to provide a clear message to the user if they lack the necessary permissions to perform the file operation.

  3. Other Exceptions: There may be other exceptions related to file operations, such as IOError or OSError, which can occur due to various reasons like disk errors, file locks, or invalid file paths. You can handle these exceptions similarly within your code to ensure robustness.

3.6.8. Writing modules#

In Python, creating your own modules allows you to organize your code into separate files, making it easier to manage and reuse in different projects. A module in Python is simply a file with a .py extension containing Python code, and it can contain classes, functions, variables, and more [Downey, 2015, Python Software Foundation, 2023].

Here’s a step-by-step guide on how to create and use your own Python module:

  1. Create a new Python file with a .py extension. Let’s call it mymodule.py.

  2. Inside mymodule.py, you can define functions, classes, or any other code that you want to make available as part of the module.

# mymodule.py

def say_hello():
    print("Hello, this is my module!")
    
def add(a, b):
    return a + b
  1. Save the mymodule.py file in a directory where you’ll be working on your project.

  2. Now, you can use the module in another Python script or interactive session. To do this, make sure that the Python script or interactive session is in the same directory as mymodule.py, or you can also add the directory containing the module to the Python path.

# main.py (or any other Python script)

import mymodule

mymodule.say_hello()
result = mymodule.add(5, 3)
print(result)
Hello, this is my module!
8
  1. When you run main.py, it will import mymodule and call the functions defined in it.

Remember that you can have multiple functions, classes, or variables in your module, and they will be accessible through the mymodule object once you import it.

If you want to make your module more user-friendly and prevent certain parts from being executed when importing the module, you can use the if __name__ == "__main__": pattern. Code under this block will only run when you execute the module directly, not when you import it in another script.

# mymodule.py

def say_hello():
    print("Hello, this is my module!")
    
def add(a, b):
    return a + b

if __name__ == "__main__":
    # Code under this block will only run when the module is executed directly.
    # It won't run when the module is imported.
    print("This is the main part of my module.")