An Introduction to the Art of Computer Programming Using Python in the Age of Generative AI

XI. File Handling and I/O

Introduction to File Handling

File handling is fundamental in most applications. It enables programs to read from and write to files on disk, preserving data between runs. Python provides a flexible set of tools for file operations, simplifying the storage of user inputs, configuration settings, logs, or any form of persistent data. Proper file handling is crucial for safe data exchange and preventing issues like data corruption, file lock conflicts, or memory overuse.

Reading and Writing Text Files

You can open text files using Python’s built-in open function. The mode parameter ('r' for reading, 'w' for writing, 'a' for appending, etc.) specifies how you interact with the file. By default, files are opened in text mode. Using a with statement, Python automatically closes the file, even if an error occurs, minimizing resource leaks.


# Writing to a file
with open('example.txt', 'w') as file:
    file.write('Hello, World!')

# Reading from a file
with open('example.txt', 'r') as file:
    content = file.read()
    print(content)
        

Sometimes you only need to read part of a file or process it line by line. In such cases, you can use file.readlines() or iterate over the file object directly:


# Reading a file line by line
with open('example.txt', 'r') as file:
    for line in file:
        print(line.strip())
        

Reading and Writing Binary Files

Not all data is text-based. Binary files (images, audio, executables) require opening in binary mode ('b'). This prevents Python from interpreting the data, ensuring it is read or written exactly as stored on disk.


# Writing binary data to a file
with open('example.bin', 'wb') as file:
    file.write(b'\x00\x01\x02\x03')

# Reading binary data from a file
with open('example.bin', 'rb') as file:
    data = file.read()
    print(data)
        

When handling large files, you can prevent excessive memory usage by reading in chunks (e.g. file.read(1024)) to process data incrementally.

Working with File Paths

The os.path and pathlib modules let you manipulate file paths in an OS-agnostic way, ensuring your code runs consistently across Windows, macOS, and Linux, where path structures differ.


import os
from pathlib import Path

# Using os.path
# Joining paths
path = os.path.join('folder', 'file.txt')
print(path)  # Output depends on the OS

# Getting the file extension
extension = os.path.splitext(path)[1]
print(extension)

# Using pathlib
path = Path('folder') / 'file.txt'
print(path)  # Output depends on the OS

# Getting the file extension
extension = path.suffix
print(extension)
        

pathlib.Path objects include methods for creating directories, verifying file existence, and more. They provide an object-oriented interface often clearer than string-based approaches.

Handling I/O Errors

Files can be missing, corrupted, or inaccessible. Python raises exceptions like FileNotFoundError and PermissionError for such situations. Using try-except blocks ensures your program doesn’t crash when a file operation fails.


try:
    with open('nonexistent.txt', 'r') as file:
        content = file.read()
except FileNotFoundError:
    content = 'File not found.'
except PermissionError:
    content = 'Permission denied.'
print(content)
        

You can also log these issues, retry operations, or prompt the user for a missing file path, ensuring a robust approach to file-related errors.

Appending to Files

The 'a' mode in the open function appends data to an existing file or creates a new one if it doesn’t exist. This approach is useful for logs or any case where you want to preserve the existing file content.


# Appending to a file
with open('example.txt', 'a') as file:
    file.write('\nAppend this line.')

# Reading the appended file
with open('example.txt', 'r') as file:
    content = file.read()
    print(content)
        

Prompting Generative AI for Effective File Handling and I/O

Generative AI can help you design scripts for your specific I/O needs. By specifying file formats (e.g., CSV, JSON, or binary) and error-handling preferences, you can obtain examples that integrate reading, processing, and writing files seamlessly.

Example Prompt:
Generate a Python script that reads a CSV file, processes the data, and writes the results to a new CSV file, handling any file-related errors gracefully.

Resulting AI-generated code:


import csv

def process_csv(input_file, output_file):
    try:
        with open(input_file, 'r') as infile:
            reader = csv.reader(infile)
            data = [row for row in reader]

        # Process data (example: convert all text to uppercase)
        processed_data = [[cell.upper() for cell in row] for row in data]

        with open(output_file, 'w', newline='') as outfile:
            writer = csv.writer(outfile)
            writer.writerows(processed_data)
        return "Processing complete."
    except FileNotFoundError:
        return "Input file not found."
    except PermissionError:
        return "Permission denied."
    except Exception as e:
        return f"An error occurred: {str(e)}"

result = process_csv('input.csv', 'output.csv')
print(result)
        

Whether you require chunk-based reading for huge files, special file encodings, or structured data parsing, a well-crafted AI prompt can yield robust, targeted solutions. Always review and test any AI-generated code, particularly in production or security-sensitive environments.

Back to Home