An Introduction to the Art of Computer Programming Using Python in the Age of Generative AI

XVII. Introduction to Algorithms and Efficiency

There should be one-- and preferably only one --obvious way to do it.
Tim Peters, Zen of Python

An algorithm is a set of clearly defined, logical steps to solve a particular problem or accomplish a task. In Python, algorithms are often expressed as functions, taking specific inputs and returning results after performing a sequence of operations. However, it’s not enough to just have a correct solution—understanding how efficient it is, especially as input sizes grow, is essential. Evaluating efficiency involves time complexity (speed) and space complexity (memory), both of which can be summarized using Big-O notation.

Big-O Notation

Big-O notation describes how an algorithm's runtime (or space requirement) scales as the input size increases. It focuses on the dominant factors that affect performance. By ignoring constant factors and lower-order terms, Big-O notation provides a high-level understanding of how an algorithm’s performance behaves in the "Worst Case" scenario.

Notation Name Example Speed
O(1) Constant Dict/List lookup Instant
O(log n) Logarithmic Binary Search Very Fast
O(n) Linear Simple Loop Fast
O(n log n) Linearithmic Merge Sort, Timsort Decent
O(n^2) Quadratic Nested Loops Slow
O(2^n) Exponential Naive Recursion Very Slow

O(1) - Constant Time

An O(1) algorithm’s running time does not depend on the size of the input. Accessing an element in a list by index (data[0]) or looking up a key in a dictionary (data['key']) is instant, regardless of whether the collection has 10 items or 10 million.


fruits: list = ["apple", "banana", "orange"]

def get_first_element(items: list):
    if items:
        return items[0] # O(1) Operation
    return None
        

O(log n) - Logarithmic Time

An O(log n) algorithm grows much slower than linear time. Typically, these algorithms reduce the problem size by a constant fraction (usually half) at each step. Common examples include binary search and operations on balanced database trees.


def binary_search(sorted_list: list[int], item: int) -> bool:
    first = 0
    last = len(sorted_list) - 1

    while first <= last:
        midpoint = (first + last) // 2
        if sorted_list[midpoint] == item:
            return True
        else:
            if item < sorted_list[midpoint]:
                last = midpoint - 1 # Cut problem in half (Left)
            else:
                first = midpoint + 1 # Cut problem in half (Right)
    return False

numbers = [10, 20, 30, 40, 50, 60, 70, 80, 90]
print(f"Found 30? {binary_search(numbers, 30)}")
        

O(n) - Linear Time

An O(n) algorithm must process each element in the input at least once. This is typical for simple loops, such as summing a list or searching for a value in unsorted data.


def sum_list(example_list: list[int]) -> int:
    total = 0
    for element in example_list:
        total += element
    return total

numbers = [1, 3, 5, 7]
print(f"Sum: {sum_list(numbers)}")
        

O(n log n) - Linearithmic Time

Algorithms with O(n log n) complexity are usually efficient sorting algorithms. They use a "divide and conquer" approach: splitting the list takes logarithmic time, and merging/sorting takes linear time.

Note: Python's built-in list.sort() and sorted() functions use Timsort, a highly optimized O(n log n) algorithm derived from Merge Sort and Insertion Sort.


def quicksort(arr: list[int]) -> list[int]:
    if len(arr) <= 1:
        return arr

    pivot = arr[len(arr)//2]
    # List comprehensions iterate once: O(n)
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]

    # Recursive calls add the log(n) factor
    return quicksort(left) + middle + quicksort(right)

unsorted_list = [33, 10, 59, 26, -1, 2, 100]
sorted_list = quicksort(unsorted_list)
print(f"Sorted: {sorted_list}")
        

O(n^2) - Quadratic Time

In O(n^2) algorithms, an operation is repeated for each element in the input list, resulting in nested loops. Sorting methods like bubble sort exhibit quadratic behavior, making them very slow for large datasets (e.g., 10,000 items require 100,000,000 operations).


def bubble_sort(arr: list[int]) -> list[int]:
    n = len(arr)
    # Outer loop runs n times
    for i in range(n):
        # Inner loop runs approx n times -> n * n = n^2
        for j in range(0, n-i-1):
            if arr[j] > arr[j+1]:
                arr[j], arr[j+1] = arr[j+1], arr[j]
    return arr

unordered = [55, 2, -23, 6, 24]
print(f"Bubble Sort: {bubble_sort(unordered)}")
        
Generative AI Insight: The Cost of Context
The core mechanism of modern LLMs (the Transformer Self-Attention) has a complexity of roughly O(N^2) with respect to the input text length ($N$). This is why "Context Window" is a major bottleneck: doubling the length of your prompt requires roughly four times the computational power and memory.

O(2^n) - Exponential Time

O(2^n) algorithms have runtimes that grow exponentially with input size. Solutions to complex combinatorial problems or brute-force searches (for instance, the naive Fibonacci implementation or the traveling salesperson problem) can fall into this category.


def fibonacci(n: int) -> int:
    if n <= 1:
        return n
    else:
        # Two recursive calls per step = Exponential branching
        return fibonacci(n-1) + fibonacci(n-2)

print(f"Fibonacci(10): {fibonacci(10)}")
        

Optimization: Memoization

We can fix the performance of the exponential Fibonacci algorithm using Memoization (caching). By storing results we've already calculated, we turn an $O(2^n)$ algorithm into an $O(n)$ algorithm. Python makes this easy with the lru_cache decorator.


from functools import lru_cache

@lru_cache(maxsize=None)
def fast_fibonacci(n: int) -> int:
    if n <= 1:
        return n
    return fast_fibonacci(n-1) + fast_fibonacci(n-2)

# This would freeze the browser with the naive version!
print(f"Fibonacci(50): {fast_fibonacci(50)}")
        

Considering Space Complexity

Alongside time complexity, space complexity measures how much additional memory (RAM) an algorithm requires relative to the input size.

Understanding space requirements is crucial for large inputs. If your dataset is 10GB, an $O(n)$ space algorithm might require 20GB of RAM to run, potentially crashing your system.

Prompting Generative AI for Efficient Code

When seeking help from a generative AI model, frame your questions to focus on the constraints of your algorithm. Instead of asking, “How do I sort a list?”, ask: "Write a memory-efficient sorting algorithm for a large list of integers."

Example Prompt:
Write a functional program for merge sort. Use type hints and explain the steps.

Resulting AI-generated code:


def merge_sort(lst: list[int]) -> list[int]:
    # Base case: list is already sorted
    if len(lst) <= 1:
        return lst

    # Divide: Find middle and split
    mid = len(lst) // 2
    left = merge_sort(lst[:mid])
    right = merge_sort(lst[mid:])

    # Conquer: Merge results
    return merge(left, right)

def merge(left: list[int], right: list[int]) -> list[int]:
    result = []
    i = j = 0
    # Compare elements
    while i < len(left) and j < len(right):
        if left[i] < right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1
    # Add remaining elements
    result.extend(left[i:])
    result.extend(right[j:])
    return result

data = [55, 2, -23, 6, 24]
print(f"Merge Sorted: {merge_sort(data)}")
        
Back to Home