Computer scientist Edsger Dijkstra, a pioneer of structured programming, wrote in 1970: "Program testing can be used to show the presence of bugs, but never to show their absence!" Testing is essential because it helps confirm that your code behaves correctly under a variety of circumstances, while significantly reducing the likelihood of undiscovered defects. Although no amount of testing can prove that your program is 100% bug-free, maintaining a solid testing strategy is a cornerstone of professional software development.
Black-box testing involves validating a piece of software solely by examining its inputs and outputs, without direct knowledge of its internal code or structure. This approach focuses on functional requirements, ensuring the software produces correct results for a range of test cases and edge conditions. Because the tester doesn’t look inside the code, black-box testing is excellent for validating user-facing behavior and verifying compliance with specifications.
By contrast, white-box testing (sometimes called clear-box or glass-box testing) involves examining the software’s internal logic and code paths. Testers or developers use knowledge of the implementation details—such as specific functions, branches, and loops—to create tests that ensure each pathway in the code is exercised. White-box and black-box approaches often complement each other to provide broader coverage and assurance of software quality.
Debugging is the process of locating, diagnosing, and fixing errors within your code. Python’s built-in
debugger, pdb, is a powerful tool for stepping through your program line by line, inspecting
variables, and testing hypotheses about where the bug might be.
Modern Tip: In Python 3.7+, you can use the built-in function breakpoint(). It works exactly like pdb.set_trace() but is cleaner
and can be disabled globally via environment variables.
import pdb
def add_numbers(a, b):
# Execution will pause here, opening the interactive debugger
pdb.set_trace()
# Or simply: breakpoint()
return a + b
result = add_numbers(1, 2)
print(f"Result: {result}")
When the program stops, you can use commands in the console:
p a).
Unit tests validate the functionality of small, isolated parts (or “units”) of code.
Python’s built-in unittest module offers test case classes and assertion methods.
import unittest
def multiply(a, b):
return a * b
class TestMultiplication(unittest.TestCase):
def test_multiply(self):
# Happy path
self.assertEqual(multiply(2, 3), 6)
# Edge cases
self.assertEqual(multiply(-1, 3), -3)
self.assertEqual(multiply(0, 3), 0)
if __name__ == '__main__':
# 'exit=False' prevents the test runner from closing the environment
unittest.main(argv=['first-arg-is-ignored'], exit=False)
While unittest is built-in, the standard in the Data Science and AI industry is
pytest. It is less verbose and uses standard assert statements.
# pytest style (simpler and cleaner)
def test_multiply_pytest():
assert multiply(2, 3) == 6
assert multiply(-1, 3) == -3
assert multiply(0, 3) == 0
When building AI applications, you cannot call the real API (like OpenAI) every time you run tests—it is slow, expensive, and non-deterministic. Instead, you use Mocking to simulate the API response.
from unittest.mock import MagicMock
# Simulate an AI client (like openai.OpenAI())
mock_ai_client = MagicMock()
# Define what the "fake" API should return when called
# This mimics the structure: client.chat.completions.create()
mock_ai_client.chat.completions.create.return_value = "Mocked AI Response"
# Run your code using the mock instead of the real client
# This verifies your logic without spending money or requiring internet
result = mock_ai_client.chat.completions.create(model="gpt-4", messages=[])
print(f"Result from mock: {result}")
Test-Driven Development (TDD) is a process where you write unit tests before writing the code that satisfies those tests. It commonly follows the "Red-Green-Refactor" cycle:
assert result == 10). However, Generative AI
outputs are non-deterministic text. You cannot simply assert that a summarized essay "is correct."
Generative AI is excellent at writing boilerplate test code. You can paste a function into an LLM and ask it to generate comprehensive unit tests, including edge cases you might have missed.
Example Prompt:Resulting AI-generated code:
import unittest
def calculate_discount(price, rate):
if price < 0 or rate < 0:
raise ValueError("Inputs must be non-negative")
return price * (1 - rate)
class TestDiscount(unittest.TestCase):
def test_standard(self):
self.assertEqual(calculate_discount(100, 0.2), 80.0)
def test_full_discount(self):
self.assertEqual(calculate_discount(100, 1.0), 0.0)
def test_no_discount(self):
self.assertEqual(calculate_discount(100, 0.0), 100.0)
def test_negative_input(self):
with self.assertRaises(ValueError):
calculate_discount(-50, 0.1)
if __name__ == '__main__':
unittest.main(argv=['first-arg-is-ignored'], exit=False)