A humorous yet practical guide to AI-assisted development. DON'T PANIC.
View the Project on GitHub HermeticOrmus/hitchhikers-guide-to-vibe-engineering
Risk Level: 🟢 Essential
TESTING (n.): The practice of verifying that code works. With AI-generated code, testing is your primary defense against confident wrongness. The AI will never tell you something is broken. Tests will.
AI can generate tests. But:
Solution: Generate tests, but verify they test what matters.
Before generating code:
"I need a function that:
- Takes a list of numbers
- Returns the average
- Returns 0 for empty lists
- Ignores non-numeric values"
"Write tests for that function before implementing it.
Use pytest.
Include edge cases."
Do the tests match your specification?
Do the tests make sense?
# Good test
def test_average_ignores_strings():
assert average([1, "two", 3]) == 2.0
# Suspicious test - why would this be expected?
def test_average_returns_none_for_empty():
assert average([]) is None
# Wait, spec said return 0, not None...
"Now implement the function to pass these tests."
pytest
If tests fail, you learned something. Fix and repeat.
def test_average_normal_case():
assert average([1, 2, 3, 4, 5]) == 3.0
def test_average_empty_list():
assert average([]) == 0
def test_average_single_element():
assert average([42]) == 42
def test_average_with_invalid_input():
assert average([1, "two", 3]) == 2.0
def test_average_very_large_numbers():
assert average([10**100, 10**100]) == 10**100
def test_uses_real_library():
# If this import fails, AI hallucinated the library
from actual_library import actual_function
def test_function_only_does_what_asked():
result = process_data(input_data)
# Verify it didn't add extra fields
assert set(result.keys()) == {"expected", "fields", "only"}
def test_works_with_existing_code():
# Test that AI code integrates with your codebase
existing_result = existing_function()
ai_result = ai_generated_function(existing_result)
assert ai_result is not None
"Generate pytest tests for this function.
Include:
- 3 happy path tests
- 3 edge case tests
- 2 error case tests
Follow this format:
def test_descriptive_name():
'''What this tests.'''
# Arrange
input = ...
# Act
result = function(input)
# Assert
assert result == expected"
AI might generate:
You catch these by reading the tests.
Level 0: No tests
AI says it works, you hope it works
🔴 Dangerous
Level 1: AI-generated tests, unreviewed
Better than nothing
🟡 Risky
Level 2: AI-generated tests, reviewed
You verified they test the right things
🟢 Good
Level 3: Spec-first tests, then implementation
Tests define correctness, code follows
🟢 Better
Level 4: Test + review + CI/CD
Automated verification on every commit
🟢 Best
“The AI cannot test its own correctness. Tests are how you test the AI.”
For your next AI-generated function:
Build the muscle memory of test-first AI development.