Why only testing use cases just isn’t enough; here’s how to do it right
Traditional unit tests are setting you up for production failures. Here's a path to a better launch experience: use property-based testing to build unbreakable software.
Most teams are building on quicksand and calling it solid ground.
You’ve got 95% test coverage. Your CI pipeline is green. Your unit tests pass every time. You ship with confidence, and then — crash — production burns down because of an edge case nobody thought to test. The user entered a 37-character string instead of 35. Someone sent a negative timestamp. A list came back empty when your code expected at least one item.
Sound familiar? Here’s an uncomfortable truth: traditional unit testing is fundamentally inadequate for building reliable software.
Unit tests are like taking a single photograph of a mountain and claiming you understand the entire landscape. They capture one specific moment, one exact scenario, but they miss the infinite variations that real users will throw at your system. They’re missing the crevasses, caves, and sudden unexpected features. They’re necessary, but they’re not sufficient.
After decades of building and shipping software — from startup MVPs to mission-critical systems handling millions of transactions — I’ve learned that the teams who ship bulletproof software have one thing in common: They know that testing specific use cases is inadequate.
They don’t buy into the illusion of safety that unit testing offers. They use property-based testing.
Property-based testing doesn’t just find bugs. It finds entire categories of bugs you never knew existed. It’s the difference between checking that your bridge can hold one specific car versus proving it can fulfill its design parameters by handling a constant, unpredictable flow of traffic.
If you’re new, welcome to Customer Obsessed Engineering! Every week I publish a new article, direct to your mailbox if you’re a subscriber. As a free subscriber you can read about half of every article, plus all of my free articles.
Anytime you’d like to read more, you can upgrade to a paid subscription.
The fundamental flaw in how we think about testing
Let me paint you a picture. You’re building a user activity tracking system (let’s call it Tracky). Your traditional testing approach might look something like this:
test "creates activity with valid data" do
user = insert(:user)
sprint = insert(:sprint)
assert {:ok, activity} = Activities.new_activity(
sprint.id,
user.id,
~D[2024-01-15],
~T[02:30:00],
"Working on feature X"
)
assert activity.duration == ~T[02:30:00]
assert activity.description != nil
endThis test passes. You ship. Then production explodes because:
Someone enters a duration of 25 hours
A date falls outside the sprint boundaries
The description is 500 characters long and breaks your UI
Someone passes a
nilsprint_idThe timestamp has microseconds that cause floating-point precision issues
Your unit test checked one path through the code. But software doesn’t operate in the controlled environment of your test suite. It operates in the chaos of production, where users do unexpected things, data comes in unexpected formats, and Murphy’s Law reigns supreme.
Instead of testing specific examples, we should be testing universal properties that must always hold true.
Property-based testing builds a foundation for quality by exercising all the behavioral boundaries of your code. This is exactly how we approach development using a “steel thread.” A steel thread creates an architecturally complete foundation by touching all the major components of a system.
The truth about test coverage
Here’s something that might challenge a few of your assumptions: high unit test coverage can actually make your software less reliable.
I’ve seen teams with 98% test coverage ship catastrophic bugs because they confused coverage with correctness. They tested their happy paths thoroughly but ignored the vast space of possible inputs that could break their system.
Property-based testing breaks through this limitation. Instead of measuring how much of your code is exercised, you’re measuring how much of your input space is validated. Instead of asking, “did we test this line of code?” you’re asking, “did we prove this business rule is always true?”
A single property test might only exercise 20% of your codebase but validate behavior across millions of potential inputs — including edge cases. Which approach gives you more confidence in production?
The takeaway here: coverage metrics optimize for the wrong thing. They optimize for testing effort rather than testing effectiveness.
What is a “property?”
A “property” is a universal law or attribute. Gravity is a property that dictates how objects are attracted to each other. Fluid dynamics define a set of properties about how fluids flow and interact. And, our Tracky application has certain properties, too:
Activities must fall inside the date boundary of a sprint.
An activity is limited in time (for our purposes, it’s a business rule that “activities must be less than 8 hours in duration).
Activities must be owned by a user, and that user can view, update, and delete only their own activities (not someone else’s activities).
Activities can, optionally, have a description.
Of course there’s more, but this is sufficient for the examples we’ll look at in this article.
These properties define how Tracky behaves; what you are allowed to do with it, and what you are not allowed to do. They are universal laws — in all conditions, an activity must fulfill each of these laws — and, very importantly, the system must gracefully prevent the violation of these laws.
What property-based testing actually does
We’re all familiar with unit testing and testing very specific “success criteria,” like I described above. The problem with that approach is the inherent limitation of testing a single path at a time. It’s an impossible burden to test every edge case using that approach.
Property based testing expands our test horizon. Instead of writing specific test cases, you define properties — universal truths about your system — and let the testing framework try to break them using hundreds or thousands of generated test cases.
In short, you write one generalized test scenario, and the system uses it to create thousands of potential test paths for you.
Let’s start with a simple example.
In this example I use a “generator” to seed my test with test data — if you aren’t familiar with generators, don’t worry, we’ll come back to them. For now, just understand that a generator creates an infinite, essentially random stream of test data. They are the powerhouse behind property based testing.
Let’s start by replacing my naive creates activity with valid data with a property-based test. Here’s how it might look:
property "activities must have reasonable durations and valid dates within sprint bounds" do
check all sprint <- sprint_generator(),
user <- user_generator(),
duration <- valid_activity_duration_generator(),
date <- member_of(Date.range(sprint.start, sprint.end)),
description <- lorem_style_phrase_generator_or_empty() do
result = Activities.new_activity(
sprint.id,
user.id,
date,
duration,
description
)
# Property 1: Valid inputs should always succeed
assert {:ok, activity} = result
# Property 2: Output duration should match input duration
assert activity.duration == duration
# Property 3: Activity date must be within sprint bounds
assert Date.compare(activity.date, sprint.start_date) != :lt
assert Date.compare(activity.date, sprint.end_date) != :gt
# Property 4: Description, sprint, and user must all match
assert activity.description == description
assert activity.sprint == sprint
assert activity.user == user
end
endI’m using Elixir here, but that doesn’t matter. There are property-based test frameworks available for just about every language.1
At first glance, this might not seem to be much different. I’m still creating activities, and comparing the properties of those created activities with expected values.
But, this single property test does the work of thousands of individual unit tests. It generates valid sprints with random date ranges, creates users with different roles, generates durations up to 8 hours, picks dates within the sprint bounds, and creates descriptions of varying lengths. It does this many times, creating a myriad of combinations and edge cases that we would never think to code up by hand.
This means you’ll see a lot of combinations:
An administrator creating an activity with a minute-long duration.
Another user creating a 5-hour activity, mid-sprint, with a long description filled with emoji and Unicode characters.
Activities with
nil, empty, very short, and very long descriptions with a variety of punctuation.Activities that belong to sprints far in the future, or that started (and ended) a month ago.
And on it goes, with many, many combinations of users, sprints, dates, durations and descriptions.
And, it gets more powerful: With property testing we also test the boundaries. What happens when the duration is exactly 8 hours? What about when it’s one second over? What about when the date is exactly on the sprint start date? Or the end date? It’s easy to write negative scenario tests as well, just by introducing a generator that creates bad inputs.
The testing framework explores the entire input space systematically, looking for the edge cases and combinations of edge cases that will break your system in production.
Please consider referring a friend… it really helps keep the blog going! And, you earn rewards.
The architecture of unbreakable software
Property-based testing forces you to think architecturally about your code. When you write properties, you’re not thinking about specific scenarios — you’re thinking about the fundamental invariants that must hold true regardless of input.
Thinking in terms of property-based testing makes architecture easier. We’re shifting away from, “let’s make this one specific test case succeed” and toward, “let’s define conditions that prove these fundamental truths are always true, no matter what we throw at it.”
Here are some of those “fundamental truths” property-based testing helps to verify:
Data integrity properties.
Entities must always maintain appropriate relationships.
Temporal properties (time and dates) must fall within suitable boundaries.
Access and ownership must be part of the system.
Keep reading with a 7-day free trial
Subscribe to Customer Obsessed Engineering to keep reading this post and get 7 days of free access to the full post archives.

