Test Data Management
Effective test data management ensures tests are readable, maintainable, and use realistic data that mirrors production scenarios.
Overview
Test data management involves creating, maintaining, and cleaning up test data in a way that makes tests reliable, fast, and easy to understand. Poor test data practices lead to flaky tests (failures due to data issues, not code bugs), hard-to-maintain tests (scattered data creation logic), and hard-to-debug failures (unclear what data was being tested).
The test data problem: Tests need data - users, accounts, payments, transactions. Creating this data inline in every test leads to duplication, inconsistency, and brittleness. When a domain model changes (e.g., a new required field is added to the Payment entity), hundreds of tests break because each test constructs Payment objects directly. Hard-coding test data makes tests fragile - they break when you change implementation details like database IDs or timestamp formats, even though the business logic being tested hasn't changed.
The solution: Test data builders (Java) and factories (TypeScript/JavaScript) provide reusable, flexible data creation with sensible defaults. They centralize data creation logic in one place, making it easy to update when models change - adding a new required field means updating one builder, not hundreds of tests. They provide meaningful defaults so tests only specify what's relevant to the scenario being tested (e.g., only the amount for a payment test, not recipient, currency, timestamp, etc.). They use realistic data (via tools like Faker) to catch bugs that only surface with production-like inputs, such as long names breaking UI layouts or special characters causing encoding issues.
This applies to all test types. Unit tests need objects for testing business logic. Integration tests need database records. E2E tests need full user scenarios with multiple related entities. Consistent test data patterns work across all these contexts.
Applies to: Spring Boot · Angular · React · React Native · Android · iOS
Test data management principles apply universally. Use builders (Java/Kotlin) or factories (TypeScript/JavaScript/Swift) for clean test data creation.
Core Principles
- Realistic Data: Use data that resembles production scenarios
- Builder Pattern: Create reusable test data builders for complex objects
- Test Isolation: Each test creates and cleans up its own data
- Readable Tests: Test data creation should enhance test readability
- No Shared State: Avoid sharing test data across tests
Test Data Builders (Java)
Basic Builder Pattern
The builder pattern provides a fluent API for constructing test objects. Each builder method returns this, allowing method chaining. The static factory method aPayment() provides a readable entry point that reads naturally in tests: PaymentTestBuilder.aPayment().withAmount(...).build().
Why builders work: Builders separate object construction (complex, with many fields) from test logic (simple, focused on one scenario). They provide sensible defaults for all fields, so tests only customize what matters for that specific test case. This makes tests more readable - you see immediately what's special about the test data - and more maintainable - adding a new field to Payment means updating only the builder.
public class PaymentTestBuilder {
private UUID id = UUID.randomUUID();
private BigDecimal amount = new BigDecimal("100.00");
private String currency = "USD";
private String recipient = "John Doe";
private PaymentStatus status = PaymentStatus.PENDING;
private LocalDateTime createdAt = LocalDateTime.now();
private String reference = "Test payment";
public static PaymentTestBuilder aPayment() {
return new PaymentTestBuilder();
}
public PaymentTestBuilder withId(UUID id) {
this.id = id;
return this;
}
public PaymentTestBuilder withAmount(BigDecimal amount) {
this.amount = amount;
return this;
}
public PaymentTestBuilder withCurrency(String currency) {
this.currency = currency;
return this;
}
public PaymentTestBuilder withRecipient(String recipient) {
this.recipient = recipient;
return this;
}
public PaymentTestBuilder withStatus(PaymentStatus status) {
this.status = status;
return this;
}
public PaymentTestBuilder completed() {
this.status = PaymentStatus.COMPLETED;
return this;
}
public PaymentTestBuilder failed() {
this.status = PaymentStatus.FAILED;
return this;
}
public Payment build() {
Payment payment = new Payment();
payment.setId(id);
payment.setAmount(amount);
payment.setCurrency(currency);
payment.setRecipient(recipient);
payment.setStatus(status);
payment.setCreatedAt(createdAt);
payment.setReference(reference);
return payment;
}
public PaymentRequest buildRequest() {
PaymentRequest request = new PaymentRequest();
request.setAmount(amount);
request.setCurrency(currency);
request.setRecipient(recipient);
request.setReference(reference);
return request;
}
}
Usage in Tests
These tests demonstrate the builder pattern's value. The first test only specifies what's relevant to the scenario (high-value payment in EUR) - the builder provides reasonable defaults for recipient, status, and other fields. The second test uses even fewer customizations, relying almost entirely on defaults. This keeps tests focused on the scenario being tested, not on boilerplate data construction.
Notice how the test clearly communicates intent: when you see .withAmount(new BigDecimal("10000.00")), you immediately understand this test is about high-value payments. You don't need to wade through dozens of field assignments to find what's important.
@Test
void shouldProcessHighValuePayment() {
// Arrange: Build specific test data
Payment payment = PaymentTestBuilder.aPayment()
.withAmount(new BigDecimal("10000.00"))
.withCurrency("EUR")
.build();
// Act
PaymentResponse response = paymentService.process(payment);
// Assert
assertThat(response.getStatus()).isEqualTo(PaymentStatus.REQUIRES_APPROVAL);
}
@Test
void shouldCompleteSmallPayment() {
// Arrange: Use defaults with small customization
Payment payment = PaymentTestBuilder.aPayment()
.withAmount(new BigDecimal("50.00"))
.build();
// Act
PaymentResponse response = paymentService.process(payment);
// Assert
assertThat(response.getStatus()).isEqualTo(PaymentStatus.COMPLETED);
}
Object Mother Pattern
The Object Mother pattern extends builders by providing pre-configured instances for common scenarios. While PaymentTestBuilder offers fine-grained control, PaymentMother provides ready-made configurations for typical use cases. This reduces duplication when many tests need the same setup (e.g., "small payment," "large payment").
When to use: Use Object Mothers when you have recurring test scenarios that appear across multiple test classes. If you find yourself writing the same builder configuration repeatedly (.withAmount(...).withStatus(...) in 10 different tests), create an Object Mother method. This centralizes the scenario definition and makes tests more expressive - PaymentMother.largePayment() is clearer than scattered builder calls.
public class PaymentMother {
public static Payment smallPayment() {
return PaymentTestBuilder.aPayment()
.withAmount(new BigDecimal("10.00"))
.build();
}
public static Payment largePayment() {
return PaymentTestBuilder.aPayment()
.withAmount(new BigDecimal("50000.00"))
.build();
}
public static Payment completedPayment() {
return PaymentTestBuilder.aPayment()
.completed()
.build();
}
public static Payment failedPayment() {
return PaymentTestBuilder.aPayment()
.failed()
.build();
}
public static Payment internationalPayment() {
return PaymentTestBuilder.aPayment()
.withCurrency("EUR")
.withAmount(new BigDecimal("500.00"))
.build();
}
}
// Usage
@Test
void shouldRequireApprovalForLargePayments() {
Payment payment = PaymentMother.largePayment();
PaymentResponse response = paymentService.process(payment);
assertThat(response.getStatus()).isEqualTo(PaymentStatus.REQUIRES_APPROVAL);
}
Test Data Factories (TypeScript)
Factory Functions
TypeScript uses factory functions instead of builder classes. Factories accept a partial object as overrides, merging them with defaults. This is more idiomatic in JavaScript/TypeScript than fluent builders.
How factories work: The factory defines default values for all properties. The overrides parameter allows tests to customize specific fields while accepting defaults for everything else. The spread operator (...overrides) merges the overrides with defaults, with overrides taking precedence. This achieves the same goals as Java builders - sensible defaults, minimal test customization, maintainability - with less boilerplate.
// tests/factories/paymentFactory.ts
import { faker } from '@faker-js/faker';
export interface PaymentData {
id?: string;
amount?: number;
currency?: string;
recipient?: string;
status?: PaymentStatus;
createdAt?: Date;
reference?: string;
}
export function createPayment(overrides: PaymentData = {}): Payment {
return {
id: faker.string.uuid(),
amount: 100.00,
currency: 'USD',
recipient: 'John Doe',
status: 'PENDING',
createdAt: new Date(),
reference: 'Test payment',
...overrides
};
}
export function createPaymentRequest(overrides: Partial<PaymentRequest> = {}): PaymentRequest {
return {
amount: 100.00,
currency: 'USD',
recipient: 'John Doe',
reference: 'Test payment',
...overrides
};
}
// Convenience factories
export function createCompletedPayment(overrides: PaymentData = {}): Payment {
return createPayment({ status: 'COMPLETED', ...overrides });
}
export function createFailedPayment(overrides: PaymentData = {}): Payment {
return createPayment({ status: 'FAILED', ...overrides });
}
export function createLargePayment(overrides: PaymentData = {}): Payment {
return createPayment({ amount: 50000.00, ...overrides });
}
Usage in Tests
The factory pattern makes tests concise and focused. The first test only specifies the amount (250.00), accepting factory defaults for all other fields. The second test uses a pre-configured convenience factory createLargePayment() which encodes the "large payment" scenario - no need to remember what amount qualifies as "large" in every test that needs one.
This approach follows the Test Data Management principle of realistic data - factories can use Faker to generate realistic names, emails, and other values, catching bugs that only surface with production-like data patterns.
import { createPayment, createPaymentRequest, createLargePayment } from '../factories/paymentFactory';
describe('PaymentService', () => {
it('should process payment successfully', async () => {
// Arrange: Use factory with minimal customization
const request = createPaymentRequest({ amount: 250.00 });
// Act
const result = await paymentService.processPayment(request);
// Assert
expect(result.status).toBe('COMPLETED');
expect(result.amount).toBe(250.00);
});
it('should require approval for large payments', async () => {
// Arrange: Use pre-configured factory
const payment = createLargePayment();
// Act
const result = await paymentService.processPayment(payment);
// Assert
expect(result.status).toBe('REQUIRES_APPROVAL');
});
});
Using Faker for Realistic Data
Faker.js generates realistic random data - names, emails, addresses, phone numbers, financial amounts, dates. This catches bugs that only surface with production-like data: long names that break UI layouts, special characters in addresses that cause encoding issues, edge case dates that expose timezone bugs.
Why Faker matters: Hard-coded test data like "John Doe" and "[email protected]" doesn't exercise the same code paths as production data. Real users have names like "Mary O'Brien-Smith" (apostrophe, hyphen, long) or "José García" (accented characters). Real emails have subdomains and plus-signs. Faker generates diverse, realistic data that catches these edge cases.
Use Faker for fields that don't affect test logic. For critical fields that determine test behavior (like payment amounts or status codes), use explicit values so the test clearly communicates what scenario is being tested.
import { faker } from '@faker-js/faker';
export function createUser(overrides: Partial<User> = {}): User {
return {
id: faker.string.uuid(),
username: faker.internet.userName(),
email: faker.internet.email(),
firstName: faker.person.firstName(),
lastName: faker.person.lastName(),
phone: faker.phone.number(),
address: {
street: faker.location.streetAddress(),
city: faker.location.city(),
state: faker.location.state(),
zipCode: faker.location.zipCode(),
country: 'USA'
},
createdAt: faker.date.past(),
...overrides
};
}
export function createAccount(overrides: Partial<Account> = {}): Account {
return {
id: faker.string.uuid(),
accountNumber: faker.finance.accountNumber(10),
balance: parseFloat(faker.finance.amount({ min: 100, max: 10000, dec: 2 })),
currency: 'USD',
type: faker.helpers.arrayElement(['CHECKING', 'SAVINGS']),
status: 'ACTIVE',
...overrides
};
}
Database Test Data
Test Data Scripts (SQL)
For integration tests that need pre-populated database state, SQL scripts provide a declarative way to set up test data. The @Sql annotation executes SQL scripts before (and optionally after) test methods, ensuring tests start with known database state.
When to use SQL scripts vs builders: Use SQL scripts when you need complex relational data (multiple users, accounts, and payments with foreign key relationships) or when testing legacy code with existing schema. Use builders/repositories when you're testing new code and can create data through your application's domain model. SQL scripts are faster for bulk data setup but couple tests to database schema changes.
Combining with TestContainers: TestContainers provides an isolated, real database for each test run. SQL scripts execute against this containerized database, ensuring tests don't interfere with each other or rely on external database state.
-- src/test/resources/test-data/payments.sql
DELETE FROM audit_log;
DELETE FROM payment;
DELETE FROM account;
DELETE FROM user;
-- Insert test users
INSERT INTO user (id, username, email, password_hash, created_at) VALUES
('11111111-1111-1111-1111-111111111111', 'testuser1', '[email protected]', '$2a$10$...', NOW()),
('22222222-2222-2222-2222-222222222222', 'testuser2', '[email protected]', '$2a$10$...', NOW());
-- Insert test accounts
INSERT INTO account (id, user_id, account_number, balance, currency, type, status) VALUES
('aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa', '11111111-1111-1111-1111-111111111111', '1234567890', 1000.00, 'USD', 'CHECKING', 'ACTIVE'),
('bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb', '22222222-2222-2222-2222-222222222222', '0987654321', 5000.00, 'USD', 'SAVINGS', 'ACTIVE');
-- Insert test payments
INSERT INTO payment (id, account_id, amount, currency, recipient, status, created_at) VALUES
('cccccccc-cccc-cccc-cccc-cccccccccccc', 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa', 100.00, 'USD', 'John Doe', 'COMPLETED', NOW()),
('dddddddd-dddd-dddd-dddd-dddddddddddd', 'bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb', 250.00, 'USD', 'Jane Smith', 'PENDING', NOW());
Loading Test Data
The @Sql annotation runs the specified SQL scripts at the designated execution phase. BEFORE_TEST_METHOD loads test data before each test method, ensuring fresh state. AFTER_TEST_METHOD cleans up, preventing data from affecting other tests. This maintains test isolation - each test runs against a known, clean database state.
Why this pattern works: Tests can rely on specific data existing (user with ID 11111111-..., account with balance 1000.00) because the SQL script guarantees it. This is more reliable than hoping previous tests left the database in a particular state. Cleanup after each test prevents test pollution where one test's data causes another test to pass or fail incorrectly.
For more on database testing patterns and TestContainers setup, see Integration Testing.
@SpringBootTest
@Sql(scripts = "/test-data/payments.sql", executionPhase = Sql.ExecutionPhase.BEFORE_TEST_METHOD)
@Sql(scripts = "/test-data/cleanup.sql", executionPhase = Sql.ExecutionPhase.AFTER_TEST_METHOD)
class PaymentRepositoryTest {
@Test
void shouldFindPaymentsByUser() {
UUID userId = UUID.fromString("11111111-1111-1111-1111-111111111111");
List<Payment> payments = paymentRepository.findByUserId(userId);
assertThat(payments).hasSize(1);
assertThat(payments.get(0).getRecipient()).isEqualTo("John Doe");
}
}
Database Cleanup
A reusable cleanup helper ensures consistent test isolation. The cleanAll() method deletes data in reverse dependency order (audit logs first, then payments, accounts, users) to avoid foreign key constraint violations. The resetSequences() method restarts auto-increment sequences, ensuring predictable IDs across test runs.
Why cleanup matters: Without cleanup, tests accumulate data, leading to flaky failures when a test expects a specific number of records or when unique constraints get violated. Running cleanAll() in @BeforeEach ensures each test starts with an empty database, making tests deterministic and independent.
Alternative: Transaction rollback: Spring Boot's @Transactional on test classes automatically rolls back database changes after each test. This is faster than explicit cleanup (no DELETE queries) but doesn't work for tests that verify transaction boundaries or commit behavior. Use rollback for most tests, explicit cleanup for transaction-specific tests.
// Helper class for database cleanup
public class DatabaseCleaner {
@Autowired
private JdbcTemplate jdbcTemplate;
public void cleanAll() {
jdbcTemplate.execute("DELETE FROM audit_log");
jdbcTemplate.execute("DELETE FROM payment");
jdbcTemplate.execute("DELETE FROM account");
jdbcTemplate.execute("DELETE FROM user");
}
public void resetSequences() {
jdbcTemplate.execute("ALTER SEQUENCE user_seq RESTART WITH 1");
jdbcTemplate.execute("ALTER SEQUENCE payment_seq RESTART WITH 1");
}
}
// Usage in tests
@BeforeEach
void cleanDatabase() {
databaseCleaner.cleanAll();
}
Test Fixtures
JUnit 5 Extensions
public class PaymentTestDataExtension implements BeforeEachCallback {
@Override
public void beforeEach(ExtensionContext context) {
TestData testData = context.getRequiredTestMethod()
.getAnnotation(TestData.class);
if (testData != null) {
// Load test data based on annotation
loadTestData(testData.value());
}
}
private void loadTestData(String dataSet) {
// Load specific test data set
}
}
// Custom annotation
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface TestData {
String value();
}
// Usage
@ExtendWith(PaymentTestDataExtension.class)
class PaymentServiceTest {
@Test
@TestData("small-payments")
void shouldProcessSmallPayments() {
// Test data loaded automatically
}
}
Best Practices
Use Builders Over Constructors
Constructors require all parameters in a specific order, making test code hard to read and brittle. When the Payment constructor adds a new parameter, every test breaks. Builders name each property explicitly (.withAmount(...)), making tests self-documenting and resilient to constructor changes - adding a field to Payment means updating the builder's defaults, not every test.
Why builders are better: Builders provide sensible defaults for all fields, so tests only specify what's relevant. The constructor approach requires every test to provide every parameter, even for fields that don't matter for that test scenario. This creates noise that obscures the test's intent.
// Bad: Hard to read, positional parameters
Payment payment = new Payment(
UUID.randomUUID(),
new BigDecimal("100.00"),
"USD",
"John Doe",
PaymentStatus.PENDING,
LocalDateTime.now(),
"Test payment"
);
// Good: Self-documenting, flexible
Payment payment = PaymentTestBuilder.aPayment()
.withAmount(new BigDecimal("100.00"))
.withRecipient("John Doe")
.build();
Provide Sensible Defaults
Defaults should be valid, realistic values that work for most tests. A default amount of 100.00 is reasonable for typical payment tests - not so large it triggers special validation, not so small it looks like a test value. Default currency of USD matches the most common scenario. Default status of PENDING represents the typical initial state.
Choosing good defaults: Pick values that represent the most common, normal scenario. Avoid edge cases (negative amounts, empty strings) or extreme values (999999) as defaults - those should be explicit in tests that need them. Good defaults mean most tests can use the builder with zero customization, reserving customization for the specific aspect being tested.
// Builder with good defaults
public class PaymentTestBuilder {
// Defaults that work for most tests
private BigDecimal amount = new BigDecimal("100.00"); // Valid, common amount
private String currency = "USD"; // Default currency
private PaymentStatus status = PaymentStatus.PENDING; // Typical initial state
// Only override what's relevant to the test
}
One Builder Per Domain Entity
Separate builders for each entity keep test code modular and maintainable. When the Payment entity changes, you only update PaymentTestBuilder. With a monolithic builder, changes to any entity force updates to the giant builder class, increasing merge conflicts and making it harder to understand which tests are affected.
Why separate builders: Each builder focuses on one entity's construction logic. This follows the Single Responsibility Principle - PaymentTestBuilder knows how to build valid Payment objects, nothing else. It also makes builders reusable across test files - PaymentTestBuilder can be used in payment service tests, integration tests, and E2E tests.
// Good: Separate builders for separate entities
PaymentTestBuilder.aPayment()
UserTestBuilder.aUser()
AccountTestBuilder.anAccount()
// Bad: One giant builder for everything
TestDataBuilder.create()
.withPayment(...)
.withUser(...)
.withAccount(...)
Clean Test Data Between Tests
Cleaning data before each test ensures test isolation - no test depends on data created by a previous test. This makes tests deterministic (same result every run) and independent (can run in any order). Without cleanup, tests become coupled and brittle, passing when run individually but failing when run together due to accumulated state.
When to clean: Use @BeforeEach (not @AfterEach) so you can inspect database state after a failed test for debugging. Cleaning before ensures the database starts in a known state; cleaning after doesn't help if the test fails midway.
For integration tests using TestContainers, the containerized database provides isolation automatically, but cleanup is still good practice for test clarity and performance.
@BeforeEach
void cleanDatabase() {
// Clean all test data before each test
paymentRepository.deleteAll();
accountRepository.deleteAll();
userRepository.deleteAll();
}
Use Realistic Data
Realistic test data catches bugs that only surface with production-like inputs. A recipient name like "XXXXX" won't catch bugs related to name length, special characters (apostrophes, hyphens), or internationalization (accented characters). Realistic names like "Mary O'Brien-Smith" or "José García" exercise code paths that test data like "XXXXX" or "test" never will.
When to use realistic vs explicit data: Use realistic data (via Faker) for fields that don't affect test logic. Use explicit values for fields that determine test behavior - when testing high-value payment handling, use an explicit amount like 10000.00 so readers understand the threshold being tested.
This principle is particularly important for integration tests where data flows through multiple layers (validation, serialization, database storage, API responses), each potentially exposing edge cases.
// Bad: Meaningless test data
Payment payment = PaymentTestBuilder.aPayment()
.withAmount(new BigDecimal("999999.99"))
.withRecipient("XXXXX")
.build();
// Good: Realistic test data
Payment payment = PaymentTestBuilder.aPayment()
.withAmount(new BigDecimal("1250.00"))
.withRecipient("Acme Corp - Invoice #12345")
.build();
Further Reading
- Unit Testing - Unit testing patterns
- Integration Testing - Integration test strategies
- Testing Strategy - Overall testing approach
External Resources:
- Faker.js - Generate realistic fake data
- Builder Pattern
Summary
Key Takeaways:
- Test Data Builders: Use builder pattern for flexible, readable test data creation
- Object Mother: Create pre-configured builders for common scenarios
- Factory Functions: Use factory functions in TypeScript for test data
- Realistic Data: Use Faker for realistic test data
- Sensible Defaults: Provide good defaults, override only what's relevant
- Test Isolation: Each test creates and cleans up its own data
- Database Cleanup: Clean database state between tests
- One Builder Per Entity: Separate builders for separate domain entities