Integration tests often require a lot of related data to satisfy foreign key constraints. If your database enforces these constraints (as it should), setting up test data becomes verbose and repetitive. This typically results in large amounts of duplicated setup code scattered across your test suite—code that's hard to maintain and distracts from the actual purpose of the test.

In the previous post, Test Data Builders was introduced as a way to reduce the boilerplate of test data construction. This post builds on that by introducing two complementary strategies to handle related data in integration tests:

  • Preload with a Standard Test Data Set
  • Use a Test Data Repository for Fresh Relationships

Example

Let’s consider a test for cancelling or completing a Payment in a system with the following model:

classDiagram class User { +String username } class Payment { +String receipt } class Order { +BigDecimal amount } class OrderLine { +String item +BigDecimal amount } Payment "0..1" --< "1" Order User "1" <-- "0..*" Order Order "1" *-- "1..*" OrderLine

A Payment is registered on an Order, which is placed by a User and contains one or more OrderLines. You could imagine the model being even more complex—for instance, involving product categories, but this example is sufficient to illustrate the point.

To test the cancel method on a Payment, we need a Payment that hasn’t yet been processed. Here’s what the test might look like:

@Test
void testPaymentCancellationOfNotProcessedPayment() {
    // Arrange
    // NOTICE: Start of unrelated test data setup
    final var user = UserTestData.standard();
    final var userId = userRepository.add(user);

    final var orderLine1 = OrderLineTestData.builder()
        .setAmount(BigDecimal.valueOf(10))
        .build();
    final var orderLine2 = OrderLineTestData.builder()
        .setAmount(BigDecimal.valueOf(20))
        .build();
    final var order = OrderTestData.builder()
        .addOrderLine(orderLine1)
        .addOrderLine(orderLine2)
        .setUser(userId)
        .build();
    final var orderId = orderRepository.add(order);
    // NOTICE: End of unrelated test data setup

    final var payment = PaymentTestData.builder()
        .setState(PaymentState.NOT_PROCESSED)
        .setOrder(orderId)
        .build();
    final var paymentId = paymentRepository.add(payment);

    // Act
    final var result = paymentCanceller.cancel(paymentId);

    // Assert
    assertTrue(result.isSuccess());
}

The majority of the setup code above is not directly related to the test itself, it’s just scaffolding to satisfy foreign key constraints. The next time someone needs to test something involving Payment, they’ll likely copy-paste this boilerplate. We want to avoid that.

Let’s explore two strategies to solve this.

Strategy 1: Preload with a Standard Test Data Set

The idea here is to preload the database with a default set of test data that all tests can rely on. This works well in environments where each test runs in isolation (e.g., in a fresh database).

There are a couple of ways to set up this data:

  • Using SQL
    Manually insert data at startup, or generate it by running the application and dumping the resulting state.

  • Using a Test Data Seeder
    Write code that seeds the database using your Test Data Builders. This has the advantage of reusing already-maintained builder logic.

You can then write tests without explicitly setting all foreign key relationships. For example:

@Test
void testPaymentCancellationOfNotProcessedPayment() {
    // Arrange
    final var payment = PaymentTestData.builder()
        .setState(PaymentState.NOT_PROCESSED)
        .build(); // Uses default order from the test data set
    final var paymentId = paymentRepository.add(payment);

    // Act
    final var result = paymentCanceller.cancel(paymentId);

    // Assert
    assertTrue(result.isSuccess());
}

This greatly reduces clutter in your tests.

Strategy 2: Use a Test Data Repository for Fresh Relationships

If your tests run against a shared database (e.g., in integration pipelines), using a static dataset won’t work because test data may conflict. In such cases, it's better to create new data per test.

A Test Data Repository is a wrapper around your real repository. It uses Test Data Builders and ensures that required relationships (like Order or User) are populated automatically.

Example:

@Test
void testPaymentCancellationOfNotProcessedPayment() {
    // Arrange
    final var paymentTestData = PaymentTestData.builder()
        .setState(PaymentState.NOT_PROCESSED);
    final var paymentId = paymentTestRepository.add(paymentTestData);

    // Act
    final var result = paymentCanceller.cancel(paymentId);

    // Assert
    assertTrue(result.isSuccess());
}

The repository would look like this (Order and User would have a similar one):

public class PaymentTestRepository {
    private final OrderTestRepository orderTestRepository;
    private final PaymentRepository paymentRepository;

    public PaymentId addDefault() {
        final var paymentTestData = PaymentTestData.builder();
        return add(paymentTestData);
    }

    public PaymentId add(PaymentTestData paymentTestData) {
        if (paymentTestData.getOrder().isEmpty()) {
            final var orderId = orderTestRepository.addDefault();
            paymentTestData.setOrder(orderId);
        }
        final var payment = paymentTestData.build();
        return paymentRepository.add(payment);
    }
}

This way, your test code stays clean while the repository ensures all required relationships are satisfied. If the test does need a specific Order, the builder can be configured accordingly.

Logical references (not recommended)

An alternative is to avoid foreign keys altogether and use logical identifiers instead. This makes the system easier to test but comes at a steep cost:

  • You lose database-level guarantees
  • You make data harder to work with (e.g., SQL joins)
  • You introduce risk into production for the sake of testing convenience

This trade-off is rarely worth it. Testability should not come at the expense of maintainability or data integrity.

Conclusion

You can combine Test Data Builders with strategies for handling related data in multiple ways:

Strategy Pros Cons
Standard Test Data Set Fast to set up, simple, enables concise tests Less flexible, requires test isolation, harder to customize
Test Data Repository Fresh data per test, flexible, scales with complex models Slightly more code upfront

Pick one strategy and use it consistently across your test suite. Either approach results in tests that are easier to read, write, and maintain—and helps you avoid test fragility caused by redundant setup code.