CI Testing Integration

Overview

CI testing automates test execution for every code change in GitLab CI/CD pipelines. Tests run in isolated environments that mirror production, catching integration issues that might not appear locally. The pipeline blocks merges when tests fail, preventing defects from reaching production.

Manual testing before merging is unreliable. Developers skip tests when rushed, run incomplete test suites, or test in environments that differ from production. CI testing eliminates these inconsistencies by running the same comprehensive test suite for every change in a clean, reproducible environment.

Pipeline organization follows a fail-fast strategy: fastest tests run first to provide immediate feedback and conserve CI resources. A failed unit test (30-60 seconds) stops the pipeline before expensive integration tests (5-10 minutes) or E2E tests (15-30 minutes) consume resources. This ordering reduces mean time to feedback and CI costs.

Platform Applicability

Applies to: Spring Boot · Angular · React · React Native · Android · iOS

Pipeline configuration adapts to each platform's build tools and test frameworks.

Core Principles

Fail Fast: Run fastest tests first to minimize feedback latency and wasted resources
Parallel Execution: Execute independent tests concurrently to reduce total pipeline duration
Quality Gates: Enforce coverage and mutation thresholds to maintain test suite effectiveness
Comprehensive Reporting: Generate machine-readable reports (JUnit XML, Cobertura) for GitLab integration
Resource Efficiency: Cache dependencies and reuse test infrastructure to minimize CI costs

Pipeline Architecture

Test Execution Flow

This diagram shows the progression through test stages and the conditions that block merge requests. Each stage depends on previous stages succeeding, creating quality gates that prevent defective code from advancing.

Stage Timing and Resource Usage

Understanding the time and resource cost of each stage explains why fail-fast ordering matters. Unit tests consume minimal resources and complete quickly, while E2E tests require full application deployment, databases, and headless browsers.

The gantt chart shows wall-clock time when tests run in parallel. Unit tests and linting run concurrently (45 seconds total), not sequentially (75 seconds). Integration and contract tests also run in parallel, reducing total pipeline time from 25 minutes to 15 minutes.

GitLab CI Configuration

Complete Pipeline Example

This configuration demonstrates stage organization, dependency management with needs, caching for dependency reuse, and artifact collection for test reporting. Variables optimize build performance: GRADLE_OPTS disables the Gradle daemon (which provides no benefit in single-use CI containers), and NODE_OPTIONS increases heap size for TypeScript compilation of large codebases.

The needs keyword enables pipeline DAG (Directed Acyclic Graph) execution rather than strict stage ordering. A job can start as soon as its dependencies complete, not when all jobs in the previous stage finish. This parallelization significantly reduces total pipeline duration.

# .gitlab-ci.yml

variables:
  # Gradle daemon wastes memory in CI ephemeral containers
  GRADLE_OPTS: "-Dorg.gradle.daemon=false"
  # TypeScript compilation requires additional heap for large codebases
  NODE_OPTIONS: "--max-old-space-size=4096"

stages:
  - build
  - test
  - integration
  - quality
  - e2e
  - performance

# Reusable cache configuration prevents re-downloading dependencies
# Key on branch to allow cache sharing across pipeline runs
.cache-template: &cache-template
  cache:
    key: "$CI_COMMIT_REF_SLUG"
    paths:
      - .gradle/wrapper
      - .gradle/caches
      - node_modules/

# ============================================
# Build Stage
# ============================================

build-backend:
  stage: build
  image: eclipse-temurin:21-jdk
  <<: *cache-template
  script:
    # Build without running tests (tests run in parallel test stage)
    - ./gradlew clean build -x test
  artifacts:
    paths:
      - build/libs/*.jar
    expire_in: 1 day
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == "main"'

build-frontend:
  stage: build
  image: node:22
  <<: *cache-template
  script:
    # npm ci provides deterministic installs from package-lock.json
    - npm ci
    - npm run build
  artifacts:
    paths:
      - build/
      - dist/
    expire_in: 1 day
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == "main"'

# ============================================
# Test Stage (Fast Feedback)
# ============================================

unit-test-backend:
  stage: test
  image: eclipse-temurin:21-jdk
  <<: *cache-template
  # Empty needs allows parallel execution with build stage
  needs: []
  script:
    - ./gradlew test
  # Regex extracts coverage percentage from Jacoco output
  coverage: '/Total.*?([0-9]{1,3})%/'
  artifacts:
    reports:
      # GitLab parses JUnit XML to display test results in MR
      junit: build/test-results/test/**/TEST-*.xml
      # GitLab renders coverage trends and diff in MR
      coverage_report:
        coverage_format: cobertura
        path: build/reports/cobertura-coverage.xml
    paths:
      - build/reports/tests/
    # Collect artifacts even on failure for debugging
    when: always
    expire_in: 30 days
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == "main"'

unit-test-frontend:
  stage: test
  image: node:22
  <<: *cache-template
  needs: []
  script:
    - npm ci
    - npm run test:unit -- --coverage
  # Regex extracts coverage from Jest summary output
  coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/'
  artifacts:
    reports:
      junit: junit.xml
      coverage_report:
        coverage_format: cobertura
        path: coverage/cobertura-coverage.xml
    paths:
      - coverage/
    when: always
    expire_in: 30 days
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == "main"'

lint:
  stage: test
  image: node:22
  needs: []
  script:
    - npm ci
    - npm run lint
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'

# ============================================
# Mutation Testing
# ============================================

mutation-test-backend:
  stage: test
  image: eclipse-temurin:21-jdk
  <<: *cache-template
  # Mutation tests require unit tests to pass first
  needs: ["unit-test-backend"]
  script:
    # --with-history enables incremental mutation analysis
    - ./gradlew pitest --with-history
  coverage: '/Mutation coverage: (\d+)%/'
  artifacts:
    reports:
      coverage_report:
        coverage_format: cobertura
        path: build/reports/pitest/mutations.xml
    paths:
      - build/reports/pitest/
    when: always
    expire_in: 30 days
  # Separate cache for PITest history file
  cache:
    key: pitest-history-$CI_COMMIT_REF_SLUG
    paths:
      - build/pitHistory/
  # Block merge if mutation coverage below threshold
  allow_failure: false
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == "main"'

mutation-test-frontend:
  stage: test
  image: node:22
  <<: *cache-template
  needs: ["unit-test-frontend"]
  script:
    - npm ci
    - npm run test:mutation
  coverage: '/Mutation score: (\d+\.\d+)%/'
  artifacts:
    paths:
      - reports/mutation/
    when: always
    expire_in: 30 days
  allow_failure: false
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == "main"'

# ============================================
# Integration Tests
# ============================================

integration-test-backend:
  stage: integration
  image: eclipse-temurin:21-jdk
  # GitLab services provide containers accessible via hostname
  services:
    - postgres:16
    - redis:7-alpine
  variables:
    # Service containers use these credentials
    POSTGRES_DB: testdb
    POSTGRES_USER: test
    POSTGRES_PASSWORD: test
    # Application connects to 'postgres' hostname (service name)
    SPRING_DATASOURCE_URL: "jdbc:postgresql://postgres:5432/testdb"
    SPRING_REDIS_HOST: redis
  <<: *cache-template
  needs: ["build-backend"]
  script:
    - ./gradlew integrationTest
  artifacts:
    reports:
      junit: build/test-results/integrationTest/**/TEST-*.xml
    paths:
      - build/reports/tests/integrationTest/
    when: always
    expire_in: 30 days
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == "main"'

# ============================================
# Contract Tests
# ============================================

contract-test-consumer:
  stage: integration
  image: node:22
  <<: *cache-template
  needs: ["build-frontend"]
  script:
    - npm ci
    - npm run test:contract
    # Publish contracts to broker for provider verification
    - npx pact-broker publish ./pacts
      --consumer-app-version=$CI_COMMIT_SHA
      --broker-base-url=$PACT_BROKER_URL
      --broker-token=$PACT_BROKER_TOKEN
  artifacts:
    paths:
      - pacts/
    expire_in: 7 days
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'

contract-test-provider:
  stage: integration
  image: eclipse-temurin:21-jdk
  services:
    - postgres:16
  <<: *cache-template
  needs: ["build-backend"]
  script:
    # Verify provider satisfies consumer contracts from broker
    - ./gradlew pactVerify -Ppact.verifier.publishResults=true
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == "main"'

# ============================================
# Quality Gates
# ============================================

quality-gate:
  stage: quality
  image: sonarsource/sonar-scanner-cli:latest
  needs: ["unit-test-backend", "unit-test-frontend", "integration-test-backend"]
  script:
    - sonar-scanner
      -Dsonar.projectKey=$CI_PROJECT_NAME
      -Dsonar.sources=.
      -Dsonar.host.url=$SONAR_HOST_URL
      -Dsonar.login=$SONAR_TOKEN
      -Dsonar.coverage.jacoco.xmlReportPaths=build/reports/cobertura-coverage.xml
  allow_failure: false
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'

# ============================================
# E2E Tests
# ============================================

e2e-test-web:
  stage: e2e
  # Browser image includes Chrome for Cypress
  image: cypress/browsers:node-20.11.0-chrome-121.0.6167.85-1
  services:
    - postgres:16
  variables:
    CYPRESS_BASE_URL: "http://localhost:3000"
  needs: ["build-frontend", "build-backend"]
  script:
    - npm ci
    - npm run build
    # Start app in background
    - npm start &
    # Wait for app to be ready before running tests
    - npx wait-on http://localhost:3000
    # Record results to Cypress Dashboard for debugging
    - npx cypress run --browser chrome --record --key $CYPRESS_RECORD_KEY
  artifacts:
    when: always
    paths:
      - cypress/videos/
      - cypress/screenshots/
    expire_in: 7 days
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'

e2e-test-mobile:
  stage: e2e
  image: reactnativecommunity/react-native-android:latest
  needs: ["build-mobile"]
  script:
    - npm ci
    - detox build --configuration android.release
    - detox test --configuration android.release --headless
  artifacts:
    # Only collect artifacts on failure to save storage
    when: on_failure
    paths:
      - artifacts/
    expire_in: 7 days
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'
    # Mobile E2E expensive, run nightly only
    - if: '$CI_PIPELINE_SOURCE == "schedule"'

# ============================================
# Performance Tests
# ============================================

performance-test:
  stage: performance
  image: gradle:8-jdk21
  services:
    - postgres:16
  needs: ["build-backend"]
  script:
    # Start application in background
    - ./gradlew bootRun &
    # Allow application to fully start
    - sleep 30
    # Run Gatling load tests
    - ./gradlew gatlingRun
  artifacts:
    paths:
      - build/reports/gatling/
    when: always
    expire_in: 30 days
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'
    # Performance tests expensive, run on schedule
    - if: '$CI_PIPELINE_SOURCE == "schedule"'

lighthouse-ci:
  stage: performance
  image: node:22
  needs: ["build-frontend"]
  script:
    - npm ci
    - npm run build
    - npm install -g @lhci/cli
    - lhci autorun
  artifacts:
    paths:
      - .lighthouseci/
    when: always
    expire_in: 30 days
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'

Parallel Test Execution

Why Parallelization Matters

Test execution time grows linearly with test count when running serially. A codebase with 1000 tests averaging 100ms each takes 100 seconds sequentially. Running with 8 parallel workers reduces execution to approximately 12.5 seconds (plus parallelization overhead). This 8x speedup directly reduces developer feedback latency and CI costs.

Parallelization works because most tests are independent - they don't share mutable state or depend on execution order. The test framework divides tests across workers, each worker executes its subset, and results merge at the end.

Backend Parallel Configuration

JUnit 5's parallel execution mode runs test classes or methods concurrently using a thread pool. The dynamic strategy adjusts parallelism based on available CPU cores, maximizing throughput without oversubscribing resources.

maxParallelForks at the Gradle level runs separate JVM processes for test execution. This provides process isolation (preventing shared static state issues) at the cost of increased memory consumption. Setting it to half of available processors balances parallelism with memory usage - each fork requires its own heap.

// build.gradle
test {
    // Run tests in separate JVM processes for isolation
    // Use half of available processors to avoid memory exhaustion
    maxParallelForks = Runtime.runtime.availableProcessors().intdiv(2) ?: 1

    // JUnit 5 parallel execution within each fork
    systemProperty 'junit.jupiter.execution.parallel.enabled', 'true'
    // Run test classes concurrently (methods within a class run serially)
    systemProperty 'junit.jupiter.execution.parallel.mode.default', 'concurrent'
    // Dynamic strategy adjusts thread pool based on available cores
    systemProperty 'junit.jupiter.execution.parallel.config.strategy', 'dynamic'

    // Optional: Set fixed thread count instead of dynamic
    // systemProperty 'junit.jupiter.execution.parallel.config.fixed.parallelism', '4'
}

See Spring Boot Testing for TestContainers singleton pattern configuration that enables safe parallel execution.

Frontend Parallel Configuration

Jest's --maxWorkers controls how many worker processes run tests concurrently. Each worker receives a subset of test files to execute. Setting it to 50% uses half of available CPU cores, balancing speed with system responsiveness for local development.

The --ci flag optimizes Jest for CI environments: it disables watch mode, prints more detailed progress information, and fails when snapshots are missing (preventing accidental snapshot updates in CI).

// package.json
{
  "scripts": {
    "test:unit": "jest --maxWorkers=50%",
    "test:ci": "jest --ci --maxWorkers=2 --coverage"
  }
}

// jest.config.js
module.exports = {
  // Maximum workers for parallel execution
  maxWorkers: process.env.CI ? 2 : '50%',

  // Prevent tests from hanging by enforcing timeout
  testTimeout: 10000,

  // Run tests in band (serially) for debugging
  // maxWorkers: 1,
};

See Angular Testing and React Testing for framework-specific parallel testing configurations.

GitLab Parallel Jobs

GitLab's parallel keyword splits a single job into multiple parallel jobs. Use this for horizontal scaling when a single machine's parallel execution reaches diminishing returns. This approach requires test distribution logic to divide tests across job instances.

The CI_NODE_INDEX and CI_NODE_TOTAL environment variables identify each parallel job instance (e.g., job 1 of 4, job 2 of 4). Use these to partition tests across instances.

unit-test-backend:
  stage: test
  # Split into 4 parallel jobs
  parallel: 4
  script:
    # Each job runs a different subset of tests
    - ./gradlew test --parallel
      --test-class-pattern="**/Test${CI_NODE_INDEX}*"

# Alternative: Use GitLab's parallel matrix feature
unit-test-by-module:
  stage: test
  parallel:
    matrix:
      - MODULE: [accounts, payments, transfers, loans]
  script:
    - ./gradlew :$MODULE:test

The matrix approach works better for modular monorepos where tests naturally partition by module. This eliminates complex test distribution logic and provides clearer job names in the pipeline view.

Test Reporting

JUnit XML Integration

JUnit XML is a standardized format that GitLab parses to display test results in merge requests. GitLab shows pass/fail counts, individual test names, failure messages, and execution times directly in the MR interface without requiring developers to download artifacts.

unit-test-backend:
  artifacts:
    reports:
      # GitLab automatically parses these files
      junit: build/test-results/test/**/TEST-*.xml

The glob pattern **/* matches nested directories, supporting projects that organize test results by test class package structure. GitLab aggregates all matched files into a single test report.

Coverage Reporting

Coverage reports serve two purposes: they display coverage percentage trends over time and highlight which lines changed in the MR lack coverage. The coverage diff helps reviewers identify untested code paths introduced by the change.

unit-test-backend:
  # Regex extracts coverage percentage from build output
  # GitLab stores this value and tracks trends
  coverage: '/Total.*?([0-9]{1,3})%/'
  artifacts:
    reports:
      # Cobertura XML includes line-by-line coverage data
      coverage_report:
        coverage_format: cobertura
        path: build/reports/cobertura-coverage.xml

The regex pattern must match the exact output format from your coverage tool. Jacoco outputs "Total ... 85%", Jest outputs "All files | 85.4%", and Istanbul outputs different formats. Verify the regex matches your tool's output.

GitLab also supports jacoco coverage format natively:

artifacts:
  reports:
    coverage_report:
      coverage_format: jacoco
      path: build/reports/jacoco/test/jacocoTestReport.xml

Test Summary in Merge Requests

GitLab's test report widget in merge requests displays:

Test Results: Pass/fail count, newly failing tests, and newly passing tests (regressions fixed)
Coverage: Current coverage percentage, coverage change from target branch (e.g., "+0.5%"), and coverage trend arrow
Failed Tests: List of failing test names with failure messages and stack traces
Flaky Tests: Tests that passed on retry (indicates test reliability issues)

This information surfaces in the MR without requiring developers to open CI logs or download artifacts, reducing friction in the review process.

Quality Gates

Coverage Thresholds

Quality gates automatically enforce minimum standards. Blocking merges when coverage decreases prevents gradual quality erosion. Without enforcement, coverage tends to decline over time as developers add features without corresponding tests.

This script extracts coverage percentage from Jacoco's HTML report and fails the job if below the threshold. The threshold should be set at or slightly above your current coverage to prevent regression while allowing improvement.

quality-gate:
  stage: quality
  script:
    - |
      # Extract coverage percentage from Jacoco HTML report
      COVERAGE=$(grep -oP 'Total.*?\K([0-9]{1,3})%' build/reports/jacoco/test/html/index.html | head -1 | tr -d '%')
      echo "Current coverage: $COVERAGE%"

      # Fail if coverage below threshold
      if [ "$COVERAGE" -lt "85" ]; then
        echo "Coverage $COVERAGE% is below threshold 85%"
        exit 1
      fi
  allow_failure: false
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'

An alternative approach enforces coverage delta instead of absolute coverage:

quality-gate:
  script:
    - |
      # Get coverage from current branch
      CURRENT=$(grep -oP 'Total.*?\K([0-9]{1,3})%' build/reports/jacoco/test/html/index.html | head -1 | tr -d '%')

      # Fetch and calculate coverage from target branch
      git fetch origin $CI_MERGE_REQUEST_TARGET_BRANCH_NAME
      git checkout origin/$CI_MERGE_REQUEST_TARGET_BRANCH_NAME
      ./gradlew test jacocoTestReport
      TARGET=$(grep -oP 'Total.*?\K([0-9]{1,3})%' build/reports/jacoco/test/html/index.html | head -1 | tr -d '%')

      # Fail if coverage decreased
      if [ "$CURRENT" -lt "$TARGET" ]; then
        echo "Coverage decreased from $TARGET% to $CURRENT%"
        exit 1
      fi

This delta approach allows coverage to vary by module (new modules start at 0%) while preventing any individual change from reducing coverage.

Mutation Coverage Thresholds

Mutation testing verifies test suite effectiveness by introducing bugs and checking if tests catch them. See Mutation Testing for detailed mutation testing strategies.

Configure thresholds in build configuration to fail builds when test quality drops below acceptable levels:

// build.gradle
pitest {
    // Fail if less than 80% of mutations killed
    mutationThreshold = 80
    // Fail if less than 85% line coverage
    coverageThreshold = 85

    // Enable incremental analysis for faster CI runs
    timestampedReports = false
    withHistory = true
}

mutation-test-backend:
  # Prevents merge if mutation threshold not met
  allow_failure: false
  script:
    - ./gradlew pitest --with-history

Incremental mutation analysis only mutates changed code, reducing execution time from 10-20 minutes to 2-3 minutes for typical changes. This makes mutation testing practical in every MR pipeline.

Best Practices

Fail Fast Strategy

Stage ordering determines feedback latency. Linting catches simple errors (unused imports, formatting) in 30 seconds. Unit tests catch logic errors in 2-3 minutes. Integration tests catch database query issues in 5-10 minutes. E2E tests catch UI interaction issues in 15-30 minutes.

Running E2E tests before unit tests wastes 15-30 minutes when a simple unit test would have caught the same failure in 2 minutes. This delays feedback and consumes expensive CI resources (browsers, databases, application servers) unnecessarily.

stages:
  - lint       # 30 seconds - catches syntax, formatting errors
  - test       # 2-5 minutes - catches logic errors, missing tests
  - integration  # 5-10 minutes - catches database, API integration errors
  - e2e        # 10-20 minutes - catches UI workflow errors
  - performance  # 20-40 minutes - catches performance regressions

Dependency Management with `needs`

GitLab's default behavior runs stages sequentially: all jobs in stage 1 complete before any job in stage 2 starts. This creates unnecessary wait time when jobs have no actual dependencies.

The needs keyword creates a directed acyclic graph (DAG) where jobs start as soon as their specific dependencies complete. This parallelizes independent work across stages.

integration-test:
  stage: integration
  # Starts immediately when build-backend completes
  # Doesn't wait for frontend tests, linting, mutation tests
  needs: ["build-backend"]
  script:
    - ./gradlew integrationTest

Without needs, integration tests wait for all test stage jobs (unit tests, linting, mutation tests) to complete. With needs, integration tests start as soon as the backend build finishes, running in parallel with other test stage jobs.

Visualizing this with a timeline:

Without needs (sequential stages):
|-- Build (2 min) --|-- Unit+Lint+Mutation (5 min) --|-- Integration (8 min) --|
Total: 15 minutes

With needs (DAG execution):
|-- Build (2 min) --|
                     |-- Integration (8 min) --|
                     |-- Unit tests (3 min)    --|
Total: 10 minutes

Dependency Caching

CI containers start fresh for each job - no dependencies are pre-installed. Without caching, each job downloads dependencies (npm packages, Gradle dependencies), wasting time and bandwidth.

Caching stores dependencies between pipeline runs, reducing download time from minutes to seconds. Cache key determines cache scope: branch-specific caches prevent main branch cache from being polluted by feature branch dependencies.

.cache-template: &cache-template
  cache:
    # Branch-specific cache key
    key: "$CI_COMMIT_REF_SLUG"
    paths:
      # Gradle dependencies and wrapper
      - .gradle/caches
      - .gradle/wrapper
      # npm dependencies
      - node_modules/
    # Upload cache even if job fails
    policy: pull-push

Cache policy determines behavior:

pull-push (default): Download cache at job start, upload at job end
pull: Only download (read-only cache)
push: Only upload (cache creator jobs)

Use read-only cache for jobs that don't modify dependencies:

lint:
  cache:
    key: "$CI_COMMIT_REF_SLUG"
    paths:
      - node_modules/
    # Don't upload cache, only download
    policy: pull

Retry Flaky Tests

Infrastructure failures (network timeouts, container scheduling delays) cause non-deterministic failures. Retrying these failures reduces false negatives from infrastructure issues while still catching real test failures.

e2e-test:
  retry:
    max: 2
    when:
      # Retry on GitLab runner issues
      - runner_system_failure
      # Retry on job timeout
      - stuck_or_timeout_failure
      # Don't retry on script failure (actual test failure)
      # - script_failure

Be cautious with retry: retrying actual test failures (script_failure) masks flaky tests instead of fixing them. Only retry infrastructure failures.

Track retry metrics to identify when infrastructure issues become chronic:

after_script:
  - |
    if [ "$CI_JOB_STATUS" == "success" ] && [ "$CI_JOB_RETRY" -gt "0" ]; then
      echo "Job succeeded after $CI_JOB_RETRY retries - possible flaky test"
      # Send metric to monitoring system
    fi

Test Container Reuse

TestContainers creates Docker containers for integration tests. Starting a PostgreSQL container takes 5-10 seconds. Running 50 integration test classes serially creates 50 containers, wasting 4-8 minutes on container startup alone.

Singleton containers reuse a single container instance across all tests. The first test starts the container; subsequent tests reuse it. This reduces 50 container startups to 1, saving 4-8 minutes.

// Reusable container definition
@Container
@ServiceConnection  // Spring Boot 3.1+: auto-configures datasource
static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:16-alpine")
    .withReuse(true);  // Reuse across test runs

Container reuse requires TestContainers daemon: set testcontainers.reuse.enable=true in ~/.testcontainers.properties. See Integration Testing for complete configuration.

Anti-Patterns

Running All Tests Serially

# BAD: Sequential execution wastes time
test:
  script:
    - ./gradlew test
    - npm run test
    - ./gradlew integrationTest

Sequential execution runs each test suite after the previous completes. For independent test suites (backend vs frontend), this wastes time.

# GOOD: Parallel jobs reduce total pipeline time
test-backend:
  script:
    - ./gradlew test

test-frontend:
  script:
    - npm run test

Not Using `needs` for Independent Jobs

# BAD: Integration tests wait for all test stage jobs
integration-test:
  stage: integration
  script:
    - ./gradlew integrationTest

Without needs, integration tests wait for unrelated jobs (linting, frontend tests) to complete before starting.

# GOOD: Explicit dependencies enable DAG execution
integration-test:
  stage: integration
  needs: ["build-backend", "unit-test-backend"]
  script:
    - ./gradlew integrationTest

Ignoring Test Failures

# BAD: allow_failure masks actual problems
integration-test:
  allow_failure: true
  script:
    - ./gradlew integrationTest

Allowing failures creates "broken windows": developers ignore failing tests, and the test suite becomes unreliable. Once tests are routinely ignored, they provide no value.

# GOOD: Fail fast on test failures
integration-test:
  allow_failure: false
  script:
    - ./gradlew integrationTest

Not Caching Dependencies

# BAD: Downloads dependencies every pipeline run
build:
  script:
    - npm ci
    - npm run build

Without caching, npm downloads all packages (hundreds of megabytes) for every pipeline run, wasting 1-3 minutes and bandwidth.

# GOOD: Cache dependencies across pipeline runs
build:
  cache:
    key: "$CI_COMMIT_REF_SLUG"
    paths:
      - node_modules/
  script:
    - npm ci
    - npm run build

Expensive Tests on Every Commit

# BAD: E2E and performance tests on every push
e2e-test:
  script:
    - npm run test:e2e

Running expensive tests (E2E, performance) on every commit wastes CI resources and delays feedback. A simple unit test change shouldn't trigger 30 minutes of E2E tests.

# GOOD: Expensive tests on main and scheduled runs only
e2e-test:
  script:
    - npm run test:e2e
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'
    - if: '$CI_PIPELINE_SOURCE == "schedule"'

Collecting Excessive Artifacts

# BAD: Collecting artifacts from every job
test:
  artifacts:
    paths:
      - build/
      - node_modules/
    expire_in: 30 days

Artifacts consume storage and slow down jobs (upload time). Only collect artifacts needed for debugging or downstream jobs.

# GOOD: Minimal artifacts with appropriate expiration
test:
  artifacts:
    paths:
      - build/reports/tests/
    when: on_failure  # Only on failure for debugging
    expire_in: 7 days

Merge Request Rules

Require Pipeline Success

Configure repository settings to enforce quality gates at the merge level. These settings prevent merging until all CI jobs pass.

GitLab: Settings > Merge Requests > Merge Checks:

Pipelines must succeed: Prevents merging when pipeline fails
All threads must be resolved: Ensures review comments addressed
Require approval from code owners: Enforces architecture review

These controls prevent defects from reaching main branch by enforcing both automated tests and human review.

Protected Branches

Branch protection prevents bypassing CI by pushing directly to main or force-pushing to overwrite history.

GitLab: Settings > Repository > Protected Branches:

Branch: main
Allowed to merge: Maintainers only
Allowed to push: No one (all changes via MR)
Allowed to force push: No (preserves history)
Require approval: 2 approvals minimum
Require passing pipeline: Yes

This configuration ensures every change to main goes through CI testing and code review, eliminating the possibility of untested code reaching production.

Monitoring and Alerts

Pipeline Failure Notifications

Silent failures delay issue detection. Immediate notifications to relevant channels (Slack, email) ensure failures receive prompt attention.

# .gitlab-ci.yml
notify-failure:
  stage: .post  # Runs after all other stages
  script:
    - curl -X POST $SLACK_WEBHOOK
      -H 'Content-Type: application/json'
      -d "{\"text\":\" Pipeline failed for $CI_PROJECT_NAME on $CI_COMMIT_BRANCH\n$CI_PIPELINE_URL\"}"
  rules:
    # Only notify on main branch failures
    - if: '$CI_COMMIT_BRANCH == "main"'
      when: on_failure

Include relevant context in notifications:

Project name and branch
Commit message and author
Failed job names
Pipeline URL for investigation

Test Health Metrics

GitLab provides built-in test analytics accessible at CI/CD > Test Report and Analytics > CI/CD Analytics. Monitor these metrics regularly to identify trends:

Test Pass Rate: Percentage of pipeline runs where all tests pass. Declining pass rate indicates increasing instability.
Flaky Test Rate: Tests that fail intermittently. High flaky rate (>5%) indicates test suite reliability problems requiring investigation.
Pipeline Duration Trends: Average pipeline duration over time. Increasing duration indicates test suite growth without corresponding optimization.
Coverage Trends: Coverage percentage over time. Declining coverage indicates insufficient testing of new features.
Test Execution Time: Individual test execution times. Identify slow tests (>1 second for unit tests) as optimization candidates.

Set up alerts for anomalies:

metrics-check:
  stage: .post
  script:
    - |
      # Get pipeline duration
      DURATION=$((CI_PIPELINE_CREATED_AT - CI_JOB_STARTED_AT))
      # Alert if pipeline takes longer than 20 minutes
      if [ "$DURATION" -gt "1200" ]; then
        echo "Pipeline duration ${DURATION}s exceeds threshold"
        # Send alert
      fi

Testing Strategy - Overall testing approach and test pyramid
Unit Testing - Unit testing patterns optimized for CI
Integration Testing - Integration testing with TestContainers and database containers
Contract Testing - Consumer-driven contract testing in CI/CD
Mutation Testing - Mutation testing configuration and thresholds
E2E Testing - End-to-end testing strategies for web and mobile
Spring Boot Testing - Spring Boot testing optimization and parallel execution
Performance Testing - Load testing and performance benchmarking in CI
Security Testing - SAST, DAST, and dependency scanning integration

Overview​

Core Principles​

Pipeline Architecture​

Test Execution Flow​

Stage Timing and Resource Usage​

GitLab CI Configuration​

Complete Pipeline Example​

Parallel Test Execution​

Why Parallelization Matters​

Backend Parallel Configuration​

Frontend Parallel Configuration​

GitLab Parallel Jobs​

Test Reporting​

JUnit XML Integration​

Coverage Reporting​

Test Summary in Merge Requests​

Quality Gates​

Coverage Thresholds​

Mutation Coverage Thresholds​

Best Practices​

Fail Fast Strategy​

Dependency Management with needs​

Dependency Caching​

Retry Flaky Tests​

Test Container Reuse​

Anti-Patterns​

Running All Tests Serially​

Not Using needs for Independent Jobs​

Ignoring Test Failures​

Not Caching Dependencies​

Expensive Tests on Every Commit​

Collecting Excessive Artifacts​

Merge Request Rules​

Require Pipeline Success​

Protected Branches​

Monitoring and Alerts​

Pipeline Failure Notifications​

Test Health Metrics​

Related Guidelines​

Further Reading​