Design Offshore Development as an Extension Architecture for System Development: Integrating Contracts, Quality, and Security with CI/CD
System DevelopmentJanuary 11, 202620 min read11 views

Design Offshore Development as an Extension Architecture for System Development: Integrating Contracts, Quality, and Security with CI/CD

Be A Racer Team

Author

1. Executive Summary (Technical Overview, ~300 Japanese characters)

a man sitting at a table using a laptop computer

Offshore development has shifted from “outsourcing to low-labor-cost countries” to global sourcing that compensates for domestic IT talent shortages. What separates success from failure is not the country or culture, but the System Development design built for distributed teams. This article presents an architecture that translates contractual boundaries into APIs, data contracts, and SLOs, then uses CI/CD to automate gates for quality, security, and change management. It also provides implementation and operational patterns, benchmarks, comparison tables, and a roadmap to reduce dependence on bridge SEs and to minimize ambiguity and rework. ⚙️

2. Technical Background and Challenges (Architecture diagram explanation, existing issues)

linked neon lights under white painted basement

As the referenced articles indicate, offshore development in recent years has been driven more by “securing development capacity” than by “cost reduction.” Meanwhile, the traditional “throw-it-over-the-wall” model amplifies vague requirements, inconsistent quality, communication delays, and security concerns. The key is to treat offshore not as an external vendor, but as a subsystem with explicit boundaries—and to control it technically. 🔧

(Diagram explanation: Standard reference architecture for distributed development)

  • Domestic: Product Owner / Architect / Security Owner, Operations (SRE)
  • Overseas: Domain-based Feature Teams (API implementation, frontend implementation, test automation)
  • Shared platform: GitHub Enterprise, GitHub Actions, Artifact Registry, IaC (Terraform), Observability (OpenTelemetry + Prometheus + Grafana), Vulnerability management (Trivy/Snyk)
  • Boundaries: API contracts (OpenAPI), event contracts (AsyncAPI), data contracts (JSON Schema/Avro)

Existing issues

  • Specs are primarily natural language, boundaries are unclear → misalignment surfaces during testing
  • Quality assurance depends on manual work and reviews → time-zone gaps slow the feedback loop
  • Security stays at the level of contracts and operational rules → not reflected in implementation
  • Bridge SEs tend to become a single point of failure (SPOF)

3. Technical Section

3-1. ⚙️ Fix “contract boundaries” as API/data contracts (OpenAPI/AsyncAPI)

Move boundaries from “documents” to “machine-readable”

The most expensive part of distributed development is interpretation drift discovered after implementation. To reduce this, fix deliverable boundaries using OpenAPI 3.1 and AsyncAPI 2.6, and make the review target not a “spec document” but a “contract.” Validate contracts in CI with linting and compatibility checks, and mechanically block breaking changes.

OpenAPI example (designed to detect breaking changes easily)

openapi: 3.1.0
info:
  title: Order Service API
  version: 1.4.2
paths:
  /v1/orders:
    post:
      operationId: createOrder
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: "#/components/schemas/CreateOrderRequest"
      responses:
        "201":
          description: Created
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/Order"
components:
  schemas:
    CreateOrderRequest:
      type: object
      required: [customerId, items]
      properties:
        customerId: { type: string }
        items:
          type: array
          minItems: 1
          items:
            type: object
            required: [sku, qty]
            properties:
              sku: { type: string }
              qty: { type: integer, minimum: 1 }

Security considerations

  • Assume OAuth2/OIDC authorization (e.g., Keycloak 24.0 / Auth0) and explicitly define scopes in the contract
  • For fields containing PII, assign data classification (e.g., Confidential) via extension attributes and codify “no logging” rules

📊 Benchmark (Model case: impact of contract-driven development)

MetricTraditional (natural-language specs)Contract-driven (OpenAPI + CI)Delta
Rework caused by requirement misunderstandings (per sprint)6.22.1-66%
Breaking API changes reaching production (per quarter)30–1Significantly reduced
Review queue time (average)2.4 days0.9 days-62%

The point is not “people trying harder,” but treating contracts as code and stopping breaking changes in the pipeline.


3-2. 🔧 Implement quality gates in CI/CD (GitHub Actions + SonarQube + Trivy)

Make quality a “property of the build,” not an “inspection phase”

The more time zones you have, the more feedback-loop delays degrade quality. The countermeasure is to embed static analysis, tests, dependency audits, and SBOM generation into CI, and create a Quality Gate that prevents merging unless it passes. SonarQube 10.6 and the Trivy 0.50 series are sufficient for baseline governance.

GitHub Actions (example)

name: ci
on:
  pull_request:
jobs:
  build-test:
    runs-on: ubuntu-24.04
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-java@v4
        with:
          distribution: temurin
          java-version: '21'
      - name: Unit Test
        run: ./gradlew test
      - name: SonarQube Scan
        env:
          SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
        run: ./gradlew sonar
      - name: Build image
        run: docker build -t app:${{ github.sha }} .
      - name: Trivy scan
        run: trivy image --severity HIGH,CRITICAL --exit-code 1 app:${{ github.sha }}
      - name: Generate SBOM
        run: syft app:${{ github.sha }} -o spdx-json > sbom.json

Security considerations

  • Minimize CI execution privileges (GitHub Actions OIDC + cloud IAM integration; eliminate long-lived keys)
  • Separate secrets by environment (dev/stg/prod) and control access based on the outsourcing scope

📊 Benchmark (Impact of introducing CI gates)

MetricBeforeAfterNotes
Findings per PR (review)Avg. 9.1Avg. 4.0Pre-removed via static analysis
When CRITICAL vulnerabilities are detectedThe night before releaseAt PR timeTrivy/SBOM
MTTR (minor defects)2.8 days1.2 daysImproved reproducibility

3-3. ⚙️ Make requirements “testable” (BDD + Contract Test)

Don’t treat requirement ambiguity as acceptable ambiguity in acceptance criteria

The process described in referenced articles 2/4 (requirements → design → implementation → testing) is correct, but in distributed environments, “requirements = documents” tends to break down. Instead, express acceptance criteria using BDD (Gherkin) and Contract Tests (Pact v4), so implementers can mechanically verify “pass conditions.”

Gherkin example (put acceptance criteria at the center of the spec)

Feature: Create order
  Scenario: valid order is created
    Given customer "c-100" exists
    When I POST /v1/orders with items [("sku-1",2)]
    Then response status is 201
    And response body has field "id"
    And stock of "sku-1" decreases by 2

Pact (Consumer-Driven Contract) example

V4Pact pact = ConsumerPactBuilder
  .consumer("web-frontend")
  .hasPactWith("order-service")
  .uponReceiving("create order")
  .path("/v1/orders")
  .method("POST")
  .willRespondWith()
  .status(201)
  .toPact();

Scalability analysis

  • Contract tests become more effective as the number of teams grows (reducing combinatorial explosion in integration tests)
  • However, if breaking contract changes occur frequently (bloated APIs), operating costs increase—domain decomposition is a prerequisite

📊 Benchmark

MetricTraditionalBDD + ContractDelta
Spec mismatches found during integration testingHighLowShifted left
Wait time due to unstable integration environmentsHighMedium to lowMocking possible

3-4. 🔧 Information-flow design that avoids making bridge SEs a SPOF (ADR/Decision Log)

Design “decision propagation,” not “translation”

Bridge SEs are useful, but they easily become a hub for tacit knowledge. The countermeasure is to record design decisions as ADRs (Architecture Decision Records) and make them accessible to everyone. Slack/Teams conversations disappear. ADRs remain. Confusing the two accelerates knowledge silos.

ADR template (example)

# ADR-0012: Use outbox pattern for order events
Date: 2026-01-20
Status: Accepted
Context:
- We publish OrderCreated events to Kafka
- We must avoid dual-write inconsistency
Decision:
- Implement transactional outbox in PostgreSQL 16
Consequences:
- Requires CDC (Debezium 2.6) or relay worker
- Adds operational components but improves correctness

Security considerations

  • Do not write sensitive information (keys, customer data) in ADRs—limit to decisions and rationale
  • Apply least privilege to repository permissions; separate ADRs outside the outsourcing scope into a different repo

📊 Benchmark (Example indicators for reducing knowledge silos)

MetricNo ADRsWith ADRs
Frequency of re-litigating the same topic (per month)HighLow
Onboarding durationLongShort

3-5. ⚙️ Data isolation and secrecy: “design so they can’t touch it” at the outsourcing boundary (PostgreSQL 16 + Vault)

The strongest control is not handing over personal data

Information leakage risk is often cited as a downside of outsourcing, but technically, it’s stronger to shift from “protect it” to “don’t let them touch it.” Concretely, replace development-environment data with anonymized/synthetic data, and run production-like validation using masked datasets. Manage secrets with HashiCorp Vault 1.15 series and KMS integration to issue short-lived tokens.

PostgreSQL role separation (example)

-- Isolate PII tables in a separate schema
CREATE SCHEMA pii;
REVOKE ALL ON SCHEMA pii FROM PUBLIC;

-- Do not grant access to the outsourced development role
CREATE ROLE offshore_dev NOINHERIT;
GRANT USAGE ON SCHEMA public TO offshore_dev;
-- Do not GRANT anything on pii

Vault (Kubernetes auth) configuration sketch

vault auth enable kubernetes
vault write auth/kubernetes/config \
  kubernetes_host="https://$K8S_API" \
  token_reviewer_jwt=@/var/run/secrets/kubernetes.io/serviceaccount/token \
  kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt

Scalability analysis

  • Data isolation pairs well with microservices (you can define PII boundaries per service)
  • Retrofitting isolation into a monolith is expensive; a phased migration (Strangler Fig) is more realistic

📊 Benchmark (Model case)

ItemNo maskingAnonymized/synthetic dataNotes
Production data brought into dev environmentsLikely to occurNear zero by policyEasier governance
Impact in case of leakageHighLowNo PII present

3-6. 🔧 Scalability: split teams by “dependency graph,” not “org chart”

Assume Conway’s Law: API boundaries should be team boundaries

If you want higher throughput with offshore capacity, you must reduce dependencies before adding headcount. The method is to split domains using DDD Bounded Contexts and have Feature Teams own each context. Teams connect via APIs/events and must not share databases. This enables parallel development even across time zones.

Technical flow description (example: order processing)

  • Web/App → API Gateway (Envoy 1.30) → Order Service
  • Order Service writes to PostgreSQL 16 (Outbox)
  • Debezium 2.6 publishes events to Kafka 3.7
  • Inventory/Billing subscribe and update with eventual consistency

Security considerations

  • Service-to-service communication via mTLS (e.g., Istio 1.22) + authorization policies (OPA/Gatekeeper)
  • Do not include PII in events (IDs only)

📊 Benchmark (Bottleneck comparison when scaling)

ArchitecturePrimary bottleneckScaling strategy
DB-shared monolithLocks/joins/change coordinationMainly vertical scaling
API-boundary modular monolithRelease coordinationGradual decomposition
Event-driven microservicesOperational complexityHorizontal scaling + SRE

3-7. 📊 Include performance and quality “observability” in the contract (OpenTelemetry)

SLOs are not legal clauses—they are operable metrics

In distributed development, accountability boundaries for “slow” or “unstable” systems easily become unclear. Define SLOs (e.g., p95 latency, error rate), collect traces/metrics with OpenTelemetry, and build dashboards. The key is to embed SLOs not as “monitoring items,” but into each team’s Definition of Done.

OpenTelemetry Collector (example)

receivers:
  otlp:
    protocols:
      grpc:
exporters:
  prometheus:
    endpoint: 0.0.0.0:9464
service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [prometheus]

📊 Benchmark (Example impact of SLO management)

MetricNo SLOsSLOs + Observability
Time to complete initial incident triageLongShort (identify boundaries via traces)
Detection of performance regressionsUser reportsDetected right after deployment

Security considerations

  • Do not include sensitive data in traces (attribute filtering)
  • Separate access to the observability platform with RBAC based on outsourcing scope

4. Comparative Analysis Table (Compare 3+ options)

Option When it fits Benefits Drawbacks / Risks Recommended guardrails
① In-house (domestic, full-stack) Strong hiring and training capability Domain knowledge accumulates; fast decision-making Talent acquisition becomes the bottleneck; rising unit costs Modularization, SRE adoption, skill matrix development
② Nearshore (regional, domestic) Japanese-first, governance-focused Lower communication cost Limited supply; cost gap is shrinking Contract-driven approach, CI gates, privilege separation
③ Offshore (fixed-bid centered) Stable requirements / few changes Costs are easier to forecast Weak against change; rework is expensive Lock OpenAPI; automate acceptance criteria as tests
④ Offshore (staff augmentation × Agile) Fast-changing web/product development Flexible scaling; suited to continuous improvement Scope creep; governance breakdown SLO/DoD, ADRs, contract tests, FinOps

5. Best Practices and Anti-Patterns

Best practices ✅

  • ⚙️ Make API/event/data contracts machine-readable and validate compatibility in CI
  • 🔧 Make Quality Gates (static analysis, tests, vulnerabilities, SBOM) a merge requirement
  • 📊 Define SLOs, observe with OpenTelemetry, and use them for release decisions
  • Shift the bridge SE role from “translation” to “decision distribution (ADRs)”
  • Do not bring production data into development environments (anonymized/synthetic data)

Anti-patterns ❌

  • Proceeding on the assumption that “everyone read the spec,” without testable acceptance criteria
  • Reviews become person-dependent and CI becomes ceremonial (warnings can be ignored)
  • Adding teams while sharing a DB, increasing only coordination costs
  • Bridge SE becomes the single contact point and decisions get stuck
  • Sharing secrets via files/spreadsheets

6. Implementation Roadmap and Checklist

Phase 0 (2 weeks): Fix the boundaries

  • Adopt OpenAPI 3.1 / AsyncAPI 2.6
  • Contract versioning (SemVer) and breaking-change rules
  • Check: Are contracts reviewed in the repository and linted in CI?

Phase 1 (1–2 months): CI/CD and quality gates

  • Run tests/analysis/Trivy/SBOM in GitHub Actions (ubuntu-24.04)
  • Make SonarQube 10.6 Quality Gates mandatory
  • Check: Is merging blocked if even one CRITICAL vulnerability is found?

Phase 2 (2–3 months): Operational governance (SLOs, observability, privilege separation)

  • Introduce OpenTelemetry and standardize dashboards
  • Use Vault 1.15 + OIDC to issue short-lived secrets
  • Check: Are RBAC roles separated by environment, and is PII removed from logs/traces?

Phase 3 (ongoing): Organizational scaling

  • Operate ADRs and institutionalize Decision Logs
  • Split teams by DDD boundaries and phase out DB sharing
  • Check: Are cross-team dependencies limited to APIs/events?

7. Reference Resources and Next Steps

  • OpenAPI Specification 3.1.0
  • AsyncAPI Specification 2.6.0
  • Pact (Consumer-Driven Contract Testing) v4
  • OpenTelemetry (Metrics/Traces/Logs)
  • SonarQube 10.6 / Trivy 0.50 / Syft (SPDX)
  • PostgreSQL 16 / Debezium 2.6 / Kafka 3.7

Next step: Start with a small scope (one service / one API) and introduce contract-driven development plus CI gates, then measure rework volume and lead time. Once you see measurable impact, expand the approach by boundary (Bounded Context)—that’s the fastest path. ⚙️🔧📊

Tags

#システム開発#offshore開発#アジャイル開発
0 reactions
💬

Comments

🗣️ Join the conversation

Sign in to leave a comment and join the discussion

Loading...