Skip to main content

Backend Services

Diagram of the platform focussing on a running form.

User Datastore

Services deployed using MOJ Forms/ Formbuilder require a component to store data submitted by their users in a secure way. There is an suitable implementation agreed with the Information Security team, discussed in detail in an internal document Runner / User data store Threats & Mitigations.

Technology

  • PostgrSQL RDS
  • API workers, Ruby on Rails

Repositories

JWT authentication

Each request will be timestamped and signed using a per-service serviceToken (generated by Publisher and injected into the service’s Runner as an environment variable).

User FileStore

The User Filestore is a service for storing files uploaded by users for the lifetime of their application.

It is comprised of an API service and a storage service (Amazon S3).

The service:

  • is transient

Files are stored for the same length of time as the user’s other data held in the User Datastore

28 days by default

  • is not for making files available either publicly or to final intended recipients

Files can only be retrieved from storage through the API

  • is secure

Files are stored encrypted so that files cannot be accessed if the collection is backed up/moved elsewhere

  • controls access

Files are stored using a key generated from a digest of the service, user id and file’s fingerprint encrypted with the user’s id/token digest

Files can only be retrieved when presented with all those pieces.

Signing requests

Requests should be signed with JWT, see section below.

Checking additional requirements

Requests should be checked for the presence of encrypted_user_id_and_token (as property for POST, x-header for GET)

  • Error if encrypted_user_id_and_token property is not present
    • code: 403
    • name: forbidden.user-id-token-missing

Creating S3 key

  • Create digest from user id + file fingerprint (no service token)
  • Encrypt digest via AES-256 with the encrypted_user_id_and_token as key
  • Base64 encode the digest
  • Key is {user_id}{hashed_digest} (no / in between and no service_slug)

The difference is only for the name of the file that resides on S3.

Store file

POST /service/{service_slug}/{user_id}

{
  "iat": "<integer> unix_timestamp",
  "encrypted_user_id_and_token": "<encryped_string> userId+userToken encrypted via AES-256 with the serviceToken as the key",
  "file": "<binary>",
  "policy": {
    "allowed_types": [
      "<string> mediatype"
    ],
    "max_size": "<integer> bytes",
    "expires": "<integer> days"
  }
}

A. As per “Check request correctly signed and meets requirements”

B. Check file

  • Perform size check if policy.max_size is present
    • Error if file is too large
    • code: 400
    • name: invalid.too-large
    • max_size: {max_size}
    • size: {file_size}
  • Perform file type checks if policy.allowed_types is present
    • Error if file is wrong type
    • code: 400
    • name: invalid.type
    • type: {file_type}
  • Send to virus scanning service
    • Error if file contains virus
    • code: 400
    • name: invalid.virus
    • virus_name: {virus_name}

C. Store file

  • Fingerprint file
  • Create S3 key
  • Encrypt file using encrypted_user_id_and_token as key
  • Upload file to S3 key
    • Error if file cannot be stored
    • code: 503
    • name: unavailable.file-store-failed
    • [service_code]
    • [message] ‘any additional info from S3 request’

D. Return file storage details

{
  "url": "/service/{service_slug}/{user_id}/{fingerprint}",
  "size": "<integer>(bytes)",
  "type": "<string>(mediatype)",
  "date": "<integer>(unix_timestamp)"
}
  • Status code if no file previously existed

    • 201 (Created)
  • Status code if a file previously existed

    • 204 (No Content)

This information is stored in the User Datastore and is sent to the Submitter to retrieve the file.

Retrieve file

GET /service/{service_slug}/{user_id}/{fingerprint}

encrypted_user_id_and_token must be sent as an x-header.

A. As per “Check request correctly signed and meets requirements”

B. Fetch file

  • Create S3 key
  • Fetch file
    • Error if file cannot be fetched
    • code: 503
    • name: unavailable.file-retrieval-failed
    • [service_code]
    • [message] ‘any additional info from S3 request’
    • Error if file does not exist
    • code: 404
    • name: not-found
  • Decrypt file using encrypted_user_id_and_token as key

C. Return file

Return file as body of repsonse

Technology

  • Ruby on Rails

Repositories

Service Token Cache

The services which make up the platform communicate and verify that they can do so. All requests use a signed JSON Web Token (JWT) with the corresponding service’s service token.

The flow of communication between service in the platform.

Flow for using the Service Token cache

App A - for example the Runner

  1. Encrypts payload

  2. Encodes a JWT and signs with the private key

  3. Sends to App B (Datastore)

  4. Decodes the token, get the service slug

  5. Requests the Service Token Cache for the public key of App A - requests from k8s if it needs to

  6. Verify the signature of the JWT

  7. All okay, continue the request

  8. Error then return a 401 Unauthorised (hash is wrong) or 422 Unprocessable Entity (payload error)

Antivirus

Proves uploaded document scanning.

Technology

  • ClamAv

Repositories

This page was last reviewed on 1 June 2021. It needs to be reviewed again on 1 September 2021 .
This page was set to be reviewed before 1 September 2021. This might mean the content is out of date.