Skip to content

Architecture

What does DMS extraction do?

The following diagram illustrates the logical architecture.

DMS Extraction Logical

  • Connect to the source database
  • Extract full data from the source database and then uses change data capture (CDC) to extract on-going changes. This is a very efficient way of moving data between systems and also allows downstream applications to track any changes in data
  • Extract the source database metadata to be used in the rest of the pipeline and update the metadata store
  • Validate the extracted data against the metadata
  • Upload the data to the Analytical Platform data lake
  • Expose the data and metadata for downstream processes to use
  • Apply Logging, Monitoring and Alerting (LMA) in accordance with good practice

How is DMS Extraction implemented?

Various serverless Data Analytics AWS Services are used. This means AWS takes over the heavy lifting of the following:

  • Providing and managing scalable, resilient, secure, and cost-effective infrastructural components
  • Ensuring infrastructural components natively integrate with each other

The following AWS Services are used:

also:

  • Uses create-a-derived-table to curate the data via Amazon Athena orchestrated using dbt
  • Uses different AWS accounts on the Analytical Platform to facilitate and isolate resource management
  • Provisions dev and preprod pipelines for testing deployment changes before deploying to production
  • Extracts metadata from the source database to be used in various places along the pipeline. Please refer to metadata for more details
  • Uses GitHub Actions to automate software workflows and run CI/CD pipelines. Please refer deployment for more details
  • Uses pulumi to define and deploy Infrastructure as Code (IAC). Please refer to using pulumi for more details

The following diagram summarises the physical architecture for a single database and environment:

DMS Extraction Physical

Please refer to components for a deeper dive into the individual components.


Last update: July 8, 2024
Created: July 8, 2024