Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Typically, there are few initial matches between an external a remote API and FDW.

Metadata must be set up in FDW such that the incoming metadata from the external remote API is recognized correctly. This includes creating new metadata items as well as aliases for existing metadata items.

...

  1. Research the API: Investigate the API, including acquiring the credentials if necessary, and document the authentication required, the different endpoints and the available filters and formats. This step may require interaction with the developers of the remote API and some level of trial and error if support and/or documentation is not available. The output from the ticket is typically a Jupyter Notebook that demonstrates how to access the API and download the data.

  2. Develop a Data Ingestion Pipeline data ingestion pipeline for the API: Write a new Luigi pipeline within FDW to download the necessary data and perform API-specific transformations required to prepare the data for ingestion to FDW,  and then use the transformed data as input to a generic data normalization, validation and ingestion tasks. The output from the ticket is a complete pipeline with associated unit tests merged into the FDW software and released to the FDW production environment.

  3. Support for enabling the API in FDW production: Use API-specific guidance to determine the required metadata and Data Series in FDW. Typically, we have no control over the content of the remote API, and so we must set up FDW appropriately to recognize the Data Series we want to capture. The ingestion pipeline produces an API metadata matches spreadsheet to help with this process, which reports the data series available from the remote API, and how the metadata matches to metadata available in FDW, including the Data Series defined for the relevant Data Source Document(s).