The following list of new features, updates and bug fixes are included in the latest Sprint 22.10 release to the FEWS NET Data Platform.
New Features
Introduce IPC HFA data domain (DATA-557)
...
Read more about these features.
Updates
Add filtering by keyword for News Article API (DATA-1651)
...
Read more about the updates to the FDW. These changes included many tickets from prior sprints and several in this sprint (DATA-2035, DATA-2036, DATA-2040, DATA-2042).
Bug Fixes
Trying to open the Semi Structured Data Point page results in 500 Internal Server Error (DATA-1656)
Trying to open https://fdw.fews.net/en/admin/semistructured/semistructureddatapoint/ was resulting in an error. This has been fixed and is now working correctly.
...
Unable to update datasourcedocument name
...
Description
...
Jira Legacy | ||||||
---|---|---|---|---|---|---|
|
...
(DATA-1929)
Users were unable to update a source document name when no license information is selected.
...
was selected. This error has been corrected and users can now update data source document names.
Investigate issues with refresh materialized views debouncing incorrectly.
...
Description
...
(DATA-1935)
Uploaded data collections were taking a long time to reflect the updates update in FDW . The is because the refresh materialized views have were not been running correctly. These can be run manually by ssh-ing into the server and running the commands from psql
, like the below;
Code Block |
---|
psql -c "REFRESH MATERIALIZED VIEW CONCURRENTLY price_marketpricefacts_materialized"; |
It looks as though at least the price_marketpricefacts_materialized
is being continually debounced and not actually running. We see a log message like the below;
Task common.tasks.refresh_materialized_view[5d089b85-9811-48ed-a854-1938387de3b9] succeeded in 0.05116157094016671s: 'Request received for refresh_materialized_view_refresh_materialized_view-materialized_view=price_marketpricefacts_materialized while DEBOUNCING so re-scheduled, id 901fd5c6-04fc-4897-be17-7ad3cac3e369 eta 2022-09-30 11:49:53.046322+00:00.'
and tracing the new task that has been spawned we see the below log;
Task common.tasks.refresh_materialized_view[901fd5c6-04fc-4897-be17-7ad3cac3e369] succeeded in 0.03576034493744373s: 'Task refresh_materialized_view_refresh_materialized_view-materialized_view=price_marketpricefacts_materialized triggered to run twice, so cancelling second call 901fd5c6-04fc-4897-be17-7ad3cac3e369.'
The only reason that the code can be reached to execute the last log statement is if the debounce method fails to delete the HERD:{task_key}
from the cache. This could happen if the HERD:{task_key}
is not set in the first place. Chris Preager Could this also happen if the tasks are on different servers?
The code has the following comment.
Code Block |
---|
# This can be caused by:
# 1. This cache delete occurs just before cache_set(HERD) in apply_async, deletes nothing, or
# apply_async enqueues a debounce while this is running, and then this method enqueues a
# duplicate below due to the renewed ETA.
# 2. This cache_delete(HERD) runs immediately after cache_set(HERD) in apply_async, so a
# second call to apply_async goes through. In this case the latter call won't run after
# debounce_wait seconds, but will be satisfied by this run anyway. A subsequent call will run fine.
|
DATA-1938 Fix Comtrade unit test erros
running correctly. This has been corrected.
Fix Comtrade unit test errors (DATA-1938)
The way the Comtrade tests were testing the dates meant that on the first day of the month they would fail. This has been fixed.
Update FAO pipeline based on changes in remote API (DATA-1943)
Description
The help ticket reported
Jira Legacy | ||||||
---|---|---|---|---|---|---|
|
the datasereis api url is changed from https://fpma.apps.fao.org/giews/food-prices/tool/api/v1/series/ to https://fpma.fao.org/giews/v4/price_module/api/v1/FpmaSerie/
The current version also require a two step process to fetch the datapoints.
First https://fpma.fao.org/giews/v4/price_module/api/v1/FpmaSerie/ returns the list of dataseries. Each dataseries has a guid type identifier as oppose to the previous api'sseries_id
[eg.8_23_642_2_434_nominal
Then the datapoints can be fetched from a url like:
https://fpma.fao.org/giews/v4/price_module/api/v1/FpmaSeriePrice/e23b83a7-7cf7-40b0-87f7-6c03a2d3b2bc/ and there is nodatapoint_url
now included in each dataseriesthe
seriesId
is now has a different datatype - Guid and the column is nameduuid
fao
Country Code
of valueXXX
used to be used to refer international prices, now they replaced that withIPS
market
is now an id andmarket_name
contains the name,commodity_name
contains the product name wherecommodity
used to referencedateRanges
field is nowperiodicity
These changes have side effect on the pipeline and we need to update it accordingly.
DATA-1954 Fix safety warning for protobuf
Description
The below safety error is causing issues in the CI pipelines.
protobuf | 3.20.1 | >=3.20.0rc0,<3.20.2 | 51167 |
A parsing vulnerability for the MessageSet type in the ProtocolBuffers versions prior to and including 3.16.1, 3.17.3, 3.18.2, 3.19.4, 3.20.1 and 3.21.5 for protobuf-cpp, and versions prior to and including 3.16.1, 3.17.3, 3.18.2, 3.19.4, 3.20.1 and 4.21.5 for protobuf-python can lead to out of memory failures. A specially crafted message with multiple key-value per elements creates parsing issues, and can lead to a Denial of Service against services receiving unsanitized input. We recommend upgrading to versions 3.18.3, 3.19.5, 3.20.2, 3.21.6 for protobuf-cpp and 3.18.3, 3.19.5, 3.20.2, 4.21.6 for protobuf-python. Versions for 3.16 and 3.17 are no longer updated.
https://gitlab.com/fewsnet/data/fdw/-/jobs/3126720223
...
Recent changes to the FAO remote API effected the FAO pipeline. Updates were made to the pipeline and it is now working as expected.
Fix safety warning for protobuf (DATA-1954)
A safety error was causing issues in the CI pipelines. This has been fixed and the pipelines now run correctly.
Bug when uploading XLSForm, both in old fdw KoBo and latest version
...
(DATA-2010)
When uploading XLSForms sometimes there is no task registered. This has been fixed and XLSForms can now be successfully uploaded to the deployments.
SaveNewArticles fails for articles from multiple sources
...
Description
SaveNewArticle relies on ReadExistingArticles to avoid inserting duplicates.
...
(DATA-2083)
An error was occurring when an article with the same URL has was been downloaded from a different source, then it attempts to save the article using batch_insert, which raises IntegrityError.
This is happening at the moment for the UK FCDO source, which duplicates some (but not all) articles with the UK.gov source.
We will fix this by falling back to saving individual articles if we hit an error, and ignoring the duplicatestwo different sources. This has been fixed and articles are now being ingested without error.