Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Retrieve a copy of the country’s spreadsheet template, usually named <CountryName>AgStats. You will enter the new source crop data in the tab named DATA. There may already be data in the country template, which should provide concrete examples for all the steps that follow.

  2. Most of the fields in the country template provide key data that instruct FDW how to document and archive the data. A few fields are intended to inform you, the data enterer, about how to best interpret, document, and enter the data. The template fields are described below.

  3. Review the source document(s) containing the new crop data to determine how you will extract, process, and add the source tabular data into the upload file format shown in the DATA tab in this Excel Crop Production Data workbook. Examine the structure and format of the tabular data in the source document.

    1. The fundamental variables that you want to extract are: reporting unit name, year, season, crop name, crop production system, area planted, area harvested, yield, and quantity.

    2. The fundamental units of measure should be “hectares” for area, “metric tons per hectare” for yield, and metric tons for quantity. If not, you should convert them in an editing process.

    3. If the structure of all source tables is the same, then you will need less error-checking than if the tables vary in structure or format. If the tables vary in structure, naming convention, composition of variables, or in other important manners, you will need to consider extra quality-control steps in your extraction and editing (see step 7).

  4. If the source document is in a .pdf format, you will need an application to convert the PDF tables into spreadsheet tables. The current preferred application is ABBYY FineReader 16. It contains a full-featured suite of PDF editing tools and uses optical-character recognition (OCR) to convert the PDF content.

    1. To convert the PDF tables, open the .pdf file in FineReader, select Save As, and choose a spreadsheet as the format.

      1. Tip: If the source PDF document contains a lot of text, and/or other topics in addition to the tabular data you want, consider deleting the unwanted pages.

  5. After extracting and converting the source document tabular data, Immediately check the quality of the saved spreadsheets. If the OCR conversion to spreadsheet format has made a number of recognition errors, or has not correctly identified or structured all the tables you want, you can then work with the OCR-editing application offered-up at the same time by FineReader to edit the output before then selecting Recognize (any edited pages) and, again, Save As to complete the conversion.

  6. If the source data are offered in any of a variety of other non-PDF online applications that allow you to define a CSV or spreadsheet output format, prioritize that choice for your download of source data.

  7. If the source document tables vary unpredictably in the way they are structured, or how they present the fundamental variables of crop production, it will be important to define a secondary process for error-checking and quality control steps immediately after the extraction and conversion. In many cases, this will consist of visually checking each extracted table to see if it accurately duplicates the source table.

...

The order of columns in a crop data upload spreadsheet is not necessarily fixed, except for DNL (Do Not Load) columns which must always be located to the right of the columns intended for upload. However, it is helpful to follow the most common pattern described below to ensure consistent and complete documentation of the crop data to be uploaded.

Note that some of the expected content of the fields described below may differ slightly between crop data and other domains.

Source_organization

The first column is typically Source_Organization. It correlates to the entity that gathers, organizes, and publishes originally offers access to the crop data. It is most often a Ministry of Agriculture, statistical bureau, or a multi-entity crop assessment mission; this latter term will most often indicate a UN-sponsored Crop and Food Supply Assessment Mission (CFSAM) organized by either, or both, the Food and Agriculture Organization (FAO) and the World Food Program (WFP).

...

The second column is Source_Document. It describes the nature or type of the activity or publication in which publishes the crop data are found (note: this differs from other FDW domains).

A common entry here would be “Official crop statistics”statistics, CountryName”. Others include “Agricultural Census,” “Annual crop assessment,” “Statistical Yearbook,” or “CFSAM.” As in the first column, this descriptor is followed by a comma and the name of the country. If possible, use a source_document name that is already found in the FDW Crop domain metadata for the country involved.

...

The third column is often referred to as Publication. It should provide enough specific details to be able to identify the actual document or location from which the crop data were found inextracted.

Common entries found in this column might include “2017/2018 Crop Assessment Report”, or “2018 Statistical Yearbook”, or “2017/2018 Crop and Food Supply Assessment Mission FSAM to Sudan”, or a URL address. Again, the name is followed by a comma and the country name.

...

FEWS NET practice is generally not to capture intermediate estimates during a season, but if a country normally publishes two estimates per harvest, and the only publication estimate that can be found is the first, FEWS would normally put an entry like “1st round estimate” in this field to document the status of the data entered.

...

In the Country column, enter the country's [two-letter code as it is identified on the FDW Country Codes tabintl code name] two-letter code.

Zone

The Zone column is only used to provide additional information about where the reporting unit is located in the country. It If it has utility, the zone field will likely already be filled-in in the country template. This field is not uploaded to the FDW.

...

The FNID column contains FNID codes that have already been assigned to every crop-reporting and/or administrative unit in the country. They change every year in which a country’s boundaries change. These ID codes should be selected from annual boundary sets which are found in the FDW Spatial domain. They change ID code every year in which a country’s boundaries change their geographic shape/location. If the newly-extracted data reveal a new administrative unit, you should notify FEWS NET Data specialists to have it added. Consult with Hub personnel to receive a list of FNIDs for the year(s) represented by the data.

Administration units

The Admin1 column contains the name of the first subnational crop reporting/administration unit, i.e., the first administrative unit after the national level.

After you have entered the names as they appear in the source files, the FEWS NET GIS Data Specialist uploaded the new data into the template, you, or other FEWS NET personnel will update the file with FNIDs and the official Admin names as defined in FEWS NET administration shapefile hierarchy.

The Admin2, Admin3, and Admin4 columns contain the name of the next subnational administration unit name names in the country’s reporting hierarchy. If the source data report only at a higher Admin level (e.g. Admin1), this field, and all others, will be empty. If you have names for Admin2, you will also have a parent an administrative unit “parent” name in the Admin1 field. Enter the names as they appear in the source file and continue similarly for Admin3 and Admin4 if there are such.

Info

For all 4 any of the administrative unit columns, in some countries these may be a hierarchical subset, or set of groupings of Admin units, expressly used for reporting have a different set of crop-reporting units that differ from the administrative units and are only used for crop statistics. If that is the case, what is required here in one of the columns are the crop reporting unit names, which will likely have been previously been coded and assigned for that purpose, and ; annual boundary sets for them should be found in the Spatial domain.

...

In the Data Upload column for Year, you will enter the agricultural year “year” of the country's new crop data. The field is restricted to four digits in the FDW, so you will enter a four-digit (e.g., 1999) year descriptor in this column.

It is extremely important that this year be consistent with FEWS NET’s practice for identifying crop years. For countries whose annual agricultural year crosses over a December 31 date, identifying which year to insert here can be complex. Looking at the SEASON tab should help you understand which year to enter here, or it may be clear from the data already in the DATA tab. Please consult the FEWS NET Database Manager if the year to enter is not crystal clear.

If the source document identifies the year in a different format (e.g. 2017/18), also enter the source document year as it is found in a DNL_Source_year column at the right of the spreadsheet.

Additional dates

You In the country template, you may next encounter several fields regarding start and end periods of the season and you may also find multiple columns describing the crop development stage dates (“phenology”), which are relative to the specific crop data being entered on each row. These are entered automatically upon upload and are not your responsibility.

...

Enter CPCv2 crop codes in the column/field labeled “CPCv2” or “Crop.” These have already been defined for each crop in the FDW, and existing crop names for this country can be found in the Crop Domain metadatacountry template. CPC codes can be found in FDW under Metadata management>Common>Classified products.

...

FEWS NET practice is to upload all crop data offered in the source document, except for some “fruit tree crops” (oranges, apples, pears, etc.). Coffee, cocoa, bananas, and plantains are “tree” crops whose data is always uploaded) and commercially-grown “flowers.” often uploaded. Commercially-grown “flower” estimates are not collected. If unsure about what to add, ask the field analyst or FEWS NET Database Manager.

If you find that existing FDW crop names and code metadata provide less information about the type of crop being reported now in the source document (e.g. existing crop data for this country includes a name/code of  “R01701AA” for common beans, while the new data describes the crop as “Red beans”, or “Vigna beans”), ask the field analyst or FEWS NET Database Manager which name/code should be entered.

All Crop crop names need to pre-exist and be coded in the FDW before being uploaded.

...

Info

Source documents generally distinguish between a zero value and “no data collected/no data available/no data reported”reported.Try to retain those differences as you move the data into the data upload sheet (“0” for a zero-value, “NA” for no crop grown, “NC” for no data collected/reported).

Enter the area planted in hectares (ha) in the Area planted column/field. The source document may refer to the area planted “planted” by other names, e.g., Area “Area under [crop]. If the source file provides this data in a different unit (e.g. “acres”), make the conversion, but add a DNL column (“DNL_Source_area planted”) for the source unit data for information purposes. If this is the only area data you have, determine or ask the field estimate does not mention “planted” or “harvested” in some manner, ask the Database Manager to assist in determining whether the area estimate is ”planted” or “harvested”. 

...

Enter the yield in MT/ha into the Yield column. Enter the actual yield figure given in the source document, unless: a)  you have to convert it first to MT/ha, b) there is an obvious error in the yield figure, or c) no yield data are given. In the latter two cases, compute yield as “quantity” divided by “area planted” (or “area harvested” if that alone is given.  

Enter the quantity produced in the Quantity column/field as Metric tons (MT) only, i.e., not bags, kgs, or sacs. If the source file quantity estimate is in a different weight or volume unit, make the conversion, but add a DNL column for the source quantity estimate for information purposes.

...