How to Manage Harvest Repositories

Release 9.3.1 E-mail This Topic Printable Version Give Us Feedback

How to Manage Harvest Repositories

The Manage Harvest Repositories page allows Geoportal publishers and administrators to register repositories and manage them for harvesting. Harvesting occurs when metadata records are acquired from another repository via the Geoportal extensions Harvesting Tool.



For information on registering a repository, see How to Register a Repository.



The Manage Repositories page has three sections. The first defines search criteria to organize the repositories. The second controls actions on the repositories shown in the Repositories Table below it. The third is the Repositories Table itself.

  1. Repositories Search Criteria

    Because there can be many repositories registered in the Geoportal, it is useful to be able to search them. There are eight different search criteria for the repositories that users can specify: Repository ID, Repository UUID, Name, Host URL, Protocol Type (options: None, ESRI MS, Z39.50, OAI, WAF, and CSW), date that the repository was updated in the Geoportal ("Update date"), the date of the last harvest ("Last harvest date"), and if results should only include repositories that are due for harvesting ("Harvest date due now only?").



    After entering search criteria, click the "Search" button, and the repositories that match the search criteria will appear in the Repositories Table.
  2. Select Available Actions Drop down

    Below the search criteria section, there is a drop down list where users can select an action to perform on a repository listed in the table. The user selects a repository by checking the empty box next to it. Then, the user can pick an action from the drop down list, and carry out the action by clicking the "Execute" button. The available actions are:
    • Edit Repository: displays interface for editing repository information.
    • View Harvest History: displays list of harvests for selected repository.
    • Queue for Harvest: schedules repository for harvesting records that have been changed or added to it since the last harvest.
    • Queue for Full Harvest: schedules repository for harvesting all records in that repository.
    • Delete all selected: deletes selected repositories from the Geoportal.
  3. Repositories Table

    The table shows repositories that match criteria defined in the search. Each repository has a check box next to it that allows it to be selected for the actions described above. The table also shows information about the repository as defined by the publisher who registered it. The table can be sorted by clicking on a column heading.



    Five buttons appear next to each record: Edit , View History, Queue for Harvest, Queue for Full Harvest. The buttons' functionalities match those described in the Available Actions section above

Edit a Repository

After a repository is registered for harvesting, its information can be updated from the Edit Repository interface. This form pulls up an interface prepopulated with information for that repository as currently registered in the Geoportal. Update the form and click "Submit" to change the repository's information and register the new information to the Geoportal.

View Harvest History

The harvest history for a repository will show a list of all harvests that have taken place since that repository was registered and the results of those harvests. The history page has three sections:

  1. Search Harvest Reports

    Users can find a specific harvesting instance by searching its Report UUID or the date that the harvest occurred. Fill out this criteria and click the "Search" button to search for the specific reports.


  2. Available Actions

    Select one of the available actions from this drop down box, click "Execute", and the action will be carried out on the report that you have selected from the Reports Table. "View report" will load the report details for viewing. "Delete all selected" will delete the selected reports from the Geoportal entirely.


  3. Reports Table

    Reports are listed in the table and can be sorted according to the date they were harvested, the number of documents harvested, the number that validated, and the number that were published to the Geoportal. Click on one of those column headings to sort the reports. The "Actions" buttons allow users to view the report or delete it from the Geoportal; these functions correspond to the Available Actions drop down list in the section above.



    About Harvest Reports

    The harvest report has three sections: the Harvest Report identification section, the Summary section, and the Details section.
    • Harvest Report identification section

      This shows the Repository Name, UUID, URL, and Protocol of the repository this report describes. The report's UUID is also shown. A report with its own unique UUID will be generated every time a harvest is attempted.
    • Summary section

      This section provides essential information on the conditions for a specific harvest attempt. The parameters for the harvest are described below:
      Parameter Details
      saveOutput ON or OFF. "ON" indicates that the documents are to be saved to a folder in addition to or instead of being published to the Geoportal
      outputLocation location of the folder from the "saveOutput" parameter
      validate ON or OFF. "ON" indicates that validation is turned on for the Geoportal
      publish ON or OFF. "ON" indicates that the documents are to be published to the Geoportal
      harvestEnd date and time that the harvest was completed
      docsHarvested number of documents harvested
      docsPassedValidation if "validate" is set to "ON", this will show the number of documents harvested that passed the Geoportal's validation rules
      docsPublished number of documents published to the Geoportal
      docsAdded of the "docsPublished", this is the number of documents that were new when published
      docsUpdated of the "docsPublished", this is the number of documents that replaced existing documents in the Geoportal because they had been updated
    • Details section

      Click on the "+" sign next to the Details text to expand this section. When expanded, a list of all records that were included in the harvest appears, along with the Validation Status and Publish Status for each record. The Source URI is the unique file identifier for the metadata record, and distinguishes one record from another in the Geoportal. Validation Status indicates if the record passed validation rules for the Geoportal. Publish Status indicates if the record was published to the Geoportal.