Uploaded image for project: 'DSpace'
  1. DSpace
  2. DS-2965

Re-indexing performance degrades when processing a large number of items

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 6.0
    • Fix Version/s: 6.0
    • Component/s: DSpace API
    • Labels:
      None
    • Attachments:
      0
    • Comments:
      5
    • Documentation Status:
      Not Required

      Description

      I am attempting to index a repository containing 250,000 items (metadata only).

      I first ran index-discovery -b. On subsequent runs, I ran index-discovery.

      The process runs quickly for the first several thousand items. After that the process becomes very sluggish. One one test that I ran overnight, the process ran out of memory and filled my log directory.

      If I kill the process after about an hour and restart it, the process resumes and processes items quickly again.

      In my testing, I have included the following code: https://github.com/DSpace/DSpace/pull/1131

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              kevin van de velde Kevin Van de Velde (Atmire)
              Reporter:
              terrywbrady Terrence W Brady
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: