Uploaded image for project: 'DSpace'
  1. DSpace
  2. DS-1836

doi_seq in update-sequences.sql missing



    • Type: Bug
    • Status: More Details Needed (View Workflow)
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 4.0
    • Fix Version/s: None
    • Component/s: DSpace API
    • Labels:
    • Attachments:
    • Comments:
    • Documentation Status:
      In Description


      The file [dspace-source]/dspace/etc/<rdbs>/update-sequences.sql should update all database sequences in case a database dump had to be restored (see: https://wiki.duraspace.org/display/DSDOC4x/Storage+Layer#StorageLayer-MaintenanceandBackup). With DSpace 4.0 we will introduce a doi_seq sequence in the database, but this one is not mentioned in update-sequences.sql. A fix should be easy, but as I am not 100% sure how to fix this, I write this ticket to discuss two possible solutions.

      Every entry in update-sequences.sql sets a sequence to the value it possibly has given when it were used the last time. Most of the sequences will be used to autoincrement a dedicated column. Update-sequences can take the maximum value of the column the sequence is dedicated to. Only for handles it takes the maximum handle suffix of any registered handle.

      For doi_seq we can either use the maximum value of the column doi_id or we can try to determine the largest DOI suffix. If we use the maximum value of the doi_id column we could get a problem if anyone adds manually generated DOIs into the DOI table which collides with the DOIs DSpace generates. But to determine the maximum value of the doi_id column in SQL is easy while it is not so easy to identify the largest DOI suffix in SQL.

      Let's take a quick look how DOIIdentifierProvider generate handles. A new DOI will be generated out of three parts: the DOI prefix, the name space separator and the value of the column doi_id (https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/identifier/DOIIdentifierProvider.java#L819). To select the maximum given DOI suffix within SQL, we would need to know the name space separator. But there is no easy way to get this one into the update-sequences.sql script.

      I personally think that we should use the maximum value of the column doi_id. If anyone adds DOIs manually to the DOI table he or she should know what they are doing.

      What do you think? How should we fix the update-sequences.sql? Should we document any of this anywhere else then in this ticket?


          Issue Links



              Unassigned Unassigned
              pbecker Pascal-Nicolas Becker
              0 Vote for this issue
              2 Start watching this issue