Uploaded image for project: 'DSpace'
  1. DSpace
  2. DS-4036

Delete EPersons in DSpace



    • Attachments:
    • Comments:
    • Documentation Status:


      Data protection laws like GDPR demands that personal accounts of people can be deleted. In DSpace personal accounts are realized through EPersons. DSpace currently cannot delete an EPerson if it is referenced in any other database table like e.g. item, resource policy, ...

      I see two ways to do this:

      1. Create a dummy EPerson like we have dummy EPersonGroups (Anonymous, Administrator). Replace every reference to that dummy EPerson and then delete the original one when necessary.
      2. Change DSpace in a way that it does not expect every reference to an EPerson to be used. Set the EPerson reference to null and delete the EPerson.

      The first approach is much simpler, but I would consider it a bad design and a dirty hack. The second one is more work but the cleaner approach.

      We created a PR some years ago that followed the second approach (https://github.com/DSpace/DSpace/pull/672). Unfortunately it never made it into DSpace. We updated the PR to work with DSpace 7, but we did not looked out again for all the places that were added in the meantime and expect a reference to an EPerson to never be null.

      I hope our changes are a good starting point, and I want to suggest to finish this as a community together. For example we did not looked into the new REST-API yet and not into the new Angular UI. I hope we can work as a community together to find all the missing spots where DSpace must learn to deal for example with an item submitter that is null.

      There is one concern that is often raised: For legal reasons we need the information who uploaded a file and submitted an item. Let me address this directly here: we can delete an EPerson without loosing this information. We do not need the submitter of an item to be linked with an EPerson account. We have this information in dc.description.provenance. This field is hidden by our default configuration (metadata.hide.dc.description.provenance = true). So it can be seen by Administrators only. That is important for GDPR too and I think it is the right way to handle this.


          Issue Links



              pbecker Pascal-Nicolas Becker
              pbecker Pascal-Nicolas Becker
              0 Vote for this issue
              5 Start watching this issue