The SolrServiceImpl needs a major refactor to separate out the functionality based on design patterns. As part of a discussion with Tim and Andrea, it was agreed that this refactor is important for DSpace 7 because the SolrServiceImpl has deviated even further from a manageable class with the latest customizations.
The refactoring can be summarized as:
- For indexing, the SolrServiceImpl should be able to iterate over Spring configured IndexFactory objects. This would use an ItemIndexFactory, CollectionIndexFactory, … which will perform the indexing from the old SolrServiceImpl. Additional Factories will be used for WorkspaceItem, WorkflowItem, PoolTask, ClaimTask.
- This model can hereafter be extended to index e.g. Metadata Schemas and Fields, a use case which would allow us to search for all fields containing "author" in the schema, element, qualifier or note. This would be useful in the admin edit item view. Or a similar feature would be a search in the bitstream formats when manually changing the bitstream format
- To differentiate between the various types of objects being indexed, a name-based field (e.g. ObjectType containing Item, Collection, ClaimTask, MetadataField, …) would also ensure that this model is flexible enough to allow any new content to be added without adding a Constant per indexed object. I would recommend to keep the search.resourcetype indexed as well for the types where it was used but make it deprecated to slowly phase out that parameter.
- To avoid requiring changes to the content classes to index them, the IndexableObject interface and IndexableObjectService can be removed. The changes to all the services can also be reverted. Determining whether an object can be indexed would rather be based on the presence of a configured IndexFactory subclass. As an alternative to this IndexableObject interface which requires changes to each content class and service, and IndexingObject can be created
- This refactoring doesn't require changes to what's being indexed apart from adding the ObjectType, Neither does it require changes how the calls to solr are being created for this. It does not modify what's stored in solr or returned for a query. The goal is rather to ensure buildDocument(Context context, Item item) is moved to an ItemIndexFactory class, indexInProgressSubmissionItem(Context context, Item item) is moved to a WorkspaceTaskIndexFactory class and a PoolTaskIndexFactory and would use the WorkspaceTask and PoolTask as parameters instead of the Item, …. A MetadataFieldIndexFactory can additionally be created and configured to ensure the list of Metadata Fields can be indexed as well
- This rework of the SolrServiceImpl doesn't impact the object model, it would rather ensure the new content will be indexed according to the design patterns
A visualization of the various IndexFactory and IndexingObject classes can be found in attachment