DSpace's Solr configurations are an absolute mess (yes, I said it). They need to be cleaned up & explanatory comments need to be added as to how/why we are using particular fields or settings.
The "solrconfig.xml" and "schema.xml" files for each of the cores include a ton of example configurations. Sometimes these configurations are commented out (and just take up unneeded space) and other times they are left there for no apparent reason.
The messiness of our configs and lack of useful comments make it extremely difficult to locate issues/bugs in the Solr configuration/schema.
Here's some examples of absolutely unnecessary settings / default settings that serve no purpose:
- Our "search" schema is named "example": https://github.com/DSpace/DSpace/blob/master/dspace/solr/search/conf/schema.xml#L48
- We have a "random_*" dynamic field which look unused: https://github.com/DSpace/DSpace/blob/master/dspace/solr/search/conf/schema.xml#L623
- There may be other fieldTypes and fields which only exist because they were in a sample file.
Essentially, our schema.xml files seem to be copies of the "example" schema and we failed to remove much of the stuff we don't use:
Similarly, our solrconfig.xml files are copies of the example solrconfig.xml and feature a ton of commented out, unused fields:
|Document the design and use of DSpace's Solr schemas||Accepted / Claimed|
|Internal documentation and cleanup of Solr schemas||Accepted / Claimed|