As discussed in this thread (https://groups.google.com/forum/#!topic/dspace-tech/l4Rzo4Pajoo) it would appear that SOLR is only preserving the full-text indexing results from `dspace filter-media` of the final bitstream that was processed.
As discussed in the forum, this seems to be evidencing itself also in the http://demo.dspace.org/xmlui/discover site. A search for "test word document" (including quotes) should return the handle for Test PDF Document (http://demo.dspace.org/xmlui/handle/10673/5) but it does not because the index only preserved the full text of the last bitstream on that handle. This mirrors the behavior of our 5.4 installation.
In a scenario with multi-part bitstreams, only the last is included in the index viewable by using the SOLR viewer (http://localhost:8080/solr/search/select?q=handle:...). If there are 4 bitstreams, the first 3 are not preserved in the index.
I discovered that if I manually override the order in DSpace table bundle2bitstream (field bitstream_order) that whichever bitstream gets the greatest integer is the one that is retained.
The `fulltext` XML handle in the SOLR index ought to account for multiple bitstreams or hopefully it can be expanded to have multiple fulltext additions. I don't see any references in the SOLR view that alludes to there being multiple streams to choose from (e.g. only the last bitstream is mentioned in stream_name) so hopefully this isn't a SOLR limitation. We often have multiple bitstreams and would like them all indexed and full-text searchable.
This is my first JIRA posting so feel free to administratively update it as appropriate.