      (This is related to the Groups thread: https://groups.google.com/forum/#!topic/islandora/ZfKjKOSnD74 )

      I'm in the process of rebuilding our newspaper archives after a crash that occurred before my time. I am wondering if there is anything that can be done to address circumstances like the following (see attached image), which you see when searching for "strike".

      So in this newspaper page, there is a lot of labour-related striking going on, but the result the searcher sees is something about some kind of a wagon accident that occured when a Census was being taken, which just happens to be an earlier thing the OCR picked up. Meanwhile, there are numerous instances of labour-related strikes further "down" the same newspaper page, which is much more likely to be what is sought.

      For situations like this, instead of just one context, it would be good to (have the option to?) display multiple contexts:

      Text something something something <em>strike</em> something something something
      ... [1000 characters] ...
      something something something <em>strike</em> something something something


      I noticed that there is already a setting snippets=8 in /sites/all/modules/islandora/islandora_ocr/includes/solr.inc, but I wonder if that just means how many times the keyword will get emboldened. It wouldn't necessarily mean the other 7 results, being much farther down the text stream, won't be truncated.

      What I am suggesting is that, if all the occurrences of the word don't fit into the initial chunk of text, that additional chunks be added with ellipses and how far apart they are from each other, or maybe a defaced horizontal rule with that number embedded in it.

      Ideally this would be configurable by the admins or even better by the end user searching (where the admin just picks the defaults).

      I would be happy to attempt to code this feature myself, but I have no idea where to begin. Diego suggesting using the book reader to do it, but I think it's still beyond my ken to try to do it independently.

      (If there were a way to get highlighting but not truncate the returned text, a JavaScript / CSS solution could be possible! But it really should be in the basic output.)

      Thanks very much for any suggestions and feedback you can provide!


