At UMD, during our batch ingest operations, we started encountering frequent stuck threads in Tomcat. After some number of successful transactions containing many write operations, finally a transaction would hang up during the request to commit it, and we would be alerted by the Tomcat stuck thread monitor.
We eventually determined that if we removed the fcrepo-audit module from our webapp build, the stuck threads stopped occurring. Our current theory is that since the fcrepo-audit module was causing additional writes to the repository for every piece of content that we were writing, somehow these write events were getting into some sort of lock or resource contention state, and filling up the Modeshape event buffer to the point that it was no longer able to process writes to the repository.
It should be noted that while in this stuck thread state all writes to the repository would hang, reads (e.g., as performed by the indexers) continued to succeed.
Mailing list threads with more details about the stuck thread issue, and our various solution attempts:
(Note that the stuck thread issue recurred several times, each time with a different solution. In particular, the first two mailing list threads linked above were not necessarily connected to the fcrepo-audit module.)