When building the OAI index in 6.x
The process becomes really slow after a while and consumes a lot of memory.
It is the same problem as describe in ---
---. I have seen that the solution which was found for this bug was clearing the hibernate session which works great for my problem as well. DS-2965
I think that the problem was that Session#clear() in the DBConnection not only evicts all items from the cache but also cancels all pending saves/updates/deletes.
but it should be save to be used in a read only use case, like reading all items from the database to get indexed.
Some number to help understanding the problem:
Using the current code to index 33013 items:
after 1463 Minutes and 26400 item: the 2GB memory are completely in use, 1043967 items in the hibernate cache. To index one item needs about 5 seconds (instead of ~ 5 milliseconds)
Trying to evict every object that is touched by the code similar to the index discovery code
hard to find every place where we have something loaded in the cache (collections, communities, items, metadatavalues, bundles, bitstreams ...) so the result is:
finished after 175 Minutes: 107695 items in the hibernate cache. To index one item needs about 0,4 seconds (instead of ~ 5 milliseconds)
Using Session#clear() from the Hibernate Session object
Finished after 9 Minutes. As expected 0 items in the cache. About 4 milliseconds to index an item. Should be able to index every amount of items because the cache size is constant
So I am going to add a pull request which adds the possibility to clear the whole cache which is great for batch reading of items.
I could add the flush method as well to make batch creation of a large number of items possible as described here: