[ https://issues.apache.org/jira/browse/SOLR-13051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16713830#comment-16713830 ]
Gus Heck commented on SOLR-13051:
---------------------------------
As written the code executes a preemptive collection creation action in a separate thread. The current code does close it's threads reliably while the server is running and does not leak. However, due to the asynchronous operation in progress, the test requires a Thread.sleep() at the end to avoid thread leak warnings. The operation to preemptively create a collection was not considered important to be completed before shutdown since the goal is to keep indexing flowing without waiting for collection creation, which is not a concern during shutdown, but the effect on the tests is undesirable. This slows down the tests and in cases of extreme testing load even the 2 second sleep currently used could theoretically still fail.
The fundamental problem is that the thread MUST be able to run longer than the request so that updates can continue while collections are created in the background, but the thread and the server don't know about each other, so it's not possible to ensure the thread doesn't continue longer than the server. What needs to happen here is that the server shutdown process needs to be aware of the executor, properly initiate executor shutdown and then wait for that shutdown to complete.
My initial impulse was to register a CloseHook, which can be done in ways that make the test pass, but CloseHook seems to be best used on objects that live and die with the core, not on requests. Several issues arise with this tactic in a request level class such as an URP.
# One has to be careful not to register enormous numbers of close hooks, so it has to be done at initialization time, and only the first time the class is initialized.
# Such an initialization must come after the core is created, and relying on class loading order would be treacherous, so one winds up with a volatile field and/or some synchronization to ensure only one close hook is registered. This seems overly complicated.
# The executor needs a place to live that has a similar life span to the core holding the close hook. Putting it on the URP instance leads to executors that we don't know how to reference reliably from the close hook (because of when the close hook has to be created) and putting a static reference on the class means that once one core closes no other core can use the executor. This also gets complicated.
One could solve #3 via a map of executors keyed by core, or maybe alias, or perhaps by abusing the object store on the core that's meant for metrics stuff (ugly hack), or perhaps by other means, but everything down this road leads to lots of complicated code.
The simple solution is to add an ExecutorService to SolrCore and a method to submit a Runnable to it. Then we can make closing it part of the normal shutdown process with a single line of code. That solution shrinks and simplifies the code, is clear and easy to understand and is perfectly friendly to the tests. The risk is that if this facility is misused the core will not shutdown cleanly, and this risk will be clearly documented in the associated Javadocs.
If there is concern about having one ExecutorService per core, this could maybe be moved to CoreContainer, though I'm nervous about it being attached above the SolrCore level since resource loading is often once per core and the risk of not unloading resources referenced by the Runnable would come into play (though submitting long running tasks would be considered a misuse in the first place, so maybe it's fine?).
Post by Gus Heck (JIRA)Improve TimeRoutedAlias preemptive create to play nice with tests
-----------------------------------------------------------------
Key: SOLR-13051
URL: https://issues.apache.org/jira/browse/SOLR-13051
Project: Solr
Issue Type: Improvement
Security Level: Public(Default Security Level. Issues are Public)
Components: Tests
Affects Versions: master (8.0), 7.7
Reporter: Gus Heck
Assignee: Gus Heck
Priority: Minor
SOLR-12801 added AwaitsFix to TimeRoutedAliasUpdateProcessorTest. This ticket will fix the test to not require a sleep statement and remove the AwaitsFix.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-***@lucene.apache.org
For additional commands, e-mail: dev-***@lucene.apache.org