AbderRahman N Sobh

Better Queueing Behavior for Cluster Submission

Would it be possible for jobs that are stuck for a long time (i.e. after 1 or 2 days) in a queue to check if a different node has no or lesser wait on it?

I sometimes come across job submission behavior where I have a simulation stuck on a "doomed" node's queue with someone's long simulation blocking mine from completing. Meanwhile, I can start another simulation and have it sent to another node which was sitting free the whole time and run.


