OSCER users,
Schooner is working well again, thanks to our own Patrick Calhoun!
There are now 187 nodes with 1216 jobs running on a node or
part of a node per job, and 225 nodes running 46 multi-node
(MPI) jobs (with a comparable number of jobs pending).
And there's a total of 510 nodes running jobs, or ready to run
them.
(Presumably there's a bit more cleanup to do on some of the
compute nodes, but that's not a barrier to people getting
their work done.)
Henry
----------
On Sat, 6 Aug 2022, Henry Neeman wrote:
>OSCER users,
>
>Schooner is having problems.
>
>We apologize for the inconvenience!
>
>We'll get this resolved as soon as we can, but that might be
>Monday morning (Aug 8).
>
>Currently, almost all of Schooner's compute nodes are
>either down, drained of running jobs, or draining.
>
>At the moment, 117 nodes are running 559 jobs that each
>consume at most a single node, and 119 nodes are running
>multi-node (presumably MPI) jobs.
>
>That's about a third of the total compute nodes, and
>we expect that to get worse before it gets better.
>
>---
>
>Henry Neeman ([log in to unmask])
>Director, OU Supercomputing Center for Education & Research (OSCER)
>Associate Professor, Gallogly College of Engineering
>Adjunct Associate Professor, School of Computer Science
>OU Information Technology
>The University of Oklahoma
>
>Engineering Lab 212, 200 Felgar St, Norman OK 73019
>405-325-5386 (office), 405-325-5486 (fax), 405-245-3823 (cell),
>[log in to unmask] (to e-mail me a text message)
>http://www.oscer.ou.edu/
|