OSCER-USERS-L Archives

OSCER users

OSCER-USERS-L@LISTS.OU.EDU

Options: Use Classic View

Use Monospaced Font
Show HTML Part by Default
Condense Mail Headers

Topic: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Sender: OSCER users <[log in to unmask]>
Date: Sun, 7 Aug 2022 14:38:53 -0500
Reply-To: Henry Neeman <[log in to unmask]>
MIME-Version: 1.0
Message-ID: <[log in to unmask]>
In-Reply-To: <[log in to unmask]>
Content-Type: text/plain; charset=US-ASCII
From: Henry Neeman <[log in to unmask]>
Parts/Attachments: text/plain (87 lines)
OSCER users,

We've got 699 compute nodes in production, with
1079 1-node or less-than-1-node jobs running on 194 nodes
and 49 multi-node (MPI) jobs running on 252 nodes,
with 1315 jobs pending.

(Most of the idle nodes are condominium.)

Many thanks to Patrick for resolving this!

Henry

----------

On Sun, 7 Aug 2022, Henry Neeman wrote:

>OSCER users,
>
>Schooner is again experiencing problems. Currently, most of
>its compute nodes are down, drained or draining.
>
>We'll get this tracked down and resolved, but probably
>not until Monday (Aug 8).
>
>We apologize for the inconvenience.
>
>Henry
>
>----------
>
>On Sat Aug 6 2022, Henry Neeman <[log in to unmask]> wrote:
>
>OSCER users,
>
>Schooner is working well again, thanks to our own Patrick Calhoun!
>
>There are now 187 nodes with 1216 jobs running on a node or
>part of a node per job, and 225 nodes running 46 multi-node
>(MPI) jobs (with a comparable number of jobs pending).
>
>And there's a total of 510 nodes running jobs, or ready to run
>them.
>
>(Presumably there's a bit more cleanup to do on some of the
>compute nodes, but that's not a barrier to people getting
>their work done.)
>
>Henry
>
>----------
>
>On Sat, 6 Aug 2022, Henry Neeman wrote:
>
>OSCER users,
>
>Schooner is having problems.
>
>We apologize for the inconvenience!
>
>We'll get this resolved as soon as we can, but that might be
>Monday morning (Aug 8).
>
>Currently, almost all of Schooner's compute nodes are
>either down, drained of running jobs, or draining.
>
>At the moment, 117 nodes are running 559 jobs that each
>consume at most a single node, and 119 nodes are running
>multi-node (presumably MPI) jobs.
>
>That's about a third of the total compute nodes, and
>we expect that to get worse before it gets better.
>
>---
>
>Henry Neeman ([log in to unmask])
>Director, OU Supercomputing Center for Education & Research (OSCER)
>Associate Professor, Gallogly College of Engineering
>Adjunct Associate Professor, School of Computer Science
>OU Information Technology
>The University of Oklahoma
>
>Engineering Lab 212, 200 Felgar St, Norman OK 73019
>405-325-5386 (office), 405-325-5486 (fax), 405-245-3823 (cell),
>[log in to unmask] (to e-mail me a text message)
>http://www.oscer.ou.edu/

ATOM RSS1 RSS2