OSCER-USERS-L Archives

OSCER users

OSCER-USERS-L@LISTS.OU.EDU

Options: Use Forum View

Use Monospaced Font
Show HTML Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
"Neeman, Henry J." <[log in to unmask]>
Reply To:
Neeman, Henry J.
Date:
Mon, 9 Oct 2023 16:47:19 +0000
Content-Type:
multipart/alternative
Parts/Attachments:
text/plain (2851 bytes) , text/html (9 kB)

OSCER users,

REMINDER:

Schooner/OURRstore scheduled maintenance outage
Wed Oct 11 8am-midnight

Details are earlier in this e-mail thread.

IMPORTANT IMPORTANT IMPORTANT IMPORTANT!!!

Before the scheduled maintenance outage starts on
Wed Oct 11 8:00am:

Jobs that wouldn't finish before the scheduled maintenance
outage starts won't be able to start at all, until after the
scheduled maintenance outage has ended (planned for
Wed night at midnight).

Because, by this approach, such jobs can run for
the full wall clock time limit that they've requested.

So, if you want a job to run before the scheduled
maintenance outage begins, then in your batch scripts,
you might need to reduce the amount of wall clock
time limit you request.

(Once the maintenance period ends, this won't apply
any more.)

If you have jobs that you've already submitted that are
pending in the queue, then you might want to reduce
their requested wall clock time limit, to give them
a chance to start before the scheduled maintenance
outage begins.

The command is:

scontrol update JobId=######## TimeLimit=DD-HH:MM:SS

except REPLACE ######## with the job ID number, and
REPLACE DD-HH:MM:SS with 2-digit number of days,
2-digit number of hours (beyond the number of days),
2-digit number of minutes (beyond the days and hours)
and 2-digit number of seconds (beyond the days, hours
and minutes).

You have to pick a wall clock time limit short enough that
the job can run to its time limit before the start of the
scheduled maintenance outage.

For example, suppose job 123456 had requested 48 hours of
wall clock time limit, but now there are less than 48 hours
before the start of the scheduled maintenance outage.

So job 123456 won't be able to run until after the
scheduled maintenance outage ends.

But, we could change the requested wall clock time limit
for job 123456 to, for example, 30 hours, like this:

scontrol update JobId=123456 TimeLimit=01-06:00:00

In that case, job 123456 would have a *CHANCE*
(but NOT A GUARANTEE) to run before the outage starts.

Henry


________________________________
From: Neeman, Henry J. <[log in to unmask]>
Sent: Friday, October 6, 2023 7:15 AM
To: [log in to unmask] <[log in to unmask]>
Subject: Schooner scheduled maintenance outage


OSCER users,

SUMMARY:

Schooner/OURRstore scheduled maintenance outage
Wed Oct 11 8am-midnight

DETAILS:

On Wed Oct 11 8am-midnight, OSCER will hold a scheduled
maintenance outage for the Schooner supercomputer and
the OURRstore tape archive, specifically:

* Update job scheduler software SLURM to version 23 on
the Schooner supercomputer.

* Update the configuration on the supercomputer Ethernet
switches.

* Shift the OURRstore Ethernet switches in Norman to
the new second tape library at OUHSC (to be delivered soon).

As always, we apologize for the inconvenience -- our goal
is to make OSCER capabilities even better!

The OSCER Team ([log in to unmask])


ATOM RSS1 RSS2