OSCER-USERS-L Archives

OSCER users

OSCER-USERS-L@LISTS.OU.EDU

Options: Use Classic View

Use Proportional Font
Show HTML Part by Default
Show All Mail Headers

Topic: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
David Akin <[log in to unmask]>
Wed, 11 Oct 2023 17:31:46 -0500
text/plain (3693 bytes) , text/html (8 kB)
OSCER users,
The Schooner and OURRstore maintenance is now complete. They are again
available for use at your convenience. Please let us know if you have any
issues.

Thanks
Dave

On Wed, Oct 11, 2023, 7:59 AM David Akin <[log in to unmask]> wrote:

> OSCER users,
> Schooner and OURRstore maintenance is now starting. Please save your work
> and log off. We expect them to be back online by midnight Oklahoma time.
> We'll send out another alert when the maintenance is complete.
>
> Dave
>
> On Mon, Oct 9, 2023 at 11:47 AM Neeman, Henry J. <[log in to unmask]> wrote:
>
>>
>> OSCER users,
>>
>> REMINDER:
>>
>> Schooner/OURRstore scheduled maintenance outage
>> Wed Oct 11 8am-midnight
>>
>> Details are earlier in this e-mail thread.
>>
>> IMPORTANT IMPORTANT IMPORTANT IMPORTANT!!!
>>
>> Before the scheduled maintenance outage starts on
>> Wed Oct 11 8:00am:
>>
>> Jobs that wouldn't finish before the scheduled maintenance
>> outage starts won't be able to start at all, until after the
>> scheduled maintenance outage has ended (planned for
>> Wed night at midnight).
>>
>> Because, by this approach, such jobs can run for
>> the full wall clock time limit that they've requested.
>>
>> So, if you want a job to run before the scheduled
>> maintenance outage begins, then in your batch scripts,
>> you might need to reduce the amount of wall clock
>> time limit you request.
>>
>> (Once the maintenance period ends, this won't apply
>> any more.)
>>
>> If you have jobs that you've already submitted that are
>> pending in the queue, then you might want to reduce
>> their requested wall clock time limit, to give them
>> a chance to start before the scheduled maintenance
>> outage begins.
>>
>> The command is:
>>
>> scontrol update JobId=######## TimeLimit=DD-HH:MM:SS
>>
>> except REPLACE ######## with the job ID number, and
>> REPLACE DD-HH:MM:SS with 2-digit number of days,
>> 2-digit number of hours (beyond the number of days),
>> 2-digit number of minutes (beyond the days and hours)
>> and 2-digit number of seconds (beyond the days, hours
>> and minutes).
>>
>> You have to pick a wall clock time limit short enough that
>> the job can run to its time limit before the start of the
>> scheduled maintenance outage.
>>
>> For example, suppose job 123456 had requested 48 hours of
>> wall clock time limit, but now there are less than 48 hours
>> before the start of the scheduled maintenance outage.
>>
>> So job 123456 won't be able to run until after the
>> scheduled maintenance outage ends.
>>
>> But, we could change the requested wall clock time limit
>> for job 123456 to, for example, 30 hours, like this:
>>
>> scontrol update JobId=123456 TimeLimit=01-06:00:00
>>
>> In that case, job 123456 would have a *CHANCE*
>> (but NOT A GUARANTEE) to run before the outage starts.
>>
>> Henry
>>
>>
>> ------------------------------
>> *From:* Neeman, Henry J. <[log in to unmask]>
>> *Sent:* Friday, October 6, 2023 7:15 AM
>> *To:* [log in to unmask] <[log in to unmask]>
>> *Subject:* Schooner scheduled maintenance outage
>>
>>
>> OSCER users,
>>
>> SUMMARY:
>>
>> Schooner/OURRstore scheduled maintenance outage
>> Wed Oct 11 8am-midnight
>>
>> DETAILS:
>>
>> On Wed Oct 11 8am-midnight, OSCER will hold a scheduled
>> maintenance outage for the Schooner supercomputer and
>> the OURRstore tape archive, specifically:
>>
>> * Update job scheduler software SLURM to version 23 on
>> the Schooner supercomputer.
>>
>> * Update the configuration on the supercomputer Ethernet
>> switches.
>>
>> * Shift the OURRstore Ethernet switches in Norman to
>> the new second tape library at OUHSC (to be delivered soon).
>>
>> As always, we apologize for the inconvenience -- our goal
>> is to make OSCER capabilities even better!
>>
>> The OSCER Team ([log in to unmask])
>>
>


ATOM RSS1 RSS2