OSCER users


Options: Use Forum View

Use Monospaced Font
Show HTML Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
OSCER users <[log in to unmask]>
Wed, 11 Dec 2024 00:47:31 -0600
David Akin <[log in to unmask]>
David Akin <[log in to unmask]>
multipart/alternative; boundary="0000000000008017bf0628f8f96f"
To: "Neeman, Henry J." <[log in to unmask]>
text/plain (4 kB) , text/html (15 kB)
OSCER Users,
Today's maintenance has been completed. If you should find any
supercomputer related issues please let us know.

The OSCER team

On Tue, Dec 10, 2024 at 7:50 AM Neeman, Henry J. <[log in to unmask]> wrote:

> OSCER users,
> Scheduled maintenance outage Tue Dec 10 6am through midnight
> Central Time
> The Nov 26 outage reconfigured Side A;
> the Dec 10 outage will reconfigure Side B.
> This will affect the supercomputer, but not OURdisk, OURcloud,
> nor OURRstore.
> ------------------------------
> *From:* Neeman, Henry J. <[log in to unmask]>
> *Sent:* Monday, December 9, 2024 10:06 AM
> *To:* OSCER users <[log in to unmask]>
> *Subject:* Re: Scheduled maintenance outage Tue Dec 10 6am through
> midnight CT
> OSCER users,
> Before the scheduled maintenance outage starts
> **TOMORROW** (Tue Dec 10) at 6:00am:
> Jobs that wouldn't finish before the scheduled maintenance
> outage starts won't be able to start at all, until after
> the scheduled maintenance outage has ended (planned for
> Tue night at midnight).
> Because, by this approach, such jobs can run for
> the full wall clock time limit that they've requested.
> So, if you want a job to run before the scheduled
> maintenance outage begins, then in your batch scripts,
> you might need to reduce the amount of wall clock
> time limit you request.
> (Once the maintenance period ends, this won't apply
> any more.)
> If you have jobs that you've already submitted that are
> pending in the queue, then you might want to reduce
> their requested wall clock time limit, to give them
> a chance to start before the scheduled maintenance
> outage begins.
> The command is:
> scontrol update JobId=######## TimeLimit=DD-HH:MM:SS
> except REPLACE ######## with the job ID number, and
> REPLACE DD-HH:MM:SS with 2-digit number of days,
> 2-digit number of hours (beyond the number of days),
> 2-digit number of minutes (beyond the days and hours)
> and 2-digit number of seconds (beyond the days, hours
> and minutes).
> You have to pick a wall clock time limit short enough that
> the job can run to its time limit before the start of the
> scheduled maintenance outage.
> For example, suppose job 123456 had requested 48 hours of
> wall clock time limit, but now there are less than 48 hours
> before the start of the scheduled maintenance outage.
> So job 123456 won't be able to run until
> after the scheduled maintenance outage ends.
> But, we could change the requested wall clock time limit
> for job 123456 to, for example, 30 hours, like this:
> scontrol update JobId=123456 TimeLimit=01-06:00:00
> In that case, job 123456 would have a **CHANCE**
> (but **NOT A GUARANTEE**) to run before the outage starts.
> Henry
> ------------------------------
> *From:* Neeman, Henry J. <[log in to unmask]>
> *Sent:* Thursday, December 5, 2024 12:26 PM
> *To:* OSCER users <[log in to unmask]>
> *Subject:* Scheduled maintenance outage Tue Dec 10 6am through midnight CT
> OSCER users,
> Scheduled maintenance outage Tue Dec 10 6am through midnight
> Central Time
> The Nov 26 outage reconfigured Side A;
> the Dec 10 outage will reconfigure Side B.
> This will affect the supercomputer, but not OURdisk, OURcloud,
> nor OURRstore.
> Planned maintenance tasks:
> OU Facilities Management will be finishing the reconfiguration
> of the power systems that feed electricity to the rows
> inside of the Four Partners Place Datacenter. This work is
> expected to take all of Tuesday next week. As such we are
> scheduling a downtime of our systems to accommodate the
> electricians to complete the work described.
> The work for the project "16-25 4PP Computing Center Busbar
> Feed reconfiguration" allows us to better distribute the
> electrical load across the full breadth of the datacenter's
> power distribution panels. This gives OSCER the ability to
> have even more evenly distributed power utilization, which
> gives us more available power per rack of servers.
> We expect the work to start by 7am Central Time on Tuesday.
> Also, we expect the work to be finished by 11:59pm Central
> Time on Tuesday, that same day.
> Time permitting, we'll also move a subset of scratch users from
> older scratch server(s) to a new larger scratch system.
> The user data will be copied to the new scratch system
> in advance, so the transition is expected to be smooth.
> As always, we apologize for the inconvenience -- our goal
> is to make OSCER resources even better!
> If you have any questions or concerns, please email us at:
> [log in to unmask]
> The OSCER Team