System and Infrastructure Status News

Delta maintenance 10-04-2023

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: delta-cpu.ncsa.access-ci.org, delta-gpu.ncsa.access-ci.org

Start Date: October 4, 2023, 11:00 a.m.

End Date: October 5, 2023, 3:00 a.m.

The Delta resource will undergo maintenance from 6:00AM to 8:00PM CDT on Wednesday October 4th, 2023. Additional notices will be sent if changes to the plan occur. During the maintenance period the following changes will be made: - Minor updates to the OS. - pmix package will be updated to 3.2.5 to address security vulnerability - ucx will be updated to a Mellanox release to improve functionality - apptainer will be updated to 1.2.2 from 1.1.9. - No expected changes to the existing software stack. - Software for the Delta high-speed network (HSN) switches and fabric manager will be upgraded. - Lustre file system software will be updated to address known issues. - NVIDIA GPU driver will be upgraded to support for CUDA 12.2, while the default CUDA software module will remain at CUDA 11.6.1. All Delta resources will be unavailable during the maintenance period including: - Delta login nodes - unavailable - Delta compute nodes - unavailable - Delta services - Open OnDemand - unavailable - Delta Globus Online endpoint - unavailable All running jobs will have been drained in advance of the maintenance. Queued jobs will persist and be eligible to run once maintenance is complete. A follow-up message will be sent once maintenance is complete. Please send questions to help@ncsa.illinois.edu (mailto:help@ncsa.illinois.edu) and be sure to mention Delta in the subject. --Delta Project Office

Posted: March 20, 2026

SDSC Expanse NFS server issues [resolved]

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: expanse.sdsc.access-ci.org, expanse-gpu.sdsc.access-ci.org, expanse-ps.sdsc.access-ci.org

Start Date: September 29, 2023, 7:05 p.m.

End Date: October 1, 2023, 1:30 a.m.

The Expanse home directory server issues have been resolved and the machines is back in production and available for use. >>> We are seeing the NFS server issues again on Expanse and this is causing login issues. We are working with the vendor on identifying the source of the problem and will update once we have resolution. In the interim we are putting in a system reservation to prevent new jobs from starting on Expanse.

Posted: March 20, 2026

SDSC Expanse: Home directory server and login issues [Resolved]

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: expanse.sdsc.access-ci.org, expanse-gpu.sdsc.access-ci.org

Start Date: September 28, 2023, 8:00 a.m.

End Date: September 28, 2023, 9:30 p.m.

Update: The home directory issues on Expanse have been resolved and the system is available for logins and use now. >>> The SDSC Expanse home directory servers had issues overnight and that is leading to login problems. We are looking into the issues and will update once they are resolved.

Posted: March 20, 2026

Jira Service Management ticketing system is non-responsive

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: tickets.access-ci.org

Start Date: September 27, 2023, 6:25 p.m.

End Date: September 27, 2023, 7:02 p.m.

Jira Service Management ticketing system is non-responsive again. We are working with the support to resolve this and avoid this in the future. We are not able to give an ETA at this time.

Posted: March 20, 2026

Jira Service Management ticketing system is non-responsive

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: tickets.access-ci.org

Start Date: September 27, 2023, 1:00 p.m.

End Date: September 27, 2023, 3:19 p.m.

Dear ACCESS colleagues, the Jira Service Management ticketing system is non-responsive in certain geographical regions. We have submitted a support ticket but have no ETA on the resolution

Posted: March 20, 2026

Delta projects file system maintenance 09-14-2023

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: delta-storage.ncsa.access-ci.org

Start Date: September 14, 2023, 1:00 p.m.

End Date: September 15, 2023, 3:00 a.m.

The Delta projects (/projects) file system will be unavailable from 8:00AM to 10:00PM on Thursday September 14th, 2023. The host file system, Taiga, will undergo semi-annual maintenance requiring Taiga mounts on Delta (/projects and /taiga) to be unmounted. No data can be read or written to/from /projects or /taiga during the maintenance period. The Slurm constraint for projects and taiga was removed Tuesday, September 12 at 8AM. Jobs with the projects constraint, even with a short wall clock time, will not be eligible for scheduling after that time and until maintenance is complete. Jobs that do not specify the /projects or /taiga file systems by Slurm constraint/feature will be allowed to run. Please send questions to help@ncsa.illinois.edu (mailto:help@ncsa.illinois.edu) and be sure to mention Delta in the subject or message body. --Delta Project Office

Posted: March 20, 2026

Anvil Unplanned Outage

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: anvil.purdue.access-ci.org

Start Date: September 10, 2023, 5:12 p.m.

End Date: September 11, 2023, 5:17 p.m.

Anvil has been returned to service

Posted: March 20, 2026

DUO authentication slowness or failure August 21, 2023

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: identity.access-ci.org

Start Date: August 21, 2023, 1:34 p.m.

End Date: August 21, 2023, 6:00 p.m.

Update as of 08-21-2023 1:00 pm Central The DUO multi-factor services appears to be working normally again. Original News The ACCESS Multi-factor Authentication (MFA) service provided by DUO is experiencing slowness or failure, affecting all logins using ACCESS username and password. The vendor is aware of the problem and working to resolve it. The vendor is posting outage updates here (https://status.duo.com/incidents/rw7g0q7ztj8f) and here (https://status.duo.com).

Posted: March 20, 2026

SDSC Expanse Maintenance, 7AM-Midnight (PT), August 14, 2023

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: expanse.sdsc.access-ci.org, expanse-gpu.sdsc.access-ci.org, expanse-ps.sdsc.access-ci.org

Start Date: August 14, 2023, 2:00 p.m.

End Date: August 15, 2023, 6:59 a.m.

We will have a maintenance period on Expanse 7AM-Midnight (PT), Monday, August 14, 2023. There is a reservation in place to prevent jobs from running during this period. The "squeue" output will show "ReqNodeNotAvail, Reserved for maintenance" for jobs that do not fit in the time period before the maintenance begins. These jobs will run after we release the reservation. The SDSC Expanse portal and the SDSC Expanse Globus collections will also be unavailable during this maintenance period.

Posted: March 20, 2026

RESOLVED -- 8/13/2023 Network outage upstream from Jetstream2

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: jetstream2.indiana.access-ci.org, jetstream2-gpu.indiana.access-ci.org, jetstream2-lm.indiana.access-ci.org, jetstream2-storage.indiana.access-ci.org

Start Date: August 13, 2023, 4:00 p.m.

End Date: August 13, 2023, 5:15 p.m.

UPDATE - 1:30pm Eastern: Network engineers have resolved the upstream network issues as of approximately 1:15pm Eastern. VMs on Jetstream2 should not have been affected by the outage. If you are seeing issues, please open a ticket via https://support.access-ci.org/open-a-ticket ------ There is a network outage upstream from Jetstream2 that is preventing access. Running VMs should be unaffected but they will likely not be accessible. Network engineers are aware of the issue and are working on it presently. We will update as soon as we have additional information.

Posted: March 20, 2026

8/8/2023 3:56pm ET - IU Bloomington power event affecting Jetstream2 services

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: jetstream2.indiana.access-ci.org, jetstream2-gpu.indiana.access-ci.org, jetstream2-lm.indiana.access-ci.org, jetstream2-storage.indiana.access-ci.org

Start Date: August 8, 2023, 1:30 p.m.

End Date: August 8, 2023, 7:56 p.m.

8/8/2023 3:56pm ET: UPDATE 2: The system has been restored to service. All nodes are operational again. All instances that were active prior to the outage should be active again. You may wish to reboot instances if you see any odd behavior prior to opening a ticket. If there are lingering issues, please open a ticket via support.access-ci.org/open-a-ticket 8/8/2023 1:52pm ET: UPDATE 1 - CPU and Large Memory nodes have been restored to service. We are still working to bring up GPU nodes. Exosphere is operational again. Non-GPU instances that were active prior to the outage should be active again. You may wish to reboot instances if you see any odd behavior prior to opening a ticket. If there are lingering issues, please open a ticket via support.access-ci.org/open-a-ticket 8/8/2023 9:30am ET: Jetstream2 is currently experiencing an unexpected outage related to power maintenance in the IU Bloomington data center. Engineers are currently investigating; please refrain from submitting support tickets for inaccessible instances at this time. This status item will be updated and further notices sent out when all systems are back online. Thank you for your patience and understanding as we work through these issues. -Jetstream2 Support

Posted: March 20, 2026

ACCESS User Identity and OAuth Client Registry Planned Maintenance July 1, 2023

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: registry.access-ci.org

Start Date: July 1, 2023, 11:00 a.m.

End Date: July 1, 2023, 12:00 p.m.

The ACCESS User Identity and OAuth Client Registry (https://registry.access-ci.org) will be down for upgrades early Saturday, July 1, 2023.

Posted: March 20, 2026

NCSA Delta Resource Power Outage Event

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: delta-cpu.ncsa.access-ci.org, delta-gpu.ncsa.access-ci.org

Start Date: June 30, 2023, 12:00 a.m.

End Date: June 30, 2023, 1:03 a.m.

The Delta resource has has been returned to service. All compute nodes and file systems remained up during the outage. Please check you running jobs. --Ealier Post-- The NCSA Delta resource experienced a 2nd power interruption causing an outage. Login nodes and file systems are currently accessible but the resource scheduler is unavailable. Please expect 4 hours for the system to recover and be returned to service. This posting will be updated when additional information is available.

Posted: March 20, 2026

Jetstream2 / Jetstream2 GPU / Jetstream2 Large Memory Power Outage

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: jetstream2.indiana.access-ci.org, jetstream2-gpu.indiana.access-ci.org, jetstream2-lm.indiana.access-ci.org, jetstream2-storage.indiana.access-ci.org

Start Date: June 29, 2023, 7:55 p.m.

End Date: June 30, 2023, 3:14 a.m.

Due to an ongoing adverse weather event at the Indiana University Bloomington data center, Jetstream2 is experiencing an unplanned outage. All services appear to be affected. Engineers are currently investigating the extent of the outage and next steps, though a timeline is still unknown. We appreciate your patience as we work through these technical difficulties. At this time, we ask that you please avoid submitting support tickets regarding this outage. UPDATE - 5:20p ET - Engineers are currently working to bring all affected nodes back online. We are presently waiting for all network services to be restored. Once we are cleared by operations and networks, we will begin powering up Jetstream2 infrastructure. There is no ETA at this time for return to service. We will update as soon as we have additional information. UPDATE 2 - 9:57p ET - Existing instances should be online within the next hour and available for use. New instances are not getting full connectivity. Engineers are still looking into that issue. We appreciate your patience. UPDATE 3 - 11:12p ET - The system has been restored to service. VMs that were active prior to the outage should be active again. You may wish to reboot instances if you see any odd behavior prior to opening a ticket. If there are lingering issues, please open a ticket via https://support.access-ci.org/open-a-ticket Thanks, Jetstream2 Support

Posted: March 20, 2026

NCSA Delta Resource Power Outage

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: delta-cpu.ncsa.access-ci.org, delta-gpu.ncsa.access-ci.org

Start Date: June 29, 2023, 6:30 p.m.

End Date: June 29, 2023, 8:45 p.m.

The NCSA Delta resource experienced a power outage during a severe thunderstorm event. Please expect 4 - 8 hours for the system to recover and be returned to service. This posting will be updated when additional information is available.

Posted: March 20, 2026

Anvil Cluster Maintenance

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: anvil.purdue.access-ci.org, anvil-gpu.purdue.access-ci.org

Start Date: June 28, 2023, 12:00 p.m.

End Date: June 29, 2023, 3:00 p.m.

As of 10:14 AM EDT, engineers have completed maintenance and have returned the Anvil system back to normal service. All queues have been enabled and jobs have resumed scheduling. Please see detailed list of changes (https://www.rcac.purdue.edu/news/5921) in a separate article. Please report any issues through ACCESS Help Desk at https://support.access-ci.org/open-a-ticket ————————————————————— We have extended the maintenance until 10 AM EDT today. ______________________________________________________________ The Anvil system will be unavailable Wednesday, June 28th, 2023 from 8:00am - Thursday, June 29th at 8:00am EDT for scheduled maintenance. Any Slurm jobs which request a walltime which would take them past Wednesday, June 28th, 2023 at 8:00am EDT will not start and will remain in the queue until after the maintenance is completed. **How does this maintenance impact you?** * This is a standard maintenance * OS and Slurm will be updated to newest. stable release * Storage system configuration updates by the vendor * No user impact is expected once Anvil is returned to service** Anvil will return to full production by Thursday, June 29th, 2023 at 8:00am EDT. Please submit a ticket through ACCESS Help Desk at https://support.access-ci.org/open-a-ticket (00am EDT will not start and will remain in the queue until after the maintenance is completed. **How does this maintenance impact you?** * This is a standard maintenance * OS and Slurm will be updated to newest. stable release * Storage system configuration updates by the vendor * No user impact is expected once Anvil is returned to service** Anvil will return to full production by Thursday, June 29th, 2023 at 8:00am EDT. Please submit a ticket through ACCESS Help Desk at https://support.access-ci.org/open-a-ticket) if you have any questions.

Posted: March 20, 2026

ACCESS Ticket System

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: tickets.access-ci.org

Start Date: May 19, 2023, 11:00 a.m.

End Date: May 19, 2023, 4:25 p.m.

5-19-2022 11:30 AM Central Update Ticketing system functionality is restored. Original post: The Jira Service Management ticketing system is unavailable. We are working to resolve the issue and expect ticketing capability to be restored soon.

Posted: March 20, 2026

Anvil Storage Restored

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: anvil.purdue.access-ci.org, anvil-gpu.purdue.access-ci.org

Start Date: May 16, 2023, 3:30 p.m.

End Date: May 16, 2023, 7:45 p.m.

The issues have been resolved and affected services are now functioning again. _________________ Anvil storage began experiencing networking issues around 11:30 AM this morning. The issues have affected access to ondemand.anvil. thinlinc and ssh keys. Engineers are diagnosing the issues now. We will post an update by 5:00PM EDT

Posted: March 20, 2026

SDSC Expanse Maintenance May 8-9, 2023

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: expanse.sdsc.access-ci.org, expanse-gpu.sdsc.access-ci.org

Start Date: May 8, 2023, 3:00 p.m.

End Date: May 10, 2023, 6:59 a.m.

We will have maintenance scheduled on Expanse starting 8AM (PT) May 8 through May 9, 2023. During this maintenance several upgrades will be performed. Details of the upgrades and the process are below: - The operating system will be upgraded from Rocky Linux 8.5 to Rocky Linux 8.7. The OFED version will be updated to 5.8.1.1. In addition, the firmware on all nodes will be updated. - The drivers on the GPU nodes will be updated to version 515.65.01 (CUDA 11.7). - The Lustre client version will be updated to version 2.15.2. - A new software stack build using Spack 0.17.3 will become available and will be the default (available as modules cpu/0.17.3b or gpu/0.17.3b). - The original software stack will continue to be available and will work with the new OS and OFED versions. The old software stack can be accessed by using "module load cpu/0.15.4" or "module load gpu/0.15.4". - We will be making the new software environment available in advance of the maintenance and will send out an update once it is available. - Jobs submitted before the upgrade without specifying the version of cpu (0.15.4) or gpu (0.15.4) must be cancelled and resubmitted after the maintenance to ensure that the right environment is picked up for their jobs (since the defaults will change to 0.17.3b). - We are reserving nodes for multiple days given the large number of updates involved. Jobs that will not fit in the time window before the maintenance will be pending and held before the upgrade. - After our service nodes are upgraded and the Lustre update is completed, we will look to put fully upgraded nodes back into the queue as possible to help mitigate the long downtime. Please submit a ticket via ACCESS Support (https://support.access-ci.org (https://urldefense.com/v3/__https://support.access-ci.org/__;!!Mih3wA!HahR9U4iAF_EKq7SJwqOxyu08ng7j6AyntEFwNvdhS50YdBqK4fndYVUklk9YLomcjxQK6kJjjYviBVeeeVl$)) or SDSC's local ticketing system (consult@sdsc.edu (mailto:consult@sdsc.edu)) if you have any questions. Thanks, SDSC User Services Team

Posted: March 20, 2026

Bridges-2 Maintenance Wednesday April 26, 2023

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: bridges2-em.psc.access-ci.org, bridges2-gpu.psc.access-ci.org, bridges2-rm.psc.access-ci.org, bridges2-ocean.psc.access-ci.org

Start Date: April 26, 2023, 1:00 p.m.

End Date: April 26, 2023, 10:00 p.m.

Bridges-2, including all VMs and filesystems, will be unavailable due to scheduled maintenance Wednesday, April 26 8:00AM-5:00PM Eastern Time. The slurm queue will be preserved and queued jobs will begin running once the machine has returned to service. Please direct any questions to help@psc.edu (mailto:help@psc.edu) and our team will be happy to assist you. Thank you, Bridges-2 Team

Posted: March 20, 2026