System and Infrastructure Status News

Launch Maintenance - May 7

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: launch.tamu.access-ci.org

Start Date: May 7, 2025, 1:00 p.m.

End Date: May 7, 2025, 10:00 p.m.

The TAMU Launch cluster will be down for maintenance on Wednesday, May 7 from 8:00AM CDT to 5:00PM CDT.

Posted: March 20, 2026

Bugfix for registry.access-ci.org plugin

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: registry.access-ci.org

Start Date: April 30, 2025, 1:00 p.m.

End Date: April 30, 2025, 1:30 p.m.

On April 30, 2025, a plugin used by the ACCESS User Registry (https://registry.access-ci.org/) will be updated to address a bug which prevents some users from resetting their ACCESS CI passwords. Server instances will be restarted during this update which may cause in-progress registrations/logins to fail.

Posted: March 20, 2026

PSC, Bridges-2, Neocortex, etc. Maintenance

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: bridges2-em.psc.access-ci.org, bridges2-gpu.psc.access-ci.org, bridges2-rm.psc.access-ci.org, bridges2-ocean.psc.access-ci.org, neocortex-sdflex.psc.access-ci.org

Start Date: April 30, 2025, 1:00 a.m.

End Date: May 5, 2025, 6:00 p.m.

We are pleased to announce that the Bridges-2 system are now back online and available. Neocortex is fully available. Due to severe weather that impacted the Pittsburgh area on Tuesday, April 29th, our data center experienced significant disruptions. This weather event caused widespread issues across the region, leading to a state of emergency being declared in Allegheny County. The prolonged power outage affected our machine room at Northern Pike, including the building’s generator controls, resulting in the longest sustained outage in our nearly 40-year history. Our electrical provider has been working diligently to restore power, and we are grateful for their efforts. We appreciate your patience and understanding as we navigated this extreme weather event.

Posted: March 20, 2026

Reconfiguration of ACCESS User Registry to remove LDAP

Published

Infrastructure News Type: Reconfiguration

Affected Infrastructure: registry.access-ci.org

Start Date: April 24, 2025, 1:00 p.m.

End Date: April 24, 2025, 1:10 p.m.

On April 24, 2025, the ACCESS User Registry (https://registry.access-ci.org/) will be reconfigured to remove the LDAP Provisioner. No downtime is expected. LDAP was previously used for storing ACCESS User Registry attributes for consumption by third party services such as CILogon and the ACCESS SSH Pubkey Downloader. On September 10, 2024 (https://operations.access-ci.org/node/749), an alternate provisioner using DynamoDB was deployed. (DynamoDB (https://aws.amazon.com/dynamodb/) is an AWS-managed serverless NoSQL service with high performance and a 99.999% SLA.) CILogon has been using the DynamoDB deployment since that time. On April 9, 2025 (https://operations.access-ci.org/node/838), the ACCESS SSH Pubkey Downloader was reconfigured to also use the DynamoDB deployment. After the ACCESS User Registry has been reconfigured to remove the LDAP Provisioner, there will be no services using the ACCESS User Registry LDAP servers, so they will be decommissioned.

Posted: March 20, 2026

Ticketing System Automation Rule execution is delayed

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: tickets.access-ci.org

Start Date: April 23, 2025, 3:00 p.m.

End Date: April 24, 2025, 7:46 p.m.

Hi Everyone, Atlassian is investigating the cause of the rule execution delay. This might delay the communication of the tickets for you. We will keep you posted when we have an update. Meanwhile, your patience and understanding are appreciated. https://jira-service-management.status.atlassian.com/incidents/4717qqxyt0nk Thanks

Posted: March 20, 2026

Atlassian rolling out an updated Jira Navigation in the coming weeks

Published

Infrastructure News Type: Reconfiguration

Affected Infrastructure: tickets.access-ci.org

Start Date: April 16, 2025, 2:40 p.m.

End Date: May 1, 2025, 1:00 p.m.

Atlassian is planning a roll out of an updated Jira navigation 'in the coming weeks.' There are some details here. We can turn it on now or just wait until Atlassian makes it the default. https://support.atlassian.com/jira-software-cloud/docs/what-is-the-new-navigation-in-jira/

Posted: March 20, 2026

SDSC Expanse: Scheduler issues [Resolved]

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: expanse.sdsc.access-ci.org, expanse-gpu.sdsc.access-ci.org

Start Date: April 9, 2025, 11:30 p.m.

End Date: April 10, 2025, 4:15 a.m.

The Expanse scheduler issue has been fixed and job submissions and queue commands are working now. Thanks SDSC User Services Staff ----- Dear Expanse User, We are currently seeing issues with the Expanse Slurm scheduler and troubleshooting the problem. At present new job submissions and Slurm commands are failing. We will update once the issue is resolved. Thanks SDSC User Services

Posted: March 20, 2026

Update to Production Deployment of RP SSH Pubkey Service

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: registry.access-ci.org

Start Date: April 9, 2025, 2:00 p.m.

End Date: April 9, 2025, 3:00 p.m.

On April 9, 2025, starting at 9:00am CDT (10:00am EDT) ACCESS Operations will be updating the production deployment of the SSH Public Key retrieval service for ACCESS RPs. This will require a very brief downtime for the existing service, during which queries from RP registered to use the service may be interrupted. There will be no interruption to the ACCESS User Identity and OAuth Client Registry for this update.

Posted: March 20, 2026

idp.access-ci.org updated

Published

Infrastructure News Type: Reconfiguration

Affected Infrastructure: identity.access-ci.org

Start Date: April 2, 2025, 2:30 p.m.

End Date: April 2, 2025, 2:40 p.m.

On April 2, 2025, the ACCESS Identity Provider (https://idp.access-ci.org/idp) was updated to address a vulnerability in the OpenSAML library (https://shibboleth.net/community/advisories/secadv_20250326.txt) used by the Shibboleth Identity Provider software. (Updated to use Shib IdP v5.1.4 (https://shibboleth.atlassian.net/wiki/spaces/IDP5/pages/3199500367/ReleaseNotes#5.1.4-(March-27,-2025)).)

Posted: March 20, 2026

Jira changing terminology, 'issue' is being changed to 'work item.'

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: tickets.access-ci.org

Start Date: March 31, 2025, 1:56 p.m.

End Date: May 1, 2025, 1:00 p.m.

The community post (https://community.atlassian.com/forums/Jira-articles/It-s-here-Work-is-the-new-collective-term-for-all-items-you/ba-p/2954892) on this and a couple other related term changes are detailed.

Posted: March 20, 2026

502 Bad Gateway Error Affecting JSM Customer Portal in Some Regions

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: tickets.access-ci.org

Start Date: March 25, 2025, 6:29 a.m.

End Date: March 25, 2025, 6:49 a.m.

Users in some regions are currently experiencing a 502 Bad Gateway error when trying to access the JSM ticketing system Customer Portal. This issue is impacting both customers and agents in those regions. We are waiting for an update from Atlassian support and will provide further information as soon as we hear from them. https://jira-service-management.status.atlassian.com/incidents/vy4wm8b5v0yh

Posted: March 20, 2026

Some errors in JSM automation executions

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: tickets.access-ci.org

Start Date: March 24, 2025, 12:39 a.m.

End Date: March 24, 2025, 4:56 a.m.

Hi Everyone, We are encountering some errors in the automation executions and as a result, the notifications for new tickets or ticket edits may not go out. Other than that, there's no impact on managing tickets or queues. Please bear with us while we are troubleshooting this with Atlassian support. We will keep you posted as we know. Thank you

Posted: March 20, 2026

Unplanned outage to xdmod.access-ci.org and metrics.access-ci.org [RESOLVED]

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: xdmod.access-ci.org

Start Date: March 21, 2025, 10:28 a.m.

End Date: March 21, 2025, 2:55 p.m.

We are investigating an issue with connectivity to xdmod.access-ci.org and metrics.access-ci.org. Resolved 9:56 AM EDT

Posted: March 20, 2026

Reconfigure ACCESS User Registry Database

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: registry.access-ci.org

Start Date: March 19, 2025, 1:00 p.m.

End Date: March 19, 2025, 1:30 p.m.

On March 19, 2025, the ACCESS User Registry (https://registry.access-ci.org/) will experience a total outage for approximately 10-15 minutes. During this time, the database used by the ACCESS User Registry will be reconfigured to use "utf8mb4" as the character encoding for several user attributes such as name and organization. This update is necessary to correctly display non-Latin characters. During the outage, visitors to https://registry.access-ci.org/ will be redirected to this infrastructure news notice. Users will not be able to register for ACCESS accounts, make changes to their existing ACCESS accounts, or create/modify OIDC client registrations. Logging on to other ACCESS websites should not be affected. For questions or concerns with this update, please contact help@cilogon.org (mailto:help@cilogon.org) or open an ACCESS Help Ticket (https://access-ci.atlassian.net/servicedesk/customer/portal/2/create/30).

Posted: March 20, 2026

Some slowness in select Jira Services noted from Atlassian

Published

Infrastructure News Type: Degraded

Affected Infrastructure: tickets.access-ci.org

Start Date: March 19, 2025, 12:45 p.m.

End Date: March 20, 2025, 1:00 p.m.

Some users experiencing a 'site maintenance' message when attempting to access Confluence and Jira on some APAC based sites. (https://jira-service-management.status.atlassian.com/incidents/bt1p90s32742) We have partially restored functionality to some of the affected Jira Work Management, Jira Service Management, and Jira Cloud customers. We will continue to work on the fix and share more information as it becomes available.

Posted: March 20, 2026

Kerberos Replica DNS Work

Published

Infrastructure News Type: Reconfiguration

Affected Infrastructure: kerberos.access-ci.org

Start Date: March 18, 2025, 6:00 p.m.

End Date: March 18, 2025, 7:00 p.m.

There will be DNS work to transition to the new Kerberos replica server hosted at PSC. No service outage expected as hosts should failover to the other replicas.

Posted: March 20, 2026

idp.access-ci.org updated

Published

Infrastructure News Type: Reconfiguration

Affected Infrastructure: identity.access-ci.org

Start Date: March 18, 2025, 1:30 p.m.

End Date: March 18, 2025, 1:30 p.m.

On March 18, 2025, the ACCESS Identity Provider (https://idp.access-ci.org/idp) was updated to address a Tomcat vulnerability that was being actively exploited in the wild.

Posted: March 20, 2026

SDSC Expanse: Lustre filesystem back in production use (03/24/2025)

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: expanse.sdsc.access-ci.org, expanse-gpu.sdsc.access-ci.org, expanse-ps.sdsc.access-ci.org

Start Date: March 17, 2025, 4:00 p.m.

End Date: March 25, 2025, 12:00 a.m.

Dear Expanse User, The Expanse Lustre filesystem issues have been resolved and the filesystem is back in production use. Thank you for your patience through the long outage. We will continue to monitor the filesystem and follow up as needed. Users are also reminded that both /expanse/lustre/scratch and /expanse/lustre/projects are not backed up. So please make offsite copies of anything critical in those locations. Thanks SDSC User Services Staff ------------ Dear Expanse User, We are continuing to work on the Expanse Lustre filesystem. The initial problem was due to a hardware issue with one of the metadata server drives. The drive was replaced but problems persist with mounting of storage targets due to a software bug being triggered in the Lustre filesystem. Unfortunately since the metadata server controls the entire filesystem, the /expanse/lustre/scratch and /expanse/lustre/projects directories will continue to be unavailable. We are sorry for the impact this is causing and will keep users posted about any new developments. Thanks SDSC User Services Staff --------- Dear Expanse User, We are continuing to work on the metadata server problem on the Expanse Lustre filesystem. Unfortunately the outage is going to go longer and we will update once we have more information. Thanks SDSC User Services Staff ----------------------------------------- Dear Expanse User, We are continuing to work on the Lustre filesystem on Expanse. The problem is going to take much longer than anticipated to resolve and likely the earliest we can recover is tomorrow (03/18/2025). We recognize that a lot of Expanse users do not use the lustre directories and to enable them to run we will release the reservation. We have held current jobs that are clearly using lustre (e.g. if they specified it in the output path or working directory path). However, if you have jobs that use Lustre without specifying the need through a constraint, the jobs will fail. We strongly recommend all jobs needing Lustre include the following line: #SBATCH --constraint="lustre" Please see more details in our user guide under the "SUBMITTING JOBS USING LUSTRE" subsection in the storage section (https://www.sdsc.edu/systems/expanse/user_guide.html#narrow-wysiwyg-10). We also want to remind users that the home and NFS directories are limited in performance and scaling so please don't submit intensive jobs that need the Lustre filesystem from there. We will keep you posted on the Lustre issue status tomorrow. Thanks SDSC User Services Staff ----------------------------------------

Posted: March 20, 2026

Service Slowness in Multiple Products (Jira, Jira Service Management, and Confluence)

Published

Infrastructure News Type: Outage Full

Affected Infrastructure: tickets.access-ci.org

Start Date: March 17, 2025, 3:30 p.m.

End Date: March 18, 2025, 4:31 a.m.

This may have delayed some of the automation executions up to one day. But, as of now, there's no any other impact identified. https://jira-service-management.status.atlassian.com/incidents/hbhx13fzvdgz#

Posted: March 20, 2026

ACCESS XDMoD Partial Downtime

Published

Infrastructure News Type: Outage Partial

Affected Infrastructure: xdmod.access-ci.org

Start Date: March 17, 2025, 2:00 p.m.

End Date: March 18, 2025, 2:00 p.m.

ACCESS XDMoD will be upgraded to version 11.0.1 on Monday, March 17 at approximately 10:00 EDT. Various data in ACCESS XDMoD may be unavailable during the upgrade. Service is expected to be fully restored within 24 hours. Once the upgrade is started, release notes will be available at https://xdmod.access-ci.org/#main_tab_panel:about_xdmod?Release%20Notes

Posted: March 20, 2026