4928
Comment:
|
6383
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
#rev 2018-08-27 mreimers #rev 2020-08-31 alders |
|
Line 13: | Line 16: |
<<Anchor(2017-10-18-outage-etz-d-96-2)>> == Outage of Servers in Serverroom ETZ/D/96.2 == '''Status:''' {{attachment:Status/red.gif}} 2017-10-18 10:30:: Outage of most racks in ETZ/D/96.2 (cooling problem) . Most compute servers are offline. 2017-10-18 12:50:: The problem has been localized and repaired. We need to wait that the circuit is cooling down. 2017-10-18 14:00:: Cooling system is still not working correctly, we only selectively powered on a couple of compute machines. 2017-10-19 18:30:: The cooling engineer could not fix the problem, so some servers are still offline. Another technicial will try to fix the cooling system tomorrow morning. 2017-10-20 10:00:: No changes so far. The technician will arrive at 13:00 hours. <<Anchor(2017-05-13-outage-etz-d-96-2)>> == Outage Servers in Serverroom ETZ/D/96.2 == |
<<Anchor(2021-03-31-network-disruption)>> == Network disruption affecting several ISG.EE services == |
Line 33: | Line 20: |
2017-05-13 20:00:: Outage of some racks in ETZ/D/96.2. Several compute servers offline. 2017-05-13 23:59:: Most of the servers are back online. 2017-05-15 08:45:: Status of remaining servers verified. All back online. |
2021-03-11 08:00:: ID Networking team has rolled-back a deployed configuration, pending further investigation/analysis. 2021-03-11 07:30:: There are currently disruption affecting a VPZ with servers managed by ISG.EE. Networking team of ID is investigating the issue. There are several ISG.EE services affected/malfunctioning due to this in particluar the FindYourData service. |
Line 37: | Line 23: |
<<Anchor(2017-03-24-cronbox-login-ssh-keys)>> == Cronbox/Login Server migration: new SSH host key == |
<<Anchor(2021-03-11-mira-upgrade)>> == login.ee.ethz.ch: downtime for server upgrade == |
Line 42: | Line 27: |
2017-03-24 17:00:: The cronbox and login server has moved to a new host. A new SSH host key has been generated: {{{ 4096 MD5:fc:a8:00:5b:64:90:86:a1:fb:49:75:ef:55:58:90:b3 (RSA) 4096 SHA256:v48HAAAjr+avnPAESdQzazSriKYZeTGGtIPKfoE8Dg0 (RSA) }}} The SSH host key is as well listed on: https://people.ee.ethz.ch/ |
2021-03-11 06:30:: Upgrade completed and service is up and running again. 2021-03-11 06:00:: The server servicing login.ee.ethz.ch will be upgraded to a new OS version (Debian buster). During the time of the update logins might not be possible. |
Line 49: | Line 30: |
Remember:: '''Always''' verify a fingerprint of a SSH host key before accepting it. | <<Anchor(2020-07-11-storage-downtime)>> == Planned project/ archive storage downtime and client reboot == '''Status:''' {{attachment:Status/green.gif}} |
Line 51: | Line 34: |
<<Anchor(2017-01-07-Mailsystem migration)>> | 2020-07-11 12:00:: Migration has been completed, all services are back to operational state. |
Line 53: | Line 36: |
== EE Mailsystem migration == '''STATUS:''' {{attachment:Status/green.gif}} '''Mailsystem up''' |
2020-07-11 08:00:: Migration started, services are shutdown |
Line 56: | Line 38: |
2017-01-08 15:00:: The new mailsystem is now started. In case of unattended problems we will stop it again to prevent data loss and to analyze the problem. | 2020-07-11 8:00-12:00:: Start of planned maintenance work. Project/ archive storage services (known under the names "ocean", "bluebay", "lagoon" and "benderstor") will not be available. ISG-managed Linux clients will be rebooted. |
Line 58: | Line 40: |
2017-01-07 24:00:: Not all testcases could be performed. We now plan to enable the new system about noon. | |
Line 60: | Line 41: |
2017-01-07 20:45:: Old Mailserver Configuration migrated, starting the mailserver testing | |
Line 62: | Line 42: |
2017-01-07 14:00:: User mailbox data migrated, starting mailserver configuration migration | <<Anchor(2020-06-04-svnsrv-upgrade)>> == svn.ee.ethz.ch downtime for server upgrade == '''Status:''' {{attachment:Status/green.gif}} |
Line 64: | Line 46: |
2017-01-07 07:00:: All mail services are stopped. Mailbox data copy started. | 2020-06-04 07:05:: Webservices for managing SVN repositories are enabled. 2020-06-04 06:15:: Systemupgrade is done and access to the SVN repositories via the `svn` and `https` transport protocols are back online. 2020-06-04 06:00:: The server servicing the SVN repositories will be upgraded to a new operating system version. During this timeframe outages for access to the SVN repositories are expected. |
Line 66: | Line 50: |
<<Anchor(2016-09-12-network-outage)>> | <<Anchor(2020-05-17-cluster-abuse)>> == European HPC cluster abuse == '''Status:''' {{attachment:Status/green.gif}}<<BR>> Recently European HPC clusters have been attacked and abused for mining purposes. The D-ITET Slurm and SGE clusters have not been compromised. We are monitoring the situation closely. 2020-05 17 08:30:: No successful login from known attacker IP addresses could be determined, none of the files indicating being compromised have been found on our file systems 2020-05-16 14:30:: No unusal cluster job activity was observed |
Line 68: | Line 57: |
== Networkoutage ETH == '''STATUS:''' {{attachment:Status/green.gif}} |
<<Anchor(2020-05-04-itetnas04-upgrade)>> == D-ITET Netscratch downtime for server upgrade == '''Status:''' {{attachment:Status/green.gif}} |
Line 71: | Line 61: |
2016-02-09 08:20:: ETH wide network outage due to hardware problems for the firewall infrastructure. In any case, please reboot your computer before continue. | 2020-05-04 06:00:: Server upgrade has been completed. 2020-05-04 06:00:: The server servicing the D-ITET Netscratch service will be upgraded to a new operating system version. During this timeframe outages for the NFS service will be expected. |
Line 73: | Line 64: |
2016-02-09 12:35:: Network is back online and services are being recovered. Due to the hardware failure 53 network zones were affected. The problem got localized and resolved. | <<Anchor(2020-04-07-network-interuption)>> == Network outage ETx router == '''Status:''' {{attachment:Status/green.gif}} 2020-04-07 05:30:: There was an issue on the Router `rou-etx`. ID networking team trackled and solved the issue. There was about a 10min interuption for the ETx networking zone affecting almost all ISG.EE maintained systems. |
Line 75: | Line 69: |
2016-02-09 14:25:: Our systems should be all back to normal. In case you experience any problem please contact support via mailto:support@ee.ethz.ch. | <<Anchor(2020-04-06-mira-maintenance)>> == login.ee.ethz.ch: Reboot for maintenance == '''Status:''' {{attachment:Status/green.gif}} 2020-04-06 05:35:: System behind `login.ee.ethz.ch` has been rebootet for maintenance and increase available resources. |
Line 77: | Line 74: |
<<Anchor(2016-02-10-maintenance-polaris)>> | See the [[RemoteAccess|information on access D-ITET resources remotely]]. To distribute better the load user are encouraged to use the VPN service whenever possible. |
Line 79: | Line 76: |
== Maintenance login.ee.ethz.ch and cronbox.ee.ethz.ch service == '''STATUS:''' {{attachment:Status/green.gif}} |
<<Anchor(2020-02-18-nostro-maintenance)>> == itet-stor (FindYourData) Server maintenance: Reconfiguration of VM parameters == '''Status:''' {{attachment:Status/green.gif}} |
Line 82: | Line 80: |
2016-02-10: 06:05:: The server for the [[Services/Cronjob|cronbox]] and login service is currently beeing updated from Debian Wheezy to Debian Jessie. The services will be temporarly unavailable. | 2020-02-18 19:03:: System again up and running. 2020-02-18 19:00:: Scheduled downtime for the [[Workstations/FindYourData|itet-stor/FindYourData service]] due to maintenance work on the underlying server. |
Line 84: | Line 83: |
2016-02-10: 12:00:: Server update is done. | <<Anchor(2020-01-20-nostro-os-upgrade)>> == itet-stor (FindYourData) Server migration: New operating system version == '''Status:''' {{attachment:Status/green.gif}} 2020-01-20 07:15:: OS upgrade done, there were short interruptions to the [[Workstations/FindYourData|itet-stor/FindYourData service]]. 2020-01-20 06:00:: We will update the server servicing the [[Workstations/FindYourData|FindYourData service]] from Debian jessie 8 to Debian stretch 9. There will be short downtimes accessing this service during the update. |
Line 88: | Line 93: |
[[Status/Archive/2010|2010]] [[Status/Archive/2011|2011]] [[Status/Archive/2012|2012]] [[Status/Archive/2013|2013]] [[Status/Archive/2014|2014]] |
|
Line 89: | Line 99: |
[[Status/Archive/2014|2014]] [[Status/Archive/2013|2013]] [[Status/Archive/2012|2012]] [[Status/Archive/2011|2011]] [[Status/Archive/2010|2010]] |
[[Status/Archive/2016|2016]] [[Status/Archive/2017|2017]] [[Status/Archive/2018|2018]] [[Status/Archive/2019|2019]] |
General Informations
This page lists announcements and status messages for IT services managed by ISG.EE.
For notifications and announcements of central IT services managed by ID, please visit https://www.ethz.ch/services/de/it-services/service-desk.html
For a detailed status overview of central IT services managed by ID, please visit https://ueberwachung.ethz.ch
Status-Key |
|
|
Resolved |
|
Still working but with some errors |
|
Pending |
Current status reports
Network disruption affecting several ISG.EE services
Status:
- 2021-03-11 08:00
- ID Networking team has rolled-back a deployed configuration, pending further investigation/analysis.
- 2021-03-11 07:30
There are currently disruption affecting a VPZ with servers managed by ISG.EE. Networking team of ID is investigating the issue. There are several ISG.EE services affected/malfunctioning due to this in particluar the FindYourData service.
login.ee.ethz.ch: downtime for server upgrade
Status:
- 2021-03-11 06:30
- Upgrade completed and service is up and running again.
- 2021-03-11 06:00
- The server servicing login.ee.ethz.ch will be upgraded to a new OS version (Debian buster). During the time of the update logins might not be possible.
Planned project/ archive storage downtime and client reboot
Status:
- 2020-07-11 12:00
- Migration has been completed, all services are back to operational state.
- 2020-07-11 08:00
- Migration started, services are shutdown
- 2020-07-11 8:00-12:00
- Start of planned maintenance work. Project/ archive storage services (known under the names "ocean", "bluebay", "lagoon" and "benderstor") will not be available. ISG-managed Linux clients will be rebooted.
svn.ee.ethz.ch downtime for server upgrade
Status:
- 2020-06-04 07:05
- Webservices for managing SVN repositories are enabled.
- 2020-06-04 06:15
Systemupgrade is done and access to the SVN repositories via the svn and https transport protocols are back online.
- 2020-06-04 06:00
- The server servicing the SVN repositories will be upgraded to a new operating system version. During this timeframe outages for access to the SVN repositories are expected.
European HPC cluster abuse
Status:
Recently European HPC clusters have been attacked and abused for mining purposes. The D-ITET Slurm and SGE clusters have not been compromised. We are monitoring the situation closely.
- 2020-05 17 08:30
- No successful login from known attacker IP addresses could be determined, none of the files indicating being compromised have been found on our file systems
- 2020-05-16 14:30
- No unusal cluster job activity was observed
D-ITET Netscratch downtime for server upgrade
Status:
- 2020-05-04 06:00
- Server upgrade has been completed.
- 2020-05-04 06:00
- The server servicing the D-ITET Netscratch service will be upgraded to a new operating system version. During this timeframe outages for the NFS service will be expected.
Network outage ETx router
Status:
- 2020-04-07 05:30
There was an issue on the Router rou-etx. ID networking team trackled and solved the issue. There was about a 10min interuption for the ETx networking zone affecting almost all ISG.EE maintained systems.
login.ee.ethz.ch: Reboot for maintenance
Status:
- 2020-04-06 05:35
System behind login.ee.ethz.ch has been rebootet for maintenance and increase available resources.
See the information on access D-ITET resources remotely. To distribute better the load user are encouraged to use the VPN service whenever possible.
itet-stor (FindYourData) Server maintenance: Reconfiguration of VM parameters
Status:
- 2020-02-18 19:03
- System again up and running.
- 2020-02-18 19:00
Scheduled downtime for the itet-stor/FindYourData service due to maintenance work on the underlying server.
itet-stor (FindYourData) Server migration: New operating system version
Status:
- 2020-01-20 07:15
OS upgrade done, there were short interruptions to the itet-stor/FindYourData service.
- 2020-01-20 06:00
We will update the server servicing the FindYourData service from Debian jessie 8 to Debian stretch 9. There will be short downtimes accessing this service during the update.
Archived status reports
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019