Differences between revisions 17 and 118 (spanning 101 versions)

Status-Key
	Resolved
	Pending

NFS outage on oenone

STATUS:

2011-08-30: 02:00 AM - 08:30 AM: During this night at around 02:00 the NFS Services on oenone crashed. Users with homes on oenone where affected, these are BIWI, VAW, Collegium Helveticum, Control, IBT, IKT. This crash also affected all webservices which depend on oenone and the mailserver (at least for those having the home directory on oenone). No mails are lost as we put them into a hold queue!
Update: 08:30: oenone is now up and running again. All Mails on the hold queue are now gradualy delivered. The webservices are also all available now.

We are sorry for the caused inconvenience and we are investigating the problem.

Connection problems to all admin.ch servers

STATUS:

2011-08-23: 03:00 PM: All admin.ch websites an mailservers are reachable.
2011-08-23: 01:25 PM: Currently all traffic to the admin.ch servers is disturbed. This includes the websites and also email connections. Your sent mails are not lost as our server keeps them until it can connect to the destination.

Maintenance work on D-ITET's central IT infrastructure

STATUS:

2011-08-23: CANCELLATION: Due to unresolved error within Solaris operating system introduced by latest patch set
2011-08-23: 7:00 PM - 10:00 PM: To keep our systems up to date with the newest software and security releases, we need to update our servers on a regular basis. For this reason we are going to patch and reboot some of our Solaris servers.

To prevent data corruption/loss please do the following:

save all open files
close all running applications
logout from all ITET systems (Linux, Solaris, Windows, Mac OS X)
shutdown your personal PC/Desktop
do not establish any connection from outside

More details:

servers concerned: DRWHO, TARDIS, OENONE, SPITFIRE, YOSEMITE, MALINA
webpages hosted on these systems are NOT available
NO mail access, NO outgoing mails (incoming mails WON'T get lost)

Routing problems on switch.ch network

STATUS:

2011-08-19: 05:00 PM: Routing Problems solved.
2011-08-19: 02:22 PM: Because of a routing problem on the switch network, all traffic to http://www.virginia.edu and their mailserver is disturbed.

Outage drwho.ee.ethz.ch

STATUS:

2011-08-13 09:00: Since around 1:00 we experience server problems on one of our main servers affecting most of the services.
2011-08-13 11:00: All services back to normal.

ETH wide DNS outage

STATUS:

2011-08-08 10:00: ETH wide DNS outage. All Services using Name Resolution do not work. Our Mailserver denys all incoming messages with an 450 4.3.2 Service currently unavailable. Properly configured Mailservers should retry the message delivery later, so no mail is lost.
2011-08-08 10:20: DNS works again. All Services up and running.

colombo04 not available

STATUS:

2011-06-23 11:00: Powersupply of colombo04 replaced. colombo04 up and ready.
2011-06-22 07:00: Powersupply of colombo04 broke. Now waiting for replacement from Oracle.

biwinas01 not available

STATUS:

2011-06-08 08:00: biwinas01 is now back online.
2011-06-07 15:00: The hardware supplier returned the server today. They had to replace the following hardware:

We will test the server now and bring it back online as soon as possible.

2011-06-06 15:00: biwinas01 is currently out of order due to a hardware failure. It might take several days until biwinas01 is back online.

ifhlux11 not available

STATUS:

2011-06-08 11:20: iflux11 is now back online.
2011-06-07 15:00: We are in contact with the supplier. Unfortunately, the reason for the crash of ifhlux11 is not known yet.
2011-06-06 15:00: ifhlux11 is currently out of order due to a hardware failure. It might take several days until ifhlux11 is back online.

Coming soon: Outage of several IT services due to cooling system replacement

STATUS:

2011-06-03 15:00: All servers are again up and running with the exceptions of biwinas01 and ifhlux11 (see above).

UPCOMING: 2011-06-03 09:00 - 2011-06-06 09:00

The refrigeration supply in the ETZ building will be replaced. One server room has to be shutdown completely (ETZ/D/96.2; all compute servers of the institutes). The other server room will get an emergency cooling system (ETZ/F/66). This will, however, not allow the cooling of all currently running servers within that room. Consequently we will shut down as many servers as possible.

Basic services like email, user homes and other network shares, printing and login to Windows or Linux workstations are not expected to be affected by this construction work. In general, if you are not familiar with the server names or service terms below, you should not be affected at all.

These services will not be available during the construction work times:

Shut down 2011-06-03 at 09:00. Back online 2011-06-06 at 09:00:
- Remote-Desktop-Access to the Windows Terminal Servers
  - quinn
  - sivi
  - vega7
- All institute NAS servers (no access via Samba, link in home, ssh, etc.)
  - biwinas01-03
  - hamam01
  - ifenas01
  - ibtnas01
- Most IFE compute servers
  - bernstein
  - coltrane
  - dylan
  - haydn
  - marley
  - mozart
- Publication databases on sato (IFA)
Shut down 2011-06-03 at 16:00. Back online 2011-06-06 at 09:00:
- Computer rooms for students
  - ETZ D 61.1
  - ETZ D 61.2
  - ETZ D 96
- All institute compute servers
  - autserv*
  - bender*
  - biwilux*
  - casseri*
  - colombo*
  - IFE compute servers cash and elvis
  - ifhlux*
  - nariwork*
  - tik*x
  - vierzack*

Firewall Problem: some Services not reachable

STATUS:

2011-05-26 07:15 - 2011-05-26 16:00

Due to problems with the firewall hardware, some services of the D-ITET-servers were not reachable.

Maintenance reboot of Servers behind login.ee.ethz.ch and ipp2vpp.ee.ethz.ch (printing/licenses)

STATUS:

On 2011-05-19 around 7:00AM we will perform a maintenance reboot of polaris (serving login.ee.ethz.ch) and zaan (serving printing at D-ITET). During this downtime it will not be possible to print via samba or cups.

Maintenance reboot of Server behind people.ee.ethz.ch

STATUS:

2011-05-10 07:45

On 2011-05-10 around 7:00AM we will perform a maintenance reboot of galen. galen is the Server serving your personal homepage on people.ee.ethz.ch.

Horde Webmail outage

STATUS:

2011-05-02 23:49 - 2011-05-03 02:27

The Horde Webmail Client had to be taken down for security reasons. It was not clear if someone used a zero day exploit or a phished account to send spam over our server. It turned out, the attackers used a phished account.

$/!\$ Please Remember: We at ISG.EE will NEVER ask you for your password.

Short SMTP outage

STATUS:

2011-02-28 14:49 - 15:04: As a result of an LDAP failure we had to stop the mail server for 15 minutes, to prevent the rejection of incoming emails. While the Mainserver was down, our two backup MX collected incoming emails.

Windows Terminal Server zhadum out-of-operation

STATUS:

2011-02-28 15:00: Please use the server vega7 from now on. If you had access to zhadum before, you should also be able to access vega7. Please use your NETHZ username and password to log in.
2011-02-28 14:00: The hardware of the departemental Windows terminal server zhadum is broken. A replacement server should be available soon...

Upgrade of Backup Server JABBA

STATUS:

2011-02-22 18:15 PM: The migration is finished and the new JABBA server is online.
2011-02-22 7:00 AM - 7:00 PM (approx.): We are going to upgrade the departments backup server JABBA. This upgrade includes changes in software as well as in hardware. During the upgrade no restore request of lost data can be fulfilled.

The complete backup infrastructure and all belonging services are NOT available during the upgrade

Agilent ADS/ICCAP License Server Change

STATUS:

2011-02-07 10:00: As announced a week ago, the license server for Agilent ADS and ICCAP software has changed. If your client software still uses the old license server information, please make sure you change the license server to lic-agilent.ee.ethz.ch. The port stays the same.

Subversion server not reachable from outside ETH Zurich

STATUS:

2011-01-31 16:30: Our subversion server is now available again for users outside of ETH Zurich.
2011-01-31: 16:15: We have been told that the central firewall rules cannot be changed right now due to other (completely unrelated) problems. At the moment it is unknown when this will be fixed.
2011-01-31: 15:30: Our subversion server svn.ee.ethz.ch is at the moment not accessible through the svn:// protocol from outside ETH Zurich due to a firewall configuration error. We are working on it...

Mailserver problems while patching

STATUS:

2011-01-26: 11:00 AM: During the server upgrade last night a patch has temporarily misconfigured the mail server. The server accepted incoming mails but could not place them into the users mailbox. EVERY such mail created a bounce. Because of this, our statement that no mails get lost while updating the servers, is not fully true anymore. Incoming Mails which bounced back have to be resent by the sender.

Solaris Server Patching

STATUS:

2011-01-25: 10:30 PM: All servers and services are back online
2011-01-25: 7:00 PM - 10:00 PM: To keep our systems up to date with the newest software and security releases, we need to update our servers on a regular base. For this reason we are going to patch and reboot some of our Solaris servers.

Servers concerned: drwho, tardis, oenone, spitfire, yosemite, malina.

To prevent data corruption/loss please do the following:

All diskless clients (Linux): please logout and shutdown all DL clients
All Windows systems: please logout and shutdown all Windows clients
All Mac systems: please logout and shutdown all Mac clients
All user homes: please logout from these servers
No NFS or SAMBA access to user homes
No mail access, no outgoing mails (incoming mails WON'T get lost)
Webpages hosted on these systems are unavailable

Delayed Email delivery

STATUS:

2011-01-19: 11:00 AM - 4:30 PM: As a result of a faulty ClamAV signature File every Email that contained a PDF-file was marked as infected. Before we could resend the quarantined emails we had to fix the issue. No mail was lost and everything was resent.
Update: 2011-01-20: 10:33 AM: ClamAV Signatues have been updated and tested. Everything is working as it should.

Maintenance Reboot of Solaris Server Yosemite

STATUS:

2010-12-07: 7:30 AM: Server yosemite has been rebooted successfully. All services are available.
2010-12-07: 7:00 AM: Due to a shortage of available memory we are forced to reboot the solaris server yosemite. Downtime approx. 30 minutes.

cooling water system outage on clusters

STATUS:

2010-11-26: 5:00 PM: Host autserv02 is running as well. All hosts can be used.
2010-11-26: 4:40 PM: Server racks are cooled again, all hosts except of autserv02 are running and can be used.
2010-11-26: 4:00 PM: Server racks are still down. --> Update follows at 5 PM or earlier
2010-11-26: 3:10 PM: One of the cooling water pumps installed in ETZ/D/96.2 does not work correctly. This forces some of the racks in this server room to shutdown in order to protect the servers from thermal damage. clusters from IFH, IBT, BIWI, TIK, IKT and VAW are affected. the facility management is working on solving the problem. --> Update follows at 4 PM

email phishing attack

STATUS:

2010-11-23: Yesterday between 18:20 and 19:50 about 320 Phishing Mails have been sent to different Users at D-ITET. The Mails pretend to come from IT Support Group and contain the subject ISG.EE Webmail Alert. The mail tells something about spammers that have compromised the ISG.EE Webmail Account and that you should provide your Username, Password and some Alternate Email. Please remember, that the ISG.EE Team will NEVER ask you for your Password! If you still have replied to this phishing mail please contact us immediately under support@ee.ethz.ch so that we can plan with you the next steps to keep your account safe.

oenone home server crash

STATUS:

2010-11-17: During this night at around 00:15 oenone one of our home-servers crashed. Users with homes on oenone where affected, these are BIWI, VAW, Collegium Helveticum, Control, IBT, IKT. The server is now checking the filesystems and comming up again.

We are sorry for the caused inconvenience and we are investigating the problem.

Update: 08:00: oenone is now up and running again.
Update: 2010-11-18 07:30: We opened a support case at Sun/Oracle for this server.

cooling water system outage for some clusters

STATUS:

2010-11-16: On last friday evening one of the cooling water pumps installed in ETZ/D/96.2 stopped working correctly. This forced some of the racks in this server room to shutdown in order to protect the servers from thermal damage. All clusters from IFH, IBT, BIWI, TIK, IKT and VAW were affected.

The facility management is working on solving the problem.

The servers are currently (08:35) down again. Please, even if they come up again, do not use them for long-timed computations as we still do not know when exactly the technician has solved the issue.

Update: 2010-11-17 08:25: The rack systems are running now with only one cooling water pump. A new pump is ordered by the rack company.
Update: 2010-11-18 16:00: Planed substitution of broken pump will be on 25.11 or 26.11.

-  ⇤ ← Revision 17 as of 2010-11-18 06:28:18 → 
  Size: 1465
  Editor: bonaccos
  Comment:
+   ← Revision 118 as of 2011-08-30 07:00:15 → ⇥
  Size: 17126
  Editor: maegger
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
-= Status =
+||||<style="border-width: 1px 0px; border-color: rgb(85, 136, 238); padding: 0.6em;">'''Status-Key''' ||
||<style="border: medium none;"> {{attachment:green.gif}} ||<style="border: medium none;">Resolved ||
||<style="border-width: medium medium 1px; border-top: medium none rgb(85, 136, 238); border-left: medium none rgb(85, 136, 238); border-right: medium none rgb(85, 136, 238); border-color: rgb(85, 136, 238);"> {{attachment:red.gif}} ||<style="border-width: medium medium 1px; border-top: medium none rgb(85, 136, 238); border-left: medium none rgb(85, 136, 238); border-right: medium none rgb(85, 136, 238); border-color: rgb(85, 136, 238);">Pending ||

<<Anchor(2011-08-30-nfs-problems-on-oenone)>>
= NFS outage on oenone =
'''STATUS:''' {{attachment:green.gif}}
 2011-08-30:  02:00 AM - 08:30 AM:: During this night at around 02:00 the NFS Services on '''oenone''' crashed. Users with homes on oenone where affected, these are '''BIWI''', '''VAW''', '''Collegium Helveticum''', '''Control''', '''IBT''', '''IKT'''. This crash also affected all webservices which depend on oenone and the mailserver (at least for those having the home directory on oenone). No mails are lost as we put them into a hold queue!

 Update: 08:30:: oenone is now up and running again. All Mails on the hold queue are now gradualy delivered. The webservices are also all available now.

We are sorry for the caused inconvenience and we are investigating the problem.

----

<<Anchor(2011-08-23-admin-ch-connection-problems)>>
= Connection problems to all admin.ch servers =
'''STATUS:''' {{attachment:green.gif}}
 2011-08-23:  03:00 PM:: All admin.ch websites an mailservers are reachable.
 2011-08-23:  01:25 PM:: Currently all traffic to the admin.ch servers is disturbed. This includes the websites and also email connections. Your sent mails are not lost as our server keeps them until it can connect to the destination.

----

<<Anchor(2011-08-23-solaris-server-patching)>>

= Maintenance work on D-ITET's central IT infrastructure =
'''STATUS:''' {{attachment:green.gif}}

 2011-08-23:  CANCELLATION::  Due to unresolved error within Solaris operating system introduced by latest patch set

 2011-08-23:  7:00 PM - 10:00 PM:: To keep our systems up to date with the newest software and security releases, we need to update our servers on a regular basis. For this reason we are going to patch and reboot some of our Solaris servers.

To prevent data corruption/loss please do the following:
 * save all open files
 * close all running applications
 * logout from all ITET systems (Linux, Solaris, Windows, Mac OS X) 
 * shutdown your personal PC/Desktop
 * do not establish any connection from outside

More details:
 * servers concerned: DRWHO, TARDIS, OENONE, SPITFIRE, YOSEMITE, MALINA
 * webpages hosted on these systems are NOT available
 * NO mail access, NO outgoing mails (incoming mails WON'T get lost)
----

<<Anchor(2011-08-19-switch-routing-loop)>>

= Routing problems on switch.ch network =
'''STATUS:''' {{attachment:green.gif}}
 2011-08-19:  05:00 PM:: Routing Problems solved.
 2011-08-19:  02:22 PM:: Because of a routing problem on the switch network, all traffic to http://www.virginia.edu and their mailserver is disturbed. 

----

<<Anchor(2011-08-08-drwho-problems)>>

= Outage drwho.ee.ethz.ch =
'''STATUS:''' {{attachment:green.gif}}

 2011-08-13 09:00:: Since around 1:00 we experience server problems on one of our main servers affecting most of the services.

 2011-08-13 11:00:: All services back to normal.

----

<<Anchor(2011-08-08-dns_outage)>>

= ETH wide DNS outage =
'''STATUS:''' {{attachment:green.gif}}

 2011-08-08 10:00:: ETH wide DNS outage. All Services using Name Resolution do not work. Our Mailserver denys all incoming messages with an {{{450 4.3.2 Service currently unavailable}}}. Properly configured Mailservers should retry the message delivery later, so no mail is lost.

 2011-08-08 10:20:: DNS works again. All Services up and running.


----

<<Anchor(2011-06-22-colombo04)>>

= colombo04 not available =
'''STATUS:''' {{attachment:green.gif}}

 2011-06-23 11:00:: Powersupply of colombo04 replaced. colombo04 up and ready.

 2011-06-22 07:00:: Powersupply of colombo04 broke. Now waiting for replacement from Oracle.

----

<<Anchor(2011-06-06-biwinas01)>>

= biwinas01 not available =
'''STATUS:''' {{attachment:green.gif}}

 2011-06-08 08:00:: biwinas01 is now back online.

 2011-06-07 15:00:: The hardware supplier returned the server today. They had to replace the following hardware:
 * 1 CPU
 * 1 power supply
 * 1 fan

We will test the server now and bring it back online as soon as possible.

 2011-06-06 15:00:: biwinas01 is currently out of order due to a hardware failure. It might take several days until biwinas01 is back online.

----
<<Anchor(2011-06-06-ifhlux11)>>

= ifhlux11 not available =
'''STATUS:''' {{attachment:green.gif}}

 2011-06-08 11:20:: iflux11 is now back online.

 2011-06-07 15:00:: We are in contact with the supplier. Unfortunately, the reason for the crash of ifhlux11 is not known yet.

 2011-06-06 15:00:: ifhlux11 is currently out of order due to a hardware failure. It might take several days until ifhlux11 is back online.

----
<<Anchor(2011-06-03-Cooling-System-Replacement)>>

= Coming soon: Outage of several IT services due to cooling system replacement =

'''STATUS:''' {{attachment:green.gif}}

 2011-06-03 15:00:: All servers are again up and running with the exceptions of '''biwinas01''' and '''ifhlux11''' (see above).

'''UPCOMING: 2011-06-03 09:00 - 2011-06-06 09:00'''

The refrigeration supply in the ETZ building will be replaced. One server room
has to be shutdown completely (ETZ/D/96.2; all compute servers of the institutes).
The other server room will get an emergency cooling system (ETZ/F/66). This will,
however, not allow the cooling of all currently running servers within that room.
Consequently we will shut down as many servers as possible.

'''Basic services like email, user homes and other network shares, printing and login to
Windows or Linux workstations are not expected to be affected by this construction work.'''
In general, if you are not familiar with the server names or service terms below, you
should not be affected at all.

'''These services will not be available during the construction work times:'''
 * Shut down 2011-06-03 at '''09:00'''. Back online 2011-06-06 at 09:00:
  * Remote-Desktop-Access to the Windows Terminal Servers
   * quinn
   * sivi
   * vega7
  * All institute NAS servers (no access via Samba, link in home, ssh, etc.)
   * biwinas01-03
   * hamam01
   * ifenas01
   * ibtnas01
  * Most IFE compute servers
   * bernstein
   * coltrane
   * dylan
   * haydn
   * marley
   * mozart
  * Publication databases on sato (IFA)

 * Shut down 2011-06-03 at '''16:00'''. Back online 2011-06-06 at 09:00:
  * Computer rooms for students
   * ETZ D 61.1
   * ETZ D 61.2
   * ETZ D 96
  * All institute compute servers
   * autserv*
   * bender*
   * biwilux*
   * casseri*
   * colombo*
   * IFE compute servers cash and elvis
   * ifhlux*
   * nariwork*
   * tik*x
   * vierzack*

----
<<Anchor(2011-05-26-Firewall-Problem)>>

= Firewall Problem: some Services not reachable =
'''STATUS:''' {{attachment:green.gif}}

'''2011-05-26 07:15 - 2011-05-26 16:00'''

Due to problems with the firewall hardware, some services of the D-ITET-servers were not reachable.

----
<<Anchor(2011-05-19-polaris-and-zaan-reboot)>>

= Maintenance reboot of Servers behind login.ee.ethz.ch and ipp2vpp.ee.ethz.ch (printing/licenses) =
'''STATUS:''' {{attachment:green.gif}}

On '''2011-05-19''' around '''7:00AM''' we will perform a maintenance reboot of polaris (serving login.ee.ethz.ch) and zaan (serving printing at D-ITET). During this downtime it will not be possible to print via samba or cups.

----
<<Anchor(2011-05-09-galen-reboot)>>

= Maintenance reboot of Server behind people.ee.ethz.ch =
'''STATUS:''' {{attachment:green.gif}}

'''2011-05-10 07:45'''

On '''2011-05-10''' around '''7:00AM''' we will perform a maintenance reboot of galen. galen is the Server serving your personal homepage on people.ee.ethz.ch.

----

<<Anchor(2011-05-02-horde-outage)>>

= Horde Webmail outage =
'''STATUS:''' {{attachment:green.gif}}

'''2011-05-02 23:49 - 2011-05-03 02:27'''

The [[https://email.ee.ethz.ch|Horde Webmail Client]] had to be taken down for security reasons. It was not clear if someone used a zero day exploit or a phished account to send spam over our server. It turned out, the attackers used a phished account. 

/!\ '''Please Remember: We at ISG.EE will NEVER ask you for your password.'''

----

<<Anchor(2011-03-01-short-smtp-outage)>>

= Short SMTP outage =
'''STATUS:''' {{attachment:green.gif}}

 2011-02-28 14:49 - 15:04:: As a result of an LDAP failure we had to stop the mail server for 15 minutes, to prevent the rejection of incoming emails. While the Mainserver was down, our two backup MX collected incoming emails.

----

<<Anchor(2011-02-28-zhadum-crash)>>

= Windows Terminal Server zhadum out-of-operation =
'''STATUS:''' {{attachment:green.gif}}

 2011-02-28 15:00:: Please use the server '''vega7''' from now on. If you had access to zhadum before, you should also be able to access vega7. '''Please use your NETHZ username and password to log in.'''

 2011-02-28 14:00:: The hardware of the departemental Windows terminal server '''zhadum''' is broken. A replacement server should be available soon...

----
<<Anchor(2011-02-22-jabba-upgrade)>>

= Upgrade of Backup Server JABBA =
'''STATUS:''' {{attachment:green.gif}}

 2011-02-22 18:15 PM:: The migration is finished and the new JABBA server is online.

 2011-02-22 7:00 AM - 7:00 PM (approx.):: We are going to upgrade the departments backup server JABBA. This upgrade includes changes in software as well as in hardware. During the upgrade no restore request of lost data can be fulfilled.

'''The complete backup infrastructure and all belonging services are NOT available during the upgrade'''

----
<<Anchor(2011-02-07-agilent-license-server-change)>>

= Agilent ADS/ICCAP License Server Change =
'''STATUS:''' {{attachment:green.gif}}

 2011-02-07 10:00:: As announced a week ago, the license server for Agilent ADS and ICCAP software has changed. If your client software still uses the old license server information, please make sure you change the license server to '''lic-agilent.ee.ethz.ch'''. The port stays the same.

----
<<Anchor(2011-01-31-svn-firewall)>>

= Subversion server not reachable from outside ETH Zurich =
'''STATUS:''' {{attachment:green.gif}}

 2011-01-31 16:30:: Our subversion server is now available again for users outside of ETH Zurich.

 2011-01-31: 16:15:: We have been told that the central firewall rules cannot be changed right now due to other (completely unrelated) problems. At the moment it is unknown when this will be fixed.

 2011-01-31:  15:30:: Our subversion server svn.ee.ethz.ch is at the moment not accessible through the svn:// protocol from outside ETH Zurich due to a firewall configuration error. We are working on it...

----
<<Anchor(2011-01-26-mailser-problems-after-patching)>>

= Mailserver problems while patching =
'''STATUS:''' {{attachment:green.gif}}

 2011-01-26:  11:00 AM:: During the server upgrade last night a patch has temporarily misconfigured the mail server. The server accepted incoming mails but could not place them into the users mailbox. EVERY such mail created a bounce. Because of this, our statement that no mails get lost while updating the servers, is not fully true anymore. Incoming Mails which bounced back have to be resent by the sender.

----
<<Anchor(2010-12-14-solaris-server-patching)>>

= Solaris Server Patching =
'''STATUS:''' {{attachment:green.gif}}

 2011-01-25:  10:30 PM:: All servers and services are back online

 2011-01-25:  7:00 PM - 10:00 PM:: To keep our systems up to date with the newest software and security releases, we need to update our servers on a regular base. For this reason we are going to patch and reboot some of our Solaris servers.

Servers concerned: '''drwho''', '''tardis''', '''oenone''', '''spitfire''', '''yosemite''', '''malina'''.

To prevent data corruption/loss please do the following:

 * All diskless clients (Linux): please logout and shutdown all DL clients
 * All Windows systems: please logout and shutdown all Windows clients
 * All Mac systems: please logout and shutdown all Mac clients
 * All user homes: please logout from these servers
 * No NFS or SAMBA access to user homes
 * No mail access, no outgoing mails (incoming mails WON'T get lost)
 * Webpages hosted on these systems are unavailable

----
<<Anchor(2010-12-19-delayed-email-delivery)>>

= Delayed Email delivery =
'''STATUS:''' {{attachment:green.gif}}

 2011-01-19: 11:00 AM - 4:30 PM:: As a result of a faulty [[http://lurker.clamav.net/thread/20110119.125839.2b4ce0e1.en.html|ClamAV signature File]] every Email that contained a PDF-file was marked as infected. Before we could resend the quarantined emails we had to fix the issue. No mail was lost and everything was resent.

 Update: 2011-01-20: 10:33 AM:: ClamAV Signatues have been updated and tested. Everything is working as it should.

----
<<Anchor(2010-12-07-reboot-yosemite)>>

= Maintenance Reboot of Solaris Server Yosemite =
'''STATUS:''' {{attachment:green.gif}}

 2010-12-07:  7:30 AM:: Server yosemite has been rebooted successfully. All services are available.

 2010-12-07:  7:00 AM:: Due to a shortage of available memory we are forced to reboot the solaris server yosemite. Downtime approx. 30 minutes.

----
<<Anchor(2010-11-26-servers-down)>>

= cooling water system outage on clusters =
'''STATUS:''' {{attachment:green.gif}}

 2010-11-26:  5:00 PM:: Host autserv02 is running as well. All hosts can be used.

 2010-11-26:  4:40 PM:: Server racks are cooled again, all hosts except of autserv02 are running and can be used.

 2010-11-26:  4:00 PM:: Server racks are still down. --> Update follows at 5 PM or earlier

 2010-11-26:  3:10 PM:: One of the cooling water pumps installed in ETZ/D/96.2 does not work correctly. This forces some of the racks in this server room to shutdown in order to protect the servers from thermal damage. '''clusters from IFH, IBT, BIWI, TIK, IKT and VAW are affected.''' the facility management is working on solving the problem. --> Update follows at 4 PM

----
<<Anchor(2010-11-23-email-phishing-attack)>>

= email phishing attack =
'''STATUS:''' {{attachment:green.gif}}

 2010-11-23:: Yesterday between 18:20 and 19:50 about 320 Phishing Mails have been sent to different Users at D-ITET. The Mails pretend to come from ''IT Support Group'' and contain the subject ''ISG.EE Webmail Alert''. The mail tells something about ''spammers'' that have compromised ''the'' ISG.EE Webmail Account and that you should provide your '''Username, Password''' and some '''Alternate Email'''. Please remember, that the ISG.EE Team will '''NEVER ask you for your Password!''' If you still have replied to this phishing mail please contact us '''immediately''' under support@ee.ethz.ch so that we can plan with you the next steps to keep your account safe.

----
-Line 5:
+Line 344:
-== oenone home server crash ==

'''2010-11-17'''

During this night at around 00:15 '''oenone''' one of our home-servers crashed. Users with homes on oenone where affected, these are '''BIWI''', '''VAW''', '''Collegium Helveticum''', '''Control''', '''IBT''', '''IKT'''. The server is now checking the filesystems and comming up again.
+= oenone home server crash =
'''STATUS:''' {{attachment:green.gif}}

 2010-11-17:: During this night at around 00:15 '''oenone''' one of our home-servers crashed. Users with homes on oenone where affected, these are '''BIWI''', '''VAW''', '''Collegium Helveticum''', '''Control''', '''IBT''', '''IKT'''. The server is now checking the filesystems and comming up again.
-Line 13:
+Line 351:
-'''Update: 08:00''': oenone is now up and running again.

'''Update: 2010-11-18 07:30''' We opened a support case at Sun/Oracle for this server.

----
+ Update: 08:00:: oenone is now up and running again.
 Update: 2010-11-18 07:30:: We opened a support case at Sun/Oracle for this server.

----
-Line 21:
+Line 357:
-== cooling water system outage for some clusters ==

'''2010-11-16:'''

On last friday evening one of the cooling water pumps installed in ETZ/D/96.2 stopped working correctly. This forced some of the racks in this server room to shutdown in order to protect the servers from thermal damage. '''All clusters from IFH, IBT, BIWI, TIK, IKT and VAW were affected.'''
+= cooling water system outage for some clusters =
'''STATUS:''' {{attachment:green.gif}}

 2010-11-16:: On last friday evening one of the cooling water pumps installed in ETZ/D/96.2 stopped working correctly. This forced some of the racks in this server room to shutdown in order to protect the servers from thermal damage. '''All clusters from IFH, IBT, BIWI, TIK, IKT and VAW were affected.'''
-Line 32:
+Line 366:
-'''Update: 2010-11-17 08:25:''' The rack systems are running now with only one cooling water pump. A new pump is ordered by the rack company.

----
+ Update: 2010-11-17 08:25:: The rack systems are running now with only one cooling water pump. A new pump is ordered by the rack company.
 Update: 2010-11-18 16:00:: Planed substitution of broken pump will be on 25.11 or 26.11.

----

Wiki

Page

NFS outage on oenone

Connection problems to all admin.ch servers

Maintenance work on D-ITET's central IT infrastructure

Routing problems on switch.ch network

Outage drwho.ee.ethz.ch

ETH wide DNS outage

colombo04 not available

biwinas01 not available

ifhlux11 not available

Coming soon: Outage of several IT services due to cooling system replacement

Firewall Problem: some Services not reachable

Maintenance reboot of Servers behind login.ee.ethz.ch and ipp2vpp.ee.ethz.ch (printing/licenses)

Maintenance reboot of Server behind people.ee.ethz.ch

Horde Webmail outage

Short SMTP outage

Windows Terminal Server zhadum out-of-operation

Upgrade of Backup Server JABBA

Agilent ADS/ICCAP License Server Change

Subversion server not reachable from outside ETH Zurich

Mailserver problems while patching

Solaris Server Patching

Delayed Email delivery

Maintenance Reboot of Solaris Server Yosemite

cooling water system outage on clusters

email phishing attack

oenone home server crash

cooling water system outage for some clusters