General Informations

Status-Key

Status/green.gif

Resolved

Status/orange.gif

Still working but with some errors

Status/red.gif

Pending


Maintenance work on D-ITET's home server TARDIS and OENONE

STATUS: Status/green.gif

2011-12-17: 11:00 AM
successful reboot of TARDIS and OENONE
2011-12-17 10:00AM - 11:00AM
During the last installation of Oracle patches a bug within the automount daemon was introduced causing high CPU load on systems with a high number of auto-mounted file systems. We have investigated this problem together with Oracle. Now, a bug fix is available, but requests a server reboot. Due to this requirement we are going to reboot TARDIS and OENONE at

To prevent data corruption/loss please do the following:

More details:


oenone home server crash

STATUS: Status/green.gif

2011-11-29

During this night at around 21:30 oenone one of our home-servers crashed. Users with homes on oenone where affected, these are BIWI, VAW, Collegium Helveticum, Control, ISI, IKT. The system was up again at 23:00.

We are sorry for the caused inconvenience and we are investigating the problem.


Poweroutage affecting compute clusters

STATUS: Status/green.gif

2011-11-14
Due to a power outage some racks containing our compute clusters went down.
2011-11-14 08:00
All compute clusters should be up and running again.


Maintenance work on D-ITET's central IT infrastructure

STATUS: Status/green.gif

2011-11-12: 05:50 PM
Final reboot of TARDIS successfully terminated.
2011-11-12: 04:00 PM
The final reboot of tardis is still outstanding due to a broken disk within TARDIS internal RAID (boot device). The broken disk has been successfully replaced but the RAID is still syncing. The reboot is postponed until the sync process is finished and the reboot can safely be carried out. So be prepared for a short interrupt today or tomorrow.
2011-11-12: 04:00 PM
All systems are back online.
2011-11-12: 10:00 AM - 04:00 PM
To keep our systems up to date with the newest software and security releases, we need to update our servers on a regular basis. For this reason we are going to upgrade specific storage software and install latest Oracle patches on our main servers. The servers will be rebooted multiple times during this maintenance

To prevent data corruption/loss please do the following:

More details:


Routing problems on switch.ch network

STATUS: Status/green.gif

2011-08-19: 05:00 PM
Routing Problems solved.
2011-08-19: 02:22 PM

Because of a routing problem on the switch network, all traffic to http://www.virginia.edu and their mailserver is disturbed.


Migration of cronbox.ee.ethz.ch to Debian Squeeze

STATUS: Status/green.gif

2011-10-14: 07:30

We plan to migrate the server behind cronbox.ee.ethz.ch to the new version of the Debian operating system. Expected downtime is from 07:30 up to around 10:00. The affected services are cron and ssh logins to the machine.

2011-10-14: 11:00
Migration completed.


Matlab License Server down

STATUS: Status/green.gif

2011-09-30: 07:00

Currently the Server from Informatik Dienste providing the license server service is down. You can track the curren status at their ID-Service Status page under Lizenzen -> 1965@vnava.

2011-09-30: 08:45
License server from Informatikdienste is now up again.
2011-09-30: 15:30
The license server lic-matlab.ethz.ch is unavailable again. Due it is outside our control we cannot estimate when it will work again.
2011-09-30: 16:00
lib-matlab.ethz.ch is available again.


NFS outage on oenone

STATUS: Status/green.gif

2011-08-30: 02:00 AM - 08:30 AM

During this night at around 02:00 the NFS Services on oenone crashed. Users with homes on oenone where affected, these are BIWI, VAW, Collegium Helveticum, Control, IBT, IKT. This crash also affected all webservices which depend on oenone and the mailserver (at least for those having the home directory on oenone). No mails are lost as we put them into a hold queue!

Update: 08:30
oenone is now up and running again. All Mails on the hold queue are now gradualy delivered. The webservices are also all available now.

We are sorry for the caused inconvenience and we are investigating the problem.


Connection problems to all admin.ch servers

STATUS: Status/green.gif

2011-08-23: 03:00 PM
All admin.ch websites an mailservers are reachable.
2011-08-23: 01:25 PM
Currently all traffic to the admin.ch servers is disturbed. This includes the websites and also email connections. Your sent mails are not lost as our server keeps them until it can connect to the destination.


Maintenance work on D-ITET's central IT infrastructure

STATUS: Status/green.gif

2011-08-23: CANCELLATION
Due to unresolved error within Solaris operating system introduced by latest patch set
2011-08-23: 7:00 PM - 10:00 PM
To keep our systems up to date with the newest software and security releases, we need to update our servers on a regular basis. For this reason we are going to patch and reboot some of our Solaris servers.

To prevent data corruption/loss please do the following:

More details:


Routing problems on switch.ch network

STATUS: Status/green.gif

2011-08-19: 05:00 PM
Routing Problems solved.
2011-08-19: 02:22 PM

Because of a routing problem on the switch network, all traffic to http://www.virginia.edu and their mailserver is disturbed.


Outage drwho.ee.ethz.ch

STATUS: Status/green.gif

2011-08-13 09:00
Since around 1:00 we experience server problems on one of our main servers affecting most of the services.
2011-08-13 11:00
All services back to normal.


ETH wide DNS outage

STATUS: Status/green.gif

2011-08-08 10:00

ETH wide DNS outage. All Services using Name Resolution do not work. Our Mailserver denys all incoming messages with an 450 4.3.2 Service currently unavailable. Properly configured Mailservers should retry the message delivery later, so no mail is lost.

2011-08-08 10:20
DNS works again. All Services up and running.


colombo04 not available

STATUS: Status/green.gif

2011-06-23 11:00
Powersupply of colombo04 replaced. colombo04 up and ready.
2011-06-22 07:00
Powersupply of colombo04 broke. Now waiting for replacement from Oracle.


biwinas01 not available

STATUS: Status/green.gif

2011-06-08 08:00
biwinas01 is now back online.
2011-06-07 15:00
The hardware supplier returned the server today. They had to replace the following hardware:
  • 1 CPU
  • 1 power supply
  • 1 fan
  • We will test the server now and bring it back online as soon as possible.

    2011-06-06 15:00
    biwinas01 is currently out of order due to a hardware failure. It might take several days until biwinas01 is back online.


    ifhlux11 not available

    STATUS: Status/green.gif

    2011-06-08 11:20
    iflux11 is now back online.
    2011-06-07 15:00
    We are in contact with the supplier. Unfortunately, the reason for the crash of ifhlux11 is not known yet.
    2011-06-06 15:00
    ifhlux11 is currently out of order due to a hardware failure. It might take several days until ifhlux11 is back online.


    Coming soon: Outage of several IT services due to cooling system replacement

    STATUS: Status/green.gif

    2011-06-03 15:00

    All servers are again up and running with the exceptions of biwinas01 and ifhlux11 (see above).

    UPCOMING: 2011-06-03 09:00 - 2011-06-06 09:00

    The refrigeration supply in the ETZ building will be replaced. One server room has to be shutdown completely (ETZ/D/96.2; all compute servers of the institutes). The other server room will get an emergency cooling system (ETZ/F/66). This will, however, not allow the cooling of all currently running servers within that room. Consequently we will shut down as many servers as possible.

    Basic services like email, user homes and other network shares, printing and login to Windows or Linux workstations are not expected to be affected by this construction work. In general, if you are not familiar with the server names or service terms below, you should not be affected at all.

    These services will not be available during the construction work times:


    Firewall Problem: some Services not reachable

    STATUS: Status/green.gif

    2011-05-26 07:15 - 2011-05-26 16:00

    Due to problems with the firewall hardware, some services of the D-ITET-servers were not reachable.


    Maintenance reboot of Servers behind login.ee.ethz.ch and ipp2vpp.ee.ethz.ch (printing/licenses)

    STATUS: Status/green.gif

    On 2011-05-19 around 7:00AM we will perform a maintenance reboot of polaris (serving login.ee.ethz.ch) and zaan (serving printing at D-ITET). During this downtime it will not be possible to print via samba or cups.


    Maintenance reboot of Server behind people.ee.ethz.ch

    STATUS: Status/green.gif

    2011-05-10 07:45

    On 2011-05-10 around 7:00AM we will perform a maintenance reboot of galen. galen is the Server serving your personal homepage on people.ee.ethz.ch.


    Horde Webmail outage

    STATUS: Status/green.gif

    2011-05-02 23:49 - 2011-05-03 02:27

    The Horde Webmail Client had to be taken down for security reasons. It was not clear if someone used a zero day exploit or a phished account to send spam over our server. It turned out, the attackers used a phished account.

    /!\ Please Remember: We at ISG.EE will NEVER ask you for your password.


    Short SMTP outage

    STATUS: Status/green.gif

    2011-02-28 14:49 - 15:04
    As a result of an LDAP failure we had to stop the mail server for 15 minutes, to prevent the rejection of incoming emails. While the Mainserver was down, our two backup MX collected incoming emails.


    Windows Terminal Server zhadum out-of-operation

    STATUS: Status/green.gif

    2011-02-28 15:00

    Please use the server vega7 from now on. If you had access to zhadum before, you should also be able to access vega7. Please use your NETHZ username and password to log in.

    2011-02-28 14:00

    The hardware of the departemental Windows terminal server zhadum is broken. A replacement server should be available soon...


    Upgrade of Backup Server JABBA

    STATUS: Status/green.gif

    2011-02-22 18:15 PM
    The migration is finished and the new JABBA server is online.
    2011-02-22 7:00 AM - 7:00 PM (approx.)
    We are going to upgrade the departments backup server JABBA. This upgrade includes changes in software as well as in hardware. During the upgrade no restore request of lost data can be fulfilled.

    The complete backup infrastructure and all belonging services are NOT available during the upgrade


    Agilent ADS/ICCAP License Server Change

    STATUS: Status/green.gif

    2011-02-07 10:00

    As announced a week ago, the license server for Agilent ADS and ICCAP software has changed. If your client software still uses the old license server information, please make sure you change the license server to lic-agilent.ee.ethz.ch. The port stays the same.


    Subversion server not reachable from outside ETH Zurich

    STATUS: Status/green.gif

    2011-01-31 16:30
    Our subversion server is now available again for users outside of ETH Zurich.
    2011-01-31: 16:15
    We have been told that the central firewall rules cannot be changed right now due to other (completely unrelated) problems. At the moment it is unknown when this will be fixed.
    2011-01-31: 15:30
    Our subversion server svn.ee.ethz.ch is at the moment not accessible through the svn:// protocol from outside ETH Zurich due to a firewall configuration error. We are working on it...


    Mailserver problems while patching

    STATUS: Status/green.gif

    2011-01-26: 11:00 AM
    During the server upgrade last night a patch has temporarily misconfigured the mail server. The server accepted incoming mails but could not place them into the users mailbox. EVERY such mail created a bounce. Because of this, our statement that no mails get lost while updating the servers, is not fully true anymore. Incoming Mails which bounced back have to be resent by the sender.


    Solaris Server Patching

    STATUS: Status/green.gif

    2011-01-25: 10:30 PM
    All servers and services are back online
    2011-01-25: 7:00 PM - 10:00 PM
    To keep our systems up to date with the newest software and security releases, we need to update our servers on a regular base. For this reason we are going to patch and reboot some of our Solaris servers.

    Servers concerned: drwho, tardis, oenone, spitfire, yosemite, malina.

    To prevent data corruption/loss please do the following:


    Delayed Email delivery

    STATUS: Status/green.gif

    2011-01-19: 11:00 AM - 4:30 PM

    As a result of a faulty ClamAV signature File every Email that contained a PDF-file was marked as infected. Before we could resend the quarantined emails we had to fix the issue. No mail was lost and everything was resent.

    Update: 2011-01-20: 10:33 AM
    ClamAV Signatues have been updated and tested. Everything is working as it should.


    CategoryEDUC

    Status/Archive/2011 (last edited 2016-01-06 12:48:20 by alders)