Differences between revisions 1 and 39 (spanning 38 versions)
Revision 1 as of 2018-06-18 13:09:13
Size: 2381
Editor: davidsch
Comment:
Revision 39 as of 2021-04-20 06:47:35
Size: 3751
Editor: stroth
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= Data archiving on lagoon = #rev 2020-09-04 davidsch
#rev 2018-08-28 davidsch

= Archiving personal data on "lagoon" server =
Line 5: Line 8:
== Overwiew ==
The archive system is thought to be used for long-term storage of privte data (e.g. home data). The archive system file space is ''not'' a working place, e.g. like a standard user home, where you operate on the files, create, read, write and delete them frequently. Instead the archive space should be used as a "quiet" location where you keep backup copies or data which is likekly not to be used for a longer period of time. A typical use of the archive system might be storing larger output data of scientific applications which is no more frequently used or modified after a certain deadline/ project end, but you think parts of the data may be reused one day in the future.
== Overview and terms of use ==
The archive system is thought to be used for long-term storage of private data (e.g. home data). The archive system file space is ''not'' a working place, e.g. like a standard user home, where you operate on the files, create, read, write and delete them frequently. Instead the archive space should be used as a "quiet" location where you keep backup copies or data which is likely not to be used for a longer period of time. A typical use of the archive system might be storing larger output data of scientific applications which is no longer frequently used or modified after a certain deadline/ project end (but maybe the data is reused years later). If you need to work again on previously archived data, copy or move it back to your home (or whatever target it is). You must not use the archive system in any way that generates an avoidable, constant and high I/O load on the archive server over a longer period of time. Thus, besides moving/ copying data from or to the personal archive, you should not, for instance, configure applications to use your archive directory as data read source or write target, except for backup purposes. The data stored in your personal archive folder is truly private and cannot be shared with other users. Finally, note that the data in the archive folder will not be backed up itself, i.e. the data is well protected using redundancy against loss resulting from hardware failure, but if you accidentally delete a file yourself in the archive, it cannot be restored.
Line 10: Line 13:
==== Obtaining access to the archive ====
The archive is not automatically accessible for every user. Instead, access must be obtained from ISG.EE (write an e-mail request to support@ee.ethz.ch). All D-ITET users are allowed to obtain an archive access (excluding students). The first 5 GB per user are free of cost.
=== Obtaining access to the archive ===
The archive is not automatically accessible for every user. Instead, access must be requested from ISG.EE (write an e-mail to support@ee.ethz.ch). '''Only D-ITET users (staff, project account owners) are allowed to obtain archive space''' (excluding students and short-term guests). The first 5 GB per user are free of charge.
Line 14: Line 17:
==== How to access the archive ==== === How to access the archive ===
Line 17: Line 20:
 * NFS (preferred on Linux): `/itet-stor/USERNAME/lagoon_backup` or `/usr/beegfs01/archive/USERNAME`
 * Samba or MS Windows Network Map: `\\itet-stor\USERNAME\lagoon_backup` or `S:\lagoon_backup`
 * NFS (preferred on Linux): `/itet-stor/USERNAME/lagoon_archive` or `/srv/beegfs01/archive/USERNAME`
 * Samba or MS Windows mapped network drive: `\\itet-stor\USERNAME\lagoon_archive` or `S:\lagoon_archive`
Line 20: Line 23:
==== Notes, read carefully ==== === Notes, read carefully ===
Line 23: Line 26:
 * Summarize small files (e.g. only a few kB per file) in to larger archives (e.g. zip-archives, tarballs (.tar.gz) etc.)
 * Do not store small files in the archive, especially avoid storing many of such small files.
 * The ideal size of these archive files is between 5 and 10 Gigabytes.
 * Don't create archive files bigger than 20 Gigabytes. (???)
 * Take care when you delete files. You are working on the backup system and there is no further backup of that data.
 * The quota applied to the files in the archive is shared among all users of the same laboratory. If one user eats up all the free space, other users won't have enough space to add their data to the archive. If you think the amount of data you plan to add to the archive is especially large, please first contact your local administrator or ISG.EE.
 * Move small files (e.g. only a few kB per file) in to larger archive files (e.g. zip-archives, tarballs (.tar.gz) etc.)
 * '''Do not store small files in the lagoon archive filesystem''', especially avoid storing many (from thousands to millions) of such small files.
 * Take care when you delete files. You are working on the backup system and '''there is no additional backup of that data'''.
 * The quota applied to the files in the archive is shared among all users of the same laboratory. If one user consumes all the laboratory's available archive space, other users won't have enough space to add their own data to the archive. If you think that the amount of data you plan to add to the archive is extraordinarily large, please contact your local administrator or ISG.EE support first.
Line 30: Line 31:
== Usage examples ==
Line 31: Line 33:
 * Create a new archive file tarball on the archive system: {{{
tar cvfz /itet-stor/USERNAME/lagoon_archive/phd_thesis_files.tar.gz /phd/thesis/directory
}}}
 * Alternatively you may use zip, for instance: {{{
zip -r /itet-stor/USERNAME/lagoon_archive/phd_thesis_file.zip /phd/thesis/directory
}}}
 * It is no longer necessary to split large archive files (e.g. 50 GB+) into smaller chunks.

Archiving personal data on "lagoon" server

Overview and terms of use

The archive system is thought to be used for long-term storage of private data (e.g. home data). The archive system file space is not a working place, e.g. like a standard user home, where you operate on the files, create, read, write and delete them frequently. Instead the archive space should be used as a "quiet" location where you keep backup copies or data which is likely not to be used for a longer period of time. A typical use of the archive system might be storing larger output data of scientific applications which is no longer frequently used or modified after a certain deadline/ project end (but maybe the data is reused years later). If you need to work again on previously archived data, copy or move it back to your home (or whatever target it is). You must not use the archive system in any way that generates an avoidable, constant and high I/O load on the archive server over a longer period of time. Thus, besides moving/ copying data from or to the personal archive, you should not, for instance, configure applications to use your archive directory as data read source or write target, except for backup purposes. The data stored in your personal archive folder is truly private and cannot be shared with other users. Finally, note that the data in the archive folder will not be backed up itself, i.e. the data is well protected using redundancy against loss resulting from hardware failure, but if you accidentally delete a file yourself in the archive, it cannot be restored.

Usage

Obtaining access to the archive

The archive is not automatically accessible for every user. Instead, access must be requested from ISG.EE (write an e-mail to support@ee.ethz.ch). Only D-ITET users (staff, project account owners) are allowed to obtain archive space (excluding students and short-term guests). The first 5 GB per user are free of charge.

How to access the archive

Once access has been granted by ISG.EE, you may access the archive in the ways shown below:

  • NFS (preferred on Linux): /itet-stor/USERNAME/lagoon_archive or /srv/beegfs01/archive/USERNAME

  • Samba or MS Windows mapped network drive: \\itet-stor\USERNAME\lagoon_archive or S:\lagoon_archive

Notes, read carefully

Please follow these rules when storing data in the lagoon archive:

  • Move small files (e.g. only a few kB per file) in to larger archive files (e.g. zip-archives, tarballs (.tar.gz) etc.)
  • Do not store small files in the lagoon archive filesystem, especially avoid storing many (from thousands to millions) of such small files.

  • Take care when you delete files. You are working on the backup system and there is no additional backup of that data.

  • The quota applied to the files in the archive is shared among all users of the same laboratory. If one user consumes all the laboratory's available archive space, other users won't have enough space to add their own data to the archive. If you think that the amount of data you plan to add to the archive is extraordinarily large, please contact your local administrator or ISG.EE support first.

Usage examples

  • Create a new archive file tarball on the archive system:

    tar cvfz /itet-stor/USERNAME/lagoon_archive/phd_thesis_files.tar.gz /phd/thesis/directory
  • Alternatively you may use zip, for instance:

    zip -r /itet-stor/USERNAME/lagoon_archive/phd_thesis_file.zip /phd/thesis/directory
  • It is no longer necessary to split large archive files (e.g. 50 GB+) into smaller chunks.

CategoryBKUP

Services/DataArchivingNG (last edited 2023-10-16 11:33:32 by alders)