Differences between revisions 20 and 29 (spanning 9 versions)
Revision 20 as of 2019-05-13 15:26:46
Size: 12075
Editor: stroth
Comment:
Revision 29 as of 2019-05-14 11:56:43
Size: 9549
Editor: stroth
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
= Set up a python development environment for data science =
The following procedure shows how to set up a typical python development environment for master students in data sciences. It is installed with the [[https://conda.io/|conda]] packet manager and will contain [[https://pytorch.org/|pytorch]] and [[https://www.tensorflow.org/|tensorflow]] including non-python dependencies like [[https://developer.nvidia.com/cuda-toolkit|CUDA toolkit]] and the [[https://developer.nvidia.com/cudnn|cuDNN library]].

== Install conda ==
 * Time to install: ~1 minute
 * Space required: ~350M

To provide conda, the minimal anaconda distribution '''miniconda''' can be installed and configured for the D-ITET infrastructure with the following bash script:
= Setting up a personal python development infrastructure =
This page shows how to [[#Installing_conda|set up a personal python development infrastructure]], how to [[#Using_conda|use it]], how to [[#Maintenance|maintain it]] and [[#Backup|make backups of your project environments]].

Some [[#Installation_examples|examples for software installation]] in the field of data sciences are provided.

The infrastructure is driven by the [[https://conda.io/|conda]] packet manager which accesses the [[https://repo.continuum.io/pkgs/|Anaconda repositories]] to install software.

After familiarizing yourself with `conda`, read [[Programming/Languages/GPUCPU|further information]] about available platforms on which to use your infrastructure and particularities of the software packages involved.

== Installing conda ==
 * Time to install: ~1.5 minutes
 * Space required: ~370M

To provide `conda`, the minimal anaconda distribution '''miniconda''' can be installed and configured for the D-ITET infrastructure with the following bash script:
Line 39: Line 45:
# Update conda and conda base environment
conda update conda --yes
conda update -n 'base' --update-all --yes
Line 53: Line 63:
{{{#!highlight bash numbers=disable {{{
Line 57: Line 67:
{{{#!highlight bash numbers=disable {{{
Line 62: Line 72:
== Conda storage locations == == conda storage locations ==
Line 69: Line 79:
The purpose of this configuration is to store reproducible and space consuming data outside of your `$HOME` to prevent using up your quota.

== Using Conda ==

The purpose of this configuration is to store data according to its importance and prevent using up your quota. If you intend to deviate from the default configuration, consult the [[Services/StorageOverview|storage overview]] to choose your storage locations adequately and follow these recommendations:

 * Reproducible, space consuming data like environments and package cache belongs into storage class ''SCRATCH''
 * For example your code is not reproducible and should therefore be backuped regularly. It consumes a small amount of space therefore it's ideal location is in your `$HOME` and checked into your [[https://git.ee.ethz.ch/users/sign_in|git repository]].
 * Data generated over a long time period which would be time consuming to recreate from scratch and is in use regularly should be stored in the storage class ''PROJECT''.
 * Data generated as a final result which is not needed for ongoing work and needs to be available for later generations should be stored in the storage class ''ARCHIVE''

== Using conda ==
Line 75: Line 91:
This means the dependency installed in an environment with both packages together might have a lower version number than in environments separating both packages. This means the dependency installed in an environment with both packages together might have a lower version number than in environments seperating both packages.
Line 81: Line 97:
=== Common commands ===
The official [[https://conda.io/projects/conda/en/latest/user-guide/cheatsheet.html|cheat sheet]] is a compact summary of common commands to get you started. An abbreviated list is shown here:

==== Environments ====
===== Create an environment called "my_env" with packages "package1" and "package2" installed =====
{{{#!highlight bash numbers=disable
The official [[https://conda.io/projects/conda/en/latest/user-guide/cheatsheet.html|cheat sheet]] is a compact summary of common commands to get you started. An abbreviated list to get you started is shown below.

=== Environments ===
The name of the automatically installed default environment is `base`.
==== Create an environment called "my_env" with packages "package1" and "package2" installed ====
{{{
Line 89: Line 105:
===== Activate the environment called "my_env" =====
{{{#!highlight bash numbers=disable
==== Activate the environment called "my_env" ====
{{{
Line 93: Line 109:
===== Deactivate the current environment =====
{{{#!highlight bash numbers=disable
==== Deactivate the current environment ====
{{{
Line 97: Line 113:
===== List available environments =====
{{{#!highlight bash numbers=disable
==== List available environments ====
{{{
Line 101: Line 117:
===== Remove the environment called "my_env" =====
{{{#!highlight bash numbers=disable
==== Remove the environment called "my_env" ====
{{{
Line 105: Line 121:
===== Create a cloned environment named "cloned_env" from "original_env" =====
{{{#!highlight bash numbers=disable
==== Create a cloned environment named "cloned_env" from "original_env" ====
{{{
Line 109: Line 125:
===== Export the active environment definition to the file "my_env.yml" =====
{{{#!highlight bash numbers=disable
==== Export the active environment definition to the file "my_env.yml" ====
{{{
Line 113: Line 129:
===== Recreate a previously exported environment =====
{{{#!highlight bash numbers=disable
==== Recreate a previously exported environment ====
{{{
Line 117: Line 133:
===== Creates the environment "my_env" in the specified location ===== ==== Creates the environment "my_env" in the specified location ====
Line 119: Line 135:
{{{#!highlight bash numbers=disable {{{
Line 122: Line 138:

==== Packages ====
===== Search for a package named "package1" =====
{{{#!highlight bash numbers=disable
==== Update an active environment ====
Make sure to create a [[#Backup|backup]] by exporting the active environment before updating.
{{{
conda update --update-all
}}}

=== Packages ===
==== Search for a package named "package1" ====
{{{
Line 128: Line 149:
===== Install the package named "package1" in the active environment =====
{{{#!highlight bash numbers=disable
==== Install the package named "package1" in the active environment ====
{{{
Line 132: Line 153:
===== List packages installed in the active environment =====
{{{#!highlight bash numbers=disable
==== List packages installed in the active environment ====
{{{
Line 137: Line 158:
==== Maintenance ====
===== Remove index cache, lock files, unused cache packages, and tarballs =====
{{{#!highlight bash numbers=disable
=== Maintenance ===
The cache of installed packages will consume a lot of space over time. The default location set for the package cache resides on [[Services/NetScratch|NetScratch]], the terms of use for this storage area imply to clean your cache regularly.
==== Remove index cache, lock files, unused cache packages, and tarballs ====
{{{
Line 142: Line 164:
The name of the default environment is `base`.

=== Installation examples ===
 * Time to install: ~5 minutes per environment
 * Space required: ~2G per environment, ~1.5G packages before cleanup, ~130M packages after cleanup

The following examples show how to install a specfic `python` version, `pytorch` and `tensorflow` in an environment intended to be run either on a Linux managed client, a GPU cluster or a Linux machine without a NVIDIA GPU. The CUDA toolkit versions in the examples are derived from the version of the NVIDIA driver available on a given platform, which always has to be determined before installing an environment. For details see [[#NVIDIA-CUDA-Toolkit|the explanation below]].

For `conda`, `python` itself is just a software package as any other. Depending on all installation parameters it decides which `python` version works for all other packages. This means different environments will contain differing versions of `python`.

==== A specific python version ====
{{{#!highlight bash numbers=disable
conda create --name py37 python=3.7.3
}}}
==== pytorch on GPU cluster: CUDA toolkit 10 ====
{{{#!highlight bash numbers=disable
conda create --name pytcu10 pytorch torchvision cudatoolkit=10.0 --channel pytorch
}}}
==== pytorch on managed Linux client: CUDA toolkit 9 ====
{{{#!highlight bash numbers=disable
conda create --name pytcu9 pytorch torchvision cudatoolkit=9.0 --channel pytorch
}}}
==== pytorch on Linux machine without NVIDIA GPU ====
{{{#!highlight bash numbers=disable
conda create --name pytcpu pytorch-cpu torchvision-cpu --channel pytorch
}}}
==== tensorflow on GPU cluster: CUDA toolkit 10 ====
{{{#!highlight bash numbers=disable
conda create --name tencu10 tensorflow-gpu cudatoolkit=10.0
}}}
==== tensorflow on managed Linux client: CUDA toolkit 9 ====
{{{#!highlight bash numbers=disable
conda create --name tencu9 tensorflow-gpu cudatoolkit=9.0
}}}
==== tensorflow on Linux machine without NVIDIA GPU ====
{{{#!highlight bash numbers=disable
conda create --name tencpu tensorflow
}}}

A [[https://software.intel.com/en-us/articles/intel-optimization-for-tensorflow-installation-guide#Anaconda_Intel|CPU version of tensorflow optimized for Intel CPUs]] exists, which might be a tempting choice. Be aware that this version of `tensorflow` and installed dependencies will differ from versions installed from the default channel in the examples above.

As shown in the examples above, environments can be tailored to a platform for optimal performance. Make sure you set up environments for each platform you intend to use. The list of packages installed and their version numbers should be identical on all environments if you follow the examples. An identical list of versions in your environments will make sure your environments behabe identically on all platforms.



=== Maintenance ===
The cache of installed packages will consume a lot of space over time. The default location set for the package cache resides on [[Services/NetScratch|NetScratch]], the terms of use for this storage area imply to [[#Remove_index_cache,_lock_files,_unused_cache_packages,_and_tarballs|clean up the cache]] regularly.
==== Update conda without any active environment ====
{{{
conda update conda
}}}
Line 191: Line 170:
Regular backups of environments are recommended to be able to reproduce an environment used at a certain point in time. Before installing or updating an environment, a backup should always be created in order to be able to revert the changes. Regular backups are recommended to be able to reproduce an environment used at a certain point in time. Before installing or updating an environment, a backup should always be created in order to be able to revert the changes.   It is not necessary to backup environments themselves, it is sufficient to backup the files of environment exports to recreate them exactly.
Line 209: Line 190:
=== Testing installations ===

==== Testing pytorch ====
To verify the successful installation of `pytorch` run the following python code in your python interpreter:
{{{#!highlight python numbers=disable
from __future__ import print_function
import torch
x = torch.rand(5, 3)
print(x)
}}}
The output should be similar to the following:
{{{
tensor([[0.4813, 0.8839, 0.1568],
        [0.0485, 0.9338, 0.1582],
        [0.1453, 0.5322, 0.8509],
        [0.2104, 0.4154, 0.9658],
        [0.6050, 0.9571, 0.3570]])
}}}
To verify CUDA availability in `pytorch`, run the following code:
{{{#!highlight python numbers=disable
import torch
torch.cuda.is_available()
}}}
It should return ''True''.

==== Testing TensorFlow ====
The following code prints information about your `tensorflow` installation:
{{{#!highlight python numbers=disable
import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
}}}
Lines containing `device: XLA_` show which CPU/GPU devices are available.

A line containing `cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version` means the NVIDIA driver installed on the system you run the code is not compatible with the CUDA toolkit installed in the environment you run the code from.

== NVIDIA CUDA Toolkit ==
Which version of the CUDA toolkit is usable depends on the version of the NVIDIA driver installed on the machine you run your programs. The version can be checked by issuing the command `nvidia-smi` and looking for the number next to the text ''Driver Version''.

The CUDA compatibility document by NVIDIA shows a [[https://docs.nvidia.com/deploy/cuda-compatibility/index.html#binary-compatibility__table-toolkit-driver|dependency matrix]] matching driver and toolkit versions.
=== Installation examples ===
/!\ time to install /space neu abzählen

For `conda`, `python` itself is just a software package as any other. Depending on all installation parameters it decides which `python` version works for all other packages. This means different environments will contain differing versions of `python`.

==== Creating an environment with a specific python version ====
{{{
conda create --name py37 python=3.7.3
}}}
==== Creating an environment with the GPU version of pytorch and CUDA toolkit 10 ====
 * Time to install: ~5 minutes
 * Space required: ~2G, ~1.5G packages before cleanup, ~130M packages after cleanup
{{{
conda create --name pytcu10 pytorch torchvision cudatoolkit=10.0 --channel pytorch
}}}
==== Creating an environment with the GPU version of tensorflow and CUDA toolkit 10 ====
 * Time to install: ~5 minutes
 * Space required: ~2G, ~1.5G packages before cleanup, ~130M packages after cleanup
{{{
conda create --name tencu10 tensorflow-gpu cudatoolkit=10.0
}}}

Setting up a personal python development infrastructure

This page shows how to set up a personal python development infrastructure, how to use it, how to maintain it and make backups of your project environments.

Some examples for software installation in the field of data sciences are provided.

The infrastructure is driven by the conda packet manager which accesses the Anaconda repositories to install software.

After familiarizing yourself with conda, read further information about available platforms on which to use your infrastructure and particularities of the software packages involved.

Installing conda

  • Time to install: ~1.5 minutes
  • Space required: ~370M

To provide conda, the minimal anaconda distribution miniconda can be installed and configured for the D-ITET infrastructure with the following bash script:

#!/bin/bash

# Locations to store environments
# net_scratch is used as default, local scratch needs to be chosen explicitly
LOCAL_SCRATCH="/scratch/${USER}"
NET_SCRATCH="/itet-stor/${USER}/net_scratch"

# Installer of choice for conda
CONDA_INSTALLER_URL='https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh'

# Unset pre-existing python paths
[[ -z ${PYTHONPATH} ]] || unset PYTHONPATH

# Downlad latest version of miniconda and install it
wget -O miniconda.sh "${CONDA_INSTALLER_URL}" \
    && chmod +x miniconda.sh \
    && ./miniconda.sh -b -p "${NET_SCRATCH}/conda" \
    && rm ./miniconda.sh

# Configure conda
eval "$(${NET_SCRATCH}/conda/bin/conda shell.bash hook)"
conda config --add pkgs_dirs "${NET_SCRATCH}/conda_pkgs" --system
conda config --add envs_dirs "${LOCAL_SCRATCH}/conda_envs" --system
conda config --add envs_dirs "${NET_SCRATCH}/conda_envs" --system
conda config --set auto_activate_base false
conda deactivate

# Update conda and conda base environment
conda update conda --yes
conda update -n 'base' --update-all --yes

# Show how to initialize conda
echo
echo 'Initialize conda immediately:'
echo "eval \"\$(${NET_SCRATCH}/conda/bin/conda shell.bash hook)\""
echo
echo 'Automatically initialize conda for furure shell sessions:'
echo "echo 'eval \"\$(${NET_SCRATCH}/conda/bin/conda shell.bash hook)\"' >> ${HOME}/.bashrc"

# Show how to remove conda
echo
echo 'Completely remove conda:'
echo "rm -r ${NET_SCRATCH}/conda ${NET_SCRATCH}/conda_pkgs ${NET_SCRATCH}/conda_envs ${LOCAL_SCRATCH}/conda_envs ${HOME}/.conda"

Save this script as install_conda.sh, make it executable with

chmod +x install_conda.sh

and execute the script by issuing

./install_conda.sh

Choose your preferred method of initializing conda as recommended by the script.

conda storage locations

The directories listed in the command for complete conda removal contain the following data:

/itet-stor/$USER/net_scratch/conda

The miniconda installation

/itet-stor/$USER/net_scratch/conda_pkgs

Downloaded packages

/itet-stor/$USER/net_scratch/conda_envs

Virtual environments on NAS

/scratch/$USER/conda_envs

Virtual environments on local disk

/home/$USER/.conda

Personal conda configuration

The purpose of this configuration is to store data according to its importance and prevent using up your quota. If you intend to deviate from the default configuration, consult the storage overview to choose your storage locations adequately and follow these recommendations:

  • Reproducible, space consuming data like environments and package cache belongs into storage class SCRATCH

  • For example your code is not reproducible and should therefore be backuped regularly. It consumes a small amount of space therefore it's ideal location is in your $HOME and checked into your git repository.

  • Data generated over a long time period which would be time consuming to recreate from scratch and is in use regularly should be stored in the storage class PROJECT.

  • Data generated as a final result which is not needed for ongoing work and needs to be available for later generations should be stored in the storage class ARCHIVE

Using conda

conda allows to seperate installed software packages from each other by creating so-called environments. Using environments is best practice to generate deterministic and reproducible tools.

conda takes care of dependencies common to the packages it is asked to install. If two packages have a common dependency but define a differing range of version requirements of said dependency, conda chooses the highest common version number. This means the dependency installed in an environment with both packages together might have a lower version number than in environments seperating both packages.

It is best practice to seperate packages in different environments if they don't need to interact.

For a complete guide to conda see the official documentation.

The official cheat sheet is a compact summary of common commands to get you started. An abbreviated list to get you started is shown below.

Environments

The name of the automatically installed default environment is base.

Create an environment called "my_env" with packages "package1" and "package2" installed

conda create --name my_env package1 package2

Activate the environment called "my_env"

conda activate my_env

Deactivate the current environment

conda deactivate

List available environments

conda env list

Remove the environment called "my_env"

conda remove --name my_env --all

Create a cloned environment named "cloned_env" from "original_env"

conda create --name cloned_env --clone original_env

Export the active environment definition to the file "my_env.yml"

conda env export > my_env.yml

Recreate a previously exported environment

conda env create --file my_env.yml

Creates the environment "my_env" in the specified location

This example is for creating the environment on local scratch for faster disk access

conda create --prefix /scratch/$USER/conda_envs/my_env

Update an active environment

Make sure to create a backup by exporting the active environment before updating.

conda update --update-all

Packages

Search for a package named "package1"

conda search package1

Install the package named "package1" in the active environment

conda install package1

List packages installed in the active environment

conda list

Maintenance

The cache of installed packages will consume a lot of space over time. The default location set for the package cache resides on NetScratch, the terms of use for this storage area imply to clean your cache regularly.

Remove index cache, lock files, unused cache packages, and tarballs

conda clean --all

Update conda without any active environment

conda update conda

Backup

Regular backups are recommended to be able to reproduce an environment used at a certain point in time. Before installing or updating an environment, a backup should always be created in order to be able to revert the changes.

It is not necessary to backup environments themselves, it is sufficient to backup the files of environment exports to recreate them exactly.

For a simple backup of all environments the following script can be used:

#!/bin/bash

BACKUP_DIR="${HOME}/conda_env_backup"
MY_TIME_FORMAT='%Y-%m-%d_%H-%M-%S'

NOW=$(date "+${MY_TIME_FORMAT}")
[[ ! -d "${BACKUP_DIR}" ]] && mkdir "${BACKUP_DIR}"
ENVS=$(conda env list |grep '^\w' |cut -d' ' -f1)
for env in $ENVS; do
    echo "Exporting ${env} to ${BACKUP_DIR}/${env}_${NOW}.yml"
    conda env export --name "${env}"> "${BACKUP_DIR}/${env}_${NOW}.yml"
done

Installation examples

/!\ time to install /space neu abzählen

For conda, python itself is just a software package as any other. Depending on all installation parameters it decides which python version works for all other packages. This means different environments will contain differing versions of python.

Creating an environment with a specific python version

conda create --name py37 python=3.7.3

Creating an environment with the GPU version of pytorch and CUDA toolkit 10

  • Time to install: ~5 minutes
  • Space required: ~2G, ~1.5G packages before cleanup, ~130M packages after cleanup

conda create --name pytcu10 pytorch torchvision cudatoolkit=10.0 --channel pytorch

Creating an environment with the GPU version of tensorflow and CUDA toolkit 10

  • Time to install: ~5 minutes
  • Space required: ~2G, ~1.5G packages before cleanup, ~130M packages after cleanup

conda create --name tencu10 tensorflow-gpu cudatoolkit=10.0

Programming/Languages/Conda (last edited 2024-06-03 07:21:16 by stroth)