Code Monkey home page Code Monkey logo

dlforcv2018spring's Introduction

Important Update

For cuDNN, please use cuDNN v7.0.5 or v7.0.4 instead of cuDNN 7.1. cuDNN 7.1 is not compatible with tensorflow and will cause issue. If you've installed cnDNN 7.1, you can use

sudo apt-get remove libcudnn7

to remove it. Then follow the same instruction to install cuDNN 7.0. Sorry for all the trouble it may cause.

DLforCV cloud setting

Google Cloud instance

VMs

Create a VM instance using Google Cloud

Settings:

  • Zone: us-east1-d
  • Machine type: click customize
    • 8 vCPU
    • 30 GB Memory
    • CPU platform: Intel Haswell or later
    • GPU: 1 NVIDIA Tesla K80
  • Boot disk:
    • OS images: Ubuntu 16.04 LTS
    • Boot disk type: Standard persistent disk
    • Size: 200 GB

Apply for GPU quota (Skip this part if you do not need one)

  1. Link your billing account to the credit you received (If you use the $300 free trial credit in your account, you will not be able to use GPU)
  2. In the notifications on top right of your browser, click request increase quota-request
  3. In the Quotas, find NVIDIA K80 GPUs for us-east1. Select it and click EDIT QUOTAS on top. quota-request2
  4. Enter your information.
  5. Enter 1 in the limit
  6. In the description say you will use the gpu for Columbia CS 4995 Deep learning for Computer Vision Course Project
  7. The quota will be approved almost instantaneously

Connect to VMs

  1. Web terminal
  2. SSH generate a public private key pair. Use your uni for USERNAME. USERNAME will be used later.
ssh-keygen -t rsa -f ~/.ssh/[KEY_FILENAME] -C [USERNAME]
cat ~/.ssh/[KEY_FILENAME].pub
Save it to Metadata
ssh -i ~/.ssh/my-ssh-key [USERNAME]@[EXTERNAL_IP_ADDRESS]
  1. Cyberduck/WinSCP/Putty

Environments

Either with or without GPU is ok. But it is usually trained faster on instances with GPU. Choose as you like.

Instances with GPU

The following script helps you install all the dependencies for keras.
We highly recommend that you run the following code, manually, line by line
to avoid any problem.

You need to run the code in the given order to avoid dependency issues.

* Linux Essentials
* Nvidia Driver
* CUDA
* cuDNN
* Tensorflow
* Keras
# essential
sudo apt-get update
sudo apt-get upgrade  
sudo apt-get install build-essential cmake g++ gfortran 
sudo apt-get install git pkg-config python-dev 
sudo apt-get install software-properties-common wget
sudo apt-get autoremove 
sudo rm -rf /var/lib/apt/lists/*

# install driver
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-375

# reboot your machine
Sudo reboot

# check driver is installed correctly
nvidia-smi

# install cuda
wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda_9.0.176_384.81_linux-run
sudo chmod +x cuda_9.0.176_384.81_linux-run
sudo sh cuda_9.0.176_384.81_linux-run
'''
Executing cuda installation
1. scoll to the bottom of the license agreement using d
2. Do you accept the previously read EULA? : accept
3. Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81?: NO (we've installed manually)
4. Install the CUDA 9.0 Toolkit? : yes
5. Enter Toolkit Location: use default(press enter)
6. Do you want to install a symbolic link at /usr/local/cuda?: yes
7. Install the CUDA 9.0 Samples?: no
'''

# change environment for cuda
echo 'export PATH=/usr/local/cuda-9.0/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc 

#verify your cuda is installed correctly
nvcc -V 

# install cuDNN 
# You need to register online and download the file to local machine then upload to the cloud
# https://developer.nvidia.com/cudnn
# download the cuDNN 7.0 for CUDA 9.0 and for Ubuntu 16.04. 
sudo apt install ./(your cuDNN file)

# install python stuff
sudo apt-get update
sudo apt-get install git python-dev python3-dev python3-numpy build-essential python-pip python3-pip python-virtualenv swig python-wheel libcurl3-dev
sudo apt-get install -y libfreetype6-dev libpng12-dev
python3 -m pip install -U pip
pip3 install -U matplotlib ipython[all] jupyter pandas scikit-image

# install tensorflow
pip3 install --upgrade tensorflow-gpu

# keras
pip3 install keras

# pytorch
pip3 install http://download.pytorch.org/whl/cu90/torch-0.3.1-cp35-cp35m-linux_x86_64.whl 
pip3 install torchvision

# helpful tools 
# tmux: keep session in the background. Keep the session running even the ssh disconnects.
# nohup: similar to tmux. Keep things running. Log the output to nohup.log

Instances without GPU

wget https://repo.continuum.io/archive/Anaconda3-5.1.0-Linux-x86_64.sh

bash Anaconda3-5.1.0-Linux-x86_64.sh
# Select yes to all options during the setup process

# install pytorch
pip3 install http://download.pytorch.org/whl/cpu/torch-0.3.1-cp35-cp35m-linux_x86_64.whl 
pip3 install torchvision

Examples

  • Go to google gloud console, go to 'VPC Network' panel, select 'Fire wall rules'. Add a rule as follows.
tensorboard

Network
default

Priority
1000

Direction
Ingress

Action on match
Allow

Source filters
IP ranges
0.0.0.0/0

Protocols and ports
tcp:6006;udp:6006
  • Then click your own instances, go to 'Edit', check box with 'Enable connecting to serial ports'

  • Run the following code in ssh

# Get the code from tensorflow
git clone https://github.com/tensorflow/tensorflow.git

# Mnist is chosen as demo
cd tensorflow/tensorflow/examples/tutorials/mnist

# run it in the background, output is stored in train.log
nohup python3 mnist_with_summaries.py --max_steps=1000000 > train.log

# close and open another terminal
# run the tensorboard
nohup tensorboard --logdir=/tmp/tensorflow/mnist
  • Close the terminal

  • Open your browser, go to '[external id]: 6006', you should see the tensorflow. ([extenal id] could be found on the VM instance page)

Reference: https://bicepjai.github.io/machine-learning/2016/08/22/tensorboard-on-gcloud.html

Running Jupyter Notebook Remotely

[Notice]

A couple of students informed us that after going through the tutorial shown below, they have issue with tensorflow-gpu. After inspection, the problem is due to the outdated Anaconda installed in the tutorial. When installing the Anaconda, make sure you install the newest Anaconda listed on the official website.

See this link. https://towardsdatascience.com/running-jupyter-notebook-in-google-cloud-platform-in-15-min-61e16da34d52


Remember to Stop the instance

Remember to Stop the instance

Remember to Stop the instance

dlforcv2018spring's People

Contributors

dynsk avatar argowang avatar

Watchers

Shreya Jain avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.