Vincent Van Steenbergen - @nsteenv
Playing with Scala, Akka & Spark +/- 3 years
Deeply interested in Artificial Intelligence and Data Analysis
aka. Convolutional Neural Networks (convNet)
1. a lot of time (usually weeks/months)
2. a lot of computing power
Ex: AlphaGo - 1202 CPU and 176 GPU - 6 weeks training
from my laptop?
for a decent cost?
within a short timespan?
Technicaly possible on a (high end) laptop but very slow
Solution: distribute training over a cluster
Pool ressources from all the spark slaves on the cluser
GPU instances (g2.2xlarge, g2.8xlarge)
Spot instances (on demand, generally 2-3 times cheaper than regular instances)
Four NVIDIA GRID GPUs, each with 1,536 CUDA cores and 4 GB of video memory
32 vCPUs
60 GiB of memory
240 GB (2 x 120) of SSD storage
Average price: $1.00 per hour
TensorFlow (Google)
Caffe (Berkeley)
Torch (Facebook/Deepmind)
Distribute Caffe on a Spark cluster
Developed/maintained by Yahoo (Flickr)
Can run on an existing cluster along other Spark jobs
Leverage existing Caffe models
Use SQL, DataFrames, existing LMDB files
Peer-to peer communication with Message passing
... let's give it a go!
Classifying handwritten digits
Any questions?
My email: v.vansteenbergen@gmail.com