class: inverse, center, middle background-image: url(imgs/Keras_Logo.jpg),url(imgs/cbi.png),url(imgs/logo_UT3_RVB.png),url(imgs/index.jpeg) background-position: 50% 0%,25% 100%,50% 100%,75% 100% background-size: 10%,15%,20%,10% ## .center[Deep Learning with .large[__R__] and __`keras`__] ### .center[R user group] <hr /> .large[Vincent ROCHER | Centre de Biologie Intégrative - Toulouse | 28/01/2021] --- .pull-left[ .large[ ### Table of content - What is Deep learning - Tensorflow & Keras - Keras with R - Some examples - text classification - image classification ] ] .pull-right[ ![deeplearning with R](imgs/dplwithr.jpeg) ] --- ## What is __Deep Learning__ ? .pull-left[ ![deeplearning](imgs/whatisdeeplearning.jpg) ] .pull-right[ ![deeplearning](imgs/01fig02.jpg) ] - __AI: __ Hard-coded rules. - __Machine Learning: __ Learn from data using features. - __Deep Learning: __ Learn representations from data. --- ## History .pull-left[ #### Core of deep learning: - __Artificial neuron__ (_1943_) - __Perceptron__ (_1957_) #### The network: - __Multilayer perceptron__ (_1967_) #### More complex networks: - __Vision__: - __Neocognitron__ (_1980_) - __Convolution__ (_1998_) - __Speech__: - __Recurrent neural networks__ (_1986_) - __Long short-term memory__ (_1997_) ] -- .pull-right[ ### Image generation with GANs <img src="imgs/thispersondoesntexist.jpeg" width="70%" /> [thispersondoesnotexist.com](https://thispersondoesnotexist.com/) ] --- ## Deep Learning applications ### A Neural Network for Machine Translation, at Production Scale .center[ <img src="imgs/google_translate.png" width="60%" /> _from Google AI blog_ ] --- ## Deep Learning applications ### Computer Vision .center[ <img src="imgs/detection.png" width="70%" /> _from Google AI blog_ ] --- ## Deep Learning applications ### Neural style transfer .center[ <img src="imgs/style_transfert2.png" width="35%" /> _from Image Style Transfer Using Convolutional Neural Networks_ ] --- ## What is __Deep Learning__ ? .pull-left[ #### An artificial neuron ![deeplearning](imgs/neural_example.jpeg) ] .pull-right[ #### A _deep_ neural network ![deeplearning](imgs/layer_exemple.png) ] .large[ - A layer apply a __geometric transformation__ on a tensor and output a __tensor__ using _weights_ (also __tensors__). - A _deep_ neural network consist of successive (linear) __stack of layers__, from an _input layer_ to a single _output layer_. ] --- ## What is __Deep Learning__ ? .pull-left[ ![deeplearning](imgs/01fig07.jpg) __Weights__ progressively transform `\(X\)` into `\(Y'\)`. ] -- .pull-right[ ![deeplearning](imgs/01fig08.jpg) - __Loss score__ represent the __distance__ between `\(Y'\)` and `\(Y\)`. ] --- ## What is __Deep Learning__ ? .pull-left[ ![deeplearning](imgs/01fig07.jpg) __Weights__ progressively transform `\(X\)` into `\(Y'\)`. ] .pull-right[ ![deeplearning](imgs/01fig09.jpg) - __Loss score__ represent the __distance__ between `\(Y'\)` and `\(Y\)`. - __Optimizer__ update the __weights__ with _backpropagation_ and _stochastic gradient descent_. ] --- ## __Loss__ & __Activation__ functions .large[ <table class="table table-striped table-hover" style="margin-left: auto; margin-right: auto;"> <caption>Activation and Loss for different problems</caption> <thead> <tr> <th style="text-align:left;"> Problem type </th> <th style="text-align:left;"> Activation </th> <th style="text-align:left;"> Loss </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #2980b9 !important;"> Regression </td> <td style="text-align:left;width: 20em; font-weight: bold;color: black !important;"> None (linear) </td> <td style="text-align:left;width: 40em; font-weight: bold;color: black !important;"> Mean Square Error (MSE) or Mean Absolute Error (MSE) </td> </tr> <tr> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #2980b9 !important;"> Binary classification </td> <td style="text-align:left;width: 20em; font-weight: bold;color: black !important;"> Sigmoid </td> <td style="text-align:left;width: 40em; font-weight: bold;color: black !important;"> Binary crossentropy </td> </tr> <tr> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #2980b9 !important;"> Multiclass classification </td> <td style="text-align:left;width: 20em; font-weight: bold;color: black !important;"> Softmax </td> <td style="text-align:left;width: 40em; font-weight: bold;color: black !important;"> Categorical crossentropy </td> </tr> </tbody> </table> ] --- ##Library for deep learning .pull-left[ ### Tensorflow <img src="imgs/FullColorPrimary Icon.svg" width="50%" /> - Open source platform for machine learning (and deep learning) developed by Google. - _Tensor_: Manipulate data as __tensors__. - _flow_: Build computation node as __graph__. ] .pull-right[ ### Keras <img src="imgs/Keras_Logo.jpg" width="40%" /> - High level neural network API (`python`). - Work with __Tensorflow__, __CNTK__, or __Theano__. - __User-friendly__ & work very well with __Tensorflow__. ] --- ## R interface to Keras #### Package `keras` by Rstudio (https://keras.rstudio.com/). -- ```r install.packages("keras") # default installation library(keras) install_keras() # install using a conda environment (default is virtualenv) install_keras(method = "conda") # install with GPU version of TensorFlow # (NOTE: only do this if you have an NVIDIA GPU + CUDA!) *install_keras(tensorflow = "gpu") # install a specific version of TensorFlow install_keras(tensorflow = "1.2.1") install_keras(tensorflow = "1.2.1-gpu") ``` --- ## R interface to Keras ### Change backend .pull-left[ ```r library(keras) use_backend("theano") ``` __Theano__ is an open-source symbolic tensor manipulation framework developed by LISA Lab at Université de Montréal. ] .pull-right[ ```r library(keras) use_backend("cntk") ``` __CNTK__ is an open-source toolkit for deep learning developed by Microsoft. ] --- ## Input data as tensors ### What are tensors ? <table class="table table-striped table-hover" style="margin-left: auto; margin-right: auto;"> <caption>Examples of tensors</caption> <thead> <tr> <th style="text-align:left;"> Dimension </th> <th style="text-align:left;"> R object </th> <th style="text-align:left;"> Data </th> <th style="text-align:left;"> Description </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #2980b9 !important;"> 0D </td> <td style="text-align:left;width: 40em; font-weight: bold;color: black !important;"> 42 </td> <td style="text-align:left;"> </td> <td style="text-align:left;width: 20em; "> </td> </tr> <tr> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #2980b9 !important;"> 1D </td> <td style="text-align:left;width: 40em; font-weight: bold;color: black !important;"> c(42, 42, 42) </td> <td style="text-align:left;"> Vector data </td> <td style="text-align:left;width: 20em; "> 2D tensors of shape (samples, features) </td> </tr> <tr> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #2980b9 !important;"> 2D </td> <td style="text-align:left;width: 40em; font-weight: bold;color: black !important;"> matrix(42, nrow = 2, ncol = 2) </td> <td style="text-align:left;"> Timeseries data </td> <td style="text-align:left;width: 20em; "> 3D tensors of shape (samples, timesteps, features) </td> </tr> <tr> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #2980b9 !important;"> 3D </td> <td style="text-align:left;width: 40em; font-weight: bold;color: black !important;"> array(42, dim = c(2,3,2)) </td> <td style="text-align:left;"> Images </td> <td style="text-align:left;width: 20em; "> Images 4D tensors of shape (samples, height, width, channels) </td> </tr> <tr> <td style="text-align:left;font-weight: bold;color: white !important;background-color: #2980b9 !important;"> 4D </td> <td style="text-align:left;width: 40em; font-weight: bold;color: black !important;"> array(42, dim = c(2,3,2,3)) </td> <td style="text-align:left;"> Videos </td> <td style="text-align:left;width: 20em; "> Video 5D tensors of shape (samples, frames, height, width, channels) </td> </tr> </tbody> </table> __Tensors__ are a _generalization_ of vectors and matrices to an __arbitrary number of dimensions__ (or _axis_). --- ## Build a model in R .pull-left[ ### Sequential mode (basic) ```r library(keras) model <- keras_model_sequential() model %>% layer_dense(32, input_shape=c(784)) %>% layer_activation('relu') %>% layer_dense(10) %>% layer_activation('softmax') ``` ] .pull-right[ ### With Keras functional API ```r input <- layer_input(shape = c(784)) output <- input %>% layer_dense(32) %>% layer_activation('relu') %>% layer_dense(10) %>% layer_activation('softmax') model <- keras_model(input,output) ``` ] --- ## Build a model in R .pull-left[ ### Sequential mode (basic) ```r library(keras) model <- keras_model_sequential() model %>% * layer_dense(32, `input_shape=c(784)`) %>% layer_activation('relu') %>% `layer_dense(10)` %>% layer_activation('softmax') ``` ] .pull-right[ ### With Keras functional API ```r library(keras) input <- `layer_input(shape = c(784))` output <- input %>% * layer_dense(32) %>% layer_activation('relu') %>% layer_dense(10) %>% `layer_activation('softmax')` model <- keras_model(input,output) ``` ] --- ## R to Tensorflow graph .pull-left[ <!-- # ```{r out.width = '60%',echo=F} --> <!-- # knitr::include_graphics("imgs/model_one.bmp") --> <!-- # ``` -->
] .pull-right[ ### R ```r library(keras) model <- keras_model_sequential() model %>% layer_dense(32, input_shape=c(784)) %>% layer_activation('relu') %>% layer_dense(10) %>% layer_activation('softmax') ``` ] --- ### Summary of our model ```r summary(model) ``` ``` ## Model: "sequential_1" ## ________________________________________________________________________________ ## Layer (type) Output Shape Param # ## ================================================================================ ## dense_4 (Dense) (None, 32) 25120 ## ________________________________________________________________________________ ## activation_4 (Activation) (None, 32) 0 ## ________________________________________________________________________________ ## dense_5 (Dense) (None, 10) 330 ## ________________________________________________________________________________ ## activation_5 (Activation) (None, 10) 0 ## ================================================================================ *## Total params: 25,450 ## Trainable params: 25,450 ## Non-trainable params: 0 ## ________________________________________________________________________________ ``` --- ### A very big model ```r summary(vgg16_imagenet_model) ``` ``` Model: "vgg16" _________________________________________________________________________________________________________________________ Layer (type) Output Shape Param # ========================================================================================================================= input_1 (InputLayer) [(None, 224, 224, 3)] 0 _________________________________________________________________________________________________________________________ block1_conv1 (Conv2D) (None, 224, 224, 64) 1792 _________________________________________________________________________________________________________________________ block1_conv2 (Conv2D) (None, 224, 224, 64) 36928 _________________________________________________________________________________________________________________________ block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 _________________________________________________________________________________________________________________________ ... _________________________________________________________________________________________________________________________ flatten (Flatten) (None, 25088) 0 _________________________________________________________________________________________________________________________ fc1 (Dense) (None, 4096) 102764544 _________________________________________________________________________________________________________________________ fc2 (Dense) (None, 4096) 16781312 _________________________________________________________________________________________________________________________ predictions (Dense) (None, 1000) 4097000 ========================================================================================================================= *Total params: 138,357,544 Trainable params: 138,357,544 Non-trainable params: 0 _________________________________________________________________________________________________________________________ ``` --- ### Compilation ```r model %>% compile( optimizer = "rmsprop", loss = "categorical_crossentropy", metrics = c("acc") ) ``` `compile()` modify the model object in place (no return). - `optimizer`: This object specifies the training procedure. Commonly used optimizers are e.g. adam, rmsprop, or sgd. - `loss`: The function to minimize during optimization. Common choices include mean square error (mse), categorical_crossentropy, and binary_crossentropy. - `metrics`: Used to monitor training. In classification, this usually is accuracy. --- ### Training ```r history <- model %>% fit( x_train, y_train, epochs = 10, batch_size = 128, validation_split = 0.2, verbose = 1 ) ``` - `epochs = 10`: Use 10 times our dataset to train the model. - `batch_size = 128`: Model feed by mini-batches of 128 samples. - `validation_split = 0.2`: Keep 20% of our train data for validation. --- ### Training ```r plot(history) ``` .center[ <img src="imgs/example_plot_history.svg" width="80%" /> ] --- ### Evaluation and prediction ```r model %>% evaluate(x_test,y_test) ``` ``` $loss [1] 0.4052937 $acc [1] 0.8454 ``` ```r model %>% predict_classes(x_test[1:10,]) ``` ``` [1] 0 1 1 0 1 1 1 0 1 1 ``` ```r model %>% predict_proba(x_test[1:10,]) ``` ``` [1] 8.842140e-03 9.998803e-01 9.527058e-01 4.542440e-01 9.996748e-01 8.639503e-01 9.832536e-01 [8] 7.519126e-05 9.976131e-01 9.998198e-01 ``` --- ### Use __Hyperparameter Tuning__ #### Define flags for key parameters ... ```r FLAGS <- flags( flag_string("activation","relu"), flag_string("activation2","softmax"), flag_string("optimizer",'rmsprop'), flag_string("loss",'binary_crossentropy'), flag_numeric("epoch",10), flag_numeric("dense1",128), flag_numeric("dense2",16) ) ``` --- ### Use __Hyperparameter Tuning__ #### ... And replace hard-coded hyperparameters by `FLAGS$hyper_parameter_name`: ```r model_tuning <- keras_model_sequential() %>% layer_dense(`FLAGS$dense1`, input_shape=c(784)) %>% layer_activation(`FLAGS$activation`) %>% layer_dense(`FLAGS$dense2`) %>% layer_activation(`FLAGS$activation2`) model_tuning %>% compile( optimizer = `FLAGS$optimizer`, loss = `FLAGS$loss`, metrics = c("acc") ) history <- model %>% fit( x_train, y_train, epochs = `FLAGS$loss`, batch_size = 128, validation_split = 0.2 ) ``` --- ### Use __Hyperparameter Tuning__ ### Prepare our model for __Hyperparameter Tuning__ #### Then launch hypertuning ```r require("tfruns") tuning_run("model_tuning.R", runs_dir = "hypertuning/model_tuning", flags = list( dense1 = c(512,256,128), dense2 = c(64,32,16) )) ``` ```bash Rscript model_tuning.R --epoch 5 --dense1 128 --dense2 64 ``` --- ## Python vs R #### First examples found on the documentation ([R](https://keras.rstudio.com/articles/sequential_model.html), [python](https://keras.io/getting-started/sequential-model-guide/)) .pull-left[ ### `python` ```python from keras.models import Sequential from keras.layers import Dense, Activation model = Sequential([ Dense(32, input_shape=(784,)), Activation('relu'), Dense(10), Activation('softmax'), ]) ``` ] .pull-right[ ### `R` ```r library(keras) model <- keras_model_sequential() model %>% layer_dense(32, input_shape=c(784)) %>% layer_activation('relu') %>% layer_dense(10) %>% layer_activation('softmax') ``` ] --- ## Layers available ### Dense layer ```r dense_layer() ``` .center[ <img src="imgs/layer_exemple.png" width="60%" /> ] --- ### Convolution layer ####Convolution in 1D ```r layer_conv_1d() ``` .center[ ![conv1d](imgs/06fig22.jpg) ] --- ### Convolution layer ####Convolution in 2D ```r layer_conv_2d() ``` .center[ ![conv2d](imgs/05fig03.jpg) ] --- ### Recurrent neural network (RNN) ```r layer_simple_rnn() layer_lstm() layer_gru() ``` .center[ ![conv1d](imgs/RNN-unrolled.png) _from http://colah.github.io/posts/2015-08-Understanding-LSTMs/_ ] --- ### Embedding layer ```r layer_embedding() ``` .center[ ![embedding](imgs/06fig02.jpg) ] --- ### Regularization layer #### L1 / L2 regularization ```r regularizer_l1(l = 0.01) regularizer_l2(l = 0.01) regularizer_l1_l2(l1 = 0.01, l2 = 0.01) ``` .pull-left[ #### Dropout ```r layer_dropout(0.5) ``` ] .pull-right[ ![dropout](imgs/dropout.png) _Srivastava, Nitish, et al. ”Dropout: a simple way to prevent neural networks from overfitting”, JMLR 2014_ ] --- ### Regularization layer #### Gaussian noise ```r layer_gaussian_noise() layer_gaussian_dropout() ``` - __GaussianNoise__: Add zero-centered noise, with specified standard deviation (Similar to data augmentation). - __GaussianDropout__: A combination of Dropout and Gaussian noise. #### Batch normalization ```r layer_batch_normalization() ``` - __Batch normalization__: Scaling/centering batches to avoid the layers to adapt themselves to a new distribution in every training step. --- ### And many others .center[ <iframe src="https://keras.rstudio.com/reference/index.html" width="100%" height="400px"></iframe> ] --- ## __Application:__ Predict DNA secondary structure using DNA sequences ###.large[Deep__G4:__] A deep learning approach to predict active G-quadruplexes __Input:__ `ACTGTGGGGATTTCTTAGGCTTTAGGGGACTTATGTGCTAGGGG` __Output:__ .center[ <img src="imgs/G4.svg" width="45%" /> ] --- class: split-80 ### Mapping G4s in vivo with __BG4-seq__ (2018) .column[ * ChIP-seq on DNA secondary structures using a G4-structure-specific single-chain antibody (BG4). * Refinements in chromatin immunoprecipitation. * Followed by high-throughput sequencing. <img src="imgs/Av_prof.png" width="100%" /> ] .column[ [Genome-wide mapping of endogenous G-quadruplex DNA structures by chromatin immunoprecipitation and high-throughput sequencing](https://www.nature.com/articles/nprot.2017.150) <img src="imgs/BG4seq.png" width="70%" /> ] --- ## __DeepG4__ model architecture <img src="imgs/CNN_model_improved.svg" width="100%" /> 1. __Conv1D__: Scan sequences using kernel (20bp). 2. __Average pooling__: Reduce dimension size and aggregate kernel signal. 3. __Global max pooling__: Output max activation signal for each kernel. 4. __Dropout__: Regularization layer. 5. __Dense layer__ `(100 units,linear)`: Combination of weighted kernel signal. 6. __Dense layer__ `(1 unit, sigmoid)`: Output a probability. --- ## __DeepG4__ model architecture ... in __R__ ```r input <- layer_input(shape = input_shape) output <- input %>% layer_conv_1d(filters = FLAGS$filters1, kernel_size = FLAGS$kernel_size1, activation = FLAGS$activation) %>% layer_average_pooling_1d(pool_size = FLAGS$pool_size1) %>% layer_global_max_pooling_1d() %>% layer_dropout(FLAGS$dropout1) %>% layer_dense(FLAGS$dense_1) %>% layer_dense(1) %>% layer_activation("sigmoid") model <- keras_model(input,output) model %>% compile( optimizer = optimizer_rmsprop(lr = FLAGS$learning_rate), loss = FLAGS$loss, metrics = c('accuracy') ) history <- model %>% fit( x_train,y_train, epochs = FLAGS$epoch, batch_size = 128, validation_split = 0.2, verbose=1, callbacks = list( callback_model_checkpoint("best_model.h5",save_best_only=TRUE) ) ) ``` --- ## Input data ```r library(Biostrings) library(DeepG4) sequences <- system.file("extdata", "test_G4_data.fa", package = "DeepG4") sequences <- readDNAStringSet(sequences) sequences <- sequences %>% DNAToNumerical() sequences %>% dim() ``` ``` ## [1] 50 201 5 ``` <img src="imgs/oneHotEncoding.svg" width="100%" /> --- ## Output ```r model %>% predict(sequences) %>% head() ``` ``` ## [,1] ## [1,] 0.9998598 ## [2,] 0.9993761 ## [3,] 0.9539083 ## [4,] 0.9974855 ## [5,] 0.9908580 ## [6,] 0.9999917 ``` --- ### Convolution weights are motifs ```r require(ggseqlogo) require(cowplot) res <- sequences %>% ExtractMotifFromModel(top_kernel=4) p.pcm <- lapply(res,function(x){ggseqlogo(as.matrix(x)) + ggplot2::theme_classic(base_size=14)}) print(plot_grid(plotlist = p.pcm,ncol=2)) ``` .center[ ![](index_files/figure-html/unnamed-chunk-38-1.svg)<!-- --> ] --- class: inverse, center, middle background-image: url(imgs/logo.svg),url(imgs/cbi.png),url(imgs/logo_UT3_RVB.png),url(imgs/index.jpeg) background-position: 50% 0%,25% 100%,50% 100%,75% 100% background-size: 45%,15%,20%,10% .center[ # Thanks ! ] .center[ [@Github](https://github.com/morphos30/DeepG4) [@bioRxiv](https://www.biorxiv.org/content/10.1101/2020.07.22.215699v1) ] <hr /> .large[Vincent ROCHER, Matthieu Genais, Elissar Nassereddine and Raphaël Mourad] .large[CBI-Toulouse | Chromatin and DNA Repair]