class: center, middle, inverse, title-slide # Deep Learning e R ## SEST - 2018 ### Daniel Falbel (
@Curso-R
)
dfalbel@curso-r.com
### 24/05/2018 --- ## Oi! - Bacharel em Estatística (2015) - Sócio da [Curso-R](http://curso-r.com) - Sócio da [R6](http://rseis.com.br) --- ## Deep Learning * Redes neurais com muitas camadas. * A partir de 2012 obteve pelas primeiras vezes resultados super-humanos em tarefas de reconhecimento de imagens. * Recentemente alcançou bastante sucesso em problemas de classificação de imagens, audio e texto. data:image/s3,"s3://crabby-images/2708f/2708ffb087a906ba0fe317c5f7ea2352638c83f1" alt=":scale 90%" --- ## Algumas aplicações ### [Show and Tell](https://ai.googleblog.com/2016/09/show-and-tell-image-captioning-open.html) data:image/s3,"s3://crabby-images/b7d5c/b7d5c3f9737f539fc85855486f3f282b814945b4" alt="" --- ### [Zero Shot Translation](https://ai.googleblog.com/2016/11/zero-shot-translation-with-googles.html) data:image/s3,"s3://crabby-images/a5240/a5240183a08efe8c515fda3d35d7dc419a8b40c2" alt="" --- ### [Speech Generation](https://google.github.io/tacotron/publications/tacotron2/index.html) data:image/s3,"s3://crabby-images/9da82/9da820757f29ec43793a45ecb4a99c7852060ea7" alt="" --- ### [The Case for Learned Index Structures](https://arxiv.org/abs/1712.01208) data:image/s3,"s3://crabby-images/fd3e9/fd3e95c40c746a5d8564ed86a48c2325fd0bf8a1" alt="" --- class: inverse, middle, center ## Exemplo ### [Classifying duplicate questions from Quora with Keras](https://tensorflow.rstudio.com/blog/keras-duplicate-questions-quora.html) --- ## Keras * Uma API que permite especificar modelos de *Deep Learning* de forma intuitiva e rápida. * Criada por François Chollet (@fchollet). data:image/s3,"s3://crabby-images/36410/364104a4c366221ee2bdbe1b04f5b64c7e804728" alt=":scale 40%" * Originalmente implementada em `python`. --- ## Keras * Uma API com múltiplas implementações. data:image/s3,"s3://crabby-images/f02ca/f02ca83b1d10bef7ee557d01394d561421370312" alt=""<!-- --> --- ## Keras + R * Pacote do R: [`keras`](https://github.com/rstudio/keras). * Baseado em [reticulate](https://github.com/rstudio/reticulate). * Desenvolvido pelo JJ Allaire (CEO do RStudio). * Tem uma sintaxe R-like com uso de `%>%`. data:image/s3,"s3://crabby-images/074a8/074a890d527cb66cb44ccd1c7bc1336c5f35d689" alt=":scale 70%" --- data:image/s3,"s3://crabby-images/f6ec1/f6ec123c70af521e9c2282c6bbd62a5eeb4af353" alt=":scale 50%" * [Quora](https://www.quora.com): site de perguntas e respostas de âmbito geral. * Para quem usa o Quora, é melhor ter apenas uma versão de uma pergunta. * Banco de dados de uma [competição do Kaggle](https://www.kaggle.com/c/quora-question-pairs). * ~400k pares de perguntas duplicadas (ou não) marcadas pelos moderadores do site. * **Objetivo:** Identificar os pares de perguntas que possuem o mesmo _significado_. * Antes da competição o problema era resolvido com Random Forests, depois passaram a usar [Deep Learning](https://engineering.quora.com/Semantic-Question-Matching-with-Deep-Learning). --- ### Duplicadas <div> .pull-left[ * How can I be a good geologist? * How do I read and find my YouTube comments? * What can make Physics easy to learn? ] .pull-right[ - What should I do to be a great geologist? - How can I see all my Youtube comments? - How can you make physics easy to learn? ] </div> ### Não-Duplicadas <div> .pull-left[ * How can I increase the speed of my internet connection while using a VPN? * What is the step by step guide to invest in share market in india? * How do I get over my ex's past? ] .pull-right[ * How can Internet speed be increased by hacking through DNS? * What is the step by step guide to invest in share market? * What is the best way to get over your ex? ] </div> --- ## Arquitetura do modelo <br> <br> data:image/s3,"s3://crabby-images/ab3f1/ab3f191c35630f9e2aa16b911261e5b7b47a42f3" alt=""<!-- --> * Siamese LSTM --- #### Embedding data:image/s3,"s3://crabby-images/945e7/945e744ca5c5e9bce665db21ee626883378cf072" alt=""<!-- --> --- ## LSTM data:image/s3,"s3://crabby-images/8446c/8446cb03b4259f167b9ee3ea1352a6df7366523b" alt=""<!-- --> --- ## Arquitetura no Keras <br> <br> data:image/s3,"s3://crabby-images/d2f8c/d2f8c2abda5055ef72cf4c0269b3b682af988b0e" alt=""<!-- --> --- ## Código ```r library(keras) ``` -- ```r input1 <- layer_input(shape = c(20), name = "input_question1") input2 <- layer_input(shape = c(20), name = "input_question2") ``` -- ```r word_embedder <- layer_embedding( input_dim = 50000, # vocab size output_dim = 128, # hyperparameter - embedding size input_length = 20 # padding size ) ``` -- ```r seq_embedder <- layer_lstm(units = 128) ``` -- ```r vector1 <- input1 %>% word_embedder() %>% seq_embedder() vector2 <- input2 %>% word_embedder() %>% seq_embedder() ``` -- ```r #> Tensor("lstm_1/TensorArrayReadV3:0", shape=(?, 128), dtype=float32) #> Tensor("lstm_1_1/TensorArrayReadV3:0", shape=(?, 128), dtype=float32) ``` --- ## Arquitetura do Modelo <br> <br> data:image/s3,"s3://crabby-images/d2f8c/d2f8c2abda5055ef72cf4c0269b3b682af988b0e" alt=""<!-- --> --- ## Código ```r cosine_similarity <- layer_dot(list(vector1, vector2), axes = 1) ``` -- ```r output <- cosine_similarity %>% layer_dense(units = 1, activation = "sigmoid") ``` -- ```r model <- keras_model(list(input1, input2), output) model %>% compile( optimizer = "adam", metrics = list(acc = metric_binary_accuracy), loss = "binary_crossentropy" ) ``` --- ```r summary(model) ``` ```r # _______________________________________________________________________________________ # Layer (type) Output Shape Param # Connected to # ======================================================================================= # input_question1 (InputLayer (None, 20) 0 # _______________________________________________________________________________________ # input_question2 (InputLayer (None, 20) 0 # _______________________________________________________________________________________ # embedding_1 (Embedding) (None, 20, 128) 6400256 input_question1[0][0] # input_question2[0][0] # _______________________________________________________________________________________ # lstm_1 (LSTM) (None, 128) 131584 embedding_1[0][0] # embedding_1[1][0] # _______________________________________________________________________________________ # dot_1 (Dot) (None, 1) 0 lstm_1[0][0] # lstm_1[1][0] # _______________________________________________________________________________________ # dense_1 (Dense) (None, 1) 2 dot_1[0][0] # ======================================================================================= # Total params: 6,531,842 # Trainable params: 6,531,842 # Non-trainable params: 0 # _______________________________________________________________________________________ ``` --- ## Treinando ```r model %>% fit( list(train_question1_padded, train_question2_padded), train_is_duplicate, batch_size = 64, epochs = 10, validation_data = list( list(val_question1_padded, val_question2_padded), val_is_duplicate ) ) ``` ```r # Train on 363861 samples, validate on 40429 samples # Epoch 1/10 # 363861/363861 [==============================] - 89s 245us/step - loss: 0.5860 - acc: 0.7248 - val_loss: 0.5590 - val_acc: 0.7449 # Epoch 2/10 # 363861/363861 [==============================] - 88s 243us/step - loss: 0.5528 - acc: 0.7461 - val_loss: 0.5472 - val_acc: 0.7510 # Epoch 3/10 # 363861/363861 [==============================] - 88s 242us/step - loss: 0.5428 - acc: 0.7536 - val_loss: 0.5439 - val_acc: 0.7515 ``` --- class: inverse, middle, center ## Predições com Shiny [Shiny](https://jjallaire.shinyapps.io/shiny-quora/) --- ## Mais * [Galeria de exemplos do Keras](https://keras.rstudio.com/articles/examples/index.html). * [Livro: Deep Learning com R](https://www.amazon.com/Deep-Learning-R-Francois-Chollet/dp/161729554X) * [Blog: Tensorflow for R](https://tensorflow.rstudio.com/blog.html) data:image/s3,"s3://crabby-images/861c2/861c2afdfc1a14165d19736a8047840aaf5f638c" alt="" --- class: inverse, middle, center ## Obrigado dfalbel@curso-r.com github.com/dfalbel sest.curso-r.com