Skip to content

Latest commit

 

History

History
127 lines (70 loc) · 11.1 KB

README.md

File metadata and controls

127 lines (70 loc) · 11.1 KB

Papers

List of papers that we will or did discuss during the sessions.

Upcoming

Date Title Presenter
30/03/2017 Neural Multi-Source Morphological Reinflection Onur Gungor
TBA A Convolutional Neural Network for Modelling Sentences Çağıl Sönmez
TBA Rich feature hierarchies for accurate object detection and semantic segmentation İlhan Adıyaman

Past

Date Title Presenter
23/03/2017 Skip-Thought Vectors Arda Çelebi
16/03/2017 Understanding Deep Learning Requires Rethinking Generalization Miraç Göksu Öztürk
09/03/2017 Learning to learn by gradient descent by gradient descent Arda Celebi
03/02/2017 Hierarchical Attention Networks for Document Classification Çağıl Sönmez
02/02/2017 Understanding Neural Networks through Representation Erasure Onur Gungor
26/01/2017 Collaborative Deep Learning For Recommender Systems Mine Öğretir
19/01/2017 Relation Extraction: Perspective from Convolutional Neural Networks Çağıl Sönmez

Detailed List of Papers Upcoming

Citation: Katharina Kann, Ryan Cotterell and Hinrich Schütze. Neural Multi-Source Morphological Reinflection. EACL. 2017.

From Abstract: We explore the task of multi-source morphological reinflection, which generalizes the standard, single-source version. The input consists of (i) a target tag and (ii) multiple pairs of source form and source tag for a lemma. The motivation is that it is beneficial to have access to more than one source form since different source forms can provide complementary information, e.g., different stems.

Presenter: Onur Gungor

Link: https://arxiv.org/pdf/1612.06027.pdf

More info: Implementation

Citation: Nal Kalchbrenner, Edward Grefenstette and Phil Blunsom. 2014. A Convolutional Neural Network for Modelling Sentences. In Proceedings of ACL 2014.

One sentence from Abstract: We describe a convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) that we adopt for the semantic modelling of sentences. The network uses Dynamic k-Max Pooling, a global pooling operation over linear sequences. The network handles input sentences of varying length and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations.

Presenter: Çağıl Sönmez

Link: http://aclanthology.info/papers/a-convolutional-neural-network-for-modelling-sentences

More info: Code Theano implementation A Sentiment Analysis tool written in Theono+Tornado

Citation: Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik. CVPR 2014.

From Abstract: The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012---achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

Presenter: İlhan Adıyaman

Link: https://arxiv.org/abs/1311.2524

More info: Implementation

Detailed List of Papers Past

Citation: Thien Huu Nguyen and Ralph Grishman. 2015a. Relation extraction: Perspective from convolutional neural networks. In The NAACL Workshop on Vector Space Modeling for NLP (VSM).

From Abstract: In this work, we depart from these traditional approaches with complicated feature engineering by introducing a convolutional neural network for relation extraction that automatically learns features from sentences and minimizes the dependence on external toolkits and resources. Our model takes advantages of multiple window sizes for filters and pre-trained word embeddings as an initializer on a non-static architecture to improve the performance. We emphasize the relation extraction problem with an unbalanced corpus.

Link: http://aclanthology.info/papers/relation-extraction-perspective-from-convolutional-neural-networks

More info: Implementation

Citation: H. Wang, N. Wang, and D. Yeung. Collaborative deep learning for recommender systems. In KDD, 2015.

From Abstract: We generalize recent advances in deep learning from i.i.d. input to non-i.i.d. (CF-based) input and propose in this paper a hierarchical Bayesian model called collaborative deep learning (CDL), which jointly performs deep representation learning for the content information and collaborative filtering for the ratings (feedback) matrix.

Link: http://dl.acm.org/citation.cfm?id=2783273

More info: Related Material

Citation : Li, J., Monroe, W., & Jurafsky, D. (2016). Understanding Neural Networks through Representation Erasure. arXiv preprint arXiv:1612.08220.

From Abstract: : In this paper, we propose a general methodology to analyze and interpret decisions from a neural model by observing the effects on the model of erasing various parts of the representation, such as input word-vector dimensions, intermediate hidden units, or input words.

Link : https://arxiv.org/abs/1612.08220

Citation : Yang Zichao, Yang Diyi, Dyer Chris, He Xiaodong, Smola Alex, Hovy Eduard. Hierarchical Attention Networks for Document Classification in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies:1480-1489 ACL 2016.

From Abstract: : We propose a hierarchical attention network for document classification. Our model has two distinctive characteristics: (i) it has a hierarchical structure that mirrors the hierarchical structure of documents; (ii) it has two levels of attention mechanisms applied at the wordand sentence-level, enabling it to attend differentially to more and less important content when constructing the document representation. Experiments conducted on six large scale text classification tasks demonstrate that the proposed architecture outperform previous methods by a substantial margin. Visualization of the attention layers illustrates that the model selects qualitatively informative words and sentences.

Link : http://aclanthology.info/papers/hierarchical-attention-networks-for-document-classification

Citation : Learning to learn by gradient descent by gradient descent by Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W. Hoffman, David Pfau, Tom Schaul, Brendan Shillingford, Nando de Freitas; NIPS 2016

From Abstract: : Optimization algorithms are still designed by hand. In this paper we show how the design of an optimization algorithm can be cast as a learning problem, allowing the algorithm to learn to exploit structure in the problems of interest in an automatic way.

Link : https://arxiv.org/abs/1606.04474

More info :Implementation

Citation : Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals. ICLR 2017.

From Abstract: : Our experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data. This phenomenon is qualitatively unaffected by explicit regularization, and occurs even if we replace the true images by completely unstructured random noise. We corroborate these experimental findings with a theoretical construction showing that simple depth two neural networks already have perfect finite sample expressivity as soon as the number of parameters exceeds the number of data points as it usually does in practice.

Link : https://arxiv.org/abs/1611.03530

(23/03/2017):scroll: Skip-Thought Vectors

Citation : Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, Sanja Fidler. NIPS 2015.

From Abstract: : We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the surrounding sentences of an encoded passage. Sentences that share semantic and syntactic properties are thus mapped to similar vector representations. We next introduce a simple vocabulary expansion method to encode words that were not seen as part of training, allowing us to expand our vocabulary to a million words.

Link : https://arxiv.org/abs/1506.06726

More info:Implementation