Institution

Dalle Molle Institute for Artificial Intelligence Research

Facility•Lugano, Switzerland•

About: Dalle Molle Institute for Artificial Intelligence Research is a facility organization based out in Lugano, Switzerland. It is known for research contribution in the topics: Artificial neural network & Reinforcement learning. The organization has 251 authors who have published 1331 publications receiving 131103 citations. The organization is also known as: IDSIA & IDSIA Dalle Molle Institute for Artificial Intelligence.

...read moreread less

Topics: Artificial neural network, Reinforcement learning, Recurrent neural network, Bayesian network, Robot ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Long short-term memory

[...]

Sepp Hochreiter¹, Jürgen Schmidhuber²•Institutions (2)

Technische Universität München¹, Dalle Molle Institute for Artificial Intelligence Research²

01 Nov 1997-Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Abstract: Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

...read moreread less

72,897 citations

Journal Article•DOI•

Ant colony system: a cooperative learning approach to the traveling salesman problem

[...]

Marco Dorigo¹, Luca Maria Gambardella²•Institutions (2)

VU University Amsterdam¹, Dalle Molle Institute for Artificial Intelligence Research²

01 Apr 1997-IEEE Transactions on Evolutionary Computation

TL;DR: The results show that the ACS outperforms other nature-inspired algorithms such as simulated annealing and evolutionary computation, and it is concluded comparing ACS-3-opt, a version of the ACS augmented with a local search procedure, to some of the best performing algorithms for symmetric and asymmetric TSPs.

...read moreread less

Abstract: This paper introduces the ant colony system (ACS), a distributed algorithm that is applied to the traveling salesman problem (TSP). In the ACS, a set of cooperating agents called ants cooperate to find good solutions to TSPs. Ants cooperate using an indirect form of communication mediated by a pheromone they deposit on the edges of the TSP graph while building solutions. We study the ACS by running experiments to understand its operation. The results show that the ACS outperforms other nature-inspired algorithms such as simulated annealing and evolutionary computation, and we conclude comparing ACS-3-opt, a version of the ACS augmented with a local search procedure, to some of the best performing algorithms for symmetric and asymmetric TSPs.

...read moreread less

7,596 citations

Proceedings Article•DOI•

Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks

[...]

Alex Graves¹, Santiago Fernández¹, Faustino Gomez¹, Jürgen Schmidhuber²•Institutions (2)

Dalle Molle Institute for Artificial Intelligence Research¹, Technische Universität München²

25 Jun 2006

TL;DR: This paper presents a novel method for training RNNs to label unsegmented sequences directly, thereby solving both problems of sequence learning and post-processing.

...read moreread less

Abstract: Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data. In speech recognition, for example, an acoustic signal is transcribed into words or sub-word units. Recurrent neural networks (RNNs) are powerful sequence learners that would seem well suited to such tasks. However, because they require pre-segmented training data, and post-processing to transform their outputs into label sequences, their applicability has so far been limited. This paper presents a novel method for training RNNs to label unsegmented sequences directly, thereby solving both problems. An experiment on the TIMIT speech corpus demonstrates its advantages over both a baseline HMM and a hybrid HMM-RNN.

...read moreread less

5,188 citations

Proceedings Article•DOI•

Multi-column deep neural networks for image classification

[...]

Dan Ciresan¹, Ueli Meier¹, Jürgen Schmidhuber¹•Institutions (1)

Dalle Molle Institute for Artificial Intelligence Research¹

16 Jun 2012

TL;DR: In this paper, a biologically plausible, wide and deep artificial neural network architectures was proposed to match human performance on tasks such as the recognition of handwritten digits or traffic signs, achieving near-human performance.

...read moreread less

Abstract: Traditional methods of computer vision and machine learning cannot match human performance on tasks such as the recognition of handwritten digits or traffic signs. Our biologically plausible, wide and deep artificial neural network architectures can. Small (often minimal) receptive fields of convolutional winner-take-all neurons yield large network depth, resulting in roughly as many sparsely connected neural layers as found in mammals between retina and visual cortex. Only winner neurons are trained. Several deep neural columns become experts on inputs preprocessed in different ways; their predictions are averaged. Graphics cards allow for fast training. On the very competitive MNIST handwriting benchmark, our method is the first to achieve near-human performance. On a traffic sign recognition benchmark it outperforms humans by a factor of two. We also improve the state-of-the-art on a plethora of common image classification benchmarks.

...read moreread less

3,717 citations

Journal Article•DOI•

Learning to Forget: Continual Prediction with LSTM

[...]

Felix A. Gers¹, Jürgen Schmidhuber¹, Fred Cummins¹•Institutions (1)

Dalle Molle Institute for Artificial Intelligence Research¹

01 Oct 2000-Neural Computation

TL;DR: This work identifies a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset, and proposes a novel, adaptive forget gate that enables an LSTm cell to learn to reset itself at appropriate times, thus releasing internal resources.

...read moreread less

Abstract: Long short-term memory (LSTM; Hochreiter & Schmidhuber, 1997) can solve numerous tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset. Without resets, the state may grow indefinitely and eventually cause the network to break down. Our remedy is a novel, adaptive "forget gate" that enables an LSTM cell to learn to reset itself at appropriate times, thus releasing internal resources. We review illustrative benchmark problems on which standard LSTM outperforms other RNN algorithms. All algorithms (including LSTM) fail to solve continual versions of these problems. LSTM with forget gates, however, easily solves them, and in an elegant way.

...read moreread less

3,135 citations

Collapse

Authors

Showing all 256 results

Name	H-index	Papers	Citations
Marco Dorigo	105	657	91418
Jürgen Schmidhuber	99	539	122453
Emanuele Zucca	77	470	29055
Luca Maria Gambardella	71	336	35327
Petros Koumoutsakos	71	375	21325
Aude Billard	66	382	15554
Auke Jan Ijspeert	65	447	19103
Alex Graves	63	102	84198
Julian Togelius	58	420	13135
Daan Wierstra	51	71	51290
Klaus Jansen	45	344	6477
Paul Smolensky	42	178	15145
Sepp Hochreiter	42	168	72856
Tom Schaul	41	104	17380
Douglas Eck	40	127	6082

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

38.6K papers, 1.3M citations

92% related

Microsoft

86.9K papers, 4.1M citations

91% related

Facebook

10.9K papers, 570.1K citations

90% related

Carnegie Mellon University

104.3K papers, 5.9M citations

90% related

Performance

Metrics

1,345

Papers

167,947

Citations

No. of papers from the Institution in previous years
Year	Papers
2023	3
2022	10
2021	94
2020	101
2019	80
2018	69