Professor Jürgen Schmidhuber
Professor
Jürgen Schmidhuber, Ph.D., Habil is
Codirector
of the Swiss AI lab
IDSIA, Lugano, Switzerland,
Professor of
Cognitive
Robotics
and
Computer
Science
at TU Munich (Extraordinarius),
Adjunct Professor of the faculty of computer science at
the
University of Lugano, and
Professor
SUPSI
in Manno (Switzerland).
Jürgen is well known for his work on machine learning, universal
Artificial Intelligence
(AI),
artificial neural networks, digital physics, and low-complexity art.
His contributions also include generalizations of
Kolmogorov complexity
and the
Speed Prior.
Recurrent Neural Networks
The dynamic recurrent neural networks developed in his lab are
simplified mathematical models of the biological neural networks found
in human brains. A particularly successful model of this type is called
Long Short-Term Memory. From
training sequences it learns to solve numerous tasks unsolvable by
previous such models. Applications range from automatic music
composition to speech recognition, reinforcement learning and robotics
in partially observable environments.
Artificial Evolution / Genetic Programming
As an undergrad at TUM, Jürgen evolved computer programs
through
genetic algorithms. The method was published in 1987 as one of the
first papers in the emerging field that later became known as genetic
programming. Since then he has coauthored numerous additional papers
on artificial evolution. Applications include robot control, soccer
learning, drag minimization, and time series prediction.
Neural Economy
In 1989 he created the first learning algorithm for neural networks
based on principles of the market economy (inspired by John Holland’s
bucket brigade algorithm for classifier systems): adaptive neurons
compete for being active in response to certain input patterns; those
that are active when there is external reward get stronger synapses,
but active neurons have to pay those that activated them, by
transferring parts of their synapse strengths, thus rewarding “hidden”
neurons setting the stage for later success.
Artificial Curiosity
In 1990 he published the first in a long series of papers on artificial
curiosity for an autonomous agent. The agent is equipped with an
adaptive predictor trying to predict future events from the history of
previous events and actions. A reward-maximizing, reinforcement
learning, adaptive controller is steering the agent and gets a
curiosity
reward for executing action sequences that improve the predictor. This
discourages it from executing actions leading to boring outcomes that
are either predictable or totally unpredictable. Instead the controller
is motivated to learn actions that help the predictor to learn new,
previously unknown regularities in its environment, thus improving its
model of the world, which in turn can greatly help to solve externally
given tasks. This has become an important concept of developmental
robotics.
Unsupervised Learning / Factorial Codes
During the early 1990s Jürgen also invented a neural method for
nonlinear independent component analysis (ICA) called predictability
minimization. It is based on co-evolution of adaptive predictors and
initially random, adaptive feature detectors processing input patterns
from the environment. For each detector there is a predictor trying to
predict its current value from the values of neighboring detectors,
while each detector is simultaneously trying to become as unpredictable
as possible. It can be shown that the best the detectors can do is to
create a factorial code of the environment, that is, a code that
conveys all the information about the inputs such that the code
components are statistically independent, which is desirable for many
pattern recognition applications.
Kolmogorov Complexity / Computer-Generated
Universe
In 1997 Jürgen published a paper based on Konrad Zuse’s
assumption (1967) that the history of the universe is computable. He
pointed out that the simplest explanation of the universe would be a
very simple Turing machine programmed to systematically execute all
possible programs computing all possible histories for all types of
computable physical laws. He also pointed out that there is an
optimally efficient way of computing all computable universes based on
Leonid Levin’s universal search algorithm (1973). In 2000 he expanded
this work by combining Ray Solomonoff’s theory of inductive inference
with the assumption that quickly computable universes are more likely
than others. This work on digital physics also led to limit-computable
generalizations of algorithmic information or
Kolmogorov Complexity and
the concept of Super Omegas, which are limit-computable numbers that
are even more random (in a certain sense) than Gregory Chaitin’s
number of
wisdom Omega.
Universal AI
Important recent research topics of his group include universal
learning algorithms and universal AI. Contributions include the first
theoretically optimal decision makers living in environments obeying
arbitrary unknown but computable probabilistic laws, and mathematically
sound general problem solvers such as the remarkable asymptotically
fastest algorithm for all well-defined problems, by his former postdoc
Marcus Hutter. Based on the theoretical results obtained in the
early
2000s, he is actively promoting the view that in the new
millennium the field of general AI has matured and become a real formal
science.
An old dream of computer scientists is to build an optimally efficient
universal problem solver. Jürgen uses
Gödel’s
self-reference trick to
achieve this. A
Gödel Machine is a computer whose original
software
includes axioms describing the hardware and the original
software (this is possible without circularity) plus whatever is
known about the (probabilistic) environment plus some formal
goal in form of an arbitrary user-defined utility function, e.g.,
cumulative future expected reward in a sequence of optimization
tasks. The original software also includes a proof searcher which
uses the axioms (and possibly an online variant of Levin’s
universal search) to systematically make pairs (“proof”, “program”)
until it finds a proof that a rewrite of the original software
through “program” will increase utility. The machine can be
designed such that each self-rewrite is necessarily globally
optimal in the sense of the utility function, even those rewrites
that destroy the proof searcher.
Low-Complexity Art / Theory of Beauty
Jürgen’s low-complexity artworks (since 1997) can be described by
very short computer programs containing very few bits of information,
and reflect his formal theory of beauty based on the concepts of
Kolmogorov complexity and minimum description length.
He writes that since age 15 or so his main scientific ambition
has been to build an optimal scientist, then retire. First he wants to
build a scientist better than himself (humorously, he quips that his
colleagues claim that should be easy) who will then do the remaining
work. He says he “cannot see any more efficient way of using and
multiplying the little creativity he’s got”.
Publications
Jürgen’s more than 100 peer-reviewed articles include
Recurrent Neural Networks,
Artificial Evolution
Active Exploration, Artificial Curiosity & What’s Interesting,
Nonlinear Independent Component Analysis (ICA),
Unsupervised Learning, Redundancy Reduction,
Generalized Algorithmic Information
Generalized Algorithmic Probability
Super Omegas,
Speed Prior:
A New Simplicity Measure for Near-Optimal Computable
Predictions (based on the fastest way of describing objects,
not the shortest),
Computable Universes &
Algorithmic Theory of Everything, and
Theory of Beauty & Low-Complexity Art.
Read the
full list of his publications!
Jürgen earned his diploma in computer science in 1987, his Ph.D.
in 1991 and his Habilitation in 1993, all from TUM.
Watch
In the beginning was the code: Jürgen Schmidhuber at
TEDxUHasselt.