Ebook: The Logic of Adaptive Behavior
Learning and reasoning in large, structured, probabilistic worlds is at the heart of artificial intelligence. Markov decision processes have become the de facto standard in modeling and solving sequential decision making problems under uncertainty. Many efficient reinforcement learning and dynamic programming techniques exist that can solve such problems. Until recently, the representational state-of-the-art in this field was based on propositional representations. However, it is hard to imagine a truly general, intelligent system that does not conceive of the world in terms of objects and their properties and relations to other objects. To this end, this book studies lifting Markov decision processes, reinforcement learning and dynamic programming to the first-order (or, relational) setting. Based on an extensive analysis of propositional representations and techniques, a methodological translation is constructed from the propositional to the relational setting. Furthermore, this book provides a thorough and complete description of the state-of-the-art. It surveys vital, related historical developments and contains extensive descriptions of several new model-free and model-based solution techniques.
One of my favorite stories is The Library of Babel (1941) by the Argentinean writer and librarian Jorge Luis Borges (1899–1986). In this fantastic story, Borges describes an imaginary library containing all possible books of a specific length, containing a specific number of pages and symbols. The library itself consists of an enormous amount of interconnected hexagonal rooms. Borges describes that “[W]hen it was proclaimed that the Library contained all books, the first impression was one of extravagant happiness. All men felt themselves to be the masters of an intact and secret treasure. There was no personal or world problem whose eloquent solution did not exist in some hexagon.”
For computer scientists and mathematicians, the concept of the universal library immediately activates connections with combinatorics and permutations. In fact, most computer scientists will be able to program in roughly ten minutes an algorithm to generate all possible books (though actually running it to completion will not be possible in their lifetime). The physical concept of the interconnected hexagons has interesting links to graph theory and search spaces. By all means, if the universal library would really exist, every writer could just ‘look up’, or search the connection graph for, the book he or she wants to write, instead of writing it him- or herself. Looking up the right book is, unfortunately, not very easy given that the library is extremely large. Traveling through even the slightest portion of corridors and hexagons will take up more than a lifetime for any mere mortal being. Even stronger, the physical size of the library exceeds the capacity of our universe, as the mathematician Goldbloom Bloch (2008) writes in his exciting book on the library. The labyrinth of the library is thus fictional, yet it is far more intuitive (and fun) to imagine wandering through the physical version. Borges tells us that “[F]or four centuries now men have exhausted the hexagons … There are official searchers, inquisitors. I have seen them in the performance of their function: they always arrive extremely tired from their journeys; they speak of a broken stairway which almost killed them; they talk with the librarian of galleries and stairs; sometimes they pick up the nearest volume and leaf through it, looking for infamous words. Obviously, no one expects to discover anything.”
Wandering through a library, not expecting to discover anything is one of the pleasant things in life, I think, although more modern variants exist through the vast network of interconnected webpages on the internet, where hyperlinks correspond to corridors. Umberto Eco was inspired by Borges' library for his book The Name of the Rose (1980), in which he describes an exciting hunt through a physical version of a similar library. Here though, the travelers were looking for something specific, for knowledge, even though it remains unclear what exactly that is but until the end of the story. Eco acknowledged the inspiration through the librarian in his story, named Jorge von Burgos. And although the travelers in Eco's story were looking for general knowledge, the ultimate form of general knowledge in the library has many connections with compression, orderings and information theory: “[W]e also know of another superstition of that time: that of the Man of the Book. On some shelf in some hexagon (men reasoned) there must exist a book which is the formula and perfect compendium of all the rest: some librarian has gone through it and he is analogous to a god. In the language of this zone vestiges of this remote functionary's cult still persist. Many wandered in search of Him.”
Out of the many interesting ideas in Borges' story, I find this concept of a book that provides a perfect compendium to all other books most intriguing. The existence of this one book that summarizes all other books in whatever ‘best’ way is an extraordinary phenomenon. It has to exist though, given that the library is finite. Yet, I find it hard to imagine what the contents would look like. In a much more modest setting, imagining a book that summarizes a well-chosen set of texts on a particular topic is much more conceivable. In fact, this is what I have tried to do in this book: providing a compendium of all work concerning learning sequential decision making under uncertainty in first-order and relational domains, in whatever ‘best’ way possible. I have wandered through a partially physical, partially electronic, Borgian library of literature, discovered new hexagons and books myself, and I have drawn a map of interesting hexagons, corridors, galleries and books that were visited by other travelers.
Research on this topic began in 1998 and has since then continued to grow. It is now an established subfield of AI and results appear at international scientific fora that deal with topics such as machine learning, knowledge representation, decision making and planning. Furthermore, it is also an active field, and there is a growing attention for its problems and results. Among others, this year the ICAPS 2008 workshop on a reality check for planning and scheduling under uncertainty was organized, as well as the AAAI 2008 workshop on transfer learning for complex tasks. Both are much related to the core problems in this book. Furthermore, there was a tutorial on decision-theoretic planning and learning in relational domains at AAAI 2008, and a tutorial on first-order planning techniques at ICAPS 2008. In the coming years, there will be many more scientific events where results concerning sequential decision making in first-order domains will be presented. However, the research in the past decade has already developed a thorough understanding of, and a relatively established core set of techniques for, the Markov decision process setting in first-order and relational domains. It is exactly this topic that I describe in the book. A similar setting, but for non-relational domains, was described a decade ago in the famous book by Sutton and Barto (1998).
Even though I did my ultimate best to cover every text that was relevant for the topic in this book, I cannot possibly guarantee that there are no hidden hexagons that I missed by mistake. In addition, almost all hexagons I describe contain books that are written in English. Though on my journey, I have encountered several texts written in Japanese and Chinese texts for example, that seemed to describe matters that were of interest to me. Unfortunately my knowledge of these languages is limited. Yet, for all English texts I am confident that I might have discovered them all. This book contains a bibliographic map that describes all relevant hexagons. As we will see, there are cases where fellow travelers were not aware of other travelers having discovered interesting hexagons before them, and others had locked themselves inside a particular hexagon without looking for paths that connected their hexagon with another that contained similar books. As far as possible, I have tried to establish links between hexagons, and books, assembling the knowledge of all travelers that have gone before me, and I have even dug new tunnels connecting hexagons when relevant or necessary.
This book grew out of my recent dissertation (van Otterlo, 2008a). Writing a book like this one has been a long, personal journey through a vast, Borgian library. Fortunately, during this journey, I have met many kind and interesting fellow travelers. Sometimes our paths crossed several times, sometimes I stumbled across them in some remote, dimly lit, hidden hexagon, and some people have watched over me while I was traveling from one hexagon to another. I have had the pleasure to engage in interesting conversations with many of the fellow travelers that I mention in this book. Many talks in person or by e-mail have helped me in increasing my understanding of sometimes complicated matters. A possible list of names would quickly turn into a Borgian space of its own, and thus here I take the opportunity to thank you all at once. I also thank all of my coauthors for working with me on interesting topics. In the recent years, I have had the honor to be a member of the Dutch discussion group on artificial intelligence, EMERGENTIA, and I would like to thank all its members and former members for fun and interesting discussions on many Sunday afternoons.
I owe much to my promotor John-Jules Meyer and assistant-promotor and referent Marco Wiering. Both have watched over my travels from a small distance and contributed to the final success of my journey with their kind, and complementary, support. I also thank all other wise men who agreed to take place in my committee and read my dissertation. I specifically want to thank Luc De Raedt and Joost Kok. Luc, for all the opportunities he has given me in Dagstuhl, Freiburg and Leuven, and also for showing me how science works, and Joost, for giving me the kind opportunity to create this book in this form.
The scientific community has much to offer, but friends and family are most important still. Many thanks go to my parents and my sister and in addition to all friends and people of all kinds of family. Yet, the one I have to – and gratefully want to – thank the most, is my dear Marieke. In addition to love, friendship, irresistible humor, and much support she has given me slightly longer already than the decade I describe in this book, she has always shown a great understanding of me and my journey. I admire that, especially given the scope and length of my travels. I'm looking forward to all our travels still to come.
Leuven, November 2008
Martijn van Otterlo