Large-Scale Cost Function Learning for Path Planning using Deep Inverse Reinforcement Learning

Abstract – We present an approach for learning spatial traversability maps for driving in complex, urban environments based on an extensive dataset demonstrating the driving behaviour of human experts. The direct end-to-end mapping from raw input data to cost bypasses the effort of manually designing parts of the pipeline, exploits a large number of data samples, and can be framed additionally to refine handcrafted cost maps produced based on manual hand-engineered features.

To achieve this, we introduce a maximum-entropy-based, non-linear inverse reinforcement learning (IRL) framework which exploits the capacity of fully convolutional neural networks (FCNs) to represent the cost model underlying driving behaviours. The application of a high-capacity, deep, parametric approach successfully scales to more complex environments and driving behaviours, while at deployment being run-time independent of training dataset size. After benchmarking against state-of-the-art IRL approaches, we focus on demonstrating scalability and performance on an ambitious dataset collected over the course of 1 year including more than 25,000 demonstration trajectories extracted from over 120 km of urban driving.

We evaluate the resulting cost representations by showing the advantages over a carefully, manually designed cost map and furthermore demonstrate its robustness towards systematic errors by learning accurate representations even in the presence of calibration perturbations. Importantly, we demonstrate that a manually designed cost map can be refined to more accurately handle corner cases that are scarcely seen in the environment, such as stairs, slopes and underpasses, by further incorporating human priors into the training framework.

 

  • [PDF] [DOI] M. Wulfmeier, D. Rao, D. Z. Wang, P. Ondruska, and I. Posner, “Large-scale cost function learning for path planning using deep inverse reinforcement learning,” The International Journal of Robotics Research, p. 278364917722396, 2017.
    [Bibtex]

    @article{WulfmeierIJRR2017,
    author = {Markus Wulfmeier and Dushyant Rao and Dominic Zeng Wang and Peter Ondruska and Ingmar Posner},
    title = {Large-scale cost function learning for path planning using deep inverse reinforcement learning},
    journal = {The International Journal of Robotics Research},
    pages = {0278364917722396},
    year = {2017},
    doi = {10.1177/0278364917722396},
    URL = {http://dx.doi.org/10.1177/0278364917722396},
    pdf = {http://dx.doi.org/10.1177/0278364917722396},
    }

  • [PDF] M. Wulfmeier, D. Rao, and I. Posner, “Incorporating Human Domain Knowledge into Large Scale Cost Function Learning,” in Neural Information Processing Systems Conference, Deep Reinforcement Learning Workshop, 2016.
    [Bibtex]

    @inproceedings{WulfmeierNIPS2016,
    author = {Wulfmeier, Markus and Rao, Dushyant and Posner, Ingmar},
    title = {Incorporating Human Domain Knowledge into Large Scale Cost Function Learning},
    journal = {CoRR},
    volume = {abs/1612.04318},
    Booktitle = {Neural Information Processing Systems Conference, Deep Reinforcement Learning Workshop},
    year = {2016},
    Pdf = {http://arxiv.org/abs/1612.04318},
    }

  • [PDF] M. Wulfmeier, D. Z. Wang, and I. Posner, ” Watch This: Scalable Cost-Function Learning for Path Planning in Urban Environments ,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016.
    IROS ABB Best Student Paper Award [Bibtex]

    @inproceedings{WulfmeierIROS2016,
    Author = { Wulfmeier, Markus and Wang, Dominic Zeng and Posner, Ingmar },
    Title = "{ Watch This: Scalable Cost-Function Learning for Path Planning in Urban Environments }",
    Booktitle = {{IEEE/RSJ} International Conference on Intelligent Robots and Systems (IROS)},
    Note = {arxiv preprint: http://arxiv.org/abs/1607.02329},
    Pdf = {http://ieeexplore.ieee.org/document/7759328/},
    Year = 2016,
    Month = October,
    award = "IROS ABB Best Student Paper Award",
    awardlink = "http://www.iros2016.org/awards.html"
    }

  • [PDF] M. Wulfmeier, P. Ondruska, and I. Posner, “Maximum Entropy Deep Inverse Reinforcement Learning,” in Neural Information Processing Systems Conference, Deep Reinforcement Learning Workshop, Montreal, Canada, 2015.
    [Bibtex]

    @inproceedings{2015deepIRL,
    author = {Markus Wulfmeier and
    Peter Ondruska and
    Ingmar Posner},
    title = {Maximum Entropy Deep Inverse Reinforcement Learning},
    journal = {CoRR},
    volume = {abs/1507.04888},
    Booktitle = {Neural Information Processing Systems Conference, Deep Reinforcement Learning Workshop},
    year = {2015},
    Address = {Montreal, Canada},
    Pdf = {http://www.robots.ox.ac.uk/~mobile/Papers/DeepIRL_2015.pdf},
    timestamp = {Sun, 02 Aug 2015 18:42:02 +0200},
    biburl = {http://dblp.uni-trier.de/rec/bib/journals/corr/WulfmeierOP15},
    bibsource = {dblp computer science bibliography, http://dblp.org}
    }