昨夜西風凋碧樹，獨上高樓，望盡天涯路。

望盡天涯路
國內視頻
論文標竿
- 2019
- 2018
- 2017
- 2016
- 2015
- 2014
- 2013
- 2012
- 2011
- 2010
研究三種境界

望盡天涯路

11 月，冷。「昨夜西風凋碧樹，獨上高樓，望盡天涯路。」

學生開始了他們的碩士學位最後一年的項目報告。

2017 年 7 月，類研究猿準備了一套 333 張的幻燈片。類研究猿解釋如何完成最後一年的項目。類研究猿張貼在這裡。現在學生問：我們可以有一些中文資源嗎？

國內視頻

類研究猿提供了 10 個視頻的連結。它們是「輯思編譯SCI論文編輯」的視頻。

論文標竿

類研究猿提供了 14 篇論文的連結。它們是 2010 年以來在Association for the Advancement of Artificial Intelligence的最佳論文。

2019

Yonathan Efroni, Gal Dalal, Bruno Scherrer and Shie Mannor, How to Combine Tree-Search Methods in Reinforcement Learning

Abstract: Finite-horizon lookahead policies are abundantly used in Reinforcement Learning and demonstrate impressive empirical success. Usually, the lookahead policies are implemented with specific planning methods such as Monte Carlo Tree Search (e.g. in AlphaZero). Referring to the planning problem as tree search, a reasonable practice in these implementations is to back up the value only at the leaves while the information obtained at the root is not leveraged other than for updating the policy. Here, we question the potency of this approach. Namely, the latter procedure is non-contractive in general, and its convergence is not guaranteed. Our proposed enhancement is straightforward and simple: use the return from the optimal tree path to back up the values at the descendants of the root. This leads to a γ^h-contracting procedure, where γ is the discount factor and h is the tree depth. To establish our results, we first introduce a notion called multiple-step greedy consistency. We then provide convergence rates for two algorithmic instantiations of the above enhancement in the presence of noise injected to both the tree search stage and value estimation stage.

2018

Chenjun Xiao, Jincheng Mei and Martin Müller. Memory-Augmented Monte Carlo Tree Search

Abstract: This paper proposes and evaluates Memory-Augmented Monte Carlo Tree Search (M-MCTS), which provides a new approach to exploit generalization in online real-time search. The key idea of M-MCTS is to incorporate MCTS with a memory structure, where each entry contains information of a particular state. This memory is used to generate an approximate value estimation by combining the estimations of similar states. We show that the memory based value approximation is better than the vanilla Monte Carlo estimation with high probability under mild conditions. We evaluate M-MCTS in the game of Go. Experimental results show that M-MCTS outperforms the original MCTS with the same number of simulations.

2017

Russell Stewart and Stefano Ermon. Label-Free Supervision of Neural Networks with Physics and Domain Knowledge

Abstract: In many machine learning applications, labeled data is scarce and obtaining more labels is expensive. We introduce a new approach to supervising neural networks by specifying constraints that should hold over the output space, rather than direct examples of input-output pairs. These constraints are derived from prior domain knowledge, e.g., from known laws of physics. We demonstrate the effectiveness of this approach on real world and simulated computer vision tasks. We are able to train a convolutional neural network to detect and track objects without any labeled examples. Our approach can significantly reduce the need for labeled training data, but introduces new challenges for encoding prior knowledge into appropriate loss functions.

2016

Robert C. Holte, Ariel Felner, Guni Sharon and Nathan R. Sturtevant. Bidirectional Search That Is Guaranteed to Meet in the Middle

Abstract: We present MM, the first bidirectional heuristic search algorithm whose forward and backward searches are guaranteed to “meet in the middle”, i.e. never expand a node beyond the solution midpoint. We also present a novel framework for comparing MM, A*, and brute-force search, and identify conditions favoring each algorithm. Finally, we present experimental results that support our theoretical analysis.

2015

Florian Pommerening, Malte Helmert, Gabriele Röger and Jendrik Seipp. From Non-Negative to General Operator Cost Partitioning

Abstract: Operator cost partitioning is a well-known technique to make admissible heuristics additive by distributing the operator costs among individual heuristics. Planning tasks are usually defined with non-negative operator costs and therefore it appears natural to demand the same for the distributed costs. We argue that this requirement is not necessary and demonstrate the benefit of using general cost partitioning. We show that LP heuristics for operator-counting constraints are cost-partitioned heuristics and that the state equation heuristic computes a cost partitioning over atomic projections. We also introduce a new family of potential heuristics and show their relationship to general cost partitioning.

2014

Elias Bareinboim, Jin Tian and Judea Pearl. Recovering from Selection Bias in Causal and Statistical Inference

Abstract: Selection bias is caused by preferential exclusion of units from the samples and represents a major obstacle to valid causal and statistical inferences; it cannot be removed by randomized experiments and can rarely be detected in either experimental or observational studies. In this paper, we provide complete graphical and algorithmic conditions for recovering conditional probabilities from selection biased data. We also provide graphical conditions for recoverability when unbiased data is available over a subset of the variables. Finally, we provide a graphical condition that generalizes the backdoor criterion and serves to recover causal effects when the data is collected under preferential selection.

2013

Janardhan Rao Doppa, Alan Fern and Prasad Tadepalli. HC-Search: Learning Heuristics and Cost Functions for Structured Prediction

Abstract: Structured prediction is the problem of learning a function from structured inputs to structured outputs. Inspired by the recent successes of search-based structured prediction, we introduce a new framework for structured prediction called HC-Search. Given a structured input, the framework uses a search procedure guided by a learned heuristic H to uncover high quality candidate outputs and then uses a separate learned cost function C to select a final prediction among those outputs. We can decompose the regret of the overall approach into the loss due to H not leading to high quality outputs, and the loss due to C not selecting the best among the generated outputs. Guided by this decomposition, we minimize the overall regret in a greedy stagewise manner by first training H to quickly uncover high quality outputs via imitation learning, and then training C to correctly rank the outputs generated via H according to their true losses. Experiments on several benchmark domains show that our approach significantly outperforms the state-of-the-art methods.

Gary Doran and Soumya Ray. SMILe: Shuffled Multiple-Instance Learning

Abstract: Resampling techniques such as bagging are often used in supervised learning to produce more accurate classifiers. In this work, we show that multiple-instance learning admits a different form of resampling, which we call “shuffling.” In shuffling, we resample instances in such a way that the resulting bags are likely to be correctly labeled. We show that resampling results in both a reduction of bag label noise and a propagation of additional informative constraints to a multiple-instance classifier. We empirically evaluate shuffling in the context of multiple-instance classification and multiple-instance active learning and show that the approach leads to significant improvements in accuracy.

2012

Suicheng Gu and Yuhong Guo. Learning SVM Classifiers with Indefinite Kernels

Abstract: Recently, training support vector machines with indefinite kernels has attracted great attention in the machine learning community. In this paper, we tackle this problem by formulating a joint optimization model over SVM classifications and kernel principal component analysis. We first reformulate the kernel principal component analysis as a general kernel transformation framework, and then incorporate it into the SVM classification to formulate a joint optimization model. The proposed model has the advantage of making consistent kernel transformations over training and test samples. It can be used for both binary classification and multi-class classification problems. Our experimental results on both synthetic data sets and real world data sets show the proposed model can significantly outperform related approaches.

Zhanying He, Chun Chen, Jiajun Bu, Can Wang and Lijun Zhang. Document Summarization Based on Data Reconstruction

Abstract: Document summarization is of great value to many real world applications, such as snippets generation for search results and news headlines generation. Traditionally, document summarization is implemented by extracting sentences that cover the main topics of a document with a minimum redundancy. In this paper, we take a different perspective from data reconstruction and propose a novel framework named Document Summarization based on Data Reconstruction (DSDR). Specifically, our approach generates a summary which consist of those sentences that can best reconstruct the original document. To model the relationship among sentences, we introduce two objective functions: (1) linear reconstruction, which approximates the document by linear combinations of the selected sentences; (2) nonnegative linear reconstruction, which allows only additive, not subtractive, linear combinations. In this framework, the reconstruction error becomes a natural criterion for measuring the quality of the summary. For each objective function, we develop an efficient algorithm to solve the corresponding optimization problem. Extensive experiments on summarization benchmark data sets DUC 2006 and DUC 2007 demonstrate the effectiveness of our proposed approach.

2011

Daniel Golovin, Andreas Krause, Beth Gardner, Sarah J. Converse and Steve Morey. Dynamic Resource Allocation in Conservation Planning

Abstract: Consider the problem of protecting endangered species by selecting patches of land to be used for conservation purposes. Typically, the availability of patches changes over time, and recommendations must be made dynamically. This is a challenging prototypical example of a sequential optimization problem under uncertainty in computational sustainability. Existing techniques do not scale to problems of realistic size. In this paper, we develop an efficient algorithm for adaptively making recommendations for dynamic conservation planning, and prove that it obtains near-optimal performance. We further evaluate our approach on a detailed reserve design case study of conservation planning for three rare species in the Pacific Northwest of the United States.

Jessica Davies, George Katsirelos, Nina Narodytska and Toby Walsh. Complexity of and Algorithms for Borda Manipulation

Abstract: We prove that it is NP-hard for a coalition of two manipulators to compute how to manipulate the Borda voting rule. This resolves one of the last open problems in the computational complexity of manipulating common voting rules. Because of this NP-hardness, we treat computing a manipulation as an approximation problem where we try to minimize the number of manipulators. Based on ideas from bin packing and multiprocessor scheduling, we propose two new approximation methods to compute manipulations of the Borda rule. Experiments show that these methods significantly outperform the previous best known approximation method. We are able to find optimal manipulations in almost all the randomly generated elections tested. Our results suggest that, whilst computing a manipulation of the Borda rule by a coalition is NP-hard, computational complexity may provide only a weak barrier against manipulation in practice.

2010

Giorgos Stoilos, Bernardo Cuenca Grau and Ian Horrocks. How Incomplete Is Your Semantic Web Reasoner? Systematic Analysis of the Completeness of Query Answering Systems

Abstract: Conjunctive query answering is a key reasoning service for many ontology-based applications. In order to improve scalability, many Semantic Web query answering systems give up completeness (i.e., they do not guarantee to return all query answers). It may be useful or even critical to the designers and users of such systems to understand how much and what kind of information is (potentially) being lost. We present a method for generating test data that can be used to provide at least partial answers to these questions, a purpose for which existing benchmarks are not well suited. In addition to developing a general framework that formalises the problem, we describe practical data generation algorithms for some popular ontology languages, and present some very encouraging results from our preliminary evaluation.

Ruoyun Huang, Yixin Chen and Weixiong Zhang. A Novel Transition Based Encoding Scheme for Planning as Satisfiability

Abstract: Planning as satisfiability is a principal approach to planning with many eminent advantages. The existing planning as satisfiability techniques usually use encodings compiled from the STRIPS formalism. We introduce a novel SAT encoding scheme based on the SAS+ formalism. It exploits the structural information in the SAS+ formalism, resulting in more compact SAT instances and reducing the number of clauses by up to 50 fold. Our results show that this encoding scheme improves upon the STRIPS-based encoding, in terms of both time and memory efficiency.

研究三種境界

《人間詞話》王國維：

古今之成大業、大學問者，必經過三種境界：

「昨夜西風凋碧樹，獨上高樓，望盡天涯路。」

「衣帶漸寬終不悔，為伊消得人憔悴。」

「眾裡尋他千百度，驀然回首，那人卻在，燈火闌珊處。」