This grant provides funding for the development of expedient, customized machine learning algorithms that are appropriate for decision problems with acyclic state spaces, and their implementation to the application area of optimal disassembly planning (ODP). The theoretical framework for the proposed research is based on (i) the fact that the acyclic nature of the considered state spaces enables the explicit characterization of the dynamics of the learning process and (ii) some results coming from the area of statistical inference known as ranking & selection". Furthermore, the embedding of these statistical results in the proposed methodological framework gives rise to some Markov Decision Process (MDP)-type of problems, that are currently unexplored and they possess some very interesting special structure and significant practical relevance. The last stage of the proposed research will seek to apply the derived results to the ODP problem, and integrate them in a computational testbed for this problem, that is currently under development. Both, the expected theoretical developments and the aforementioned computational testbed will be accessible to the broader community through a dedicated Website.
If successful, the results of this research will extend the currently existing results in the area of machine learning, from both, a theoretical and a practical standpoint. On the theoretical side, the expected results will promote the possibility of exploiting the special structure underlying many sub-classes of these problems, in order to enhance the analytical tractability, efficiency and rigor of the developed solutions. On the practical side, the expected results will lead to expedient customized learning algorithms with provably established properties for the entire class of decision problems with acyclic state spaces, in general, and the ODP problem, in particular. Finally, additional contributions are expected in the area of Markov Decision Processes, through the formulation and study of the specially structured MDP's mentioned in the previous paragraph; beyond their appearance in the machine learning context of this work, these MDP's arise also in many other application contexts that involve the optimal design of sequential experiments with random outcomes.