Surrogate-Based Optimization Using Continuous Piecewise Linear Models

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Nonlinear programming (NLP) and mixed-integer nonlinear programming (MINLP) are widely used in various application domains, including scheduling, transportation, and chemical processes, etc. Solving these problems to global optimality in a reasonable time is a significant challenge. However, with advancements in state-ofthe-art mixed-integer linear programming (MILP) solvers, we can effectively tackle nonlinear problems by converting them into MILPs. Continuous Piecewise Linear (CPWL) functions are popular modelling techniques capable of approximating any continuous nonlinear function. By using CPWL, we can convert NLP or MINLP problems into MILPs, which can be solved quickly and provide an initial point to warm-start the original problem. In some cases, when the solution is sufficiently accurate, this approach can eliminate the need to solve the NLP/MINLP entirely. This thesis examines the use of CPWL approximation methods to develop surrogate models suitable for solution as optimization problems. We focus on neural networks with rectifier linear unit (ReLU) activation functions, known as ReLU-NNs, a commonly used CPWL representation method. Our findings reveal that ReLU-NNs result in inefficient CPWL approximation and are less effective compared to other CPWL approximation techniques, such as Difference-of-Convex CPWL (DC-CPWL) [39].

Despite the popularity of ReLU-NN as CPWL models, the intricacies of the linear ipieces generated by these networks have yet to be fully explored. In Chapter 2, we first propose exact mathematical expressions for linear functions and linear regions of small rectifier networks. Moreover, we analyze the performance of the rectifier networks from a polyhedral perspective and introduce the three major challenges associated with these models: redundancy, degeneracy, and low efficiency. Furthermore, we explore DC-CPWL approximation and compare it to ReLU-based shallow and deep Neural Networks across four industrial case studies. Our findings demonstrate that DC-CPWL consistently yields highly efficient models without the issues of redundancy and degeneracy while ReLFU-NN representation generates less efficient models with several inefficient linear regions.

In Chapter 3, the CPWL models derived from ReLU-NNs and DC-CPWL are reformulated as MILPs and applied to two optimization case studies: a benchmark function and a fuel cost minimization problem (FCMP) for a gas network. Our goal is to compare the optimization solution times when each CPWL surrogate model is integrated into the optimization framework. Additionally, to accelerate the optimization process using ReLU-NNs, we employ a pruning technique to compress the network size while maintaining model accuracy. The case study results show that pruning can significantly reduce computational time. However, using DC-CPWL as a surrogate model offers an even greater reduction in computational time compared to neural networks.

Description

Keywords

Surrogate models, Optimization, Continuous Piecewise Linear Models, ReLU Neural Networks

Citation

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as Attribution 4.0 International