Exploring Deep Reinforcement Learning based Bio-inspired Thruster Optimization

dc.contributor.authorWang, Zhipeng
dc.contributor.departmentMechanical and Materials Engineering
dc.contributor.supervisorDixia, Fan
dc.contributor.supervisorAlan, Ableson
dc.date.accessioned2025-09-26T19:47:11Z
dc.date.available2025-09-26T19:47:11Z
dc.date.issued2025-09-26
dc.degree.grantorQueen's University at Kingstonen
dc.description.abstractTo optimize flapping foil performance in both Computational Fluid Dynamics(CFD) simulations and real-world experimental settings, this study applies Deep Reinforcement Learning (DRL) to generate non-parametric foil motion trajectories. Traditional control techniques and simplified sinusoidal motions are insufficient to capture the complex, nonlinear, unsteady, and high-dimensional interactions between the foil and its surrounding vortex structures. To address this limitation, we propose a DRL-based training framework built upon the Proximal Policy Optimization (PPO) algorithm and a Transformer-based policy network. The training framework is first initialized using expert demonstrations of sinusoidal motion in CFD setting. And the proposed training framework is further refined through a three-stage offline-to-online DRL training pipeline tailored for experimental deployment. To meet the demands of data-efficient DRL learning, we modified the CFD solver for high-performance parallel rollout training and designed a custom experimental system, the Intelligent Towing Farm (ITF), to efficiently collect hydrodynamic force as datasets and train in a parallel setting. In simulation, we first demonstrate the effectiveness of the proposed DRL training framework, learning the coherent foil flapping motion to generate thrust. Furthermore, by adjusting reward functions and action thresholds, DRL-optimized foil trajectories can gain significant enhancement in both thrust and efficiency compared with the sinusoidal motion. Last, through visualization of wake morphology and instantaneous pressure distributions, it is found that DRL-optimized foil can adaptively adjust the phases between motion and shedding vortices to improve hydrodynamic performance. In real-world experiments, we validate the three-stage DRL training pipeline through ablation studies. Furthermore, by modifying the reward function, the DRL-controlled foil consistently outperforms the extended EMOA Pareto Front sinusoidal policies in both thrust and lift force generation. Last, underwater robot experiments show that the DRL-optimized 3-DoF foil motion achieves up to 1.7 times the swimming distance of the EMOA-optimized counterpart under identical time. Our results give a hint of how to solve complex fluid manipulation problems using the DRL method both in simulation and experiment.
dc.description.degreePhD
dc.identifier.urihttps://hdl.handle.net/1974/35281
dc.language.isoeng
dc.relation.ispartofseriesCanadian thesesen
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectFluid Mechanics
dc.subjectDeep Refinforcement Learning
dc.titleExploring Deep Reinforcement Learning based Bio-inspired Thruster Optimization
dc.typethesisen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Wang_Zhipeng_202509_PhD.pdf
Size:
21.76 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.67 KB
Format:
Item-specific license agreed upon to submission
Description: