**Project Heads**

Paolo Erdman

**Project Members**

Paolo Erdman

**Project Duration**

01.01.2023 − 31.12.2024

**Located at**

FU Berlin

framework, based on Reinforcement Learning, to optimally control Quantum thermal machines exploiting quantum measurements and feedback. The method finds Pareto-optimal tradeoffs between high power, high efficiency and low power fluctuations.

Applications to real-world quantum devices are foreseen.

**External Website**

**Related Publications
**

- “
*Pareto-optimal cycles for power, efficiency and fluctuations of quantum heat engines using reinforcement learning*“

P. A. Erdman, A. Rolandi, P. Abiuso, M. Perarnau-Llobet, F. Noé,*Phys. Rev. Res.***5**, L022017 (2023). - “
*Model-free optimization of power/efficiency tradeoffs in quantum thermal machines using reinforcement learning*“

P. A. Erdman, F. Noé,*PNAS Nexus*,**2,**pgad248 (2023). - “
*Measurement-based quantum thermal machines with feedback control*“

B. Bhandari, R. Czupryniak, P. A. Erdman, A. N. Jordan,*Entropy***25**, 204 (2023).

**Related Pictures
**

**Fig. 1:** Schematic representation of the learning process. A computer agent learns how to optimally drive a quantum thermal machines by interacting with it multiple times (panel A). A neural network architecture, based on stacking multiple 1D convolution blocks, is employed to have a model-free method (panels B,C).

**Fig. 2**: Example of training a Reinforcement Learning agent to optimize the performance of a Quantum Refrigerator based on a Superconducting Qubit. As the training proceeds, the control becomes more deterministic (panel D) and finally converges to the protocol in panel E.

**Fig. 3**: Example of a Pareto front describing optimal tradeoffs between extracted Power and Efficiency (panel C) of a quantum heat engine based on a collection of non-interacting particles trapped in a harmonic potentian (panel A). Examples of two optimal cycles are shown in panels D and E.