AA2 – Nano and Quantum Technologies

Project

AA2-18

Pareto-Optimal Control of Quantum Thermal Devices with Deep Reinforcement Learning

Project Heads

Paolo Erdman

Project Members

Paolo Erdman

Project Duration

01.01.2023 − 31.12.2024

Located at

FU Berlin

Description

Quantum thermal machines are micro-scale devices that convert between heat and work exploiting quantum effects, and can find applications in on-chip active cooling and in energy harvesting from local temperature gradients. Optimally controlling such systems as to maximize their performance is an extremely challenging task.
 
Here we develop a mathematical framework, based on reinforcement learning, to optimally control quantum thermal machines exploiting quantum measurements and feedback. The method finds Pareto-optimal tradeoffs between multiple objectives, such as high power, high efficiency and low power fluctuations.
 
We further apply our framework to quantum batteries – quantum many-body systems that can temporarily store and provide energy. They have received great interest in recent years thanks to theoretical predictions of collective advantages in the charging power. Our method finds charging strategies that increase the stored energy and the amount of extractable energy, while preserving the collective advantage.
 
Finally, we apply analytical and machine learning based methods to the fields of quantum metrology. We design optimal interactions in quantum many-systems to maximize the precision of measuring the temperature (quantum thermometry), and of general external fields.

Related Publications

  • “Artificially intelligent Maxwell’s demon for optimal control of open quantum systems”
    P. A. Erdman, R. Czupryniak, B. Bhandari, A. N. Jordan, F. Noé, J. Eisert and G. Guarnieri, Quantum Sci. Technol. 10, 025047 (2025).
  • Reinforcement learning optimization of the charging of a Dicke quantum battery
    P. A. Erdman, G. M. Andolina, V. Giovannetti, and Frank Noé, Phys. Rev. Lett. 133, 243602 (2024).
  • “Dicke superradiant heat current enhancement in circuit quantum electrodynamics”
    G. M. Andolina, P. A. Erdman, F, Noé, J. Pekola, and M. Schirò,
    Phys. Rev. Res. 6, 043128, (2024).
  • Dissipation-induced collective advantage of a quantum thermal machine”
    M. Carrega, L. Razzoli, P. A. Erdman, F. Cavaliere, G. Benenti, and M. Sassetti, AVS Quantum Sci. 6, 025001 (2024).
  • “Optimal thermometers with spin networks”
    P. Abiuso, P. A. Erdman, M. Ronen, F. Noé, G. Haack, and M. Perarnau-Llobet, Quantum Sci. Technol. 9, 035008 (2024).
  • Pareto-optimal cycles for power, efficiency and fluctuations of quantum heat engines using reinforcement learning
    P. A. Erdman, A. Rolandi, P. Abiuso, M. Perarnau-Llobet, F. Noé, Phys. Rev. Res. 5, L022017 (2023).
  • Model-free optimization of power/efficiency tradeoffs in quantum thermal machines using reinforcement learning
    P. A. Erdman, F. Noé, PNAS Nexus, 2, pgad248 (2023).
  • “Optimal control methods for quantum batteries”
    F. Mazzoncini, V. Cavina, G. M. Andolina, P. A. Erdman, and V. Giovannetti, Phys. Rev. A 107, 032218 (2023).
  • Deep quantum Monte Carlo approach for polaritonic chemistry
    Y. Tang, G. M. Andolina, A. Cuzzocrea, M. Mezera, B. P. Szabó, Z. Schätzle, F. Noé, and P. A. Erdman, arXiv:2503.15644 (2025).
  • “From dynamical to steady-state many-body metrology: Precision limits and their attainability with two-body interactions”
    R. Puig, P. Sekatski, P. A. Erdman, P. Abiuso, J. Calsamiglia, and M. Perarnau-Llobet, arXiv:2412.02754 (2025).

Patent Application

Submitted a patent application to the European Patent Register, number 4 388 461, entitled “Quantum thermal system”.

Related Pictures

Fig. 1: Schematic representation of the learning process. A computer agent learns how to optimally drive a quantum thermal machines by interacting with it multiple times (panel A). A neural network architecture, based on stacking multiple 1D convolution blocks, is employed to have a model-free method (panels B,C).

Fig. 2: Example of training a Reinforcement Learning agent to optimize the performance of a Quantum Refrigerator based on a Superconducting Qubit. As the training proceeds, the control becomes more deterministic (panel D) and finally converges to the protocol in panel E.

Fig. 3: Example of a Pareto front describing optimal tradeoffs between extracted Power and Efficiency (panel C) of a quantum heat engine based on a collection of non-interacting particles trapped in a harmonic potentian (panel A). Examples of two optimal cycles are shown in panels D and E.