Browse Prior Art Database

REINFORCEMENT LEARNING FOR WIND FARM PRODUCTION OPTIMIZATION

IP.com Disclosure Number: IPCOM000241602D
Publication Date: 2015-May-15
Document File: 4 page(s) / 131K

Publishing Venue

The IP.com Prior Art Database

Abstract

A technique for online optimization of a wind farm control using reinforcement learning is disclosed. The technique employs an algorithm which provides online optimization of wind farm output based on historical observations and online experimentation. In order to improve overall performance of the wind farm, the technique uses windows of opportunity, for example, when power demand is low, to explore effective, site specific and customized control strategies. During non-peak hours of wind turbine operation, a control policy model suggests changes to a wind farm control strategy based on a sensor output. Depending on improvement or degradation of performance of the wind farm, confidence information with respect to the suggested change in control parameter is captured. During peak hours of wind turbine operation, the control policy model selects control strategy which has the maximum chance of producing optimal output based on sensor information.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

REINFORCEMENT LEARNING FOR WIND FARM PRODUCTION OPTIMIZATION

BACKGROUND

The present disclosure relates generally to a wind farm and more particularly to a technique for optimizing farm level control using reinforcement learning.

Reinforcement learning is a machine learning technique in which software agents implement actions that maximize a collective reward metric. Reinforcement learning is used to train computers to perform several tasks and to figure out an optimum mode of operation in varying conditions, for a given task.

 

Reinforcement learning is generally used for real-time or online control applications. For example, in a conventional technique, performance of wind turbine control is improved by combining existing controllers with a reinforcement learning agent. The reinforcement learning agent updates control policy with every sample state of the wind turbine and its environment. In another conventional technique, a reinforcement learning based adaptive critic controller is proposed for power capture control of variable-speed wind energy conversion systems (WECSs). The control objective is to optimize the power capture from wind by tracking a maximum power curve and minimizing a predefined long-term cost function in mean time. However, application of reinforcement learning in optimization of wind farm production remains a challenge.

It would be desirable to have a technique that optimizes wind farm production using reinforcement learning.

BRIEF DESCRIPTION OF DRAWINGS

Figure 1 is a block diagram that depicts a non-peak mode of operation of a technique of wind farm control using reinforcement learning.

Figure 2 is a block diagram that depicts a peak mode of operation of the technique of wind farm control using reinforcement learning.

DETAILED DESCRIPTION

A technique for online optimization of a wind farm control using reinforcement learning is disclosed. The technique employs an algorithm which optimizes wind farm output based on historical observations and online experimentation. In order to improve overall performance of the wind farm, the technique uses windows of opportunity, for example, when power demand is low, to explore effective, site specific and customized control strategies. Exploration is an incremental process where informative changes to a control system, such as, set-points and gains, among others are made and resulting performance is used to guide a learning process. Changes that improve performance of the control system are reinforced and changes that degrade performance are discouraged.

According to one embodiment, the technique is implemented in a non-peak mode for example, during non-peak hours of the wind turbine operation, or at instances which are convenient to an operator. During the non-peak mode, a control policy model suggests changes to a wind farm control strategy based on a sensor output. Accordingly, a controller makes a change, for example, in a control parameter, such as, a yaw angle of wind turbin...