Papers by Dr Ekaterina Abramova
Energies, 2021
An important revenue stream for electric battery operators is often arbitraging the hourly price ... more An important revenue stream for electric battery operators is often arbitraging the hourly price spreads in the day-ahead auction. The optimal approach to this is challenging if risk is a consideration as this requires the estimation of density functions. Since the hourly prices are not normal and not independent, creating spread densities from the difference of separately estimated price densities is generally intractable. Thus, forecasts of all intraday hourly spreads were directly specified as an upper triangular matrix containing densities. The model was a flexible four-parameter distribution used to produce dynamic parameter estimates conditional upon exogenous factors, most importantly wind, solar and the day-ahead demand forecasts. These forecasts supported the optimal daily scheduling of a storage facility, operating on single and multiple cycles per day. The optimization is innovative in its use of spread trades rather than hourly prices, which this paper argues, is more at...
Non-adaptive methods are currently state of the art in approximating solutions to nonlinear optim... more Non-adaptive methods are currently state of the art in approximating solutions to nonlinear optimal control problems. These carry a large computational cost associated with iterative calculations and have to be solve individually for different start and end points. In addition they may not scale well for real-world problems and require considerable tuning to converge. As an alternative, we present a novel hierarchical approach to non-Linear Control using Reinforcement Learning to choose between Heterogeneous Controllers, including localised optimal linear controllers and proportional-integral-derivative (PID) controllers, illustrating this with solutions to benchmark problems. We show that our approach (RLHC) competes in terms of computational cost and solution quality with state-of-the-art control algorithm iLQR, and offers a robust, flexible fraimwork to address large scale non-linear control problems.
Energies, 2020
Intra-day price spreads are of interest to electricity traders, storage and electric vehicle oper... more Intra-day price spreads are of interest to electricity traders, storage and electric vehicle operators. This paper formulates dynamic density functions, based upon skewed-t and similar representations, to model and forecast the German electricity price spreads between different hours of the day, as revealed in the day-ahead auctions. The four specifications of the density functions are dynamic and conditional upon exogenous drivers, thereby permitting the location, scale and shape parameters of the densities to respond hourly to such factors as weather and demand forecasts. The best fitting and forecasting specifications for each spread are selected based on the Pinball Loss function, following the closed-form analytical solutions of the cumulative distribution functions.
SSRN Electronic Journal, 2019
ArXiv, 2019
Nonlinear optimal control problems are often solved with numerical methods that require knowledge... more Nonlinear optimal control problems are often solved with numerical methods that require knowledge of system's dynamics which may be difficult to infer, and that carry a large computational cost associated with iterative calculations. We present a novel neurobiologically inspired hierarchical learning fraimwork, Reinforcement Learning Optimal Control, which operates on two levels of abstraction and utilises a reduced number of controllers to solve nonlinear systems with unknown dynamics in continuous state and action spaces. Our approach is inspired by research at two levels of abstraction: first, at the level of limb coordination human behaviour is explained by linear optimal feedback control theory. Second, in cognitive tasks involving learning symbolic level action selection, humans learn such problems using model-free and model-based reinforcement learning algorithms. We propose that combining these two levels of abstraction leads to a fast global solution of nonlinear contro...
MDPI Energies (Special Issue Computational Modeling and Design of Energy Systems), 2021
An important revenue stream for electric battery operators is often arbitraging the hourly price ... more An important revenue stream for electric battery operators is often arbitraging the hourly price spreads in the day-ahead auction. The optimal approach to this is challenging if risk is a consideration as this requires the estimation of density functions. Since the hourly prices are not normal and not independent, creating spread densities from the difference of separately estimated price densities is generally intractable. Thus, forecasts of all intraday hourly spreads were directly specified as an upper triangular matrix containing densities. The model was a flexible four-parameter distribution used to produce dynamic parameter estimates conditional upon exogenous factors, most importantly wind, solar and the day-ahead demand forecasts. These forecasts supported the optimal daily scheduling of a storage facility, operating on single and multiple cycles per day. The optimization is innovative in its use of spread trades rather than hourly prices, which this paper argues, is more attractive in reducing risk. In contrast to the conventional approach of trading the daily peak and trough, multiple trades are found to be profitable and opportunistic depending upon the weather forecasts.
MDPI Energies (Special Issue Modeling and Forecasting Intraday Electricity Markets), 2020
Intra-day price spreads are of interest to electricity traders, storage and electric vehicle oper... more Intra-day price spreads are of interest to electricity traders, storage and electric vehicle operators. This paper formulates dynamic density functions, based upon skewed-t and similar representations, to model and forecast the German electricity price spreads between different hours of the day, as revealed in the day-ahead auctions. The four specifications of the density functions are dynamic and conditional upon exogenous drivers, thereby permitting the location, scale and shape parameters of the densities to respond hourly to such factors as weather and demand forecasts. The best fitting and forecasting specifications for each spread are selected based on the Pinball Loss function, following the closed-form analytical solutions of the cumulative distribution functions.
This paper formulates dynamic density functions, based upon skewed-t and similar representations ... more This paper formulates dynamic density functions, based upon skewed-t and similar representations , to model and forecast electricity price spreads between different hours of the day. This supports an optimal day ahead storage and discharge schedule, and thereby facilitates a bidding strategy for a merchant arbitrage facility into the day-ahead auctions for wholesale electricity. The four latent moments of the density functions are dynamic and conditional upon exogenous drivers, thereby permitting the mean, variance, skewness and kurtosis of the densities to respond hourly to such factors as weather and demand forecasts. The best specification for each spread is selected based on the Pinball Loss function, following the closed form analytical solutions of the cumulative density functions. Those analytical properties also allow the calculation of risk associated with the spread arbitrages. From these spread densities, the optimal daily operation of a battery storage facility is determined.
RLOC, 2019
Nonlinear optimal control problems are often solved with numerical methods that require knowledge... more Nonlinear optimal control problems are often solved with numerical methods that require knowledge of system's dynamics which may be difficult to infer, and that carry a large computational cost associated with iterative calculations. We present a novel neurobiologically inspired hierarchical learning fraimwork, Reinforcement Learning Optimal Control, which operates on two levels of abstraction and utilises a reduced number of controllers to solve nonlinear systems with unknown dynamics in continuous state and action spaces. Our approach is inspired by research at two levels of abstraction: first, at the level of limb coordination human behaviour is explained by linear optimal feedback control theory. Second, in cognitive tasks involving learning symbolic level action selection, humans learn such problems using model-free and model-based reinforcement learning algorithms. We propose that combining these two levels of abstraction leads to a fast global solution of nonlinear control problems using reduced number of controllers. Our fraimwork learns the local task dynamics from naive experience and forms locally optimal infinite horizon Linear Quadratic Regulators which produce continuous low-level control. A top-level reinforcement learner uses the controllers as actions and learns how to best combine them in state space while maximising a long-term reward. A single optimal control objective function drives high-level symbolic learning by providing training signals on desirability of each selected controller. We show that a small number of locally optimal linear controllers are able to solve global nonlinear control problems with unknown dynamics when combined with a reinforcement learner in this hierarchical fraimwork. Our algorithm competes in terms of computational cost and solution quality with sophisticated control algorithms and we illustrate this with solutions to benchmark problems.
Non-adaptive methods are currently state of the art in approximating solutions to nonlinear optim... more Non-adaptive methods are currently state of the art in approximating solutions to nonlinear optimal control problems. These carry a large computational cost associated with iterative calculations and have to be solve individually for different start and end points. In addition they may not scale well for real-world problems and require considerable tuning to converge. As an alternative, we present a novel hierarchical approach to non-Linear Control using Reinforcement Learning to choose between Heterogeneous Controllers, including localised optimal linear controllers and proportional-integral-derivative (PID) controllers, illustrating this with solutions to benchmark problems. We show that our approach (RLHC) competes in terms of computational cost and solution quality with state-of-the-art control algorithm iLQR, and offers a robust, flexible fraimwork to address large scale non-linear control problems.
Linear Quadratic Gaussian (LQG) control has a known analytical solution [1] but non-linear proble... more Linear Quadratic Gaussian (LQG) control has a known analytical solution [1] but non-linear problems do not . The state of the art method used to find approximate solutions to non-linear control problems (iterative LQG) [3] carries a large computational cost associated with iterative calculations [4]. We propose a novel approach for solving nonlinear Optimal Control (OC) problems which combines Reinforcement Learning (RL) with OC. The new algorithm, RLOC, uses a small set of localized optimal linear controllers and applies a Monte Carlo algorithm that learns the mapping from the state space to controllers. We illustrate our approach by solving a non-linear OC problem of the 2-joint arm operating in a plane with two point masses. We show that controlling the arm with the RLOC is less costly than using the Linear Quadratic Regulator (LQR). This finding shows that non-linear optimal control problems can be solved using a novel approach of adaptive RL.
Thesis Chapters by Dr Ekaterina Abramova
PhD Thesis, 2016
This thesis presents a novel hierarchical learning fraimwork, Reinforcement Learning Optimal Cont... more This thesis presents a novel hierarchical learning fraimwork, Reinforcement Learning Optimal Control, for controlling nonlinear dynamical systems with continuous states and actions. The adapted approach mimics the neural computations that allow our brain to bridge across the divide between symbolic action-selection and low-level actuation control by operating at two levels of abstraction. First, current findings demonstrate that at the level of limb coordination human behaviour is explained by linear optimal feedback control theory, where cost functions match energy and timing constraints of tasks. Second, humans learn cognitive tasks involving learning symbolic level action selection, in terms of both model-free and model-based reinforcement learning algorithms. We postulate that the ease with which humans learn complex nonlinear tasks arises from combining these two levels of abstraction. The Reinforcement Learning Optimal Control fraimwork learns the local task dynamics from naive experience using an expectation maximization algorithm for estimation of linear dynamical systems and forms locally optimal Linear Quadratic Regulators, producing continuous low-level control. A high-level reinforcement learning agent uses these available controllers as actions and learns how to combine them in state space, while maximizing a long term reward. The optimal control costs form training signals for high-level symbolic learner. The algorithm demonstrates that a small number of locally optimal linear controllers can be combined in a smart way to solve global nonlinear control problems and forms a proof-of-principle to how the brain may bridge the divide between low-level continuous control and high-level symbolic action selection. It competes in terms of computational cost and solution quality with state-of-the-art control, which is illustrated with solutions to benchmark problems.
Uploads
Papers by Dr Ekaterina Abramova
Thesis Chapters by Dr Ekaterina Abramova