Learning to Run a Power Network

Organized by camilo - Current server time: Jan. 18, 2021, 8:40 a.m. UTC


Final Phase
Aug. 8, 2019, midnight UTC


Final Phase
Aug. 8, 2019, midnight UTC


Competition Ends
June 30, 2020, 11 p.m. UTC

Learning to Run a Power Network

Warning: The formal competitions ended up on June 19th 2019 but you could still submit your solution to measure the strength of your method.

Following the success of the Horizon prize to predict flows in the French long distance high voltage electricity transmission grid, managed by the company “Réseau de Transport d’Électricité” (RTE), we are organizing a new challenge. The goal of this challenge is to test the potential of Reinforcement Learning (RL) or other advanced control algorithm to control electricity transportation in power grids, while keeping people and equipment safe. Hence, it is the "gamification" of a serious problem: operating the grid in a increasingly complex environment including less predictable energy sources (wind, solar, ...), energy market globalization, limitations on new line construction, and growing consumption. To make "smart grid" materialize, it is becoming urgent to optimize more tightly grid operations, using more frequently a broader range of configuration changes, without compromising security.

To further proceed with the competition:

  • Visit our website https://l2rpn.chalearn.org/ for an interactive introduction to power grid operations.
  • Visit the Instructions subsection to get started with the competition.
  • Understand the rules of the game and the evaluation of your submission in the related subsection.
  • Review the terms and conditions that you will have to accept to make your first submission.

You are ready to get started and we are looking forward for your first submission in the Participate section to become the control room operator of the future !

A forum is available for comments and help.



In this section, we give you the instructions to help you:

  • configure your own environment,
  • get quickly a first agent ready with a starting kit,
  • get additional data,
  • make a submission on Codalab for the challenge,
  • finally discover your results.

Get the Pypownet Platform

The challenge is based on an environment (and not solely a dataset) in which an agent can learn by interactions. It runs under the Pypownet platform, which is made of 3 components:

  • A power grid simulator, the open-source Matpower in this case, and more precisely Pypower, its python API.
  • Data Chronics and parameters that define our environment.
  • The Gym framework from OpenAI, to run an environment sequentially while taking into account agent actions.

The pypownet platform can be installed with the instructions on this git repository: https://github.com/MarvinLer/pypownet.

We recommend its docker installation rather than a local installation from git to avoid compatibility issues.


or if you want a docker image  without Machine learning libraries (scikit-learn, Keras, TF, PyTorch)



Download the Starting Kit

A starting kit is available for you to download in the "Participate section"tab, with the proper game environment for the competition. Several notebooks should help you understand and run properly the Pypownet platform on chronics to train and test your agents. It also tells you how to check that your submission is valid to run on the competition servers.

Get Data

A first sample of 50 scenarios of 1 month duration is available with the starting kit. This challenge runs under an environment that uses the "IEEE 14" case study grid. It includes 5 different power plants: nuclear, themal, wind, solar (big), solar (small). A larger additional sample of about 1000 scenarios is also available with the starting kit, to better train your agent. See Get Data in the Participate section.

Making a submission

Essentially, a submission should be a ZIP file containing at least these two files:

  • submission.py: code (i.e. agent) submitted by a challenger,
  • metadata: file giving the instruction to Codalab on how to process the submission (should never be changed).

Upon reception of the challenger's submission, Codalab will see the metadata file (mandatory) and consider the submission of the participant as a code submission (contrary to results submission) and run the programs to process it. The structure of the file submission.py can be found in the starting kit.

The folder containing both submission.py and metada should then be zipped (the zip file name can be chosen by you arbitrarily, use it to track your various submissions) as explained in the starting kit notebook. Then, on Codalab:

  • Go to the competition homepage
  • Click on "Participate"
  • Click on "Submit / View Results"
  • Click on "Submit", select the latter submission zip

Codalab will take some time to process the submission, and will print the scores on the same page (after refresh). As explained in the rules, if your submission takes more than 20 minutes, it wil be a timeout.

See your results

 In the "Submit / View Results" sub-section of Participate, you can see the status of your submission. Once it is finished, you can review your submissions, its score and duration. Clicking on the blue cross of your submission, different logs are available. You can download your submission again if desired. More importantly you can get logs of your agents behavior over different scenarios in the folder "output from scoring step": indicators over all run scenarios can be visualized in the html file.

To compare your result to the other participants, please go on the Results page on which the Leaderboard is displayed. Be aware that only your last submission score is considered there.

Terms and Conditions

This challenge is governed by the general ChaLearn contest rules.

Challenge specific Rules

This challenge starts on May 15th 2019 and ends on June 19th 2019.

  • This challenge runs in two phases :
    • Validation phase until June 15th during which you can make submissions to appear on the leaderboard. Only the last submission will be considered for the leaderboard.
    • Test Phase from June 15th to 19th. Your latest submission will be selected to run on the secret dataset and your score will be revealed once submissions are closed on the 19th.
  • To compete in the Test Phase, you need to make a successful submission during the Validation Phase and your entry should be displayed on the leaderboard.
  • The organizers may provide additional baseline agents during the challenge to stimulate the competition.
  • The participant will be limited to 5 submissions per day.
  • Submissions are limited to 300 Mb in size.
  • Each submission has a limited time to finish all senarios: 20 minutes. It is about 10 times more than a do nothing agent. 
  • We will check your submission to be valid: they should not change the environement of the game in any way.
  • Teams should use a common account, under a group email. Multiple accounts are forbidden.
  • The final leaderboard (and the final ranking of the participants) will be based on scores in the Test Phase only.
  • The 2 best teams will have to open-source their code to share it with the community to be rewarded with the prizes.
  • Employees at RTE and people that participated in the design or organization of the challenge can make entries but will not be considered for the final ranking and for winning the prizes.


This challenge would not have been possible without the help of many people.

Main organizers:

  • Antoine Marot (RTE R&D, France)
  • Benjamin Donnot (RTE R&D, France)
  • Isabelle Guyon (U. Paris-Saclay; UPSud/INRIA, France and ChaLearn, USA)
  • Luca Veyrin-Forrer (U. Paris-Saclay; UPSud, France)

Developpers of the pypownet platform:

  • Marvin Lerousseau (INSERM, and CVN - CentraleSupélec and INRIA Saclay, France) - lead developper
  • Marc Mozgawa (Sogeti High Tech, France)

Other contributors to the organization, starting kit, and datasets, include:

  • Balthazar Donnon ( RTE R&D and UPSud/INRIA, France)
  • Camillo Romero (RTE R&D, France)
  • Kimang Khun (Ecole Polytechnique France)
  • Joao Araùjo

We also especially thanks our advisors:

  • Marc Schoenauer (U. Paris-Saclay; UPSud/INRIA, France)
  • Patrick Panciatici (RTE R&D, France)
  • Olivier Pietquin (Google Brain, France)

The challenge is running on the Codalab platform, administered by Université Paris-Saclay and maintained by CKCollab LLC, with primary developers:

  • Eric Carmichael (CKCollab, USA)
  • Tyler Thomas (CKCollab, USA)

ChaLearn and RTE are the challenge organization coordinators and sponsors, and RTE donated prizes.

Rules of the Game

Conditions of Game Over

As any system, a power grid can fail to operate properly, as illustrated on the challenge website. This can occur under conditions such as:

  • consumption is not met because no electricity is flowing to some loads or more than n power plants get disconnected (1 for this challenge);
  • the grid gets split appart into isolated sub-grids making the whole grid non-connex.

These conditions can appear when power lines in the grid get disconnected after being overloaded. When a line get disconnected, it loads gets distributed over other power lines, which in turn might get overloaded and thus disconnected as well, leading to a cascading failure (blackout).

In Pypownet, there are 3 "game over" modes, which you can experiment with:

- hard - The scenario stops running as soon as you have a game over because of the conditions described previously. This is a "real life setting", which is use for evaluating the agents that you submit on the competition platform. Use it to emulate the way in which your agents will be evaluated.

- soft - If you have a game over, you can continue playing the scenario (input sequence), but the grid configuration is reset to its reference topology. During training, this can be a useful mode to let your agent experiment with the whole length of a scenario, not limited to the stretch extending before the game over.

- easy - Line disconnections because of overloads are not executed. Except for a "stupid" action that would split the grid appart, you should not run into a game over in thos mode. It may be a useful mode to start training your agent, making the task easier at first. It could help you create a curriculum for your agent.

Conditions on Overloads

When the power in a line increases above its thermal limit, the line becomes overloaded. It can stay overloaded for few timesteps before it gets disconnected, if no proper agent action is taken to relieve this overload (2 timesteps are allowed in this challenge, see configuration.yml file in starting kit). If the overload is too high, the line gets disconnected immediately (above 150% of the thermal limit in this challenge). This is a 'hard' overload, as opposed to a 'soft' overflow described before.

Conditions on Actions

With Pypownet, actions on lines and substations can be run. You can connect or disconnect lines. You can split a substation in two electrical nodes with different configurations or merge it back to one node. This is explained and can be visualized in the 101 and visualize_grid notebooks. In real life, for safety reason, it is not possible to do more than n actions at a time. In this challenge to introduce this operational constraint on that small system, you are allowed to a maximum of 1 action at a substation + 1 action on a line per timestep. In addition, certain actions on a grid can only be performed at a maximum frequency. In particular for lines and substations, a time of cooldown is required before reusing them. A cooldown of 3 timestep is set in this challenge.

Those parameters are accessible in the configuration.yaml file for each environment. During training, you can modify some of these parameters to relax some constraints and initialize your training better. However the initial parameters you will find will be the one used on Codalab for the competition to evaluate your submission.


Observations to use

Observations about the state of the grid can be retrieved from the environment to be used for your agent. Please read the table in the Pypownet documentation. You can recover information over current productions, loads, and more importantly about the flows over the lines and the topology of the grid. You are free to use whatever observation available, make the best of it!


Your agent is evaluated on 10 scenarios and can be compared to a "do nothing" agent and to a "random" agent.

  •  At each time step your agent gets a immediate score, which assesses the remaining available capacity of the grid to transport electricity. The immediate score is the sum of margins, that is the remaining capacity, over all lines. If a line is overloaded, there is no more margin on this line and the sub-reward related to this line is 0:  
=  Max(0, 1-Flow/ThermalLimit)    
Line_Score = 1 - (1-Line_Margin = Max(0, 1 - (Flow/ThermalLimit)²)
Score_step = Sum (Line_Score) over lines    


  •  During a scenario the agent may encounter a maximum of one "Game Over". A Game Over can occur for various reasons: because of a cascading failure after many overloads or because of disconnections of prodution and consumption (game overs are described in Rules of the Game).
  • The score of a scenario is the sum of immediate score over the scenario. If you get a Game Over during a scenario, the score for this scenario is 0. As in the real world, we cannot afford to run into a blackout. So no matter how good an agent did before the Game Over, we cannot rely on this agent that does not run the grid safely! He must get a stiff penalty:
    • Score_scenario = sum (Score_step) over all timesteps, if no game over
    • Score_scenario = 0 , if Game Over
Score_scenario =  0
,  if Game Over
  = sum (Score_step) over all timesteps
,  otherwise
  •  If you didn't get a Game Over during a scenario, the score for this scenario will be the sum of the reward at each time step for this secnario (aka cummulated reward).
  •  The final score is the sum of the score of every scenario:
    • Score = sum (Score_scenario) over all scenarios
Score =  sum (Score_scenario) over all scenarios

You can run and evalute an agent in the related notebook in the starting kit.

Using Your own reward

Locally during training, you can specify your own reward function, in a reward_signal.py file, to train your agent. However, the reward for the score when testing the submissions will be in any case the one specified above.


Download Size (mb) Phase
Starting Kit 23.637 #1 Development phase
Public Data 380.017 #1 Development phase

Development phase

Start: May 15, 2019, midnight

Description: Development phase: you can try your models in this phase

Final Phase

Start: Aug. 8, 2019, midnight

Description: Final phase : your last submission is pushed automaticly

Competition Ends

June 30, 2020, 11 p.m.

You must be logged in to participate in competitions.

Sign In
# Username Score
1 Mihir 22944.91
2 luvf 83907.11
3 camilo 22944.91