L2RPN Sandbox, on the case14 file

Organized by BDonnot - Current server time: Jan. 19, 2021, 9:26 p.m. UTC


Development phase
April 20, 2020, midnight UTC


Competition Ends
March 31, 2025, midnight UTC

Learning to Run a Power Network, sandbox

Warning: This competition does not award anything. It has been developed as a sandbox to play around, get familiar with the problem of controlling powerflow as well as the competition platform. A forum is available for questions on cadalab, further discussion and help is also available on discord: https://discord.gg/cYsYrPT.

Power grids transport electricity across states, countries and even continents. They are the backbone of power distribution, playing a central economical and societal role by supplying reliable power to industry, services, and consumers. Their importance appears even more critical today as we transition towards a more sustainable world within a carbon-free economy, and concentrate energy distribution in the form of electricity. Problems that arise within the power grid range from transient brownouts to complete electrical blackouts which can create significant economic and social perturbations, i.e.de facto freezing society. Grid operators are still responsible for ensuring that a reliable supply of electricity is provided everywhere, at all times. With the advent of renewable energy, electric mobility, and limitations placed on engaging in new grid infrastructure projects, the task of controlling existing grids is becoming increasingly difficult, forcing grid operators to do “more with less”. This challenge aims at testing the potential of AI to address this important real-world problem for our future.

To proceed in the competition:

  • Visit our website https://l2rpn.chalearn.org/ for an interactive introduction to power grid operations and read the companion white paper.
  • Visit the Instructions subsection to get started with the competition
  • Understand the rules of the game and the evaluation of your submission in the related subsection
  • Review the terms and conditions that you will have to accept to make your first submission.
  • Dive into the starting kit for a guided tour and tutorial to get all set for the competition and start make submissions

You are ready to get started and we are looking forward for your first submission in the Participate section to become the control room operator of the future !


Important Note:

To match the most recent competitions and keep this "competition" relevant, this competition has been upgraded to use grid2op 1.2.2 instead of version 0.7.1 (that was the initial version of grid2op used in the early version of this competition).

Most important changes: the application of some action (especially on the reconnection of powerline) has been modified between this first version and grid2op 1.2.0 this does have an impact on the score. Don't pay too much attention on the scores :-)


In this section, we give you the instructions to help you:

  • configure your own environment,
  • quickly get a first agent ready with a starting kit,
  • get additional data,
  • make a submission on Codalab for the challenge,
  • finally discover your results.

Get the Grid2op Platform

The challenge is based on an environment (and not only a dataset) in which an agent can learn from interactions. It runs under the Grid2op platform.

The Grid2op platform can be installed as any python package with:

pip install grid2op

We also strongly recommend to use the "baseline" python package that will be updated during the competition with:

pip install l2rpn-baselines

Download the Starting Kit

A starting kit is available for you to download in the Participate section on codalab, along with the proper game environment for the competition. Several notebooks should help you understand how to properly run the Grid2op platform using chronics to train and test your agents.

The starting kit also gives details about how to check that your submission is valid and ready to run on the competition servers.

Get the data

Once grid2op is installed, you can get the competition data (approximately 200-300Mo) directly from the internet. This download will happen automatically the first time you will create the environment of the competition from within a python script or shell:

import grid2op
env = grid2op.make("l2rpn_case14_sandbox")

Make a submission

Essentially, a submission should be a ZIP file containing at least these two elements:

  • submission: a folder in which you agent is defined.
  • metadata: file giving the instruction to Codalab on how to process the submission (should never be changed).

In the starting kit, and script is here to help you create and check your submission is valid:

python3 check_your_submission.py --help

/!\ This is a code submission challenge, meaning that the participant has to submit his code (and not his results).

Upon reception of the challenger's submission, will be read by Codalab and the code will be run on the competition servers. The detailed structure of the submission directory can be found in the starting kit.

Then, to upload your submission on Codalab:

  • Go to the competition homepage
  • Click on "Participate"
  • Click on "Submit / View Results"
  • Click on "Submit" and select your ZIP file to submit it

Codalab will take some time to process the submission and will display the scores on the same page once the submissions have been processed. You may need to refresh the page. As explained in the rules, if your submission takes more than 20 minutes to run, a timeout error will be raised and your submission will be ignored.

See your results

 In the "Submit / View Results" sub-section in the Participate section, you can see the status of your submission. Once it is processed, you can review your submission, see the score it obtained and the time it took to run. When clicking on the blue cross next to your submission, different logs are available for additional information. You can also download your submission again if you want to. More importantly, you can get logs of your agent's behavior over different scenarios in the folder "output from scoring step". Several indicators over all the scenarios that the agent was run on can be visualized in the html file.

To compare your score to the ones of the other participants, please go on the Results page. The Leaderboard is displayed there. Be aware that only your last submission's score is considered there.

Terms and Conditions

This challenge is governed by the general ChaLearn contest rules.

Challenge specific Rules

This challenge starts on April 20th 2020 and ends on March 31st 2021. It does not award anything.

  • This challenge runs in 1 phase where you can submit you code in the same condition as the l2rpn competition that will start soon.
  • The organizers may provide additional baseline agents during the challenge to stimulate the competition.
  • The participant will be limited to 5 submissions per day.
  • Submissions are limited to 300 Mb in size.
  • Each submission has a limited time to finish all senarios: 20 minutes.
  • We will check your submission to be valid: they should not change the environement of the game in any way. This would be considered cheating.
  • Teams should use a common account, under a group email. Multiple accounts are forbidden.
  • The final leaderboard (and the final ranking of the participants) will be based on scores in the Test Phase only.
  • We strongly encourage all teams to share their code and make them accessible in the l2rpn-baselines python package (see https://github.com/rte-france/l2rpn-baselines, more information on the official discord https://discord.gg/cYsYrPT)
  • Anyone can make entries.


This challenge would not have been possible without the help of many people.

Principal coordinators:

  • Antoine Marot (RTE, France)
  • Isabelle Guyon (U. Paris-Saclay; UPSud/INRIA, France and ChaLearn, USA)

Protocol and task design:

  • Gabriel Dulac-Arnold (Google Research, France)
  • Olivier Pietquin (Google Research, France)
  • Isabelle Guyon (U. Paris-Saclay; UPSud/INRIA, France and ChaLearn, USA)
  • Patrick Panciatici (RTE, France)
  • Antoine Marot (RTE, France)
  • Benjamin Donnot (RTE, France)
  • Camilo Romero (RTE, France)
  • Jan Viebahn (TenneT, Netherlands)
  • Adrian Kelly (EPRI, Ireland)
  • Di Shi (Geirina, USA)
  • Mariette Awad (American University of Beirut, Lebanon)

Data format, software interfaces, and metrics:

  • Benjamin Donnot (RTE, France)
  • Mario Jothy (Artelys, France)
  • Gabriel Dulac-Arnold (Google Research, France)
  • Aidan O'Sullivan (UCL/Turing Institute, UK)
  • Zigfried Hampel-Arias (Lab 41, USA)
  • Jean Grizet (EPITECH & RTE, France)

Environment preparation and formatting:

  • Carlo Brancucci (Encoord, USA)
  • Vincent Renault (Artelys, France)
  • Camilo Romero (RTE, France)
  • Bri-Mathias Hodge (NREL, USA)
  • Florian Schäfer (Univ. Kassel/pandapower, Germany)
  • Antoine Marot (RTE, France)
  • Benjamin Donnot (RTE, France)

Baseline methods and beta-testing:

  • Kishan Prudhvi Guddanti (Arizo State Univ., USA)
  • Jiajun Duan (Geirina, USA)
  • Loïc Omnes (ENSAE & RTE, France)
  • Jan Viebahn (TenneT, Netherlands)
  • Medha Subramanian (TenneT & TU Delft, Netherlands)
  • Benjamin Donnot (RTE, France)
  • Jean Grizet (EPITECH & RTE, France)
  • Patrick de Mars (UCL, UK)
  • Jan-Hendrik Menke (Univ. Kassel/pandapower, Germany)
  • Yan Zan (Geirina, USA)
  • Lucas Tindall (Lab 41 & UCSD, USA)

Other contributors to the organization, starting kit, and datasets, include:

  • Balthazar Donnon (RTE R&D and UPSud/INRIA, France)
  • Kimang Khun (Ecole Polytechnique, France)
  • Luca Veyrin-Forrer (U. Paris-Saclay; UPSud, France)
  • Marvin Lerousseau
  • Joao Araùjo

Our special thanks go to:

  • Marc Schoenauer (U. Paris-Saclay; UPSud/INRIA, France)
  • Patrick Panciatici (RTE R&D, France)
  • Olivier Pietquin (Google Brain, France)

The challenge is running on the Codalab platform, administered by Université Paris-Saclay and maintained by CKCollab LLC, with primary developers:

  • Eric Carmichael (CKCollab, USA)
  • Tyler Thomas (CKCollab, USA)

ChaLearn and RTE are the challenge organization coordinators and sponsors, and RTE donated prizes.

Rules of the Game

Objective of the game

The objective of the competition is to design an agent that can sucessfully manage to operate a powergrid. Operate a powergrid here means: find ways to modify how the objects are interconnected together (aka "changing the topology) or modify the productions to make sure it stays safe (see "Conditions of Game Over").

More information are given in the 1_Power_Grid_101_notebook prodived in the starting kit.

If you have any question, we are here to answer you on the official discord: https://discord.gg/cYsYrPT

Conditions of Game Over

As any system, a power grid can fail to operate properly, as illustrated on the challenge website. This can occur under conditions such as:

  • consumption is not met because no electricity is flowing to some loads or more than n power plants get disconnected (1 for this challenge);
  • the grid gets split appart into isolated sub-grids making the whole grid non-connex.

These conditions can appear when power lines in the grid get disconnected after being overloaded. When a line get disconnected, it loads gets distributed over other power lines, which in turn might get overloaded and thus disconnected as well, leading to a cascading failure (blackout).

Conditions on Overloads

When the power in a line increases above its thermal limit, the line becomes overloaded. It can stay overloaded for few timesteps before it gets disconnected, if no proper agent action is taken to relieve this overload (2 timesteps are allowed in this challenge, see the Parameters class in grid2op), this is what we call a "soft overflow". If the overload is too high, the line gets disconnected immediately (above 200% of the thermal limit in this challenge). This is a 'hard' overload.

Conditions on Actions

Actions can consist on:

  • re connecting / disconnecting a powerline
  • changing the topology of the grid (choose to insolate some objects [productions, loads, powerlines] from other
  • modify the

Those parameters are accessible through the "Parameters" class of grid2op. During training, you can modify some of these parameters to relax some constraints and initialize your training better.


Observations to use

Observations about the state of the grid can be retrieved from the environment to be used for your agent. Please read the table in the grid2op documentation. You can recover information over current productions, loads, and more importantly about the flows over the lines and the topology of the grid. You are free to use whatever observation available, make the best of it!


Your agent is evaluated on 10 scenarios of different length starting at different times.

NB: this page will be cleaned up as soon as we can. In the mean time, you can have a look at the 2_Develop_And_RunLocally_An_agent notebook provided on the starting kit. Sorry about that.

Definition of a cost function

1) cost of energy losses

To begin with, we will recall that transporting electricity always generates some energy losses Eloss(t) due to the Joule effect in resistive power lines at any time t:

  • Eloss(t)=Σ rl × yl(t)2

At any time t, the operator of the grid is responsible for compensating those energy losses  by purchasing on the energy market the corresponding amount of production at the marginal price p(t). We can therefore define the following energy loss cost closs(t):

  • closs(t)=Eloss(t) × p(t)

2) cost of redispacthing productions after actions on generators

Then we should consider that operator decisions when taking an action can induce costs, especially when requiring market actors to perform specific actions, as they should be paid in return. Topological actions are mostly free, as the grid belongs to the power grid operator, and no energy cost is involved. However, redispatching actions involve producers which should get paid. As the grid operators ask to redispatch energy Eredispatch(t),  some power plants will increase their production by Eredispatch(t) while others will compensate by decreasing their production by the same amount to keep the power grid balanced. Hence, the grid operator will pay both producers for this redispatched energy at a cost credispatching(t) higher than the marginal price p(t) (possibly by some factor):

  • credispatching(t) = 2×Eredispatch(t)×p(t)

3) total cost of operations

If no flexibility is identified or integrated on the grid, operational costs related to redispatching can dramatically increase due to renewable energy sources as was the case recently in Germany with **an avoidable 1 billion €/year increase**.

We can hence define our overall operational cost coperations(t):

  • coperations(t) = credispatching(t) + credispatching(t)

Formally, we can define an "episode" e successfully managed by an agent up until time tend (over a scenario of maximum length Te) by:

  • e = {o1,a1,o2,a2, ... , otend,atend  }

where ot represents the observation at time t and at the actions the agent took at time t. In particular, o1 is the first observation and otend is the last one: either there is a game over at time tend or the agent reached the end of the scenario such that tend = Te.

An agent can either manage to operate the grid for the entire scenario or fail after some time tend because of a blackout. In case of a blackout, the cost cblackout(t) at a given time t would be proportional to the amount of consumption not supplied Load(t), at a price higher than the marginal price p(t) by some factor beta:

  • cblackout(t) = Load(t) × p(t) × beta with beta › 1

Notice that Load(t) >> Eredispatch(t) , Eloss(t)
which means that the cost of a blackout is a lot higher than the cost of operating the grid as expected. It is even higher if we further consider the secondary effects on the economy (More information can be found on this blackout cost simulator: https://www.blackout-simulator.com). Furthermore, a blackout does not last forever and power grids restart at some point. But for the sake of simplicity while preserving most of the realism, all these additional complexities are not considered here.

Now we can define our overall cost c for an episode e:

  • c(e) = Σ0 -> tend coperations(t) + Σ tend -> Te cblackout(t)

We still encourage the participants to operate the grid as long as possible, but penalize them for the remaining time after the game is over, as this is a critical system and safety is paramount.

Finally, participants will be tested on N hidden scenarios of different lengths, varying from one day to one week, and on various difficult situations according to our baselines. This will test agent behavior in various representative conditions. Under those episodes, our final score to minimize will be:

  • Score = Σ 0 -> N   c(ei)

Rescaling of the scores

For an basic agent (an agent that does nothing) this score was really high on our scenarios, in the order of 33 billions.  As we, poor human, can't read such numbers easily, we decided to apply a linear transformation to have:
100 for the best possible agent (an agent that handles all scenarios, without redispatching, with minimal losses of 1% during all scenarios) and to 0 for the "do nothing" baselines.

This means that:
- the score should be maximized rather than minimized
- having a score of 100 is probably out of reach
- having a positive score is already pretty good!

Note on the hidden scenarios

For this sandbox competition, hidden scenarios are defined as followed:
- 1 scenario lasts 3 days
- 6 scenarios last 2 days
- 3 scenarios last 1 day only

Scenarios have been cherry picked to offer different levels of difficulty, can start at arbitrary time steps ( but chronics starts always at middnight!). Time interval between two consecutive time step is fixed and will always be 5 mins.


Using Your own reward

You can use any rewards you want in grid2op, both at training time (when you train your agent on your computer) or at test time.

To chage the reward signal you are using, you can, at training time, specify it at the creation of the environment:

import grid2op
from grid2op.Reward import GameplayReward
env = grid2op.make_new("l2rpn_case14_sandbox", reward_class=GameplayReward)

We invite you to get have a look at the official grid2op documentation about rewards at https://grid2op.readthedocs.io/en/latest/reward.html




Get the Starting Kit

We put at your disposal a starting kit that you can download in the Participate Section. It gives you an easy start for the competition, in the form of several notebooks:

  • 1_Power_Grid_101_notebook.ipynb explains the problem of power grid operation on a small grid using the grid2op platform.


  • 2_Develop_And_RunLocally_An_agent.ipynb shows how to define an agent, test it to make sure it is running correctly, and make a submission. In particular, this notebook illustrates how to check that your submission is valid and ready to run on the competition servers.


  • 3_TestAndFormatYourAgent.ipynb is a shorter version of the previous notebook that allows you to easily test whether your submission is valid and can be run on the Codalab platform.


  • 4_DebugYourAgent.ipynb is a step by step helper to help you debug your agent if your submission fails to run on the Codalab platform.

If you need any help, do not hesitate to contact the competition organizers on the dedicated discord forum server that we opened for the competition: https://discord.gg/cYsYrPT

A fast backend simulator

The default backend is pandapower, a well-known open-soure library in the power system community. However, it can be a bit too slow when it comes to running thousands of simulations. For that aim, the lightSim2Grid simulator (https://github.com/BDonnot/lightsim2grid) was developped in C++, imitating pandapower behavior and reproducing its results for our current power grid modelization. A speedup factor of 30 can be achieved, which should be of great use when training an agent.

Simulate function

As operators do in real life, you can simulate the effect of diffent actions before taking a decision (a step in Grid2op framework). This comes at a cost in terms of computation time, but allows to validate the relevance of your action on the short term.

Reward design

You can specify your own reward, a function that can be different from the score of the competition. We believe that reward design is an important aspect of the competition, and a participant should think about which reward is best to let its agent learns and explore. 

Changing difficulty levels

Different parameters to configure an environment allow to modulate the difficulty for an agent to deals with that environment. For instance, it is possible to inhibit line disconnection when overloaded, hence avoiding any blackout and allowing an agent to operate until the end of scenario. This easy mode could be a prefered mode when your start training your agent. By modifying the environment parameter you can hence design a learning curriculum for your agent, making the environment more and more difficult to eventually operate in the full environment setting.

Grid2Viz - visual study tool of your agents

To inspect and study some particular scenarios and compare the behavior of different agents, the Grid2Viz interface is a great tool to try and use (https://github.com/mjothy/grid2viz)

Chronix2Grid - generate additional chronics

To generate all the chronics of the environment for the competition, we used the chronix2grid package. If you want to generate additional chronics, you can use it yourself https://github.com/mjothy/ChroniX2Grid/tree/master/chronix2grid

Development phase

Start: April 20, 2020, midnight

Description: Development phase: you can try your models in this phase

Competition Ends

March 31, 2025, midnight

You must be logged in to participate in competitions.

Sign In
# Username Score
1 solpino 62.05
2 qiu_o 57.79
3 shhong 56.13