Can you describe the full procedure of evaluation agents for the leaderboard? What parameters do you use? This, from first page, 'python sim_test.py' with some parameters --P=100 --F=50 --U=200 --Utest=2000 --entries_dir='my_entries' --seed=42' ? Is there some kind of "public leaderboard parameters" and "privat leaderboard parameters"?
Thank you)Posted by: KarachunMichael @ Oct. 16, 2019, 6:57 p.m.
We use the following parameters:
--P=100 --F=50 --U=1000 --Utest=2000 --K=5
I used the parameters presented by @criteo, but the result of the leaderboard was not reproduced...Posted by: bakanaouji @ Oct. 18, 2019, 3:29 a.m.
Could you elucidate, please, how did you reproduce the results?
There are at least two reasons why the result is not reproducible:
(1) Random Seeds used to initialise RecoGym Environment for (a) scoring and (b) your tests are different. That has a certain impact on CTR score as environments are quite different. However, it was revealed that an agent with a generally good model shows almost the same performance under different Random Seeds.
(2) The Agent itself is NOT deterministic. In such a case, a couple of runs might reveal different results. As an example, the Agent provided in the Starting Kit to this challenge is not deterministic. In RecoGym you can find 'deterministic_test.py' script that checks either an agent provides deterministic results or not.Posted by: Criteo @ Oct. 19, 2019, 8:25 p.m.
Thank you for your reply.
I ran `sim _ test.py 'as follows:
python sim_test.py --P=100 --F=50 --U=10000 --Utest=2000 --K=5 --entries_dir="my_entries"
Unfortunately, I didn't get the same results as the public leaderboard.
This is probably because the random seed is different from that of the public leaderboard.
What is the value of the random seed to achieve the same results as the public leaderboard?
Unfortunately, we cannot provide Random Seed. Please, try your agent on different Random Seeds.Posted by: Criteo @ Oct. 30, 2019, 2:17 p.m.
OK! Thanks!Posted by: bakanaouji @ Oct. 30, 2019, 2:30 p.m.