Diversification of image search results is now a hot research problem in multimedia. Search engines are fostering techniques that allow for providing the user with a diverse representation of his search results, rather than providing redundant information, e.g. the same perspective of a monument, or location etc. The DivFusion task builds on the MediaEval Retrieving Diverse Social Images Tasks and challenges the participants to develop highly effective information fusion techniques for social image search results diversification.
Participation in this task involves the following steps:
Participants will receive a list of image search queries with up to 300 photos retrieved from Flickr and ranked with Flickr’s default "relevance" algorithm. These data are accompanied by various metadata and content descriptors. Each query comes also with a variety of diversification system outputs (participant runs from previous years).
The requirements of the task are to fuse the provided systems' outputs and return a ranked list of up to 50 photos that are both relevant and diverse representations of the query.
Relevance: a photo is considered to be relevant for the query if it is a common photo representation of all query concepts at once. Low quality photos (e.g., severely blurred, out of focus, etc.) are not considered relevant in this scenario.
Diversity: a set of photos is considered to be diverse if it depicts different visual characteristics of the query topics and subtopics, e.g., sub-locations, temporal information, typical actors/objects, genesis information, different views at different times of the day/year and under different weather conditions, close-ups on architectural details, sketches, creative views, with a certain degree of complementarity, i.e., most of the perceived visual information is different from one photo to another.
The provided data are using two use case scenarios: (i) a tourist (single-topic query) scenario where a person tries to find more information about a place or event she might visit or attend and is interested in getting a more complete visual description of the target; (ii) a general ad-hoc (multi-topic query) scenario where the user searches for general-purpose images.
For more information, see the challenge webpage on the ChaLearn website (follow the link).
You may submit up to 5 runs during the entire duration of the challenge, making use either of the provided information (e.g., content descriptors, metadata, etc.) or of external information of your own.
Each run has to contain two separate runs, one for each test data set, as following:
Important note: the system run on the seenIR and unseenIR data should be the same (method and parameters). Please do not submit different system outputs. The idea of these data is to be able to compare the results in the two different contexts.
A valid run consists of a .zip (zip archive) containing the two run files (seenIR.txt and unseenIR.txt). This is the file you should upload in the Participate/Submit section.
Please submit your runs in the form of a trec topic file. This file is compatible with the trec_eval evaluation software (for more information please follow the previous link – you will find two archives trec_eval.8.1.tar.gz and trec_eval_latest.tar.gz - see the README file inside). The trec topic file has the structure illustrated by the following example of a file line (please note that values are separated by whitespaces):
030 Q0 ZF08 0 4238 prise1
qid iter docno rank sim run_id
Please note that each run needs to contain at least one result for each query. An example of a run file should look like this:
1 0 3338743092 0 0.94 run1_audiovisualRF
1 0 3661411441 1 0.9 run1_audiovisualRF
1 0 7112511985 48 0.2 run1_audiovisualRF
1 0 711353192 49 0.12 run1_audiovisualRF
2 0 233474104 0 0.84 run1_audiovisualRF
2 0 3621431440 1 0.7 run1_audiovisualRF
When you format your runs, please make sure that the queries are ordered as provided with the topic file, i.e., by ascending qid order (see also in the example above). This order is not necessarily the alphabetic order, thus you need to generate it manually.
You can experiment with your own runs on the development and validation data. From experience, this helped avoiding getting the wrong format. You are provided with all the tools that allow you to check the run consistency and also compute your own metrics. See also the information below.
Performance is going to be assessed for both diversity and relevance. We compute Cluster Recall at X (CR@X) - a measure that assesses how many different clusters from the ground truth are represented among the top X results (only relevant images are considered), Precision at X (P@X) - measures the number of relevant photos among the top X results and F1-measure at X (F1@X) - the harmonic mean of the previous two. Various cut off points are to be considered, e.g., X=5, 10, 20, 30, 40, 50.
Official ranking metrics will be the CR@20 images. This metric simulates the content of a single page of a typical web image search engine and reflects user behavior, i.e., inspecting the first page of results in priority. Metrics are to be computed individually on each test data set, i.e., seenIR data and unseenIR data. Final ranking will be based on overall mean values for CR@20, followed by P@20 and then F1@20.
To allow participants to evaluate on their own the results of their systems, the official evaluation tool is provided with the data (div_eval.jar). It computes the official evaluation metrics at different cutoff points (see the previous section) for each of the queries together with the overall average values. The software tool was developed under Java and to run it you need to have Java installed on your machine. To check, you may run the following line in a command window: "java -version". In case you don't have Java installed, please visit this link, download the Java package for your environment and install it.
To run the script, use the following syntax (make sure you have the div_eval.jar file in your current folder):
java -jar div_eval.jar -r <runfilepath> -rgt <rGT directory path> -dgt <dGT directory path> -t <topic file path> -o <output file directory> [optional: -f <output file name>]
-r <runfilepath> - specifies the file path to the current run file for which you want to compute the evaluation metrics. The file should be formatted according to the instructions above;
-rgt <rGT directory path> - specifies the path to the relevance ground truth (denoted by rGT) for the current data set;
-dgt <dGT directory path> - specifies the path to the diversity ground truth (denoted by dGT) for the current data set;
-t <topic file path> - specifies the file path to the topic xml file for the current data set;
-o <output file directory> - specifies the path for storing the evaluation results. Evaluation results are saved as .csv files (comma separated values);
-f <output file name> - is optional and specifies the output file name. By default, the output file will be named according to the run file name + "_metrics.csv".
java -jar div_eval.jar -r c:\divtask\RUNd2.txt -rgt c:\divtask\rGT -dgt c:\divtask\dGT -t c:\divtask\devsetkeywordsGPS_topics.xml -o c:\divtask\results my_first_results
Output file example:
"Average P@20 = ",.7222
"Average CR@20 = ",.3901
"Average F1@20 = ",.4993
"Query Id ","Location
Submissions must be submitted before April 25, 2018, midnight UTC. You may submit a total of 5 runs during the entire duration of the challenge.
These are the official rules that govern how the Multimedia Information Processing for Personality and Social Networks Analysis contest promotion will operate. This promotion will be simply referred to as the “contest” or the “challenge” throughout the rest of these rules and may be abbreviated on our website, in our documentation, and other publications as ChaLearn ICPR2018 LAP.
In these rules, “organizers”, “we,” “our,” and “us” refer to CHALEARN and "participant”, “you,” and “yourself” refer to an eligible contest participant.
This is a skill-based contest and chance plays no part in the determination of the winner(s). There are two tracks associated to this contest as described below:
Figure 1. Example of query data to be diversified by the systems: Flickr image results for query “Pingxi Sky Lantern Festival” (first 14 images from 300) and metadata example for one image.
|<photo date_taken="2007-10-02 22:35:43" description="Taking by a slow shutter with 30 seconds. Sky lanterns flied from the ground to the sky. People wrote the wishes on the lanterns and expect them to be true ! In Ping-Xi town, fly the sky lantern is a ceremony that hold in Lantern Festival every year. 被引用了: www.colourlovers.com/blog/2009/02/09/colors-of-the-lanter… 很高興能跟其他很棒的作品排在一起designsbuzz.com/index.php/inspiration/60-best-inspiration..." id="1475159652" latitude="0" license="1" longitude="0" nbComments="29" rank="2" tags="taiwan 平溪 天燈 元宵節 pingxi skylantern supershot 20mmf28d mywinners abigfave anawesomeshot thelanternfestival ysplix thelanternday" title="Burning Hell? No, they are sky lanterns !! #18" url_b="https://farm2.staticflickr.com/1317/1475159652_e319b240ea_b.jpg" username="勞動的小網管" userid="7178701@N02"views="3663"/>|
Figure 2. Examples of two pairs of essays: image and text, left and right, respectively.
|Una vez sali <FO:salí> con un amigo no muy cercano, fuimos a comer y en la comida el chico se comportaba de forma extraña algo como <DL> desagradable <DL> <DL> con un <MD> aire de superioridad <MD> algo muy desagradable tanto para <DL> mi <FO:mí> como para las personas que estaban en nuestro alrededor pero ya despues <FO:después> cuando se dio cuenta de <DL> su comportamiento cambio <FO:cambió> la forma de como <FO:cómo> se portaba y fue muy humilde.|
|Bueno soy un chico que le gusta divertirse busco todo lo bueno de cada cosa, y lo malo intento analizarlo y <NS> dar una solución. Me gusta escuchar a la gente y apoyarla si puedo , no me gusta el fut <FO:futbol> las mujeres jaja. mucho y tener amistades me encantan los retos y mis triunfos me saben mejor si me cuestan esfuerzo.|
For the two tracks, eligible entries received will be judged using the criteria described above to determine winners.
The registered participants will be notified by email of any change in the schedule.
25th February, 2018: Beginning of the quantitative competition. Track 1: Release of labeled development and unlabeled validation data. Track 2: Release of labeled development,validation data and unlabeled test data.
21th April, 2018: Deadline for code submission. Participants submit code for verification.
22th April, 2018: For track 2 only: Release of final evaluation data and possibly validation labels (still has to be confirmed). Participants can start training their final version of their methods. Participants start submitting predictions on the final evaluation data.
24th April, 2018: End of both tracks of the competition. Deadline for submitting the predictions over the final evaluation data. The organizers start the code verification process.
27th April, 2018: Deadline for submitting the fact sheets.
3rd May, 2018: Release of verification results to the participants for review.
21st August 2018: ICPR 2018 Joint Contest on Multimedia Challenges Beyond Visual Analysis, challenge results, award ceremony.
You are eligible to enter this contest if you meet the following requirements:
This contest is void within the geographic area identified above and wherever else prohibited by law.
If you choose to submit an entry, but are not qualified to enter the contest, this entry is voluntary, and any entry you submit is governed by the remainder of these contest rules; CHALEARN reserves the right to evaluate it for scientific purposes. If you are not qualified to submit a contest entry and still choose to submit one, under no circumstances will such entries qualify for sponsored prizes.
To be eligible for judging, an entry must meet the following content/technical requirements:
Other than what is set forth below, we are not claiming any ownership rights to your entry. However, by submitting your entry, you:
If you do not want to grant us these rights to your entry, please do not enter this contest.
The organizers will select a panel of judges to judge the entries; all judges will be forbidden to enter the contest and will be experts in causality, statistics, machine learning, computer vision, or a related field, or experts in challenge organization. A list of the judges will be made available upon request. The judges will review all eligible entries received and select three winners for each of the two competition tracks based upon the prediction score on test data. The judges will verify that the winners complied with the rules, including that they documented their method by filling out a fact sheet.
The decisions of these judges are final and binding. The distribution of prizes according to the decisions made by the judges will be made within three (3) months after completion of the last round of the contest. If we do not receive a sufficient number of entries meeting the entry requirements, we may, at our discretion based on the above criteria, not award any or all of the contest prizes below. In the event of a tie between any eligible entries, the tie will be broken by giving preference to the earliest submission, using the time stamp of the submission platform.
The organizers may also sponsor other events to stimulate participation.
If there is any change to data, schedule, instructions of participation, or these rules, the registered participants will be notified at the email they provided with the registration.
If you are a potential winner, we will notify you by sending a message to the e-mail address listed on your final entry within seven days following the determination of winners. If the notification that we send is returned as undeliverable, or you are otherwise unreachable for any reason, we may award the prize to an alternate winner, unless forbidden by applicable law.
Winners who have entered the contest as a team will be responsible to share any prize among their members. The prize will be delivered to the registered team leader. If this person becomes unavailable for any reasons, the prize will be delivered to be the authorized account holder of the e-mail address used to make the winning entry.
If you are a potential winner, we may require you to sign a declaration of eligibility, use, indemnity and liability/publicity release and applicable tax forms. If you are a potential winner and are a minor in your place of residence, and we require that your parent or legal guardian will be designated as the winner, and we may require that they sign a declaration of eligibility, use, indemnity and liability/publicity release on your behalf. If you, (or your parent/legal guardian if applicable), do not sign and return these required forms within the time period listed on the winner notification message, we may disqualify you (or the designated parent/legal guardian) and select an alternate selected winner.
We will post changes in the rules or changes in the data as well as the names of confirmed winners (after contest decisions are made by the judges) online on http://chalearnlap.cvc.uab.es This list will remain posted for at least one year.
If an unforeseen or unexpected event (including, but not limited to: someone cheating; a virus, bug, or catastrophic event corrupting data or the submission platform; someone discovering a flaw in the data or modalities of the challenge) that cannot be reasonably anticipated or controlled, (also referred to as force majeure) affects the fairness and / or integrity of this contest, we reserve the right to cancel, change or suspend this contest. This right is reserved whether the event is due to human or technical error. If a solution cannot be found to restore the integrity of the contest, we reserve the right to select winners based on the criteria specified above from among all eligible entries received before we had to cancel, change or suspend the contest subject to obtaining the approval from the Régie des Alcools, des Courses et des Jeux with respect to the province of Quebec.
Computer “hacking” is unlawful. If you attempt to compromise the integrity or the legitimate operation of this contest by hacking or by cheating or committing fraud in any way, we may seek damages from you to the fullest extent permitted by law. Further, we may ban you from participating in any of our future contests, so please play fairly.
ChaLearn is sponsor of this contest.
955 Creston Road,
Berkeley, CA 94708, USA
Additional sponsors can be added during the competition period.
Country of residence:
Date of birth:
Rank in challenge:
By accepting this prize, I certify that I have read and understood the rules of the challenge and that I am a representative of the team authorized to receive the prize and sign this document. To the best of my knowledge, all the team members followed the rules and did not cheat in participating to the challenge. I certify that team complied with all the challenge requirements, including that:
I recognize that I am solely responsible for all applicable taxes related to accepting the prize. NOTE: IF A PRIZE IS DONATED BY CHALEARN, THE RECIPIENT MUST FILL OUT A W9 OR W8BEN FORM
I grant CHALEARN, ChaLearn LAP 2018 competition sponsors, and the contest organizers the right to use, review, assess, test and otherwise analyze results submitted and other material submitted by you in connection with this contest and any future research or contests sponsored by CHALEARN and co-sponsors of this competition the right to feature my entry and all its content in connection with the promotion of this contest in all media (now known or later developed).
CHALEARN and ChaLearn LAP 2018 competition sponsors may use the name of my team, my name, and my place of residence online and in print, or in any other media, in connection with this contest, without payment or compensation.
Start: Feb. 25, 2018, midnight
April 25, 2018, midnight
You must be logged in to participate in competitions.Sign In