You probably know the situation when you are traveling by train or bus and you want to stream the next episode of your favorite series on your smartphone but you have to wait for a while, due to buffering of the content, until the video starts. Another well-known annoyance is when you are watching a soccer game live on your mobile phone and during a match winning situation, the screen freezes because your network connection is too slow. When you often stream content (unicast), you likely observe that the data rate of the mobile network is insufficient at many places for an uninterrupted video.
Recently, the world's first 5G Broadcast transmission went on air at Wendelstein in Bavaria (link to press release). The strength of this broadcast transmission is to efficiently transmit the same data to many users simultaneously over a large area. This broadcast transmission can now be used to complement existing unicast transmissions (OTT) in the mobile network.
In this context, your task is to develop an algorithm (machine learning) to optimize the user experience of the customer (= unsatisfied demand) and to maximize the profit (advertising earnings) of the content provider.
The data provided consists of the requested bandwidth for five streams of media content (videos, live television), along with the available bandwidth for streaming via Over the Top (OTT) for a historical period of 9 months. Additionally, the revenues out of advertising are provided for each stream but this money will only be earned when the stream will be broadcasted.
Your tasks are now:
The results of the two tasks should be submitted in a CSV-file with the correct submission format (see chapter 3). This submission will be evaluated with a scoring-application and at the end you will see the result of your solution at the competition page (Results) on CodaLab.
The data is stored in two CSV files:
This dataset contains values that are used to train machine learning models. In order to do this, you need both the features (the data used for predicting) and targets (the actual columns you want to predict).
The test dataset contains only features, without the targets. As mentioned before, your task is to use the machine learning models trained on the train data and apply them on this dataset to generate predictions, which will be evaluated for your final score.
Here are described the different columns (features, variables) in the trainings data:
The name of the submission should be submission_[team name].csv.
The extension of the file MUST be .csv otherwise the evaluation-program will not accept your solution! This CSV-file must be packed into a ZIP-file for the upload to CodaLab. The name of the ZIP-file is not important.
Delimiter: , (comma)
Decimal separator: . (full stop)
Thousands separator: nothing
All other formats with different separators and delimiters are not supported!
The first row of the CSV-file must consist of the names such as in the sample_submission.csv file.
If there are other names (check capitalization!), the scoring-program will reject your submission!
You should predict the demand of the media content of the five streams and the available bandwidth for OTT for the 3 months after the training dataset (train_data.csv) ends. Therefore, the file must contain:
The evaluation program will check if your solution contains 2256 rows and 6 columns. If the size is not correct it will throw an error!
After the submission of your ZIP-file, which contains your solution in a CSV-file, following steps will be processed:
All these steps are executed for each submission. You can see the status "Running" during the execution. It could be possible that the evaluation takes a few minutes when the server is busy because of a lot of other concurrent submission.
If an error occurs, the evaluation is stopped and you will see the error message in the log-file and in the front-end of CodaLab.
Subsequently, the calculation of the Final Score is described to guarantee for transparency and traceability. The scoring-program is written in Python.
The Final Score is the combination of three separate single results. This three calculations have already be mentioned in the first chapter:
In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between the estimated values and what is estimated. MSE is a risk function, corresponding to the expected value of the squared error loss. The fact that MSE is usually strictly positive (and not zero) is because of randomness or because the estimator does not account for information that could produce a more accurate estimate.
(from Wikipedia: https://en.wikipedia.org/wiki/Mean_squared_error)
We use this statistic value because it measures the quality of an estimator. The result is always non-negative, and values closer to zero are better.
Best value: 0 Worst value: ∞
The scoring-program takes each prediction column (stream.1-stream.5 and bandwidth_available_OTT) of your submission and true column of the reference file and compares them with the RMSE. The scoring-program uses the Python function sklearn.metrics.mean_squared_error from the library sklearn (https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_squared_error.html).
This function calculates the result with following formula:
If is the predicted value of the i-th sample, and is the corresponding true value, then the mean squared error (MSE) estimated over is defined as:
After the calculation of the 6 columns, the square root of these values is taken and then the arithmetic mean-value is computed over all values for the result of RSME:
For the computation of the square root and the arithmetic mean the functions of NumPy are used.
Range of results:
The values of RMSE could be in the range from 0 (best) to infinity ∞ (worst) but when you are only using the range of the training dataset you will get a range from 0 (best) to 5 (worst). This is important for the scaling, which will scale the RMSE to the range from 0 (best) to 100 (worst).
Create a perfect prediction of the demand of the streams and the available bandwidth and minimize (~0) the RMSE with a good prediction.
The Unsatisfied Demand is a coefficient, which represents how much demand of the customer couldn't be satisfied. For example, a customer of a mobile network provider wants to watch a live soccer game via OTT but the bandwidth of the mobile network is too small and the video cannot be watched without interruptions. This is an example of an unsatisfied demand because the wish of a customer could not be fulfilled.
The calculation of this Unsatisfied Demand is not so difficult. First, the required bandwidth for OTT must be calculated. Therefore, you need the sum of all stream and subtract the stream, which is broadcasted in this time slot:
The next step is compute the Unsatisfied Demand for one hour (one time slot):
We are only interested in the unsatisfied demands, so we only consider negative values. For the final result of the calculation we create the absolute sum over all negative values for the whole 3 months:
Range of results:
The minimum (best) and maximum (worst) sum of the Unsatisfied Demand are only known of us because we can calculate these values with the reference file. These boundaries must be known for the scaling to the range from 0 (best) to 100 (worst).
Minimize the Unsatisfied Demand to satisfy all customer and to improve their Quality of Service (watching TV without interruptions).
The calculation of the Advertising Earnings is very easy because it only depends on your choice of the channel (channel_broadcasted) which should be broadcasted in the certain timeslot (one hour).
Therefore the scoring-program only takes advertising_earnings.x (from reference file) of the broadcasted channel x (from your solution) and add up all earnings over the whole dataset (each time slot of the 3 months):
Range of results:
The minimum (worst) and maximum (best) sum of the Advertising Earnings are only known of us because we know the real truth from the reference file and can calculate these values. This is important for the scaling to the range from 0 (best) to 100 (worst).
Maximize the profit for the media content provider and therefore optimize and increase the Advertising Earnings.
In order to generate a ranking with a final score, it needs an overall result. However, this is very difficult if the individual results have different ranges. Therefore, these must be first scaled to the same range and afterwards they can be combined and calculated to one Final Score.
As a Final Score we want a value which is in the range from 0 (best) to 100 (worst). However, another problem is that the range of the Advertising Earnings is reversed. The following table shows the maximum/minimum values and the ranges of all individual results.
The values for Unsatisfied Demand and Advertising Earnings are secret and only known for us but you can see which values are good and which are bad.
Now we need formulas to scale these values to the range from 0 to 100. The formula for the transformation for RMSE is easy:
The calculation of the scaled value for Unsatisfied Demand will be calculated with a linear transformation:
The last calculation is the most complex one because the range must not only be scaled but also reversed:
After these three formulas, the three individual scores are in the same range and can now be combined to one Final Score.
The last step of the whole evaluation is the calculation of the Final Score which is used as ranking criteria on the leaderboard on CodaLab. We decided to weight the three individual results because there is a dependency between these three values and therefore it makes no sense to give all values the same weight. The following table shows the weight of the individual scores:
For the calculation of the Final Score the numpy.average function is used:
This function computes following formula:
This Final Score is then forwarded to CodaLab and there it is shown on the leaderboard if it is better (smaller) than your best or the best delivery of your teammates.
What fields of study are acceptable for participation in the competition?
The tasks are aimed at students majoring in electrical engineering – ideally with specialization in communication engineering.
How many semesters must participants have completed to participate?
To participate, we generally recommend that students have completed at least second year. We are glad to give students who have completed fewer semesters also a chance. However, they should pay attention to who is in their group. Simply get one or more participants who are at a later stage in your studies to be part of your group!
Which academic degrees can participants hold?
It doesn't matter whether the participants are working towards a bachelor's or master's.
Which groups are excluded from participating?
Exmatriculated students, graduates and PhD candidates are not allowed to participate. Please note that this is a competition for engineering students and that no students majoring in business or the social sciences are allowed to take part.
What knowledge do the participants need?
Technical knowledge of data processing (machine learning) may be helpful. The main thing is that you are having fun :)
Can I take part without a team?
We ask you to register as a team. Teams may consist of three to five persons. However we offer a teamfinder if you are a one-person team :)
For all additional questions simply e-mail us: firstname.lastname@example.org
When do we get the information on the finals?
All information on the finals will be mailed right after the announcement of the finalists on May 20th (except Asia: May 22nd).
Do we have to book an accommodation for the finals?
We already booked an accommodation for you. Just in case your travel from a long distance, we recommend to arrive already on June 4th. All costs will be covered by Rohde & Schwarz.
How to get to the finals?
We please you to book your tickets (bus, train, plane, shared car etc.) on your own. You get more information on that once you made it to the finals. All costs will be covered by Rohde & Schwarz.
Bears R&S the travel costs for the finals?
Yes, see above :)
Can we extend our stay in Munich?
Sure! We booked the “Jägers Hostel” for you. So feel free to reach out to the hostel to ask for extension of you stay. Please make sure to cover the costs for the nights after the official competition on your own. Means from June 8th onwards.
Does the data set consist of real data or fictitious data?
The data is artificially generated with the goal of simulating real-world data.
How is the format of the used csv-files?
The csv-files which are provided contain as delimiter a comma “,” and as decimal separator the full stop “.” (dot). You should submit files only in this format and the scoring-program will reject any other format.
How can I open the provided files with an Office Program (Microsoft Excel, LibreOffice)?
Windows: In windows you have to change the file extension to .txt and then open this file with Excel. After opening the file, a window will appear in which you can select the format of the file. There you have to fill in all settings mentioned above. Afterwards, Excel should show you the file in a correct way.
Linux: In Linux you can use the program LibreOffice Calc without any special configurations.
MacOS: You can use Numbers to open the file, or alternatively also Excel.
Which programming languages I can use for solving the challenge?
You can use any language you want because you only have to submit a csv-file with the solution.
Which units are used in the dataset?
The whole data are without units because these are not necessary for the challenge. But you can imagine that the demand and the bandwidth is something like Gb/s. The advertising earnings represents the income out of advertisement shown on this stream when it is broadcasted. This earnings could be expressed in a unit like ten thousand Euros. This means that a 1 could represent 10 000€ but this is only an example.
Are negative values possible in the dataset or submission?
Yes you can upload solutions with negative values, but in reality a negative demand for media content is not possible, so also the reference solution of the 3 months doesn’t contain negative values.
During my submission an error occurs and my solution is rejected. What should I do?
Please check if your submission fulfill all requirements and has the same format as the sample_submission. When you can’t find any error and the error message of the scoring-program makes no sense for you please contact us via Facebook, Forum or E-Mail. We are all humans and mistakes are possible in the scoring-program.
What is 5G?
Once every few years, a new generation of the mobile network is introduced, which basically only promises faster mobile data on smartphones. What is the difference between the fifth generation – 5G for short – and its predecessors 2G, 3G and 4G? Put simply: this time it’s no longer just about the telecommunications sector. Autonomous driving, smart homes and smart cities or medical operations from a distance: 5G makes them truly tangible, because this time a new, universal standard promises real technology convergence in a previously unattainable form and bandwidth.
What is the difference between 4G (LTE) and 5G?
Simply said, 5G is widely believed to be smarter, faster and more efficient than 4G. It promises mobile data speeds that far outstrip the fastest home broadband network currently available to consumers. With speeds of up to 100 gigabits per second, 5G is set to be as much as 100 times faster than 4G. One of the biggest improvement of 5G is the low latency compared to 4G (LTE).
What is 5G Broadcast?
What does 5G actually mean for broadcast? The convergence of broadcasting and broadband networks will enable content to be transmitted to domestic television sets as well as mobile devices in a uniform standard in the future.
The technical term is “5G Broadcast” and refers to a large area transmission network with which broadcasting content can be transmitted terrestrially similar to DVB-T2. A preliminary stage of 5G broadcast is the ATSC 3.0 broadcast transmission system. With help of the so-called Next Gen TV, the 2018 Olympic Games in South Korea were already transmitted via 5G networks and the USA is also carrying out tests with the new technology.
The only thing you’ll need to receive 5G radio is a 5G-enabled device with an unlocked broadcast function. What is currently still in the development stage could be a mobile phone, an on-board television in the car, or a stationary television in the future. For example, mobile users will be able to view program content on their smartphones without having to fear that their data volume will be used up within a very short time. The content simply reaches the end device via the HPHT broadcasting network, without the respective mobile phone provider being directly involved.
In theory, the combination of personalized and location-based content could enable forms of advertising such as targeted advertising outside the classical World Wide Web. 5G Broadcast can also accommodate the increasingly individualized user behavior: The asynchronous distribution of program content in the mobile user area is only made possible by a data highway such as 5G with full feedback channel capability. Ultimately there is no need to worry in the standard version: Anonymous broadcasting will not be compromised.
What is 5G Today?
As part of the Bavarian research project 5G TODAY, a 5G test field for broadcasting has been set up in the Bavarian Oberland. Under the direction of the Institute of Broadcast Technology, project partners Kathrein and Rohde & Schwarz are investigating large-scale TV transmission in the FeMBMS (Further evolved Multimedia Broadcast Multicast Service) broadcast mode in 5G. They are supported by the associated partners Telefónica Deutschland and Bayerischer Rundfunk, which operates the 5G FeMBMS transmitter network as a test field at its transmitter sites.
Two Rohde & Schwarz high-power transmitters with 100 KW ERP are being installed at the Bavarian Broadcasting Corporation transmitter sites in Munich-Ismaning and on top of Wendelstein Mountain (1828 meters altitude). Kathrein antennas are being integrated and specially optimized for cellular reception. Both test transmitters will operate in a single frequency network over channel 56/57 (750 MHz to 760 MHz). The spectrum for the test transmitters are being provided by Telefónica.
The IRT has developed an FeMBMS receiver based on software defined radio (SDR) technology. IRT is also involved in transmitter network planning and test site measurements.
The 5G TODAY research project is funded by the Bavarian Research Foundation over a period of 28 months.
What is the aim of 5G Today?
The insights from the project 5G TODAY will contribute greatly towards advancing 5G broadcast, supporting standardization work and promoting the development of components all the way to market launch. The goal is to enable the user to take advantage of both broadcasting and unicast transmission mechanisms on a single device – comprehensively, continuously and without further restrictions.
Where can I find more information about 5G Today?
You can find more information about the project 5G Today on the website
There you can follow the project and get information about all important activities.
What is FeMBMS?
FeMBMS (Further evolved multimedia broadcast multicast service) is a further development of the LTE broadcast mode eMBMS in 3GPP Release 14. It enables 100% of the transmission capacity to be used for broadcasting services. FeMBMS also allows larger transmitter cells in a single frequency network.
FeMBMS receivers are designed based on software-defined radio technology. In the future, this technology could be integrated into smartphones, tablets and TVs.
What is 3GPP Release 14?
Release 14 of the 3G Partnership Project (3GPP) standard supports important requirements for economically broadcasting TV programs in large-cell 4G and 5G networks. Improvements include reception without a SIM card and without authentication, as well as the option of using up to 100 percent of the available transmission capacity in the new further evolved multimedia broadcast multicast service (FeMBMS) mode for broadcast applications. Considerably increased inter-site distances permit the use of broadcast transmitter stations for economical area coverage. The 3GPP standard established a receive-only mode without the need for a return channel and defined the audiovisual transport and coding formats that are now used in broadcasting technology.
What does the abbreviation OTT mean?
Over the top (OTT) is a term used to refer to content providers that distribute streaming media as a standalone product directly to viewers over the Internet, bypassing telecommunications, multichannel television, and broadcast television platforms that traditionally act as a controller or distributor of such content.
(from Wikipedia: https://en.wikipedia.org/wiki/Over-the-top_media_services)
Why is 5G Broadcast very interesting for mobile network operators?
New media-technologies bring large, rapidly growing AV data volumes (4K, 8K, VR / AR / 360°) and long usage times. 5G broadcast adds an additional feature to 5G mobile devices without any additional cost to the recipient and allows for innovative, attractive media services, e.g. the cost-effective combination of classic linear television with the fast-growing market for associated on-demand services. In addition, long-term investments in transmission towers, antennas and supply lines can be protected. Off-loading can be used to distribute large amounts of data to many equal user groups.
What is the advantage for the customer of 5G Broadcast?
5G Broadcast offers linear, highly reliable broadcast offerings without any additional costs on smartphones or tablets. It enables consumers to seamlessly transition from linear offerings to their home, mobile, and broadband networks.
Which advantages does 5G Broadcast offer for mobile device manufacturers?
5G as a global standard promises economies of scale through cheap production of technology components. Convergence with the classical 5G communication technology on mobile devices could be used to allow new services such as personalized services.
Do I need machine learning to solve the challenge?
No you don’t necessarily need machine learning but it will help you to get a better result for the challenge and now you have a chance to learn more about the interesting topic of Artificial Intelligence especially machine learning, so use this chance and become an expert.
What is machine learning?
Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to effectively perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model of sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to perform the task. (Wikipedia).
Where can I find more information about machine learning?
Two excellent and free books on machine learning are Introduction to Statistical Learning and Elements of Statistical Learning. A less theoretical and more practical commercial book is Hands-On Machine Learning with Scikit-learn and Tensorflow.
What is CodaLab?
CodaLab Competitions is a powerful open source framework for running competitions that involve result or code submission. You can either participate in an existing competition or host a new competition.
Who developed the platform?
Codalab was created in 2013 as a joint venture between Microsoft and Stanford University. In 2014, ChaLearn joined to co-develop Codalab competitions. Since 2015, University Paris-Saclay is community lead of Codalab competitions, under the direction of Isabelle Guyon, professor of big data. Codalab is administered by CKCollab and the LRI staff.
Where can I find currently running competitions?
All public available competitions can be found under https://competitions.codalab.org/competitions/.
Where can I find the manual or help page of CodaLab?
You can find the platform tutorial on Facebook (Competition-Group) in which all necessary steps for the registration and the submission of a solution are described.
The manual and description of the platform is hosted on the GitHub page of CodaLab. This page can be found with the following link https://github.com/codalab/codalab-competitions/wiki
Important parts for participants:
Participating in a competition:
Creating and joining a team (only competition teams!)
Is it possible to participate in a competition without a registration on CodaLab?
No, it is not possible to enroll in a competition without a registration on CodaLab! Therefore, you have to create a CodaLab account (https://competitions.codalab.org/accounts/signup/?next=/) before you can participate in the Rohde & Schwarz Engineering Competition 2019.
Where can I find help on general topics?
Questions which could be also interesting for other participants should be preferably asked in the forum on the competition page. (https://competitions.codalab.org/forums/19282/) Another possibility for questions is to use one of the social media channels of Rohde & Schwarz like the Facebook-Group. They will be answered as soon as possible by the Rohde & Schwarz Engineering Competition Team.
For more detailed or personal questions you can directly write to the email address email@example.com.
Who can help me with problems with CodaLab (upload of the submission)?
In addition to the already mentioned options like forum, social media and e-mail, there is also the possibility to contact the person, responsible for the platform Witsch Daniel in urgent cases. (firstname.lastname@example.org)
Who can help me with dataset problems?
If you have any questions or problems concerning the dataset, please ask them in the forum so that all participants and teams have the same knowledge and nobody is disadvantaged.
Where can I find more information about 5G Broadcast?
If you want to get more information or have specific questions about 5G Broadcast, you can post it in the forum or in the Facebook-Group.
If you have technical questions, you can also send them directly to the email address email@example.com. These questions will be answered by 5G Broadcast Rohde & Schwarz specialists.
Who should I contact if I am interested in a career at Rohde & Schwarz?
We are always looking for young and motivated talents for working student employment, final theses or even a permanent position. Therefore, you can send us your questions about a career at Rohde & Schwarz to the following email address: firstname.lastname@example.org.
🗣 We want to be respectful in the way we communicate and act.
🦹🏻♂️ We tolerate each other as individuals and their opinions.
🤪 We want to have fun.
🔮 Honest and trustworthy feedback over misleading answers.
🇬🇧 In order to be transparent, the entire communication on all platforms will be done in English only.
We set out to providing a comfortable environment on every platform utilized during the datathon. This should lead to a harassment-free experience for everyone regardless of the following:
We absolutely do not accept any form of harassment of other participants. Any violent, sexual or abusive language and imagery is not appropriate and not accepted. This includes:
If such a violation is noticed during or after the Datathon, participants may be sanctioned or expelled from the hackathon without a refund (if applicable) at the discretion of the hackathon organizers.
Harassment includes offensive verbal comments related to gender, gender identity and expression, age, sexual orientation, disability, physical appearance, body size, race, ethnicity, nationality, religion or political views, sexual images in public spaces, deliberate intimidation, stalking, following, photography or audio/video recording against reasonable consent, sustained disruption of talks or other events, inappropriate physical contact, and unwelcome sexual attention.
Photography is encouraged, but other participants must be given a reasonable chance to opt out from being photographed. If they object to the taking of their photograph, comply with their request. It is inappropriate to take photographs in contexts where people have a reasonable expectation of privacy (in bathrooms or where participants are sleeping).
Participants asked to stop any harassing behavior are expected to comply immediately.
As this is a hackathon, we like to explicitly note that the hacks created at our hackathon are equally subject to the anti-harassment policy.
Sponsors and partners are also subject to the anti-harassment policy. In particular, sponsors should not use sexualised images, activities, or other material. Sponsor representatives (including volunteers) should not use sexualised clothing/uniforms/costumes, or otherwise create a sexualised environment.
If you are being harassed, notice that someone else is being harassed, or have any other concerns, please contact a member of hackathon staff immediately.
Hackathon staff will be happy to help participants contact any local security or local law enforcement, provide escorts, or otherwise assist those experiencing harassment to feel safe for the duration of the hackathon. We value your attendance.
If a participant engages in harassing behavior, the hackathon organisers may take any action they deem appropriate. This includes warning the offender, expulsion from the hackathon with no refund (if applicable), or reporting their behaviour to local law enforcement.
We expect participants to follow these rules at hackathon and workshop venues and hackathon-related social events.
As it is standard with other similar events as well: Source: https://hackcodeofconduct.org
Start: April 26, 2019, 7 a.m.
Description: Phase for submitting your 5G Broadcast solutions
May 19, 2019, 10 p.m.
You must be logged in to participate in competitions.Sign In