Author: Benjamin Kille
In 2015, we continue the quest of news recommendations. The CLEF initiative frames a variety of laboratories. NewsREEL appears for the second time among these labs.
We offer two tasks each of which allows participants to evaluate the qualities of their recommendation algorithms. Task 1 connects participants to actual consumers of news articles. The Open Recommendation Platform (ORP) provides an interface to various publishers. Participants receive information about visitors, news articles, and the interaction between them. Additionally, ORP issues recommendation requests and tracks the success of recommendations provided.
We seek to allow participants enough time to implement and optimise their recommenders. On the other hand, we have to simultaneously track performances. Consequently, we defined several time frames within which we will track performances:
start date | end date |
24 Feb | 2 Mar |
17 Mar | 23 Mar |
7 Apr | 13 Apr |
5 May | 2 Jun |
We left 2 weeks between these periods for implementing new recommender or optimise existing ones. The last period spans 4 weeks. Downtimes will negatively affect your performance. Thus, we suggest to have a stable recommender ready before the start of the final period.
If you look for support to get started, you may find our tutorial helpful. The tutorial illustrates how you register an ORP-account. Additionally, it shows you how to start recording data using a baseline recommender.
In Task 2, we provide a large-scale data set comprising interactions and news articles recorded with ORP in a two-month period. Your task is to build a recommender that most accurately predicts future interactions. We marked a subset of approximately 20% of impressions as recommendation request. Your recommender ougth to predict with which article the user will interact. More precisely, the recommender should provide a list of six article references. If the user visits one of these, we consider your recommendation a hit. The more hits your recommender gets, the more likely you will win Task 2. Besides prediction accuracy, we also consider functional qualities. Mainly, we consider how efficient your recommender handles high work loads. We will provide more details about how exactly we measure scalability in the upcoming weeks.
We are looking forward to your participation. You may register here.
Leave a Reply