Input and Output files - FLAGR - Rank Aggregation Library

Input 1: The file of the input preference lists
FLAGR requires that the input preference lists to be aggregated are stored in a single CSV file, regardless of the number of the involved topics (queries) or voters (rankers). The columns of this CSV file must be organized in the following manner:

Query/Topic String, Voter Name, Item Code, Item Score, Algorithm/Dataset

where:

Query/Topic: the query string or the topic for which the preference list is submitted.
Voter: the name of the voter, or the ranker who submits the preference list for the specified Query/Topic.
Item Code: a unique name that identifies a particular element of the preference list. A voter cannot submit the same element for the same query/topic two or more times. This means that each element appears exactly once in each preference list. However, the same element may appear in lists submitted by other voters.
Item Score: the preference score assigned to an item by a specific voter. It reflects the importance (or the relevance, or the weight) of the element. In many cases (e.g. search engine rankings), the preference scores are unknown. In such cases the scores can be replaced by the (reverse) ranking of an Item in such a manner, that the top rankings receive higher scores than the ones that have been assigned lower rankings.
Algorithm/Dataset: A user-defined string that usually represents the origin of a particular preference list. It may receive any non-blank value.

You may find an example of an input list CSV file here. This example file contains the preference lists that were submitted by 50 voters for 20 queries. Each input list contains 30 elements. Therefore, the number of rows in this file is equal to \(50 \cdot 20 \cdot 30=30000\).

Output 1: The file of the aggregate list/s
In this file FLAGR stores the result (output) of the selected rank aggregation method, namely, the final lists that derive after the aggregation of the input preference lists. The library creates one aggregate list per input query/topic; so, if there are \(Q\) input queries, FLAGR generates \(Q\) aggregate lists and stores them in a CSV file. Each row in the file represents an element of the aggregate list stored in decreasing score order. The columns are organized as follows:

Query/Topic String, Voter Name, Item Code, Item Rank, Item Score

Input 2: The file of relevant elements (or, the Rels file)
Optionally, the user may provide a second CSV file (we call it Rels file) that contains relevance judgments for the preference list elements of the primary input file for each query. The Rels file is employed by the FLAGR's evaluation module to evaluate each created aggregate list. Its columns must be formatted as follows:

Query/Topic String, 0, Item Code, Relevance Score

where:

Query/Topic: the query string or the topic for which the list is submitted.
0: unused. This value must be always 0.
Item Code: a unique name that identifies a particular element. There cannot be two relevance judgments for the same element for the same Query.
Relevance Score: An integer value that represents the relevance of the item with respect to the mentioned Query/Topic. Typically, zero values represent irrelevant and incorrect elements; negative values represent spam elements; and positive values represent relevant, correct and informative elements.

You may find an example of an input Rels file here. This example file contains the relevance judgments for the elements of all preference lists for all queries of the previous input list file. Notice that in case FLAGR does not find a relevance judgment for an element, then it automatically considers it as irrelevant (that is, it sets its Relevance Score equal to 0).

Output 2: The evaluation file
As soon as a valid Rels file is provided, the evaluation process takes place automatically. In this case FLAGR evaluates each aggregate list individually and outputs a second CSV file, where it writes the results of the evaluation.
If there are \(Q\) input queries, then \(Q\) aggregate lists are generated and the evaluation file contains \(Q+1\) rows. The first \(Q\) rows store the evaluation metrics for each aggregate list, whereas the last row contains the average values. On the other hand, the columns of the evaluation file depend on the eval_pts parameter that is set by the user. More specifically, the columns are \(6+4\cdot\texttt{eval_pts}\):

q, num_ret, num_rel, num_rel_ret, ap, P@1, ..., P@ev_pts, R@1, ..., R@ev_pts, D@1, ..., D@ev_pts, N@1, ..., N@ev_pts, ram

where

q is the query string.
num_ret is the length (i.e. the number of elements in the) aggregate list.
num_rel is the total number of relevant elements for this query.
num_rel_ret is the number of relevant elements included in the aggregate list.
ap is the Average Precision for a specific aggregate list w.r.t q.
P@X is the running Precision at the X-th element of the aggregate list.
R@X is the running Recall at the X-th element of the aggregate list.
D@X is the running Discounted Cumulative Gain (DCG) at the Xth element of the aggregate list.
N@X is the running normalized Discounted Cumulative Gain (nDCG) at the Xth element of the aggregate list.
ram is the name of the applied rank aggregation method.

3.1. Input and Output files