The Comparator module includes a single class that implements several tools for conducting performance comparisons of rank aggregation algorithms. The input includes a data file with the input preference lists, another file that contains the relevance judgments for the involved list elements, and a group of rank aggregation algorithms to be compared. After running the selected algorithms on the input data, Comparator produces comparison tables in various formats (e.g. CSV, \LaTeX, etc.) and plots of multiple evaluation measures.
Extended code examples of usage are presented in this notebook.
Implementation file
pyflagr/pyflagr/Comparator.py
Member variables
The class maintains three member variables:
aggregators: A Python list that contains the objects that handle rank aggregation algorithms, along with a user-defined description.results: A Pandas Dataframe that stores the results of the evaluation (namely, the values of various evaluation measures).ev_pts: An integer that represents the cutoff point at which the evaluation measures will be computed.
Member methods
add_aggregator(): This function appends new records into the aggregators list. Each record represents a rank aggregation method that will participate in the comparison tests.
| Parameter | Type | Default | Description |
name |
String, Required. | - | The name of the rank aggregation algorithm that is inserted. |
obj |
Object, Required. | - |
An object that handles the corresponding rank aggregation method. |
Here is a quick example that initializes a Comparator object and appends three rank aggregation methods:
import pyflagr.Linear as Linear
import pyflagr.Majoritarian as Majoritarian
import pyflagr.Weighted as Weighted
import pyflagr.Comparator as Comparator
EV_PTS = 10
cmp = Comparator.Comparator(EV_PTS)
cmp.add_aggregator("CombSUM-Borda", Linear.CombSUM(norm='borda', eval_pts=EV_PTS))
cmp.add_aggregator("Copeland", Majoritarian.CopelandWinners(eval_pts=EV_PTS))
cmp.add_aggregator("DIBRA-Prune", Weighted.DIBRA(aggregator='combsum:borda', gamma=1.2, prune=True,
w_norm='minmax', d1=0.3, d2=0.05, eval_pts=EV_PTS))
aggregate(): Sequentially invokes the aggregate() method of each algorithm included in the aggregators array.
This method also requires a file (or a Dataframe) that contains relevance judgments for the individual list elements. The generated aggregate lists of each algorithm are automatically evaluated (by FLAGR) by using these relevance judgments. The class computes the values of multiple well-established evaluation measures including Mean Average Precision (MAP), Precision, Recall, DCG (Discounted Cumulative Gain), and nDCG (normalized DCG). The computed values are written into the self.results Dataframe.
aggregate() takes four arguments:
| Parameter | Type | Default | Description |
input_file |
String - Required, unless input_df is set. |
Empty String | A CSV file that contains the input lists to be aggregated. |
input_df |
Pandas DataFrame - Required, unless input_file is set. |
None |
A Pandas DataFrame that contains the input lists to be aggregated. Note: If both input_file and input_df are set, only the former is used; the latter is ignored. |
rels_file |
String - Required, unless rels_df is set. |
Empty String | A CSV file that contains the relevance judgements of the involved list elements. FLAGR will evaluate the generated aggregate list/s by computing the values of multiple performance evaluation measures. The results of the evaluation will be stored in the self.results Dataframe. |
rels_df |
Pandas DataFrame - Required, unless rels_file is set. |
None |
A Pandas DataFrame that contains the relevance judgements of the involved list elements. FLAGR will evaluate the generated aggregate list/s by computing the values of multiple performance evaluation measures. The results of the evaluation will be stored in the self.results Dataframe.Note: If both rels_file and rels_df are set, only the former is used; the latter is ignored. |
Example:
# The input data file with the input lists to be aggregated. lists = 'testdata.csv' # The input data file with the relevance judgements. qrels = 'testdata_qrels.csv' cmp.aggregate(input_file=lists, rels_file=qrels)
plot_average_precision(): Creates a comparative bar plot of Mean Average Precision (MAP). The arguments include:
| Parameter | Type | Default | Description |
dimensions |
(x,y) tuple - Optional. |
(10.24,7.68) |
The plot dimensions (width, height). |
show_grid |
Boolean - Optional. | True |
Determines whether the plot will include grid lines. |
query |
String - Optional. | 'all' |
In case the input data file contains preference lists for multiple queries, this parameter determines which query to plot. Notice that 'all' does not mean that all queries will be plotted; instead, it dictates the plotting of the average MAP for all queries. |
Example:
cmp.plot_average_precision((16, 7), True, query='all')

plot_metric(): Creates a plot for a metric at a given cutoff point. The input arguments include:
| Parameter | Type | Default | Description |
cutoff |
Integer - Required. | - | The cutoff point in the aggregate list. The cutoff point must be lower than self.ev_pts. |
metric |
String - Required. | - | Determines the evaluation measure to be plotted. Acceptable values are 'precision', 'recall', 'dcg', and 'ndcg'. |
plot_type |
String - Optional. | 'bar' |
Determines the plot type. Acceptable values are 'bar' and 'lines'. |
dimensions |
(x,y) tuple - Optional. |
(10.24,7.68) |
The plot dimensions (width, height). |
show_grid |
Boolean - Optional. | True |
Determines whether the plot will include grid lines. |
query |
String - Optional. | 'all' |
In case the input data file contains preference lists for multiple queries, this parameter determines which query to plot. Notice that 'all' does not mean that all queries will be plotted; instead, it dictates the plotting of the average MAP for all queries. |
Example:
cmp.plot_metric(EV_PTS, metric='precision', plot_type='bar', dimensions=(16,8), show_grid=True, query='all')

get_results(): Returns slices of self.results by setting specific evaluation measures (columns) and queries (rows). The input arguments include:
| Parameter | Type | Default | Description |
cutoff |
Integer - Required. | - | The cutoff point in the aggregate list. The cutoff point must be lower than self.ev_pts. |
metric |
String - Required. | - | Determines the evaluation measure to be retrieved. Acceptable values are 'precision', 'recall', 'dcg', and 'ndcg'. |
query |
String - Optional. | 'all' |
In case the input data file contains preference lists for multiple queries, this parameter determines which query to retrieve. Notice that 'all' does not mean that all queries will be plotted; instead, it dictates the plotting of the average MAP for all queries. |
convert_to_latex(): Returns the LaTeX code of slices of self.results. The input arguments are:
| Parameter | Type | Default | Description |
cutoff |
Integer - Required. | - | The cutoff point in the aggregate list. The cutoff point must be lower than self.ev_pts. |
metric |
String - Required. | - | Determines the evaluation measure to be retrieved. Acceptable values are 'precision', 'recall', 'dcg', and 'ndcg'. |
query |
String - Optional. | 'all' |
In case the input data file contains preference lists for multiple queries, this parameter determines which query to retrieve. Notice that 'all' does not mean that all queries will be plotted; instead, it dictates the plotting of the average MAP for all queries. |
dec_pts |
Integer - Optional - Maximum value is 6. | 6 |
Sets the precision (i.e. the number of decimal points) of the values of the returned evaluation measures. |
