(Py)FLAGR¶

The fusion of multiple ranked lists of elements into a single aggregate list is a well-studied research field with numerous applications in Bioinformatics, recommendation systems, collaborative filtering, election systems and metasearch engines.

FLAGR is a high performance, modular, open source C++ library for rank aggregation problems. It implements baseline and recent state-of-the-art aggregation algorithms that accept multiple ranked preference lists and generate a single consensus list of elements. A portion of these methods apply exploratory analysis techniques and belong to the broad family of unsupervised learning techniques.

Installation¶

PyFLAGR is a Python library built on top of FLAGR library core. It can be easily installed with pip:

pip install pyflagr

After installation, PyFLAGR can be used in standard Python programs and Jupyter notebooks. Representative code examples can be found on this notebook.

Downloads, Documentation, Version and License¶

  • The source code of FLAGR and PyFLAGR is available through the official GitHub repository.
  • The library is fully documented at https://flagr.site/.
  • The current (Py)FLAGR version is 1.0.18.
  • Both libraries are licensed under the Apache License, version 2.

Importing and using PyFLAGR¶

PyFLAGR groups its supported rank aggregation methods in 6 modules:

  1. Linear: This module contains the CombSUM, CombMNZ, Borda and SimpleBorda classes. CombSUM and CombMNZ support five normalization methods (see Renda et al., 2003). Borda and SimpleBorda are just wrappers of CombSUM with borda and simple-borda normalization.
  2. Majoritarian: Includes:
    • CondorcetWinners,
    • CopelandWinners and
    • OutrankingApproach (Farah and Vanderpooten, 2007).
  3. MarkovChains: Implements:
    • the four Markov Chains methods of Dwork et. al, 2001 and
    • the MCT variant of DeConde et al. 2006.
  4. Kemeny: Includes KemenyOptimal (Kemeny Optimal Aggregation).
  5. RRA: Includes RobustRA (Robust Rank Aggregation of Kolde et al., 2012 in two variants).
  6. Weighted: This module implements several self-weighting rank aggregation methods. These methods automatically identify the expert voters and include:
    • The Preference Relations Graph method of Desarkar et al., 2016.
    • The Agglomerative method of Chatterjee et al., 2018.
    • The Iterative, Distance-Based method (DIBRA) of Akritidis et al., 2022.

The following statements demonstrate the imports of all PyFLAGR rank aggregation methods in a typical jupyter notebook.

In [1]:
# Import the PyFLAGR modules for rank aggregation
import pyflagr.Linear as Linear
import pyflagr.Majoritarian as Majoritarian
import pyflagr.MarkovChains as MarkovChains
import pyflagr.Kemeny as Kemeny
import pyflagr.RRA as RRA
import pyflagr.Weighted as Weighted
In [2]:
# Code snippet for displaying dataframes side by side
from IPython.display import display_html
from itertools import chain,cycle
def display_side_by_side(*args,titles=cycle([''])):
    html_str=''
    x = 1
    for df,title in zip(args, chain(titles,cycle(['</br>'])) ):
        html_str+='<th style="text-align:center"><td style="vertical-align:top">'
        html_str+=f'<h2 style="text-align: center;">{title}</h2>'
        html_str+=df.to_html().replace('table','table style="display:inline"')
        html_str+='</td></th>'
    display_html(html_str,raw=True)

General Description¶

All PyFLAGR rank aggregation methods include:

  • a standard class constructor: several hyper-parameters of the rank aggregation algorithm and other execution parameters can be passed through the constructor. All the constructor inputs have default values, therefore, they are considered optional. This means that all constructors can be called without any argument at all.
  • an aggregate method that runs the algorithm on the selected input and (optionally) evaluates the generated aggregate list. In all algorithms, the aggregate method accepts the following arguments:
Parameter Type Default Value Values
input_file String - Required, unless input_df is set. Empty String A CSV file that contains the input lists to be aggregated.
input_df Pandas DataFrame - Required, unless input_file is set. None A Pandas DataFrame that contains the input lists to be aggregated. Note: If both input_file and input_df are set, only the former is used; the latter is ignored.
rels_file String, Optional. Empty String A CSV file that contains the relevance judgements of the involved list elements. If such a file is passed, FLAGR will evaluate the generated aggregate list/s by computing several retrieval effectiveness evaluation measures. The results of the evaluation will be stored in the eval_df DataFrame. Otherwise, no evaluation will take place and eval_df will be empty. Read more on the evaluation of rank aggregation quality.
rels_df Pandas DataFrame, Optional. None A Pandas DataFrame that contains the relevance judgements of the involved list elements. If such a dataframe is passed, FLAGR will evaluate the generated aggregate list/s by computing several retrieval effectiveness evaluation measures. The results of the evaluation will be stored in the eval_df DataFrame. Otherwise, no evaluation will take place and eval_df will be empty. Read more on the evaluation of rank aggregation quality. Note: If both rels_file and rels_df are set, only the former is used; the latter is ignored.
output_dir String, Optional. Temporary directory (OS-specific) The directory where the output files (aggregate lists and evaluation) will be stored. If it is not set, the default location will be used.

Input/Output files¶

A rank aggregation application involves a set of queries $Q=\{q_1,q_2,...,q_N\}$ and a set of rankers $R=\{r_1,r_2,...r_m\}$. Each query $q\in{Q}$ is submitted to all rankers in $R$, who respond by returning a ranked list of preference items sorted in decreasing importance, or relevance order. For example, in the context of recommendation systems, a set of users may be asked to enlist their preferences as a response to the hypothetical query "which are your favorite games?". Then, a rank aggregation algorithm must merge all the submitted preference lists, discover any important latent information, and generate a single output list with improved element ordering.

For the requirements of this notebook, we use a sample dataset that was created by employing RASDaGen, a synthetic dataset generator for rank aggregation applications. The dataset includes two files: testdata.csv and testdata_qrels.csv.

The former contains preference lists that were submitted by 50 voters for 20 queries. Each input list contains 30 elements. Therefore, the number of rows in this file is equal to $50 \times 20 \times 30=30000$. The columns of this CSV file must be organized in the following manner:

Query, Voter Name, Item Code, Item Score, Algorithm/Dataset

where

  • Query represents the topic for which the preference list is submitted,
  • Voter is the name of the ranker who submitted a preference list for a particular Query,
  • Item Code is a unique name that identifies each element of the preference lists,
  • Item Score is the preference score assigned to an item by a Voter, and
  • Algorithm/Dataset is a user-defined string that represents the origin of a particular preference list.

The input csv file should not contain any headers. On the other hand, testdata_qrels.csv contains relevance judgments for the preference list elements of the primary input file for each query. It is organized in the following fashion:

Query, 0, Item Code, Relevance

where:

  • Query represents the topic for which the preference list is submitted,
  • 0: unused. This value must be always 0.
  • Item Code is a unique name that identifies each element of the preference lists,
  • Relevance is nn integer value that represents the relevance of the item with respect to the mentioned Query. Typically, zero values represent irrelevant and incorrect elements; negative values represent spam elements; and positive values represent relevant, correct and informative elements.

Please refer to this article for more details.

PyFLAGR Code examples¶

The following examples demonstrate the usage of all PyFLAGR rank aggregation methods.

Initially, we specify the two files that comprise our sample dataset: testdata.csv contains the input lists to be aggregated and testdata_qrels.csv stores relevance judgments about the elements of those input lists.

In [3]:
# The input data file with the input lists to be aggregated.
lists = '../testdata/testdata.csv'

# The input data file with the relevance judgements.
qrels = '../testdata/testdata_qrels.csv'

Linear combination methods¶

In linear combination methods, the score of each element is computed by summing up the partial scores of that element with respect to its rankings in each input preference list. The Linear() function triggers the execution of two such combination methods: CombSUM and CombMNZ. Both of them are implemented in accordance to the following paper:

  • Renda E., Straccia U., "Web metasearch: rank vs. score based rank aggregation methods", In Proceedings of the 2003 ACM symposium on Applied computing, pp. 841-846, 2003.

Each method is accompanied by an element rank/score normalization technique, as it is described in the paper above. These techniques are: Rank normalization, Borda normalization, Score normalization and Z-Score normalization. In FLAGR there is a fifth normalization technique, called Simple Borda. In contrast to the traditional Borda normalization, Simple Borda assigns zero partial scores in case an element has not been ranked by an input preference list.

CombSUM¶

Member of pyflagr.Linear.

Documentation: Linear Module.

The CombSUM constructor supports the following parameters:

Parameter Type Default Value Values
eval_pts Integer, Optional. Considered only if rels_file or rels_df is set. 10 Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for eval_pts=10 FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$.
norm String, Optional. borda Rank or score normalization methods:
  • borda: The aggregation is performed by normalizing the element rankings according to the Borda normalization method. Equivalent to the BordaCount function.
  • rank: The aggregation is performed by normalizing the element rankings according to the Rank normalization method.
  • score: The aggregation is performed by normalizing the element scores according to the Score normalization method.
  • z-score: The aggregation is performed by normalizing the element scores according to the Z-Score normalization method.
  • simple-borda: Similar to borda normalization but no partial score is assigned to an element if it is not ranked by a voter.
In [4]:
csum = Linear.CombSUM(norm='rank', eval_pts=5)

# In this case, rels_file has been specified, so PyFLAGR returns two non-blank dataframes:
# * df_out contains the aggregate list produced by the selected algorithm
# * df_eval contains the effectiveness evaluation based on the relevance judgments in the rels_file
df_out, df_eval = csum.aggregate(input_file=lists, rels_file=qrels)

display_side_by_side(df_out.head(20), df_eval, titles=['Aggregate list', 'Evaluation'])

Aggregate list

Query Voter ItemID Rank Score Aggregator
0 1 PyFLAGR Q1-E39 1 13.466667 101
1 1 PyFLAGR Q1-E48 2 12.933333 101
2 1 PyFLAGR Q1-E23 3 11.933333 101
3 1 PyFLAGR Q1-E85 4 11.400000 101
4 1 PyFLAGR Q1-E95 5 11.366667 101
5 1 PyFLAGR Q1-E94 6 11.300000 101
6 1 PyFLAGR Q1-E33 7 11.000000 101
7 1 PyFLAGR Q1-E63 8 10.700000 101
8 1 PyFLAGR Q1-E100 9 10.666667 101
9 1 PyFLAGR Q1-E5 10 10.433333 101
10 1 PyFLAGR Q1-E43 11 10.300000 101
11 1 PyFLAGR Q1-E32 12 10.000000 101
12 1 PyFLAGR Q1-E35 13 9.933333 101
13 1 PyFLAGR Q1-E88 14 9.933333 101
14 1 PyFLAGR Q1-E16 15 9.666667 101
15 1 PyFLAGR Q1-E56 16 9.566667 101
16 1 PyFLAGR Q1-E84 17 9.533333 101
17 1 PyFLAGR Q1-E3 18 9.466667 101
18 1 PyFLAGR Q1-E89 19 9.400000 101
19 1 PyFLAGR Q1-E62 20 9.400000 101

Evaluation

q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 recall_1 recall_2 recall_3 recall_4 recall_5 dcg_cut_1 dcg_cut_2 dcg_cut_3 dcg_cut_4 dcg_cut_5 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5
0 Topic 1 100 48 48 0.534255 1.00 1.000 0.666667 0.50 0.40 0.020833 0.041667 0.041667 0.041667 0.041667 1.00 1.630930 1.630930 1.630930 1.630930 1.00 1.000000 0.765361 0.636682 0.553146
1 Topic 2 100 46 46 0.539673 0.00 0.500 0.666667 0.50 0.40 0.000000 0.021739 0.043478 0.043478 0.043478 0.00 0.630930 1.130930 1.130930 1.130930 0.00 0.386853 0.530721 0.441492 0.383566
2 Topic 3 100 40 40 0.442181 1.00 1.000 0.666667 0.50 0.40 0.025000 0.050000 0.050000 0.050000 0.050000 1.00 1.630930 1.630930 1.630930 1.630930 1.00 1.000000 0.765361 0.636682 0.553146
3 Topic 4 100 40 40 0.427771 0.00 0.000 0.000000 0.25 0.20 0.000000 0.000000 0.000000 0.025000 0.025000 0.00 0.000000 0.000000 0.430677 0.430677 0.00 0.000000 0.000000 0.168128 0.146068
4 Topic 5 100 54 54 0.656236 1.00 1.000 1.000000 1.00 1.00 0.018519 0.037037 0.055556 0.074074 0.092593 1.00 1.630930 2.130930 2.561606 2.948459 1.00 1.000000 1.000000 1.000000 1.000000
5 Topic 6 100 50 50 0.496233 0.00 0.500 0.333333 0.25 0.40 0.000000 0.020000 0.020000 0.020000 0.040000 0.00 0.630930 0.630930 0.630930 1.017783 0.00 0.386853 0.296082 0.246302 0.345191
6 Topic 7 100 48 48 0.413981 0.00 0.000 0.000000 0.00 0.20 0.000000 0.000000 0.000000 0.000000 0.020833 0.00 0.000000 0.000000 0.000000 0.386853 0.00 0.000000 0.000000 0.000000 0.131205
7 Topic 8 100 48 48 0.487902 1.00 0.500 0.666667 0.75 0.60 0.020833 0.020833 0.041667 0.062500 0.062500 1.00 1.000000 1.500000 1.930677 1.930677 1.00 0.613147 0.703918 0.753698 0.654809
8 Topic 9 100 48 48 0.415008 0.00 0.500 0.333333 0.50 0.40 0.000000 0.020833 0.020833 0.041667 0.041667 0.00 0.630930 0.630930 1.061606 1.061606 0.00 0.386853 0.296082 0.414430 0.360055
9 Topic 10 100 50 50 0.480995 0.00 0.500 0.333333 0.50 0.40 0.000000 0.020000 0.020000 0.040000 0.040000 0.00 0.630930 0.630930 1.061606 1.061606 0.00 0.386853 0.296082 0.414430 0.360055
10 Topic 11 100 51 51 0.536177 1.00 0.500 0.333333 0.50 0.60 0.019608 0.019608 0.019608 0.039216 0.058824 1.00 1.000000 1.000000 1.430677 1.817529 1.00 0.613147 0.469279 0.558508 0.616434
11 Topic 12 100 38 38 0.406823 1.00 0.500 0.333333 0.25 0.20 0.026316 0.026316 0.026316 0.026316 0.026316 1.00 1.000000 1.000000 1.000000 1.000000 1.00 0.613147 0.469279 0.390380 0.339160
12 Topic 13 100 43 43 0.487894 1.00 1.000 0.666667 0.50 0.40 0.023256 0.046512 0.046512 0.046512 0.046512 1.00 1.630930 1.630930 1.630930 1.630930 1.00 1.000000 0.765361 0.636682 0.553146
13 Topic 14 100 45 45 0.470907 0.00 0.000 0.000000 0.25 0.20 0.000000 0.000000 0.000000 0.022222 0.022222 0.00 0.000000 0.000000 0.430677 0.430677 0.00 0.000000 0.000000 0.168128 0.146068
14 Topic 15 100 48 48 0.420168 0.00 0.000 0.000000 0.00 0.20 0.000000 0.000000 0.000000 0.000000 0.020833 0.00 0.000000 0.000000 0.000000 0.386853 0.00 0.000000 0.000000 0.000000 0.131205
15 Topic 16 100 52 52 0.506824 0.00 0.000 0.333333 0.50 0.40 0.000000 0.000000 0.019231 0.038462 0.038462 0.00 0.000000 0.500000 0.930677 0.930677 0.00 0.000000 0.234639 0.363318 0.315648
16 Topic 17 100 48 48 0.557906 0.00 0.000 0.333333 0.25 0.40 0.000000 0.000000 0.020833 0.020833 0.041667 0.00 0.000000 0.500000 0.500000 0.886853 0.00 0.000000 0.234639 0.195190 0.300785
17 Topic 18 100 39 39 0.509445 1.00 1.000 0.666667 0.75 0.80 0.025641 0.051282 0.051282 0.076923 0.102564 1.00 1.630930 1.630930 2.061606 2.448459 1.00 1.000000 0.765361 0.804810 0.830420
18 Topic 19 100 45 45 0.426503 0.00 0.000 0.333333 0.50 0.40 0.000000 0.000000 0.022222 0.044444 0.044444 0.00 0.000000 0.500000 0.930677 0.930677 0.00 0.000000 0.234639 0.363318 0.315648
19 Topic 20 100 45 45 0.492781 1.00 1.000 0.666667 0.75 0.80 0.022222 0.044444 0.044444 0.066667 0.088889 1.00 1.630930 1.630930 2.061606 2.448459 1.00 1.000000 0.765361 0.804810 0.830420
20 all 2000 926 926 0.485483 0.45 0.475 0.416667 0.45 0.44 0.010111 0.021014 0.027182 0.038999 0.047423 0.45 0.765465 0.915465 1.152337 1.307078 0.45 0.469343 0.429608 0.449849 0.443309
In [5]:
csum = Linear.CombSUM(norm='score')

# In this case, rels_file has NOT been specified, so PyFLAGR returns two dataframes,
# * df_out contains the aggregate list produced by the selected algorithm
# * df_eval is blank
df_out, df_eval = csum.aggregate(input_file=lists)

display_side_by_side(df_out.head(20), df_eval, titles=['Aggregate list','Evaluation'])

Aggregate list

Query Voter ItemID Rank Score Aggregator
0 1 PyFLAGR Q1-E39 1 13.206937 102
1 1 PyFLAGR Q1-E48 2 12.517291 102
2 1 PyFLAGR Q1-E23 3 11.551781 102
3 1 PyFLAGR Q1-E95 4 11.137948 102
4 1 PyFLAGR Q1-E85 5 11.103479 102
5 1 PyFLAGR Q1-E94 6 10.965562 102
6 1 PyFLAGR Q1-E33 7 10.689677 102
7 1 PyFLAGR Q1-E63 8 10.379323 102
8 1 PyFLAGR Q1-E100 9 10.344844 102
9 1 PyFLAGR Q1-E5 10 10.241416 102
10 1 PyFLAGR Q1-E43 11 10.103458 102
11 1 PyFLAGR Q1-E88 12 9.689687 102
12 1 PyFLAGR Q1-E35 13 9.689687 102
13 1 PyFLAGR Q1-E32 14 9.620729 102
14 1 PyFLAGR Q1-E16 15 9.413802 102
15 1 PyFLAGR Q1-E56 16 9.379333 102
16 1 PyFLAGR Q1-E84 17 9.241396 102
17 1 PyFLAGR Q1-E3 18 9.206927 102
18 1 PyFLAGR Q1-E89 19 9.206917 102
19 1 PyFLAGR Q1-E62 20 9.103500 102

Evaluation

BordaCount and SimpleBordaCount¶

Member of pyflagr.Linear.

Documentation: Linear Module.

BordaCount is equivalent to CombSUM with borda normalization. Its constructor supports the following parameters:

Parameter Type Default Value Values
eval_pts Integer, Optional. Considered only if rels_file or rels_df is set. 10 Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for eval_pts=10 FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$.
In [6]:
borda = Linear.BordaCount(eval_pts=7)

df_out, df_eval = borda.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[6]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 926 0.48364 0.5 0.475 0.416667 0.425 0.42 ... 1.282464 1.496188 1.629522 0.5 0.480657 0.438268 0.44024 0.434961 0.45275 0.447917

1 rows × 33 columns

In [7]:
sborda = Linear.SimpleBordaCount(eval_pts=7)

df_out, df_eval = sborda.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[7]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 926 0.485026 0.55 0.5 0.466667 0.4375 0.44 ... 1.358739 1.608084 1.724751 0.55 0.511315 0.485196 0.462466 0.46083 0.48661 0.474093

1 rows × 33 columns

In [8]:
# Equivalent code for Borda Count: This one produces the same results as the previous code block
csum = Linear.CombSUM(norm='borda', eval_pts=7)

df_out, df_eval = csum.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[8]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 926 0.48364 0.5 0.475 0.416667 0.425 0.42 ... 1.282464 1.496188 1.629522 0.5 0.480657 0.438268 0.44024 0.434961 0.45275 0.447917

1 rows × 33 columns

CombMNZ¶

Member of pyflagr.Linear.

Documentation: Linear Module.

The CombMNZ constructor supports the following parameters:

Parameter Type Default Value Values
eval_pts Integer, Optional. Considered only if rels_file or rels_df is set. 10 Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for eval_pts=10 FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$.
norm String, Optional. borda Rank or score normalization methods:
  • borda: The aggregation is performed by normalizing the element rankings according to the Borda normalization method. Equivalent to the BordaCount function.
  • rank: The aggregation is performed by normalizing the element rankings according to the Rank normalization method.
  • score: The aggregation is performed by normalizing the element scores according to the Score normalization method.
  • z-score: The aggregation is performed by normalizing the element scores according to the Z-Score normalization method.
  • simple-borda: Similar to borda normalization but no partial score is assigned to an element if it is not ranked by a voter.
In [9]:
cmnz = Linear.CombMNZ(norm='rank', eval_pts=7)

df_out, df_eval = cmnz.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[9]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 926 0.485262 0.5 0.5 0.416667 0.4375 0.41 ... 1.271859 1.503394 1.653394 0.5 0.5 0.44134 0.451202 0.431364 0.454931 0.454479

1 rows × 33 columns

Majoritarian methods¶

These methods are based on the majority criterion, that under several circumstances, identify the “winning” elements. The module includes three classes which are described below: CondorcetWinners, CopelandWinners and OutrankingApproach. Each method implements different scenarios for the majority criterion.

CondorcetWinners¶

Member of pyflagr.Majoritarian.

In the Condorcet Winners method, the score of an element $r_i$ is determined by the number of its "victories" against all the other involved elements. A victory for $r_i$ is achieved if the majority of the voters rank $r_i$ higher than another element $r_j$.

Documentation: Majoritarian Module.

The CondorcetWinners constructor supports the following parameters:

Parameter Type Default Value Values
eval_pts Integer, Optional. Considered only if rels_file or rels_df is set. 10 Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for eval_pts=10 FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$.
In [10]:
condorcet = Majoritarian.CondorcetWinners(eval_pts=7)

df_out, df_eval = condorcet.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[10]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 916 0.480078 0.5 0.475 0.433333 0.4125 0.43 ... 1.303082 1.534616 1.701283 0.5 0.480657 0.45 0.433187 0.441953 0.464379 0.467642

1 rows × 33 columns

CopelandWinners¶

Member of pyflagr.Majoritarian.

In Copeland Winners, the score of an element $r_i$ is determined by the number of its "victories" against all the other involved elements. A victory for $r_i$ is achieved if the majority of the voters rank $r_i$ higher than another element $r_j$. In contrast to the Condorcet method, Copeland Winners assign "half" a victory (i.e. a score of 0.5) to each element of a pair $(r_i,r_j)$ in the case of tie. A tie happens when exactly half of the voters rank $r_i$ higher than $r_j$ and the other half voters rank $r_j$ higher than $r_i$.

Documentation: Majoritarian Module.

The CopelandWinners constructor supports the following parameters:

Parameter Type Default Value Values
eval_pts Integer, Optional. Considered only if rels_file or rels_df is set. 10 Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for eval_pts=10 FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$.
In [11]:
copeland = Majoritarian.CopelandWinners(eval_pts=7)

df_out, df_eval = copeland.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[11]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 922 0.481959 0.5 0.45 0.416667 0.4 0.41 ... 1.252192 1.483727 1.66706 0.5 0.461315 0.435196 0.420872 0.424694 0.448979 0.458235

1 rows × 33 columns

OutrankingApproach¶

Member of pyflagr.Majoritarian.

The Outranking Approach of Farah and Vanderpooten is a majoritarian method that identifies the "winning" elements by performing pairwise comparisons of their individual rankings. The method is implemented in accordance to the following paper:

  • Farah, M., Vanderpooten, D., "An outranking approach for rank aggregation in information retrieval", In Proceedings of the 30th ACM Conference on Research and Development in Information Retrieval, pp. 591-598, 2007.

The algorithm is based on four threshold values which introduce different perspectives of the majority criterion. These values are the concordance, discordance, preference, and veto thresholds. The user may pass all of them to FLAGR as hyper-parameters, through the input arguments of the OutrankingApproach() function (see below).

Documentation: Majoritarian Module.

The OutrankingApproach constructor supports the following parameters:

Parameter Type Default Value Values
eval_pts Integer, Optional. Considered only if rels_file or rels_df is set. 10 Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for eval_pts=10 FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$.
preference Hyperparameter, Float, Optional. 0 Preference threshold.
veto Hyperparameter, Float, Optional. 0.75 Veto threshold.
concordance Hyperparameter, Float, Optional. 0 Concordance threshold.
discordance Hyperparameter, Float, Optional. 0.25 Discordance threshold.
In [12]:
outrank = Majoritarian.OutrankingApproach(eval_pts=7)

df_out, df_eval = outrank.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[12]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 926 0.481719 0.5 0.425 0.433333 0.4125 0.44 ... 1.309331 1.469624 1.636291 0.5 0.441972 0.443856 0.428076 0.444073 0.444712 0.449778

1 rows × 33 columns

Markov Chain methods¶

The Markov Chains methods constitute a well-established family of rank aggregation methods. Originally proposed by Dwork et al., (2001), they consider an aggregate list as a system that moves from one state to another with respect to a particular criterion. Dwork et al. (2001) introduced four such methods in the following article:

  • C. Dwork, R. Kumar, M. Naor, D. Sivakumar, "Rank Aggregation Methods for the Web", In Proceedings of the 10th International Conference on World Wide Web, pp. 613-622, 2001.

In addition, DeConde et al. (2006) introduced MCT, a variant that constructs the transition matrix by considering the proportion of lists which prefer an element over another.

  • R.P. DeConde, S. Hawley, S. Falcon, N. Clegg, B. Knudsen, R. Etzioni, "Combining results of microarray experiments: a rank aggregation approach", Statistical Applications in Genetics and Molecular Biology, vol. 5, no. 1, 2006.

MC1, MC2, MC3, MC4, and MCT¶

Members of pyflagr.MarkovChains.

Documentation: MarkovChains Module.

The class constructors supports the following parameters:

Parameter Type Default Value Values
eval_pts Integer, Optional. Considered only if rels_file or rels_df is set. 10 Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for eval_pts=10 FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$.
ergodic_number Hyperparameter, Float, Optional. 0.15 The ergodic number.
max_iterations Hyperparameter, Integer, Optional. 200 Maximum number of iterations.
In [13]:
mch = MarkovChains.MC1(eval_pts=7, max_iterations=50)

df_out, df_eval = mch.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[13]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 926 0.487571 0.65 0.575 0.5 0.475 0.46 ... 1.467477 1.64558 1.79558 0.65 0.591972 0.535196 0.512466 0.49771 0.497957 0.493563

1 rows × 33 columns

In [14]:
mch = MarkovChains.MC2(eval_pts=7, max_iterations=50)

df_out, df_eval = mch.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[14]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 926 0.487571 0.65 0.575 0.5 0.475 0.46 ... 1.467477 1.64558 1.79558 0.65 0.591972 0.535196 0.512466 0.49771 0.497957 0.493563

1 rows × 33 columns

In [15]:
mch = MarkovChains.MC3(eval_pts=7, max_iterations=50)

df_out, df_eval = mch.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[15]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 926 0.505721 0.5 0.475 0.566667 0.525 0.52 ... 1.524615 1.73834 1.905006 0.5 0.480657 0.543856 0.51967 0.517089 0.526026 0.523641

1 rows × 33 columns

In [16]:
mch = MarkovChains.MC4(eval_pts=7, max_iterations=50)

df_out, df_eval = mch.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[16]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 926 0.481005 0.5 0.5 0.416667 0.4125 0.42 ... 1.286819 1.447112 1.630446 0.5 0.5 0.44134 0.43439 0.436438 0.4379 0.448171

1 rows × 33 columns

In [17]:
mch = MarkovChains.MCT(eval_pts=7, max_iterations=50)

df_out, df_eval = mch.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[17]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 926 0.481263 0.45 0.5 0.466667 0.5 0.48 ... 1.410158 1.570452 1.770452 0.45 0.488685 0.467876 0.49009 0.47827 0.475222 0.486655

1 rows × 33 columns

RRA: Robust Rank Aggregation¶

Robust Rank Aggregation (RRA) is mostly used in bio-informatics applications to aggregate gene lists. It is based on a probabilistic model (beta distribution) that makes the algorithm parameter free and robust to outliers, noise and errors. The method is implemented in accordance to the following paper:

  • R. Kolde, S. Laur, P. Adler, J. Vilo, "Robust rank aggregation for gene list integration and meta-analysis", Bioinformatics, vol. 28, no. 4, pp. 573-580, 2012.

Member of pyflagr.RRA.

Documentation: RRA Module.

The RRA constructor supports the following parameters:

Parameter Type Default Value Values
eval_pts Integer, Optional. Considered only if rels_file or rels_df is set. 10 Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for eval_pts=10 FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$.
exact Hyperparameter, Boolean, Optional. False Determines whether exact p-value correction algorithm of Stuart will be applied.
In [18]:
robust = RRA.RRA(eval_pts=7, exact=True)

df_out, df_eval = robust.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[18]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 904 0.437941 0.05 0.15 0.183333 0.2625 0.28 ... 1.530184 1.690477 1.857144 0.55 0.588685 0.544412 0.536945 0.518977 0.511542 0.510485

1 rows × 33 columns

Weighted methods¶

The weighted rank aggregation methods employ exploratory analysis techniques to automatically identify the expert voters in an unsupervised fashion. Then, they assign higher weights to the voters who were identified as experts, thus boosting the scores of their submitted elements.

There are three such methods implemented in FLAGR. Preference Relations Graph, Agglomerative and DIBRA.

PreferenceRelationsGraph¶

This method constructs a preference relations graph which contains the individual elements as vertices and their weights as edges. It has been implemented in accordance to the following paper:

  • M.S. Desarkar, S. Sarkar, P. Mitra, "Preference relations based unsupervised rank aggregation for metasearch", Expert Systems with Applications, vol. 49, pp. 86-98, 2016.

Member of pyflagr.Weighted.

Documentation: Weigthed Module.

The PreferenceRelationsGraph constructor supports the following parameters:

Parameter Type Default Value Values
eval_pts Integer, Optional. Considered only if rels_file or rels_df is set. 10 Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for eval_pts=10 FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$.
alpha Hyperparameter, Float, Optional. 0.1 The $\alpha$ hyper-parameter of the algorithm.
beta Hyperparameter, Float, Optional. 0.5 The $\beta$ hyper-parameter of the algorithm.
In [19]:
prf_graph = Weighted.PreferenceRelationsGraph(alpha=0.1, beta=0.5, eval_pts=7)

df_out, df_eval = prf_graph.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[19]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 926 0.485022 0.55 0.5 0.466667 0.4375 0.44 ... 1.358739 1.608084 1.724751 0.55 0.511315 0.485196 0.462466 0.46083 0.48661 0.474093

1 rows × 33 columns

Agglomerative¶

This method works very similarly to the well-established agglomerative clustering algorithm. Specifically, it repeatedly merges the two most similar input lists into a temporary aggregate list. During list merging, it modifies the weights of the respective voters, thus affecting the future merges. It has been implemented in accordance to the following paper:

  • S. Chatterjee, A. Mukhopadhyay, M. Bhattacharyya, "A weighted rank aggregation approach towards crowd opinion analysis", Knowledge-Based Systems, vol. 149, pp. 47-60, 2018.

Member of pyflagr.Weighted.

Documentation: Weigthed Module.

The Agglomerative constructor supports the following parameters:

Parameter Type Default Value Values
eval_pts Integer, Optional. Considered only if rels_file or rels_df is set. 10 Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for eval_pts=10 FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$.
c1 Hyperparameter, Float, Optional. 2.5 The $c_1$ hyper-parameter of the algorithm.
c2 Hyperparameter, Float, Optional. 1.5 The $c_2$ hyper-parameter of the algorithm.
In [20]:
agg = Weighted.Agglomerative(c1=0.1, c2=0.2, eval_pts=7)

df_out, df_eval = agg.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[20]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 456 0.256305 0.4 0.4 0.4 0.425 0.47 ... 1.319165 1.461647 1.694981 0.4 0.4 0.4 0.416813 0.447408 0.442298 0.46591

1 rows × 33 columns

DIBRA: Iterative Distance-Based Aggregation¶

DIBRA first employs a standard non-weighted method to generate an initial aggregate ranking (see the aggregator parameter in the following table for a list of the supported non-weighted methods). In the sequel, it repeatedly assigns higher weights to the input lists which have smaller distances from the aggregate lists. The process terminates when the voter weights converge and a stable aggregate list is obtained.

The algorithm also includes an optional list pruning mechanism which arranges the input list lengths according to the respective voter weights. DIBRA has been implemented in accordance to the following paper:

  • L. Akritidis, A. Fevgas, P. Bozanis, Y. Manolopoulos, "An Unsupervised Distance-Based Model for Weighted Rank Aggregation with List Pruning", Expert Systems with Applications, vol. 202, pp. 117435, 2022.

Member of pyflagr.Weighted.

Documentation: Weigthed Module.

The DIBRA constructor supports the following parameters:

Parameter Type Default Value Values
eval_pts Integer, Optional. Considered only if rels_file or rels_df is set. 10 Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for eval_pts=10 FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$.
aggregator Hyperparameter, String, Optional. combsum:borda The baseline aggregation method. An extended weighted variant of the baseline method is applied internally by plugging the computed voter weights.
The list of supported values includes:
  • combsum:borda: CombSUM with Borda rank normalization.
  • combsum:rank: CombSUM with rankings normalization.
  • combsum:score: CombSUM with score min-max normalization.
  • combsum:z-score: CombSUM with score z-normalization.
  • combsum:simple-borda: CombSUM with simple Borda normalization.
  • combmnz:borda: CombMNZ with Borda rank normalization.
  • combmnz:rank: CombMNZ with rankings normalization.
  • combmnz:score: CombMNZ with score min-max normalization.
  • combmnz:z-score: CombMNZ with score z-normalization.
  • combmnz:simple-borda: CombMNZ with simple Borda normalization.
  • condorcet: The Condorcet Winners method.
  • copeland: The Copeland Winners method.
  • outrank: The Outranking Approach.
w_norm Hyperparameter, String, Optional. minmax The voter weights normalization method. The list of supported values includes:
  • none: The voter weights will not be normalized.
  • minmax: The voter weights will be normalized with min-max scaling.
  • z: The voter weights will be z-normalized
dist Hyperparameter, String, Optional. cosine The metric that is used to measure the distance between an input list and the temporary aggregate list. The list of supported values includes:
  • rho: The Spearman's $\rho$ correlation coefficient.
  • cosine: Cosine similarity of the lists' vector representations.
  • tau: The Kendall's $\tau$ correlation coefficient.
  • footrule: A scaled variant of Spearman's Footrule distance.
gamma Hyperparameter, Float, Optional. 1.50 The $\gamma$ hyper-parameter that determines the steplength of weight learning.
prune Hyperparameter, Boolean, Optional. None Triggers a weight-dependant list pruning mechanism.
The list of supported values includes:
  • None: No pruning takes place (i.e., all list elements are preserved)
  • low: The list pruning method of Akritidis et al., 2022.
  • wire: The item selection method of Akritidis and Bozanis, 2025.
num_buckets Hyperparameter, Integer, Optional. 3 The number of voter buckets, used in the list pruning method of Akritidis and Bozanis, 2025. Applies only if prune='wire'.
d1 Hyperparameter, Float, Optional. Used only when prune is not None 0.4 The hyperparameter $\delta_1$ of the weight-dependant list pruning mechanism. Applies only if prune is not None.
d2 Hyperparameter, Float, Optional. Used only when prune is not None 0.1 The hyperparameter $\delta_2$ of the weight-dependant list pruning mechanism. Applies only if prune is not None.
tol Hyperparameter, Float, Optional. 0.01 Controls the convergence precision. This tolerance threshold represents the minimum precision of the difference between the voter weight in an iteration and the voter weight of the previous iteration.
max_iter Hyperparameter, Integer, Optional. 50 Controls the maximum number of iterations. FLAGR will stop the execution of DIBRA if the requested number of iterations have been performed, even if the voter weights have not fully converged.
pref Hyperparameter, Float, Optional. 0 Preference threshold. Applies only if aggregator=outrank.
veto Hyperparameter, Float, Optional. 0.75 Veto threshold. Applies only if aggregator=outrank.
conc Hyperparameter, Float, Optional. 0 Concordance threshold. Applies only if aggregator=outrank.
disc Hyperparameter, Float, Optional. 0.25 Discordance threshold. Applies only if aggregator=outrank.
In [21]:
method_1 = Weighted.DIBRA(aggregator='outrank', eval_pts=7)

df_out, df_eval = method_1.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[21]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 926 0.493753 0.55 0.525 0.533333 0.5 0.47 ... 1.448134 1.590617 1.77395 0.55 0.530657 0.535196 0.512466 0.491149 0.481324 0.487617

1 rows × 33 columns

In [22]:
method_2 = Weighted.DIBRA(eval_pts=7, gamma=1.5, prune='low', d1=0.3, d2=0.05)

df_out, df_eval = method_2.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[22]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 1986 926 922 0.502947 0.55 0.55 0.583333 0.5875 0.61 ... 1.751214 1.929318 2.045985 0.55 0.55 0.573464 0.577925 0.593942 0.583816 0.562393

1 rows × 33 columns

In [23]:
method_3 = Weighted.DIBRA(eval_pts=7, aggregator="outrank")

df_out, df_eval = method_3.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)
Out[23]:
q num_ret num_rel num_rel_ret map P_1 P_2 P_3 P_4 P_5 ... dcg_cut_5 dcg_cut_6 dcg_cut_7 ndcg_cut_1 ndcg_cut_2 ndcg_cut_3 ndcg_cut_4 ndcg_cut_5 ndcg_cut_6 ndcg_cut_7
20 all 2000 926 926 0.493753 0.55 0.525 0.533333 0.5 0.47 ... 1.448134 1.590617 1.77395 0.55 0.530657 0.535196 0.512466 0.491149 0.481324 0.487617

1 rows × 33 columns