Type: Package
Title: Compute the Median Ranking(s) According to the Kemeny's Axiomatic Approach
Version: 3.0
Date: 2026-02-24
Maintainer: Antonio D'Ambrosio <antdambr@unina.it>
Depends: R (≥ 4.1.0)
LinkingTo: Rcpp
Imports: rlist (≥ 0.4.2), methods, proxy, gtools, tidyr, Rcpp
Suggests: rgl, plotly
Description: Compute the median ranking according to the Kemeny's axiomatic approach. Rankings can or cannot contain ties, rankings can be both complete or incomplete. The package contains both branch-and-bound algorithms and heuristic solutions recently proposed. The searching space of the solution can either be restricted to the universe of the permutations or unrestricted to all possible ties. The package also provide some useful utilities for deal with preference rankings, including both element-weight Kemeny distance and correlation coefficient. This release includes also the median constrained bucket order algorithm. This release removes the functions previously declared as deprecated. These functions are now defunct and no longer available in the package. Essential references: Emond, E.J., and Mason, D.W. (2002) <doi:10.1002/mcda.313>; D'Ambrosio, A., Amodio, S., and Iorio, C. (2015) <doi:10.1285/i20705948v8n2p198>; Amodio, S., D'Ambrosio, A., and Siciliano R. (2016) <doi:10.1016/j.ejor.2015.08.048>; D'Ambrosio, A., Mazzeo, G., Iorio, C., and Siciliano, R. (2017) <doi:10.1016/j.cor.2017.01.017>; Albano, A., and Plaia, A. (2021) <doi:10.1285/i20705948v14n1p117>; D'Ambrosio, A., Iorio, C., Staiano, M. and Siciliano, R (2019) <doi:10.1007/s00180-018-0858-z>.
License: GPL-3
Encoding: UTF-8
URL: https://www.r-project.org/
Repository: CRAN
RoxygenNote: 7.3.3
NeedsCompilation: yes
Packaged: 2026-02-24 18:37:20 UTC; antda
Author: Antonio D'Ambrosio [aut, cre], Sonia Amodio [ctb], Giulio Mazzeo [ctb], Alessandro Albano [ctb], Antonella Plaia [ctb]
Date/Publication: 2026-02-25 06:31:40 UTC

Median Ranking Approach According to the Kemeny's Axiomatic Approach

Description

Compute the median ranking according to the Kemeny's axiomatic approach. Rankings can or cannot contain ties, rankings can be both complete or incomplete. The package contains both branch-and-bound and heuristic solutions as well as routines for computing the median constrained bucket order and the K-median cluster component analysis. The package also contains routines for visualize rankings and for detecting the universe of rankings including ties.

Details

Package: ConsRank
Type: Package
Version: 2.1.0
Date: 2017-04-28
License: GPL-3

Author(s)

Antonio D'Ambrosio [cre,aut] <antdambr@unina.it>, Sonia Amdio <sonia.amodio@unina.it> [ctb], Giulio Mazzeo [ctb] <giuliomazzeo@gmail.com>

Maintainer: Antonio D'Ambrosio <antdambr@unina.it>

References

Kemeny, J. G., & Snell, J. L. (1962). Mathematical models in the social sciences (Vol. 9). New York: Ginn.

Marden, J. I. (1996). Analyzing and modeling rank data. CRC Press.

Emond, E. J., & Mason, D. W. (2002). A new rank correlation coefficient with application to the consensus ranking problem. Journal of Multi-Criteria Decision Analysis, 11(1), 17-28.

D'Ambrosio, A. (2008). Tree based methods for data editing and preference rankings. Ph.D. thesis. http://www.fedoa.unina.it/id/eprint/2746

Heiser, W. J., & D'Ambrosio, A. (2013). Clustering and prediction of rankings within a Kemeny distance framework. In Algorithms from and for Nature and Life (pp. 19-31). Springer International Publishing.

Amodio, S., D'Ambrosio, A. & Siciliano, R (2016). Accurate algorithms for identifying the median ranking when dealing with weak and partial rankings under the Kemeny axiomatic approach. European Journal of Operational Research, vol. 249(2).

D'Ambrosio, A., Amodio, S. & Iorio, C. (2015). Two algorithms for finding optimal solutions of the Kemeny rank aggregation problem for full rankings. Electronic Journal of Applied Statistical Analysis, vol. 8(2).

D'Ambrosio, A., Mazzeo, G., Iorio, C., & Siciliano, R. (2017). A differential evolution algorithm for finding the median ranking under the Kemeny axiomatic approach. Computers & Operations Research, vol. 82.

D'Ambrosio, A., & Heiser, W.J. (2019). A Distribution-free Soft Clustering Method for Preference Rankings. Behaviormetrika , vol. 46(2), pp. 333–351.

D'Ambrosio, A., Iorio, C., Staiano, M., & Siciliano, R. (2019). Median constrained bucket order rank aggregation. Computational Statitstics, vol. 34(2), pp. 787–802,

Examples

## load APA data set, full version
data(APAFULL)
## Emond and Mason Branch-and-Bound algorithm.
#CR=consrank(APAFULL)
#use frequency tables
#TR=tabulaterows(APAFULL)
#quick algorithm
#CR2=consrank(TR$X,wk=TR$Wk,algorithm="quick")
#FAST algorithm
#CR3=consrank(TR$X,wk=TR$Wk,algorithm="fast",itermax=10)
#Decor algorithm
#CR4=consrank(TR$X,wk=TR$Wk,algorithm="decor",itermax=10)

#####################################
### load sports data set
#data(sports)
### FAST algorithm
#CR=consrank(sports,algorithm="fast",itermax=10)
#####################################

#######################################
### load Emond and Mason data set
#data(EMD)
### matrix X contains rankings
#X=EMD[,1:15]
### vector Wk contains frequencies
#Wk=EMD[,16]
### QUICK algorithm
#CR=consrank(X,wk=Wk,algorithm="quick")
#######################################


American Psychological Association dataset, full version

Description

The American Psychological Association dataset includes 15449 ballots of the election of the president in 1980, 5738 of which are complete rankings, in which the candidates are ranked from most to least favorite.

Usage

data(APAFULL)

Source

Diaconis, P. (1988). Group representations in probability and statistics. Lecture Notes-Monograph Series, i-192., pag. 96.


American Psychological Association dataset, reduced version with only full rankings

Description

The American Psychological Association reduced dataset includes 5738 ballots of the election of the president in 1980, in which the candidates are ranked from most to least favorite.

Usage

data(APAred)

Source

Diaconis, P. (1988). Group representations in probability and statistics. Lecture Notes-Monograph Series, i-192., pag. 96.


Brook and Upton data

Description

The data consist of ballots of three candidates, where the 948 voters rank the candidates from 1 to 3. Data are in form of frequency table.

Usage

data(BU)

Source

Brook, D., & Upton, G. J. G. (1974). Biases in local government elections due to position on the ballot paper. Applied Statistics, 414-419.

References

Marden, J. I. (1996). Analyzing and modeling rank data. CRC Press, pag. 153.

Examples

data(BU)
polyplot(BU[,1:3],Wk=BU[,4])


Defunct functions in ConsRank

Description

These functions have been removed from ConsRank. Use the alternatives listed below.

Usage

EMCons(...)

QuickCons(...)

BBFULL(...)

FASTcons(...)

DECOR(...)

FASTDECOR(...)

labels(...)

Details


Emond and Mason data

Description

Data simuated by Emond and Mason to check their branch-and-bound algorithm. There are 112 voters ranking 15 objects. There are 21 uncomplete rankings. Data are in form of frequency table.

Usage

data(EMD)

Source

Emond, E. J., & Mason, D. W. (2000). A new technique for high level decision support. Department of National Defence, Operational Research Division, pag. 28.

References

Emond, E. J., & Mason, D. W. (2000). A new technique for high level decision support. Department of National Defence, Operational Research Division, pag. 28.

Examples

data(EMD)
CR=consrank(EMD[,1:15],EMD[,16],algorithm="quick")


German political goals

Description

Ranking data of 2262 German respondents about the desirability of the four political goals: a = the maintenance of order in the nation; b = giving people more say in the decisions of government; c = growthing rising prices; d = protecting freedom of speech

Usage

data(German)

Source

Croon, M. A. (1989). Latent class models for the analysis of rankings. Advances in psychology, 60, 99-121.

Examples

data(German)
TR=tabulaterows(German)
polyplot(TR$X,Wk=TR$Wk,nobj=4)

Idea data set

Description

98 college students where asked to rank five words, (thought, play, theory, dream, attention) regarding its association with the word idea, from 5=most associated to 1=least associated.

Usage

data(Idea)

Source

Fligner, M. A., & Verducci, J. S. (1986). Distance based ranking models. Journal of the Royal Statistical Society. Series B (Methodological), 359-369.

Examples

data(Idea)
revIdea=6-Idea
TR=tabulaterows(revIdea)
CR=consrank(TR$X,wk=TR$Wk,algorithm="quick")
colnames(CR$Consensus)=colnames(Idea)


USA rank data

Description

Random subset of the rankings collected by O'Leary Morgan and Morgon (2010) on the 50 American States. The 368 number of items (the number of American States) is equal to 50, and the number of rankings is equal to 104. These data concern rankings of the 50 American States on three particular aspects: socio-demographic characteristics, health care expenditures and crime statistics.

Usage

data(USAranks)

Source

Amodio, S., D'Ambrosio, A. & Siciliano, R (2015). Accurate algorithms for identifying the median ranking when dealing with weak and partial rankings under the Kemeny axiomatic approach. European Journal of Operational Research. DOI: 10.1016/j.ejor.2015.08.048

References

O'Leary Morgan, K., Morgon, S., (2010). State Rankings 2010: A Statistical view of America; Crime State Ranking 2010: Crime Across America; Health Care State Rankings 2010: Health Care Across America. CQ Press.

Examples

data(USAranks)

Combined input matrix with C++ optimization

Description

Compute the Combined input matrix of a data set as defined by Emond and Mason (2002). This version uses C++ for improved performance.

Usage

combinpmatr(X, Wk = NULL, use_cpp = TRUE)

Arguments

X

A data matrix N by M, in which there are N judges and M objects to be judged. Each row is a ranking of the objects which are represented by the columns. Alternatively X can contain the rankings observed only once. In this case the argument Wk must be used

Wk

Optional: the frequency of each ranking in the data

use_cpp

Logical. If TRUE (default), use the optimized C++ implementation. If FALSE, use the original R implementation.

Details

This function now uses an optimized C++ implementation (combinpmatr_impl) for improved performance. The original R implementation is preserved for reference and can be accessed by setting use_cpp = FALSE.

The C++ implementation provides significant speedup (typically 10-100x) compared to the R implementation, especially for large datasets with many judges and objects.

Value

The M by M combined input matrix

Author(s)

Antonio D'Ambrosio antdambr@unina.it

References

Emond, E. J., and Mason, D. W. (2002). A new rank correlation coefficient with application to the consensus ranking problem. Journal of Multi-Criteria Decision Analysis, 11(1), 17-28.

See Also

tabulaterows frequency distribution of a ranking data.

Examples

# Simple example
X <- matrix(c(1,2,3,4, 2,1,4,3, 1,3,2,4), nrow=3, byrow=TRUE)
CI <- combinpmatr(X)

# With weights
CI_weighted <- combinpmatr(X, Wk=c(2, 1, 3))

# Compare implementations
## Not run: 
data(APAred) 
system.time(CI1 <- combinpmatr(APAred, use_cpp=TRUE))
system.time(CI2 <- combinpmatr(APAred, use_cpp=FALSE))
all.equal(CI1, CI2)  # Should be TRUE

## End(Not run)


Branch-and-bound and heuristic algorithms to find consensus (median) ranking according to the Kemeny's axiomatic approach

Description

Branch-and-bound, Quick , FAST and DECOR algorithms to find consensus (median) ranking according to the Kemeny's axiomatic approach. The median ranking(s) can be restricted to be necessarily a full ranking, namely without ties

Usage

consrank(
  X,
  wk = NULL,
  ps = TRUE,
  algorithm = "BB",
  full = FALSE,
  itermax = 10,
  np = 15,
  gl = 100,
  ff = 0.4,
  cr = 0.9,
  proc = FALSE,
  use_cpp = TRUE
)

Arguments

X

A n by m data matrix, in which there are n judges and m objects to be judged. Each row is a ranking of the objects which are represented by the columns. If X contains the rankings observed only once, the argument wk can be used

wk

Optional: the frequency of each ranking in the data

ps

If PS=TRUE, on the screen some information about how many branches are processed are displayed.

algorithm

Specifies the used algorithm. One among "BB", "quick", "fast" and "decor". algorithm="BB" is the default option.

full

Specifies if the median ranking must be searched in the universe of rankings including all the possible ties (full=FALSE) or in the restricted space of full rankings (permutations). full=FALSE is the default option.

itermax

maximum number of iterations for FAST and DECOR algorithms. itermax=10 is the default option.

np

For DECOR algorithm only: the number of population individuals. np=15 is the default option.

gl

For DECOR algorithm only: generations limit, maximum number of consecutive generations without improvement. gl=100 is the default option.

ff

For DECOR algorithm only: the scaling rate for mutation. Must be in [0,1]. ff=0.4 is the default option.

cr

For DECOR algorithm only: the crossover range. Must be in [0,1]. cr=0.9 is the default option.

proc

For BB algorithm only: proc=TRUE allows the branch and bound algorithm to work in difficult cases, i.e. when the number of objects is larger than 15 or 25. proc=FALSE is the default option

use_cpp

Logical. If TRUE (default), uses C++ implementation for faster computation. If FALSE, uses the R implementation.

Details

The BB algorithm can take long time to find the solutions if the number objects to be ranked is large with some missing (>15-20 if full=FALSE, <25-30 if full=TRUE). quick algorithm works with a large number of items to be ranked. The solution is quite accurate. fast algorithm works with a large number of items to be ranked by repeating several times the quick algorithm with different random starting points. decor algorithm works with a very large number of items to be ranked. For decor algorithm, empirical evidence shows that the number of population individuals (the 'np' parameter) can be set equal to 10, 20 or 30 for problems till 20, 50 and 100 items. Both scaling rate and crossover ratio (parameters 'ff' and 'cr') must be set by the user. The default options (ff=0.4, cr=0.9) work well for a large variety of data sets All algorithms allow the user to set the option 'full=TRUE' if the median ranking(s) must be searched in the restricted space of permutations instead of in the unconstrained universe of rankings of n items including all possible ties

Value

a "list" containing the following components:

Consensus the Consensus Ranking
Tau averaged TauX rank correlation coefficient
Eltime Elapsed time in seconds

#'

Author(s)

Antonio D'Ambrosio antdambr@unina.it

References

Emond, E. J., and Mason, D. W. (2002). A new rank correlation coefficient with application to the consensus ranking problem. Journal of Multi-Criteria Decision Analysis, 11(1), 17-28. D'Ambrosio, A., Amodio, S., and Iorio, C. (2015). Two algorithms for finding optimal solutions of the Kemeny rank aggregation problem for full rankings. Electronic Journal of Applied Statistical Analysis, 8(2), 198-213. Amodio, S., D'Ambrosio, A. and Siciliano, R. (2016). Accurate algorithms for identifying the median ranking when dealing with weak and partial rankings under the Kemeny axiomatic approach. European Journal of Operational Research, 249(2), 667-676. D'Ambrosio, A., Mazzeo, G., Iorio, C., and Siciliano, R. (2017). A differential evolution algorithm for finding the median ranking under the Kemeny axiomatic approach. Computers and Operations Research, vol. 82, pp. 126-138.

See Also

iwquickcons

Examples

data(Idea)
RevIdea<-6-Idea 
# as 5 means "most associated", it is necessary compute the reverse ranking of 
# each rankings to have rank 1 = "most associated" and rank 5 = "least associated"
CR<-consrank(RevIdea)
CR<-consrank(RevIdea,algorithm="quick")
#CR<-consrank(RevIdea,algorithm="fast",itermax=10)
#not run
#data(EMD)
#CRemd<-consrank(EMD[,1:15],wk=EMD[,16],algorithm="decor",itermax=1)
#data(APAFULL)
#CRapa<-consrank(APAFULL,full=TRUE)



Item-weighted Kemeny distance

Description

Compute the item-weighted Kemeny distance of a data matrix containing preference rankings, or compute the kemeny distance between two (matrices containing) rankings.

Usage

iw_kemenyd(x, y = NULL, w)

Arguments

x

A N by M data matrix, in which there are N judges and M objects to be judged. Each row is a ranking of the objects which are represented by the columns. If there is only x as input, the output is a square distance matrix

y

A row vector, or a N by M data matrix in which there are N judges and the same M objects as x to be judged.

w

A M-dimensional row vector (individually weighted items), or a M by M matrix (item similarities)

Value

If there is only x as input, d = square distance matrix. If there is also y as input, d = matrix with N rows and n columns.

Author(s)

Alessandro Albano alessandro.albano@unipa.it
Antonella Plaia antonella.plaia@unipa.it

References

Kemeny, J. G., & Snell, L. J. (1962). Preference ranking: an axiomatic approach. Mathematical models in the social sciences, 9-23.
Albano, A. and Plaia, A. (2021) Element weighted Kemeny distance for ranking data. Electronic Journal of Applied Statistical Analysis, doi: 10.1285/i20705948v14n1p117

See Also

iw_tau_x item-weighted tau_x rank correlation coefficient

kemenyd Kemeny distance

Examples

#Individually weighted items
data("German")
w=c(10,5,5,10)
iw_kemenyd(x= German[c(1,200,300,500),],w= w)
iw_kemenyd(x= German[1,],y=German[400,],w= w)

#Item similarity weights
data(sports)
P=matrix(NA,nrow=7,ncol=7)
P[1,]=c(0,5,5,10,10,10,10)
P[2,]=c(5,0,5,10,10,10,10)
P[3,]=c(5,5,0,10,10,10,10)
P[4,]=c(10,10,10,0,5,5,5)
P[5,]=c(10,10,10,5,0,5,5)
P[6,]=c(10,10,10,5,5,0,5)
P[7,]=c(10,10,10,5,5,5,0)
iw_kemenyd(x=sports[c(1,3,5,7),], w= P)
iw_kemenyd(x=sports[1,],y=sports[100,], w= P)

Item-weighted TauX rank correlation coefficient

Description

Compute the item-weighted TauX rank correlation coefficient of a data matrix containing preference rankings, or compute the item-weighted correlation coefficient between two (matrices containing) rankings.

Usage

iw_tau_x(x, y = NULL, w)

Arguments

x

A N by M data matrix, in which there are N judges and M objects to be judged. Each row is a ranking of the objects which are represented by the columns. If there is only x as input, the output is a square matrix

y

A row vector, or a N by M data matrix in which there are N judges and the same M objects as x to be judged.

w

A M-dimensional row vector (individually weighted items), or a M by M matrix (item similarities)

Value

Item-weighted TauX rank correlation coefficient

Author(s)

Alessandro Albano alessandro.albano@unipa.it
Antonella Plaia antonella.plaia@unipa.it

References

Emond, E. J., and Mason, D. W. (2002). A new rank correlation coefficient with application to the consensus ranking problem. Journal of Multi-Criteria Decision Analysis, 11(1), 17-28.
Albano, A. and Plaia, A. (2021) Element weighted Kemeny distance for ranking data. Electronic Journal of Applied Statistical Analysis, doi: 10.1285/i20705948v14n1p117

See Also

tau_x TauX rank correlation coefficient

iw_kemenyd item-weighted Kemeny distance

Examples

#Individually weighted items
data("German")
w=c(10,5,5,10)
iw_tau_x(x= German[c(1,200,300,500),],w= w)
iw_tau_x(x= German[1,],y=German[400,],w= w)

#Item similarity weights
data(sports)
P=matrix(NA,nrow=7,ncol=7)
P[1,]=c(0,5,5,10,10,10,10)
P[2,]=c(5,0,5,10,10,10,10)
P[3,]=c(5,5,0,10,10,10,10)
P[4,]=c(10,10,10,0,5,5,5)
P[5,]=c(10,10,10,5,0,5,5)
P[6,]=c(10,10,10,5,5,0,5)
P[7,]=c(10,10,10,5,5,5,0)
iw_tau_x(x=sports[c(1,3,5,7),], w= P)
iw_tau_x(x=sports[1,],y=sports[100,], w= P)

Item-weighted Combined input matrix of a data set

Description

Compute the item-weighted Combined input matrix of a data set as defined by Albano and Plaia (2021)

Usage

iwcombinpmatr(X, w, Wk = NULL)

Arguments

X

A data matrix N by M, in which there are N judges and M objects to be judged. Each row is a ranking of the objects which are represented by the columns. Alternatively X can contain the rankings observed only once. In this case the argument Wk must be used

w

A M-dimensional row vector (individually weighted items), or a M by M matrix (item similarities)

Wk

Optional: the frequency of each ranking in the data

Value

The M by M item-weighted combined input matrix

Author(s)

Alessandro Albano alessandro.albano@unipa.it
Antonella Plaia antonella.plaia@unipa.it

References

Emond, E. J., and Mason, D. W. (2002). A new rank correlation coefficient with application to the consensus ranking problem. Journal of Multi-Criteria Decision Analysis, 11(1), 17-28.
Albano, A. and Plaia, A. (2021). Element weighted Kemeny distance for ranking data. Electronic Journal of Applied Statistical Analysis, doi: 10.1285/i20705948v14n1p117

See Also

tabulaterows frequency distribution of a ranking data.

combinpmatr combined input matrix of a ranking data set.

Examples

data(sports)
np <- dim(sports)[2]
P <- matrix(NA,nrow=np,ncol=np)
P[1,] <- c(0,5,5,10,10,10,10)
P[2,] <- c(5,0,5,10,10,10,10)
P[3,] <- c(5,5,0,10,10,10,10)
P[4,] <- c(10,10,10,0,5,5,5)
P[5,] <- c(10,10,10,5,0,5,5)
P[6,] <- c(10,10,10,5,5,0,5)
P[7,] <- c(10,10,10,5,5,5,0)
CIW <- iwcombinpmatr(sports,w=P) 


The item-weighted Quick algorithm to find up to 4 solutions to the consensus ranking problem

Description

The item-weighted Quick algorithm finds up to 4 solutions. Solutions reached are most of the time optimal solutions.

Usage

iwquickcons(X, w, Wk = NULL, full = FALSE, PS = FALSE)

Arguments

X

A N by M data matrix in which there are N judges and M objects to be judged. Each row is a ranking of the objects which are represented by the columns. Alternatively X can contain the rankings observed only once in the sample. In this case the argument Wk must be used

w

A M-dimensional row vector (individually weighted items), or a M by M matrix (item similarities)

Wk

Optional: the frequency of each ranking in the data

full

Default full=FALSE. If full=TRUE, the searching is limited to the space of full rankings.

PS

Default PS=FALSE. If PS=TRUE the number of evaluated branches is diplayed

Details

The item-weigthed Quick algorithm finds up the consensus (median) ranking according to the Kemeny's axiomatic approach. The median ranking(s) can be restricted to be necessarily a full ranking, namely without ties.

Value

a "list" containing the following components:

Consensus the Consensus Ranking
Tau averaged item-weighted TauX rank correlation coefficient
Eltime Elapsed time in seconds

Author(s)

Alessandro Albano alessandro.albano@unipa.it
Antonella Plaia antonella.plaia@unipa.it

References

Amodio, S., D'Ambrosio, A. and Siciliano, R. (2016). Accurate algorithms for identifying the median ranking when dealing with weak and partial rankings under the Kemeny axiomatic approach. European Journal of Operational Research, 249(2), 667-676.
Albano, A. and Plaia, A. (2021). Element weighted Kemeny distance for ranking data. Electronic Journal of Applied Statistical Analysis, doi: 10.1285/i20705948v14n1p117

See Also

consrank

Examples

#Individually weighted items
data("German")
w=c(10,5,5,10)
iwquickcons(X= German,w= w)

#Item similirity weights
data(sports)
dim(sports)
P=matrix(NA,nrow=7,ncol=7)
P[1,]=c(0,5,5,10,10,10,10)
P[2,]=c(5,0,5,10,10,10,10)
P[3,]=c(5,5,0,10,10,10,10)
P[4,]=c(10,10,10,0,5,5,5)
P[5,]=c(10,10,10,5,0,5,5)
P[6,]=c(10,10,10,5,5,0,5)
P[7,]=c(10,10,10,5,5,5,0)
iwquickcons(X= sports, w= P)


Kemeny distance

Description

Compute the Kemeny distance of a data matrix containing preference rankings, or compute the kemeny distance between two (matrices containing) rankings.

Usage

kemenyd(X, Y = NULL)

Arguments

X

A N by M data matrix, in which there are N judges and M objects to be judged. Each row is a ranking of the objects which are represented by the columns. If there is only X as input, the output is a square distance matrix

Y

A row vector, or a n by M data matrix in which there are n judges and the same M objects as X to be judged.

Value

If there is only X as input, d = square distance matrix. If there is also Y as input, d = matrix with N rows and n columns.

Author(s)

Antonio D'Ambrosio antdambr@unina.it

References

Kemeny, J. G., & Snell, L. J. (1962). Preference ranking: an axiomatic approach. Mathematical models in the social sciences, 9-23.

See Also

tau_x TauX rank correlation coefficient

iw_kemenyd item-weighted Kemeny distance

Examples

data(Idea)
RevIdea<-6-Idea ##as 5 means "most associated", it is necessary compute the reverse 
#ranking of each rankings to have rank 1 = "most associated" and rank 5 = "least associated"
KD<-kemenyd(RevIdea)
KD2<-kemenyd(RevIdea[1:10,],RevIdea[55,])


Auxiliary function

Description

Define a design matrix to compute Kemeny distance

Usage

kemenydesign(X)

Arguments

X

A N by M data matrix, in which there are N judges and M objects to be judged. Each row is a ranking of the objects represented by the columns.

Details

This function now uses an optimized C++ implementation (kemenydesign_cpp) for improved performance. The original R implementation is preserved for reference.

Value

Design matrix

Author(s)

Antonio D'Ambrosio antdambr@unina.it

References

D'Ambrosio, A. (2008). Tree based methods for data editing and preference rankings. Unpublished PhD Thesis. Universita' degli Studi di Napoli Federico II.

See Also

kemenydesign_cpp


Kemeny design matrix (C++ implementation)

Description

Compute the design matrix for Kemeny distance calculation. This function uses an optimized C++ implementation for better performance.

Usage

kemenydesign_cpp(X)

Arguments

X

A N by M numeric matrix or vector, where each row represents a ranking of M objects. If X is a vector, it will be converted to a single-row matrix.

Details

This is a C++ reimplementation of the original kemenydesign function for improved computational efficiency. The function processes all pairwise comparisons between M objects, generating M*(M-1)/2 binary features that encode the ranking structure.

The optimization provides significant speedup (typically 20-50x) compared to the R implementation, especially for large matrices.

Value

A numeric matrix with N rows and M*(M-1)/2 columns. Each column represents a pairwise comparison between objects:

Author(s)

Antonio D'Ambrosio antdambr@unina.it

References

D'Ambrosio, A. (2008). Tree based methods for data editing and preference rankings. Unpublished PhD Thesis. Universita' degli Studi di Napoli Federico II.

See Also

kemenydesign for the wrapper function

kemenyd for computing Kemeny distance using this design matrix

Examples

# Single ranking
x <- c(1, 3, 2, 4)
kemenydesign_cpp(x)

# Multiple rankings
X <- matrix(c(1,2,3,4,
              4,3,2,1,
              1,1,2,2), nrow=3, byrow=TRUE)
kemenydesign_cpp(X)


Score matrix according Kemeny (1962)

Description

Given a ranking, it computes the score matrix as defined by Emond and Mason (2002)

Usage

kemenyscore(X)

Arguments

X

a ranking (must be a row vector or, better, a matrix with one row and M columns)

Value

the M by M score matrix

Author(s)

Antonio D'Ambrosio antdambr@unina.it

References

Kemeny, J and Snell, L. (1962). Mathematical models in the social sciences.

See Also

scorematrix The score matrix as defined by Emond and Mason (2002)

Examples

Y <- matrix(c(1,3,5,4,2),1,5)
SM<-kemenyscore(Y)
#
Z<-c(1,2,3,2)
SM2<-kemenyscore(Z)


Median Constrained Bucket Order (MCBO)

Description

Find the median ranking constrained to exactly b buckets (tied groups) according to Kemeny's axiomatic approach. This implements the algorithms described in D'Ambrosio et al. (2019) for rank aggregation with fixed bucket constraints.

Usage

mcbo(
  X,
  nbuckets,
  wk = NULL,
  ps = TRUE,
  algorithm = "BB",
  itermax = 10,
  np = 10,
  gl = 100,
  ff = 0.4,
  cr = 0.8,
  use_cpp = TRUE
)

Arguments

X

A N by M data matrix, in which there are N judges and M objects to be judged. Each row is a ranking of the objects which are represented by the columns. If X contains the rankings observed only once, the argument wk can be used

nbuckets

Integer. The number of buckets (tied groups) the consensus ranking must contain. Must be between 2 and M-1 (where M is the number of objects)

wk

Optional: the frequency of each ranking in the data

ps

Logical. If TRUE, displays progress information on screen. Default TRUE

algorithm

Character string specifying the algorithm to use. One of:

  • "BB" - Branch-and-Bound (exact, default). Best for M <= 15

  • "quick" - Quick algorithm (heuristic). Best for 15 < M <= 50

  • "decor" - Differential Evolution (metaheuristic). Best for M > 50

itermax

Integer. Maximum number of iterations for "quick" and "decor" algorithms. Default 10

np

Integer. Number of population individuals for "decor" algorithm. Default 10

gl

Integer. Generations limit for "decor": maximum number of consecutive generations without improvement. Default 100

ff

Numeric. Scaling rate for mutation in "decor". Must be in [0,1]. Default 0.4

cr

Numeric. Crossover range for "decor". Must be in [0,1]. Default 0.8

use_cpp

Logical. If TRUE (default), use optimized C++ implementations for core functions (combinpmatr, scorematrix, PenaltyBB2, etc.)

Details

The median constrained bucket order problem finds the ranking that minimizes the sum of Kemeny distances to the input rankings, subject to the constraint that the solution must have exactly nbuckets groups of tied items.

This is useful in applications where the output must conform to predetermined categories, such as:

The search space is restricted to Z^{n/b}, which contains

\sum_{b=1}^{n} b! S(n,b)

rankings, where S(n,b) is the Stirling number of the second kind.

Algorithm Selection Guidelines:

For stochastic algorithms (quick, decor), consider running multiple times (controlled by itermax) to avoid local optima.

Value

A list containing the following components:

Consensus The consensus ranking with exactly nbuckets buckets
Tau Averaged TauX rank correlation coefficient
Eltime Elapsed time in seconds

Author(s)

Antonio D'Ambrosio antdambr@unina.it

References

D'Ambrosio, A., Iorio, C., Staiano, M., and Siciliano, R. (2019). Median constrained bucket order rank aggregation. Computational Statistics, 34(2), 787-802. doi:10.1007/s00180-018-0858-z

See Also

consrank for unconstrained consensus ranking

combinpmatr for computing the combined input matrix

scorematrix for computing score matrices

tau_x for TauX correlation coefficient

kemenyd for Kemeny distance

stirling2 for Stirling numbers (bucket combinatorics)

Examples

## Not run: 
# Simple example with 98 judges ranking 5 items into 3 buckets
data(Idea)
RevIdea <- 6 - Idea  # Reverse ranking
CR <- mcbo(RevIdea, nbuckets = 3, algorithm = "BB")
print(CR$Consensus)
print(CR$Tau)

# Large dataset with Quick algorithm
data(EMD)
CR_quick <- mcbo(EMD[,1:15], nbuckets = 5, wk = EMD[,16], 
                 algorithm = "quick", itermax = 20)

## End(Not run)


Given an ordering, it is transformed to a ranking

Description

From ordering to rank. IMPORTANT: check which symbol denotes tied rankings in the X matrix

Usage

order2rank(X, TO = "{", TC = "}", NAs = "NA")

Arguments

X

A ordering or a matrix containing orderings

TO

symbol indicating the start of a set of items ranked in a tie

TC

symbol indicating the end of a set of items ranked in a tie

NAs

symbol indicating missing ranked object. Can be any symbol in the original ordering matrix

Value

a ranking or a matrix of rankings:

R ranking or matrix of rankings

Author(s)

Antonio D'Ambrosio antdambr@unina.it

Examples

data(APAred)
ord=rank2order(APAred) #transform rankings into orderings
ran=order2rank(ord) #transform the orderings into rankings


Generate partitions of n items constrained into k non empty subsets

Description

Generate all possible partitions of n items constrained into k non empty subsets. It does not generate the universe of rankings constrained into k buckets.

Usage

partitions(n, k = NULL, items = NULL, itemtype = "L")

Arguments

n

a (integer) number denoting the number of items

k

The number of the non-empty subsets. Default value is NULL, in this case all the possible partitions are displayed

items

items: the items to be placed into the ordering matrix. Default are the first c small letters

itemtype

to be used only if items is not set. The default value is "L", namely letters. Any other symbol produces items as the first c integers

Details

If the objects to be ranked is large (>15-20) with some missing, it can take long time to find the solutions. If the searching space is limited to the space of full rankings (also incomplete rankings, but without ties), use the function BBFULL or the functions FASTcons and QuickCons with the option FULL=TRUE.

Value

the ordering matrix (or vector)

Author(s)

Antonio D'Ambrosio antdambr@unina.it

See Also

stirling2 Stirling number of second kind.

rank2order Convert rankings into orderings.

order2rank Convert orderings into ranks.

univranks Generate the universe of rankings given the input partition

Examples

X<-partitions(4,3)
#shows all the ways to partition 4 items (say "a", "b", "c" and "d" into 3 non-empty subets
 #(i.e., into 3 buckets). The Stirling number of the second kind (4,3) indicates that there
 #are 6 ways.
s2<-stirling2(4,3)$S
X2<-order2rank(X) #it transform the ordering into ranking



Plot rankings on a permutation polytope of 3 o 4 objects containing all possible ties

Description

Plot rankings a permutation polytope that is the geometrical space of preference rankings. The plot is available for 3 or for 4 objects

Usage

polyplot(X = NULL, L = NULL, Wk = NULL, nobj = 3)

Arguments

X

the sample of rankings. Most of the time it is returned by tabulaterows

L

labels of the objects

Wk

frequency associated to each ranking

nobj

number of objects. It must be either 3 or 4

Details

polyplot() plots the universe of 3 objects. polyplot(nobj=4) plots the universe of 4 objects. Note: 4-object plots require the 'rgl' package to be installed.

Value

the permutation polytope

Author(s)

Antonio D'Ambrosio antdambr@unina.it and Sonia Amodio sonia.amodio@unina.it

References

Thompson, G. L. (1993). Generalized permutation polytopes and exploratory graphical methods for ranked data. The Annals of Statistics, 1401-1430. # Heiser, W. J., and D'Ambrosio, A. (2013). Clustering and prediction of rankings within a Kemeny distance framework. In Algorithms from and for Nature and Life (pp. 19-31). Springer International Publishing.

See Also

tabulaterows frequency distribution for ranking data.

Examples

## Not run: 
polyplot()

## End(Not run)

## Not run: 
# Requires rgl package
polyplot(nobj=4)
data(BU)
polyplot(BU[,1:3],Wk=BU[,4])

## End(Not run)


Plot rankings on a permutation polytope of 3 or 4 objects

Description

Plot rankings on a permutation polytope of 3 or 4 objects

Usage

polyplotnew(X = NULL, L = NULL, Wk = NULL, nobj = 3, engine = "auto")

Arguments

X

Sample of rankings (matrix). Usually returned by tabulaterows()

L

Labels of the objects (character vector)

Wk

Frequency weights for each ranking (numeric vector)

nobj

Number of objects: 3 or 4 (integer)

engine

Rendering engine for 4-object plots: "auto", "rgl", or "plotly" (character)

Details

For 3 objects: uses base R graphics (always available) For 4 objects: - engine="auto": tries rgl first, falls back to plotly if unavailable - engine="rgl": requires 'rgl' package (may need XQuartz on macOS) - engine="plotly": requires 'plotly' package (works on all platforms)

Value

For nobj=3: invisible NULL (creates base graphics plot) For nobj=4: invisible NULL for rgl, plotly object for plotly engine

Author(s)

Antonio D'Ambrosio, Sonia Amodio

References

Thompson, G. L. (1993). Generalized permutation polytopes and exploratory graphical methods for ranked data. The Annals of Statistics, 1401-1430.

Heiser, W. J., and D'Ambrosio, A. (2013). Clustering and prediction of rankings within a Kemeny distance framework. In Algorithms from and for Nature and Life (pp. 19-31). Springer International Publishing.


Given a rank, it is transformed to a ordering

Description

From ranking to ordering. IMPORTANT: check which symbol denotes tied rankings in the X matrix

Usage

rank2order(X, items = NULL, TO = "{", TC = "}", itemtype = "L")

Arguments

X

A ordering or a matrix containing orderings

items

items to be placed into the ordering matrix. Default are the

TO

symbol indicating the start of a set of items ranked in a tie

TC

symbol indicating the end of a set of items ranked in a tie

itemtype

to be used only if items=NULL. The default value is "L", namely

Value

a ordering or a matrix of orderings:

out ranking or matrix of rankings

Author(s)

Antonio D'Ambrosio antdambr@unina.it

Examples

data(APAred)
ord<-rank2order(APAred)


Given a vector (or a matrix), returns an ordered vector (or a matrix with ordered vectors)

Description

Given a ranking of M objects (or a matrix with M columns), it reduces it in "natural" form (i.e., with integers from 1 to M)

Usage

reordering(X)

Arguments

X

a ranking, or a ranking data matrix

Value

a ranking in natural form

Author(s)

Antonio D'Ambrosio antdambr@unina.it


Score matrix according Emond and Mason (2002)

Description

Given a ranking, it computes the score matrix as defined by Emond and Mason (2002)

Usage

scorematrix(X, use_cpp = TRUE)

Arguments

X

a ranking (must be a row vector or, better, a matrix with one row and M columns)

use_cpp

Logical. If TRUE (default), uses C++ implementation for faster computation. If FALSE, uses the R implementation.

Value

the M by M score matrix

Author(s)

Antonio D'Ambrosio antdambr@unina.it

References

Emond, E. J., and Mason, D. W. (2002). A new rank correlation coefficient with application to the consensus ranking problem. Journal of Multi-Criteria Decision Analysis, 11(1), 17-28.

See Also

combinpmatr The combined inut matrix

Examples

Y <- matrix(c(1,3,5,4,2),1,5)
SM<-scorematrix(Y)
#
Z<-c(1,2,4,3)
SM2<-scorematrix(Z)


sports data

Description

130 students at the University of Illinois ranked seven sports according to their preference (Baseball, Football, Basketball, Tennis, Cycling, Swimming, Jogging).

Usage

data(sports)

Source

Marden, J. I. (1996). Analyzing and modeling rank data. CRC Press.

Examples

data(sports)

Stirling numbers of the second kind

Description

Denote the number of ways to partition a set of n objects into k non-empty subsets

Usage

stirling2(n, k)

Arguments

n

(integer): the number of the objects

k

(integer <=n): the number of the non-empty subsets (buckets)

Value

a "list" containing the following components:

S the stirling number of the second kind
SM a matrix showing, for each k (on the columns) in how many ways the n objects (on the rows) can be partitioned

Author(s)

Antonio D'Ambrosio antdambr@unina.it

References

Comtet, L. (1974). Advanced Combinatorics: The art of finite and infinite expansions. D. Reidel, Dordrecth, The Netherlands.

Examples

parts<-stirling2(4,2)


Frequency distribution of a sample of rankings

Description

Given a sample of preference rankings, it compute the frequency associated to each ranking

Usage

tabulaterows(X, miss = FALSE)

Arguments

X

a N by M data matrix containing N judges judging M objects

miss

TRUE if there are missing data (either partial or incomplete rankings): default: FALSE

Value

a "list" containing the following components:

X the unique rankings
Wk the frequency associated to each ranking
tabfreq frequency table

Author(s)

Antonio D'Ambrosio antdambr@unina.it

Examples

data(Idea)
TR<-tabulaterows(Idea)
FR<-TR$Wk/sum(TR$Wk)
RF<-cbind(TR$X,FR)
colnames(RF)<-c(colnames(Idea),"fi")
#compute modal ranking
maxfreq<-which(RF[,6]==max(RF[,6]))
rank2order(RF[maxfreq,1:5],items=colnames(Idea))
#
data(APAred)
TR<-tabulaterows(APAred)
#
data(APAFULL)
TR<-tabulaterows(APAFULL)
CR1<-consrank(TR$X,wk=TR$Wk)
CR2<-consrank(TR$X,wk=TR$Wk,algorithm="fast",itermax=15)
CR3<-consrank(TR$X,wk=TR$Wk,algorithm="quick")


TauX (tau exstension) rank correlation coefficient

Description

Tau exstension is a new rank correlation coefficient defined by Emond and Mason (2002)

Usage

tau_x(X, Y = NULL)

Tau_X(X, Y = NULL)

Arguments

X

a M by N data matrix, in which there are N judges and M objects to be judged. Each row is a ranking of the objects which are represented by the columns. If there is only X as input, the output is a square matrix containing the Tau_X rcc.

Y

A row vector, or a n by M data matrix in which there are n judges and the same M objects as X to be judged.

Value

Tau_x rank correlation coefficient

Author(s)

Antonio D'Ambrosio antdambr@unina.it

References

Emond, E. J., and Mason, D. W. (2002). A new rank correlation coefficient with application to the consensus ranking problem. Journal of Multi-Criteria Decision Analysis, 11(1), 17-28.

See Also

kemenyd Kemeny distance

iw_tau_x item-weighted tau_x rank correlation coefficient

Examples

data(BU)
RD<-BU[,1:3]
Tau<-tau_x(RD)
Tau1_3<-tau_x(RD[1,],RD[3,])


Generate the universe of rankings

Description

Generate the universe of rankings given the input partition

Usage

univranks(X, k = NULL, ordering = TRUE)

Arguments

X

A ranking, an ordering, a matrix of rankings, a matrix of orderings or a number

k

Optional: the number of the non-empty subsets. It has to be used only if X is anumber. The default value is NULL, In this case the universe of rankings with n=X items are computed

ordering

The universe of rankings must be returned as orderings (default) or rankings?

Details

The function should be used with small numbers because it can generate a large number of permutations. The use of X greater than 9, of X matrices with more than 9 columns as input is not reccomended.

Value

a "list" containing the following components:

Runiv The universe of rankings
Cuniv A list containing:
R The universe of rankings in terms of rankings;
Parts for each ranking in input the produced rankings
Univinbuckets the universe of rankings within each bucket

Author(s)

Antonio D'Ambrosio antdambr@unina.it

See Also

stirling2 Stirling number of second kind.

rank2order Convert rankings into orderings.

order2rank Convert orderings into ranks.

partitions Generate partitions of n items constrained into k non empty subsets.

Examples

S2<-stirling2(4,4)$SM[4,] #indicates in how many ways 4 objects
                         #can be placed, respectively, into 1, 2,
                         #3 or 4 non-empty subsets.
CardConstr<-factorial(c(1,2,3,4))*S2  #the cardinality of rankings 
                                     #constrained into 1, 2, 3 and 4
                                     #buckets
Card<-sum(CardConstr)  #Cardinality of the universe of rankings with 4
                      #objects                                                              
U<-univranks(4)$Runiv #the universe of rankings with four objects
                     # we know that the universe counts 75 
                     #different rankings
Uk<-univranks(4,2)$Runiv    #the universe of rankings of four objects 
                           #constrained into k=2 buckets, we know they are 14
Up<-univranks(c(1,4,3,1))$Runiv  #the universe of rankings with 4 objects
                                #for which the first and the fourth item
                                #are tied