net.sourceforge.nite.datainspection.calc
Class CoincidenceMatrixM

java.lang.Object
  extended by net.sourceforge.nite.datainspection.calc.CoincidenceMatrixM

public class CoincidenceMatrixM
extends java.lang.Object

DR: DOCUMENTATION IS WRONG! THIS CLASS IS FOR MULTIPLE ANNOTATORS! note: this class considers string values. the UNDEF or MISSING VALUE is therefore also specified as a String. We may want to change this. CoincidenceMatrixM has methods for comparing Two Classifications
The coincidence matrix accounts for Values (labels)
coincidence_matrix[Val1][Val2] is the number of times that one classification labeled some item Val1 the other Val2
coincidence_matrix is symmetrical along the main-diagonal.
the total over all entry numbers is 2*N, where N is the number of items judged (2*N values are given).
This class contains a method for computing kappa (a type of distance measure between classifications)
This distance uses a distance metric on the type of Values(class labels) assigned to the items(units).
Usually the Boolean Metric is used: distance = 0 iff values are equal otherwise it is 1.
If you think that some labels are more equal than others you may use a weighted kappa, that uses your
own DistanceMetric
kappaKrippendorf - returns the same value as alphaNominal with the standard BooleanMetric.
alphaNominal(DistanceMetric) - according to Krippendorff - requires a DistanceMetric defined on the values of the classification
kappaCohen is computed using the confusion matrix (this kappa may differ from kappa Krippendorff)
The implementation is based on "Computing Krippendorff's Alpha-Reliability"

See Also:
Classification, DistanceMetric

Constructor Summary
CoincidenceMatrixM(double[][] m)
          creates a coincidence matrix with given contents and size equal m.length and with default values the numbers (0,1,...,size-1)
CoincidenceMatrixM(int size)
          creates an empty coincidence matrix of given size with default values the numbers (0,1,...,size-1)
CoincidenceMatrixM(java.util.List cls)
           
CoincidenceMatrixM(java.util.List cls, Value undef)
          Use this constructor when the Classifications contain "MISSING CASES" or a special value to be interpreted as "UNDEF"
 
Method Summary
 double alpha()
           
 double alphaNominal(DistanceMetric dist)
          computes alpha for nominal values using the given distance metric
the distance metric should be appropriate for the Values that occur in the Classification for which
this CoincidenceMatrix is computed at contruction
 double entry(int row, int col)
           
 double entry(Value rowValue, Value colValue)
           
 java.util.List getValues()
           
 double kappaKrippendorff()
          returns the same value as alpha when using the standard Boolean Metric.
static void main(java.lang.String[] args)
           
 int nrOfItems()
           
 int numberOfValues()
           
 void printMatrix(java.io.PrintWriter pw)
           
 void printMatrix(java.lang.String filename)
           
 void printValues(java.io.PrintWriter pw)
          print all Values that occur in first or second on SO
 CoincidenceMatrixM remove(Value val)
          makes a copy of this matrix without the row and column of the given index Value if Value does not occur in the values list then this matrix self is returned (not a copy!)
 void setDistanceMetrics(DistanceMetric dist)
          set the distance metric used
 void setEntry(int row, int col, double cv)
          set coinm[rowValue][colValue] to cv
 void setEntry(Value rowValue, Value colValue, double cv)
          set coinm[rowValue][colValue] to cv
 void setValues(java.util.List vals)
          set the list of class labels used the order should be the same as the order of Values in the row and columns of the matrix
 void showDistanceMatrix(java.lang.String outFile)
           
 void showMatrix()
           
 void showValues()
          print all Values that occur in first or second on SO
 int size()
           
 int totalNumberOfItemsLabeledUndefined()
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CoincidenceMatrixM

public CoincidenceMatrixM(java.util.List cls)
Parameters:
f - the first Classification
s - the second Classification required: f and s are classifications of the same ordered list of items/units classified the constructor computes the coincidence matrix

CoincidenceMatrixM

public CoincidenceMatrixM(java.util.List cls,
                          Value undef)
Use this constructor when the Classifications contain "MISSING CASES" or a special value to be interpreted as "UNDEF"

Parameters:
undef - the String that specifys the StringValue that is considered as the Value that stands for "UNDEFINED" (or "MISSING VALUE")

CoincidenceMatrixM

public CoincidenceMatrixM(int size)
creates an empty coincidence matrix of given size with default values the numbers (0,1,...,size-1)


CoincidenceMatrixM

public CoincidenceMatrixM(double[][] m)
creates a coincidence matrix with given contents and size equal m.length and with default values the numbers (0,1,...,size-1)

Method Detail

size

public int size()
Returns:
the size of this matrix (equals the number of rows, equals the number of columns, equals the nr of Values in the values list)

remove

public CoincidenceMatrixM remove(Value val)
makes a copy of this matrix without the row and column of the given index Value if Value does not occur in the values list then this matrix self is returned (not a copy!)

Returns:
a new CoincidenceMatrixM constructed from this one by removing the given Value entries

setValues

public void setValues(java.util.List vals)
set the list of class labels used the order should be the same as the order of Values in the row and columns of the matrix


entry

public double entry(int row,
                    int col)
Returns:
coinm[row][col]

entry

public double entry(Value rowValue,
                    Value colValue)
Returns:
coinm[Value r][Value c]

setEntry

public void setEntry(Value rowValue,
                     Value colValue,
                     double cv)
set coinm[rowValue][colValue] to cv


setEntry

public void setEntry(int row,
                     int col,
                     double cv)
set coinm[rowValue][colValue] to cv


getValues

public java.util.List getValues()

nrOfItems

public int nrOfItems()

totalNumberOfItemsLabeledUndefined

public int totalNumberOfItemsLabeledUndefined()

showMatrix

public void showMatrix()

printMatrix

public void printMatrix(java.lang.String filename)

printMatrix

public void printMatrix(java.io.PrintWriter pw)

showValues

public void showValues()
print all Values that occur in first or second on SO


printValues

public void printValues(java.io.PrintWriter pw)
print all Values that occur in first or second on SO


numberOfValues

public int numberOfValues()
Returns:
number of class labels used

alphaNominal

public double alphaNominal(DistanceMetric dist)
computes alpha for nominal values using the given distance metric
the distance metric should be appropriate for the Values that occur in the Classification for which
this CoincidenceMatrix is computed at contruction

Returns:
alpha = 1.0 - (D_observed / D_chance )

alpha

public double alpha()

kappaKrippendorff

public double kappaKrippendorff()
returns the same value as alpha when using the standard Boolean Metric. return kappa = (pa-pe)/(1-pe) with pa = 1-Dobserved() and pe = 1-Dchance()

Returns:
kappa according to Krippendorff's alpha method using the standard Boolean Metric

setDistanceMetrics

public void setDistanceMetrics(DistanceMetric dist)
set the distance metric used


showDistanceMatrix

public void showDistanceMatrix(java.lang.String outFile)

main

public static void main(java.lang.String[] args)