These are files for the multi-label version of C4.5, as described in my PhD thesis: Clare, A. (2003) Machine learning and data mining for yeast functional genomics. PhD thesis. University of Wales Aberystwyth. http://users.aber.ac.uk/afc/papers/AClarePhDThesis.pdf (1Mb) They should be untarred over the top of Ross Quinlan's C4.5 Release 8, which can be downloaded from http://www.cse.unsw.edu.au/~quinlan/c4.5r8.tar.gz So to install you should do, tar xvfz c4.5r8.tar.gz tar xvfz MultiLabelC4.5.tar.gz cd R8/Src make all .data and .names files follow normal C4.5 conventions except that we can allow multiple class labels per data item, and provide a class hierarchy. Multiple classes for each data item should be spearated by the '@' character. For example: 3.4,2.5,sunny,class1@class2. Example data is given in testmulti.data and testmulti.names. Windowing and attribute subsets options to c4.5 do not work - I haven't updated this part of the code. I developed this only to use c4.5 and c4.5rules - any other executables are untested and used at your own risk. Confusion matrices are still produced, but should really be ignored - how can confusion matrices of multi-label problems be represented truthfully? If a classification is wrong, which column should it be reported under? Email me if any problems: afc@aber.ac.uk