Rule 10: (11, lift 89.0) amino_acid_pair_ratio_el > 18 amino_acid_pair_ratio_wv > 4.6 -> class 'function4(IS6110 )' [0.923] Evaluation on proper test data (811 items): tb95 - 4,2,2,0 Other IS elements, Repeated sequences, and Phage REP13E12 family REP13E12 family 'REP' "null" tb1160 - 2,1,5,0 Macromolecule metabolism Synthesis and modification of macromolecules DNA replication, repair, recombination and restriction/modification DNA replication, repair, recombination and restriction/modification 'mutT2' "MutT homologue" tb1370 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb1756 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb1757 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb1763 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb1764 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb2105 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb2167 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb2278 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb2371 - 4,3,1,1 Other PE and PPE families PE family PE subfamily 'PE' "null" tb2648 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb2815 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb3167 - 1,10,1,0 Small-molecule metabolism Broad regulatory functions Repressors/activators Repressors/activators 'null' "null" tb3186 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb3380 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb3381 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" Proper test Accuracy: 13/17 (76.47%) Application to new data (498 items): tb1466 - 5,0,0,0 Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals 'null' "null" tb2260 - 5,0,0,0 Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals 'null' "null" tb2367 - 5,0,0,0 Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals 'null' "null" Total: 3 Evaluation on training data (1060 items): tb795 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb796 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb1369 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb2106 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb2168 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb2279 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb2354 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb2355 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb2479 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb2480 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb2649 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" Training Accuracy: 11/11 (100.00%) Evaluation on test data (531 items): tb2814 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb3184 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb3185 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb3187 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb3325 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb3326 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb3474 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" tb3475 4,2,1,1 Other IS elements, Repeated sequences, and Phage IS elements IS6110 'IS6110' "null" Test Accuracy: 8/8 (100.00%) Application to new data (1023 items): tb2468 - 6,0,0,0 Unknowns Unknowns Unknowns Unknowns 'null' "null" tb2695 - 5,0,0,0 Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals 'null' "null" tb2808 - 6,0,0,0 Unknowns Unknowns Unknowns Unknowns 'null' "null" tb2901 - 6,0,0,0 Unknowns Unknowns Unknowns Unknowns 'null' "null" tb2923 - 5,0,0,0 Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals 'null' "null" tb3165 - 6,0,0,0 Unknowns Unknowns Unknowns Unknowns 'null' "null" tb3338 - 5,0,0,0 Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals 'null' "null" tb3908 - 5,0,0,0 Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals 'null' "null" Total: 8 ------------------ Rule 15: (30, lift 31.1) amino_acid_ratio_x > 58.9 [amino_acid_ratio_rule(p,3)] = 0 -> class 'function4(PE_PGRS subfamily )' [0.969] Evaluation on proper test data (811 items): tb109 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb278 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb742 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb872 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1396 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1803 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb2162 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb2487 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb2591 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb2615 - 4,3,1,1 Other PE and PPE families PE family PE subfamily 'PE' "null" tb2634 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3344 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3388 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3514 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3595 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3652 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3658 - 2,3,5,0 Macromolecule metabolism Cell envelope Other membrane proteins Other membrane proteins 'null' "null" Proper test Accuracy: 15/17 (88.24%) Application to new data (498 items): tb426 - 6,0,0,0 Unknowns Unknowns Unknowns Unknowns 'null' "null" tb3706 - 6,0,0,0 Unknowns Unknowns Unknowns Unknowns 'null' "null" Total: 2 Evaluation on training data (1060 items): tb124 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb279 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb297 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb532 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb578 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb746 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb747 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb833 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb834 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb977 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1067 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1068 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1087 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1091 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1243 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1325 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1441 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1450 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1452 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1468 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1651 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1759 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "wag22, member of the PGRS" tb1768 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1818 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1840 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb2098 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb2099 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb2126 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb2396 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb2490 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" Training Accuracy: 30/30 (100.00%) Evaluation on test data (531 items): tb2741 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb2853 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3345 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3367 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3507 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3508 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3511 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3512 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3590 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb3653 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" Test Accuracy: 10/10 (100.00%) Application to new data (1023 items): tb1158 - 5,0,0,0 Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals 'null' "null" tb1435 - 5,0,0,0 Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals 'null' "null" tb3654 - 6,0,0,0 Unknowns Unknowns Unknowns Unknowns 'null' "null" Total: 3 ------------------ Rule 16: (9, lift 29.2) amino_acid_pair_ratio_vs > 24.7 [hom( A ),keyword( A ,alternative_splicing)] = 1 -> class 'function4(PE_PGRS subfamily )' [0.909] Evaluation on proper test data (811 items): tb2487 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" Proper test Accuracy: 1/1 (100.00%) Application to new data (498 items): tb1057 - 5,0,0,0 Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals 'null' "null" tb3413 - 6,0,0,0 Unknowns Unknowns Unknowns Unknowns 'null' "null" Total: 2 Evaluation on training data (1060 items): tb124 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb747 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb832 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb978 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb980 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1067 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1087 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1091 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" tb1243 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" Training Accuracy: 9/9 (100.00%) Evaluation on test data (531 items): tb3221 - 1,8,1,0 Small-molecule metabolism Lipid Biosynthesis Synthesis of fatty and mycolic acids Synthesis of fatty and mycolic acids 'null' "null" tb3511 4,3,1,2 Other PE and PPE families PE family PE_PGRS subfamily 'PE_PGRS' "null" Test Accuracy: 1/2 (50.00%) Application to new data (1023 items): tb378 - 5,0,0,0 Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals Conserved hypotheticals 'null' "null" Total: 1 ------------------