Rule 1: (23, lift 37.6) amino_acid_pair_ratio_nm <= 3 [hom( A ),classification( A ,ascaridida)] = 0 [hom( A ),species( A ,leishmania_tarentolae__sauroleishmania_tarentolae_),mol_wt_gt( A ,55220)] = 1 -> class 'ABC superfamily (atp_bind)' [0.960] Evaluation on test data (712 items): ecoli924 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b0949' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli3275 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'yheS' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli805 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b0829' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli785 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'glnQ' 'ABC superfamily (atp_bind) ATP-binding component of glutamine high-affinity ABC transport system(2nd module)' ecoli2622 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'proV' 'ABC superfamily (atp_bind) ATP-binding component of transport system for glycine betaine and proline(1st module)' ecoli3125 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b3195' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli4248 - 1,6,2 Cell processes Adaptation Osmotic adaptation 'mdoB' 'phosphoglycerol transferase I add phosphoglycerols to OPG backbone' ecoli1089 - 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'b1116' 'ABC superfamily (membrane) paral putative membrane component of ABC transport system' ecoli3201 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'yhdZ' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli2264 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'hisP' 'ABC superfamily (atp_bind) ATP-binding component of histidine ABC transport system' ecoli578 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'fepC' 'ABC superfamily (atp_bind) ATP-binding component of ferric enterobactin transport(2nd module)' ecoli1261 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'sapF' 'ABC superfamily (atp_bind) ATP-binding protein of peptide ABC transport system(2nd module)' ecoli358 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b0366' 'ABC superfamily (atp_bind) ATP-binding component of a taurine transport system' ecoli3981 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'yjcW' 'ABC superfamily (atp_bind) ATP-binding component of allose transport system (2nd module)' ecoli3805 - 1,5,37 Cell processes Transport/binding proteins Sugar-specific PTS system 'frvB' 'PTS family fructose-like enzyme IIBC component(2nd module)' ecoli4277 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'yjjK' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system (2nd module)' ecoli1826 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b1858' 'ABC superfamily (atp_bind) ATP-binding component of a high affinity Zn transport system(1st module)' ecoli840 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'artP' 'ABC superfamily (atp&memb) ATP-binding component of 3rd arginine transport system(2nd module)' ecoli2373 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'cysA' 'ABC superfamily (atp_bind) ATP-binding component of sulfate permease A protein of ABC transport; chromate resistance (1st module)' ecoli3131 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'yhbG' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli127 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'yadG' 'ABC superfamily (atp_bind) ATP-binding component of transport protein (1st module)' ecoli1455 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b1484' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli3952 - 2,2,3 Macromolecule metabolism Macromolecule synthesis, modification DNA - replication, repair, restraction/modification 'uvrA' 'excision nuclease subunit (3rd module prob. DNA binding)' ecoli770 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b0794' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli254 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'yagC' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' Test Accuracy: 21/25 (84.00%) Test Frequency class 'ABC superfamily (atp_bind)': 27/712 (3.79%) Test Significance: dev(21.00) ; prob(1.565342E-26) Application to new data (2167 items): ecoli4153 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yjgR' 'orf' ecoli2858 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yggA' 'orf' ecoli1484 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1513' 'paral putative ATP-binding component of transport system (2nd module)' ecoli2863 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yggC' 'paral putative kinase' ecoli2498 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b2547' 'paral putative ATP-binding component of transport system' ecoli2769 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b2832' 'putative transport protein' ecoli1454 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1483' 'paral putative ATP-binding component of transport system' Frequency rule on new data: 7/2167 (0.32%) Evaluation on training data (939 items): ecoli1099 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'potA' 'ABC superfamily (atp_bind) ATP-binding component of spermidine/putrescine ABC transport system (1st module)' ecoli3489 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'xylG' 'ABC superfamily (atp_bind) ATP-binding component of D-xylose ABC transport system(2nd module)' ecoli3991 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'phnK' 'ABC superfamily (atp_bind) ATP-binding component of phosphonate ABC transportbelieved to be part of carbon-phosphorus (C-P) lyase in phosphonate metabolism' ecoli3386 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'ftsE' 'ABC superfamily (atp_bind) ATP-binding component of a membrane-associated complex involved in cell division' ecoli66 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'yabJ' 'ABC superfamily (atp_bind) ATP-binding component of thiamine ABC transport system' ecoli3403 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'nikE' 'ABC superfamily (stp_bind) ATP-binding component of nickel ABC transport system probably couples energy to transport system' ecoli1650 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b1682' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli3373 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'ugpC' 'ABC superfamily (atp_bind) ATP-binding component of sn-glycerol 3-phosphate ABC transport system (1st module)' ecoli3670 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'rbsA' 'ABC superfamily (atp_bind) ATP-binding component of d-ribose high-affinity transport system (2nd module)' ecoli1090 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b1117' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli908 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'ycbE' 'paral putative ATP-binding component of transport system' ecoli1262 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'sapD' 'ABC superfamily (atp_bind) ATP-binding protein of peptide transport system(2nd module) affects potassium transport;' ecoli3463 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'dppF' 'ABC superfamily (atp_bind) ATP-binding component of a dipeptide transport system(1st module)' ecoli2139 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'yejF' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli3378 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'livG' 'ABC superfamily (atp_bind) ATP-binding component of high-affinity branched-chain amino acid ABC transport system' ecoli1218 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'oppF' 'ABC superfamily (atp_bind) ATP-binding protein of oligopeptide ABC transport system' ecoli2159 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'ccmA' 'ABC superfamily (atp_bind) ATP-binding component of heme exporter A heme exporter protein A cytochrome c-type biogenesis protein' ecoli2108 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'mglA' 'ABC superfamily (atp_bind) ATP-binding component of methyl-galactoside transport and galactose taxis (2nd module)' ecoli3402 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'nikD' 'ABC superfamily (atp_bind) ATP-binding component of nickel ABC transport system probably couples energy to transport system(2nd module)' ecoli1882 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b1917' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli481 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b0490' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli736 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'modF' 'ABC superfamily (atp_bind) ATP-binding component of molybdenum transport system (2nd module)' ecoli440 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'mdlA' 'ATP-binding component of a transport system (2nd module)' Training Accuracy: 23/23 (100.00%) Training Frequency class 'ABC superfamily (atp_bind)': 24/939 (2.56%) Training Significance: dev(29.61) ; prob(2.363469E-37) Evaluation on validation data (471 items): ecoli441 - 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'mdlB' 'ABC superfamily (atp&memb) paral putative ATP-binding component of transport system' ecoli4119 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'ytfS' 'ABC superfamily (atp_bind) putative ATP-binding component of a transport system' ecoli3990 - 3,3,13 Metabolism of small molecules Central intermediary metabolism Phosphorus compounds 'phnL' 'ABC superfamily (atp_bind) ATP-binding component of phosphonate ABC transport believed to be part of carbon-phosphorus (C-P) lyase in phosphonate metabolism' ecoli1074 - 1,5,37 Cell processes Transport/binding proteins Sugar-specific PTS system 'ptsG' 'Sugar Specific PTS family glucose-specific IIBCcomponent (3rd module hydrophilic second phosphorylation domain) mutant form transports D-ribose' ecoli2271 - 5,1,1 Extrachromosomal Laterally acquirred elements Colicin-related functions 'cvpA' 'membrane protein required for colicin V production' ecoli642 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'gltL' 'ABC superfamily (atp_bind) ATP-binding protein of glutamate/aspartate transport system' ecoli3377 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'livF' 'ABC superfamily (atp_bind) ATP-binding component of leucine ABC transport system' ecoli1677 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'btuD' 'ABC superfamily (atp_bind) ATP-binding component of vitamin B12 ABC transport system(2nd module)' ecoli199 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'abc' 'ABC superfamily (atp_bind) ATP-binding component of ABC transport system(1st module)' ecoli1868 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'araG' 'ABC superfamily (atp_bind) ATP-binding component of high-affinity l-arabinose transport system (2nd module)' ecoli741 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'modC' 'ABC superfamily (atp_bind) ATP-binding component of molybdate ABC transport (1st module)' ecoli1724 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b1756' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli831 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'potG' 'ABC superfamily (atp_bind) ATP-binding component of putrescine ABC transport system(1st module)' ecoli4000 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'phnC' 'ABC superfamily (atp_bind) ATP-binding component of phosphonate ABC transport system' ecoli1412 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b1441' 'ABC superfamily (atp_bind) paral putative ATP-binding component of transport system' ecoli2169 - 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'yojI' 'ABC superfamily (atp&memb) paral putative ATP-binding component of transport system' ecoli727 - 1,5,23 Cell processes Transport/binding proteins Mechanism not stated 'pnuC' 'required for NMN transport' ecoli1217 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'oppD' 'ABC superfamily (atp_bind) ATP-binding protein of oligopeptide ABC transport system' Validation Accuracy: 12/18 (66.67%) Validation Frequency class 'ABC superfamily (atp_bind)': 17/471 (3.61%) Validation Significance: dev(14.34) ; prob(7.278090E-14) ------------------ Rule 2: (14/3, lift 20.1) amino_acid_pair_ratio_cg <= 2 amino_acid_pair_ratio_ci <= 1.7 amino_acid_pair_ratio_sw <= 1.8 [ss_beta( A ,gt,3),nss_beta( A , B ,gt,1)] = 0 [hom( A ),keyword( A ,transmembrane),classification( A ,rhodobacter)] = 1 -> class 'ABC superfamily (membrane)' [0.750] Evaluation on test data (712 items): ecoli3798 - 3,5,2 Metabolism of small molecules Energy metabolism, carbon Anaerobic respiration 'fdoI' 'formate dehydrogenase cytochrome B556 (FDO) subunit' ecoli1485 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'b1514' 'ABC superfamily (membrane)paral putative membrane component of ABC transport system' ecoli3374 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'ugpE' 'ABC superfamily (membrane)sn-glycerol 3-phosphate transport system integral membrane protein(1st module)' ecoli4120 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'ytfT' 'ABC superfamily (membrane)paral putative membrane component' ecoli808 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'b0832' 'ABC superfamily (membrane) paral putative membrane component of transport system' ecoli3980 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'yjcV' 'ABC superfamily (membrane) membrane component of allose ABC transport system(1st module)' ecoli644 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'gltJ' 'ABC superfamily (membrane) glutamate/aspartate transport system permease' ecoli1457 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'b1486' 'ABC superfamily (membrane) paral putative membrane component of transport system' ecoli3380 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'livH' 'ABC superfamily (membrane) membrane component of high-affinity branched-chain amino acid ABC transport system(2nd module)' ecoli1827 - 1,5,23 Cell processes Transport/binding proteins Mechanism not stated 'yebI' 'inner memrane component of a high affininty Zn transport system' ecoli786 - 1,5,23 Cell processes Transport/binding proteins Mechanism not stated 'glnP' 'glutamine high-affinity transport system; membrane component(1st module)' ecoli1633 - 1,5,20 Cell processes Transport/binding proteins MATE family 'ydhE' 'MATE family of transport protein(2nd module)' ecoli1097 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'potC' 'ABC superfamily (membrane) membrane component of spermidine/putrescine ABC transport system(2nd module)' ecoli3604 - 1,5,37 Cell processes Transport/binding proteins Sugar-specific PTS system 'glvC' 'PTS family arbutin-like IIC component' ecoli198 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'yaeE' 'ABC superfamily (membrane) paral putative membrane component of transport system' ecoli4155 - 1,5,19 Cell processes Transport/binding proteins GntP family 'yjgT' 'GntP family l-idonate transporter (2nd module)' ecoli2137 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'yejB' 'ABC superfamily (membrane) paral putative membrane component of transport system' ecoli420 - 3,5,1 Metabolism of small molecules Energy metabolism, carbon Aerobic respiration 'cyoE' 'protohaeme IX farnesyltransferase (haeme O biosynthesis)' ecoli3005 - 3,4,3 Metabolism of small molecules Degradation of small molecules Carbon compounds 'air' 'aerotaxis sensor receptor flavoprotein(2nd module)' Test Accuracy: 11/19 (57.89%) Test Frequency class 'ABC superfamily (membrane)': 26/712 (3.65%) Test Significance: dev(12.61) ; prob(8.864562E-12) Application to new data (2167 items): ecoli3445 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yhjD' 'orf' ecoli3250 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'hofF' 'orf' ecoli2625 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2680' 'orf' ecoli2111 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yeiB' 'orf' ecoli3028 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b3095' 'orf' ecoli1562 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b1592' 'orf' ecoli763 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b0787' 'orf' ecoli155 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yadQ' 'putative channel transporter' ecoli2095 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yohD' 'orf' ecoli3357 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b3434' 'orf' ecoli3739 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yigM' 'paral putative transport protein (2nd module)' ecoli2943 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yghB' 'orf' ecoli2585 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b2639' 'putative pump protein' ecoli2558 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2611' 'orf' ecoli945 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yccA' 'putative carrier/transport protein(2nd module)' ecoli3083 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yraQ' 'orf' ecoli2858 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yggA' 'orf' ecoli789 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'ybiF' 'orf' ecoli320 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b0328' 'paral putative transport protein' ecoli1726 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1758' 'putative cytochrome oxidase' ecoli2079 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2120' 'orf' ecoli65 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yabI' 'orf(1st module)' ecoli2087 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yehW' 'paral putative membrane component of transport system' ecoli3568 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yicG' 'orf' ecoli1030 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1057' 'putative cytochrome' ecoli3869 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yijC' 'orf' ecoli1571 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1601' 'paral putative transport protein (2nd module)' ecoli2349 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b2392' 'putative transport system permease(1st module)' ecoli762 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b0786' 'orf' ecoli2275 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'dedA' 'orf' ecoli255 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b0263' 'paral putative membrane component of transport system' Frequency rule on new data: 31/2167 (1.43%) Evaluation on training data (939 items): ecoli1456 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'b1485' 'ABC superfamily (membrane) paral putative membrane component of transport system' ecoli643 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'gltK' 'ABC superfamily (membrane) glutamate/aspartate transport (1st module)' ecoli1705 - 1,5,37 Cell processes Transport/binding proteins Sugar-specific PTS system 'celB' 'PTS family sugar specific enzyme II for cellobiose arbutin and salicin' ecoli3379 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'livM' 'ABC superfamily (membrane) membrane component of high-affinity branched-chain amino acid ABC transport system (2nd module)' ecoli3648 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'pstC' 'ABC superfamily (membrane) membrane component of high-affinity phosphate-specific ABC transport system (2nd module)' ecoli833 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'potI' 'ABC superfamily (membrane) membrane component of putrescine ABC transport system(2nd module)' ecoli1570 - 1,5,34 Cell processes Transport/binding proteins SMR family 'b1600' 'SMR family of transport protein' ecoli1098 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'potB' 'ABC superfamily (membrane) membrane component of spermidine/putrescine ABC transport system(2nd module)' ecoli3401 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'nikC' 'ABC superfamily (membrane) membrane component in nickel transport system probably forms heterodimeric pore with NikB(1st module)' ecoli1413 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'b1442' 'ABC superfamily (membrane) paral putative membrane component of transport system' ecoli3934 - 3,2,8 Metabolism of small molecules Biosynthesis of cofactors, carriers Menaquinone, ubiquinone 'ubiA' 'p-hydroxybenzoate: octaprenyltransferase(1st module)' ecoli1215 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'oppB' 'ABC superfamily (membrane) membrane component of oligopeptide ABC transport system(2nd module)' ecoli2107 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'mglC' 'ABC superfamily (membrane) membrane component of methyl-galactoside ABC transport system and galactose taxis(1st module)' ecoli359 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'b0367' 'ABC superfamily (membrane) membrane component of taurine ABC transport system' Training Accuracy: 11/14 (78.57%) Training Frequency class 'ABC superfamily (membrane)': 35/939 (3.73%) Training Significance: dev(14.78) ; prob(6.327689E-14) Evaluation on validation data (471 items): ecoli837 - 1,5,23 Cell processes Transport/binding proteins Mechanism not stated 'artM' 'arginine 3rd transport system permease protein' ecoli1883 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'yecC' 'ABC superfamily (membrane) paral putative membrane component of transport system' ecoli2237 - 3,5,1 Metabolism of small molecules Energy metabolism, carbon Aerobic respiration 'nuoK' 'NADH dehydrogenase I chain K' ecoli3375 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'ugpA' 'ABC superfamily (membrane) sn-glycerol 3-phosphate integral membrane protein ABC transport system' ecoli2089 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'yehY' 'ABC superfamily (membrane) paral putative membrane component of transport system' ecoli2158 - 1,2,1 Cell processes Chromosome replication Chromosome replication 'ccmB' 'heme exporter protein B cytochrome c-type biogenesis protein' ecoli1282 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'b1311' 'ABC superfamily (membrane) paral putative membrane component of transport system' ecoli3815 - 1,5,37 Cell processes Transport/binding proteins Sugar-specific PTS system 'kdgT' '2-keto-3-deoxy-D-gluconate transport system' ecoli579 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'fepG' 'ABC superfamily (membrane) ferric enterobactin transport protein' ecoli3659 - 1,5,24 Cell processes Transport/binding proteins Membrane-bound ATP synthase 'atpB' 'membrane-bound ATP synthase F0 sector subunit a' ecoli3490 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'xylH' 'ABC superfamily (membrane)d-xylose transport permease (2nd module might interact with atp hydrolysing subunit )' ecoli3465 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'dppC' 'ABC superfamily (membrane) membrane component of dipeptide ABC transport system; permease protein 2 (2nd module)' ecoli1189 - 1,5,23 Cell processes Transport/binding proteins Mechanism not stated 'chaA' 'sodium-calcium/proton antiporter' Validation Accuracy: 7/13 (53.85%) Validation Frequency class 'ABC superfamily (membrane)': 19/471 (4.03%) Validation Significance: dev(9.13) ; prob(2.329975E-07) ------------------ Rule 39: (4, lift 60.2) amino_acid_pairs_kt > 2 [ss_coil( A ,gt,1),nss_coil( A , B ,lteq,3),nss_coil( B , C ,gt,1),nss_coil( C , D ,gt,10),nss_coil( D , E ,lteq,10)] = 0 [hom( A ),e_val_gt( A ,2e-37),species( A ,streptococcus_agalactiae)] = 1 [hom( A ),species( A ,streptomyces_lividans),mol_wt_lteq( A ,55220)] = 0 [hom( A ),species( A ,leishmania_tarentolae__sauroleishmania_tarentolae_),mol_wt_gt( A ,55220)] = 0 [hom( A ),species( A ,agrobacterium_tumefaciens),mol_wt_lteq( A ,55220)] = 0 -> class 'Degradation of DNA' [0.833] Evaluation on test data (712 items): ecoli115 - 3,5,7 Metabolism of small molecules Energy metabolism, carbon Pyruvate dehydrogenase 'aceF' 'pyruvate dehydrogenase (dihydrolipoyltransacetylase component)(2nd module)' Test Accuracy: 0/1 (0.00%) Test Frequency class 'Degradation of DNA': 8/712 (1.12%) Test Significance: dev(-0.11) ; prob(1.000000E+00) Application to new data (2167 items): ecoli53 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'surA' 'survival protein(1st module)' ecoli433 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b0441' 'orf' Frequency rule on new data: 2/2167 (0.09%) Evaluation on training data (939 items): ecoli4235 2,1,1 Macromolecule metabolism Macromolecule degradation Degradation of DNA 'mcrB' 'component of 5-methylcytosine-specific restriction enzyme McrBC a truncated product of mcrB McrB(S) regulates McrBC activity' ecoli755 2,1,1 Macromolecule metabolism Macromolecule degradation Degradation of DNA 'uvrB' 'DNA repair; excision nuclease subunit B(2nd module)' ecoli389 2,1,1 Macromolecule metabolism Macromolecule degradation Degradation of DNA 'sbcC' 'ATP-dependent dsDNA exonuclease (2nd module)' ecoli2756 2,1,1 Macromolecule metabolism Macromolecule degradation Degradation of DNA 'recD' 'DNA helicase ATP-dependent dsDNA/ssDNA exonuclease V subunit ssDNA endonuclease chi sequence recognition' Training Accuracy: 4/4 (100.00%) Training Frequency class 'Degradation of DNA': 13/939 (1.38%) Training Significance: dev(16.88) ; prob(3.673762E-08) Evaluation on validation data (471 items): ecoli4239 2,1,1 Macromolecule metabolism Macromolecule degradation Degradation of DNA 'hsdR' 'host restriction; endonuclease R (2nd module)' Validation Accuracy: 1/1 (100.00%) Validation Frequency class 'Degradation of DNA': 4/471 (0.85%) Validation Significance: dev(10.81) ; prob(8.492569E-03) ------------------ Rule 53: (3, lift 187.8) [ss_beta( A ,lteq,830),nss_beta( A , B ,lteq,6),nss_beta( B , C ,gt,10)] = 1 [hom( A ),species( A ,pseudomonas_putida)] = 1 [hom( A ),e_val_gt( A ,6e-14),classification( A ,aculeata)] = 1 [hom( A ),e_val_lteq( A ,3e-06),species( A ,streptococcus_mutans)] = 1 -> class 'GPH family' [0.800] Evaluation on test data (712 items): ecoli3409 - 1,5,1 Cell processes Transport/binding proteins ABC superfamily (atp_bind) 'b3486' 'ABC superfamily (membrance) paral putative membrane component of transport system (3rd module)' ecoli3579 1,5,16 Cell processes Transport/binding proteins GPH family 'yicJ' 'GPH family paral putative transport protein' ecoli2114 - 4,1,3 Structural elements Cell envelop Outer membrane constituents 'cirA' 'outer membrane receptor for iron-regulated colicin I receptor; porin; requires tonB gene product(1st module)' Test Accuracy: 1/3 (33.33%) Test Frequency class 'GPH family': 1/712 (0.14%) Test Significance: dev(15.35) ; prob(4.207568E-03) Application to new data (2167 items): ecoli262 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yagG' 'paral putative transport protein' ecoli2908 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b2974' 'putative endoglucanase' Frequency rule on new data: 2/2167 (0.09%) Evaluation on training data (939 items): ecoli3783 1,5,16 Cell processes Transport/binding proteins GPH family 'yihP' 'GPH family paral putative transport protein' ecoli3782 1,5,16 Cell processes Transport/binding proteins GPH family 'b3876' 'GPH family paral putative transport protein (2nd module)' ecoli1586 1,5,16 Cell processes Transport/binding proteins GPH family 'uidB' 'GPH family glucuronide permease' Training Accuracy: 3/3 (100.00%) Training Frequency class 'GPH family': 4/939 (0.43%) Training Significance: dev(26.48) ; prob(7.730066E-08) Evaluation on validation data (471 items): ecoli4014 1,5,16 Cell processes Transport/binding proteins GPH family 'melB' 'GPH family melibiose permease II' Validation Accuracy: 1/1 (100.00%) Validation Frequency class 'GPH family': 1/471 (0.21%) Validation Significance: dev(21.68) ; prob(2.123142E-03) ------------------ Rule 56: (5, lift 26.8) amino_acid_pairs_vx <= 0 [hom( A ),e_val_lteq( A ,2e-37),classification( A ,streptomyces)] = 1 [hom( A ),species( A ,leishmania_tarentolae__sauroleishmania_tarentolae_),mol_wt_gt( A ,55220)] = 0 [hom( A ),keyword( A ,transmembrane),classification( A ,rhodobacter)] = 1 -> class 'Global regulatory functions' [0.857] Evaluation on test data (712 items): ecoli2726 6,1,1 Global functions Global regulatory functions Global regulatory functions 'barA' 'sensor module of sensor-regulator activates ompr by phophorylation (2nd module potential phosphoacceptor 665-784)' ecoli2176 - 4,1,4 Structural elements Cell envelop Surface polysaccharides & antigens 'rcsC' 'sensor for ctr capsule biosynthesis probable histidine kinase acting on rcsb (fourth module receiver ?)' ecoli3901 - 3,5,4 Metabolism of small molecules Energy metabolism, carbon Fermentation 'hydH' 'sensor kinase for HydG hydrogenase 3 activity(2nd module)' ecoli678 6,1,1 Global functions Global regulatory functions Global regulatory functions 'kdpD' 'sensor for high-affinity potassium transport system bifunctional enzyme catalyzing the autophosphorylation by ATP and the dephosphorylation of the corresponding response regulator KdpE(3rd module)' ecoli4006 - 2,2,11 Macromolecule metabolism Macromolecule synthesis, modification RNA synthesis, modification, DNA transcription 'basS' 'sensor protein for basR(2nd module)' Test Accuracy: 2/5 (40.00%) Test Frequency class 'Global regulatory functions': 16/712 (2.25%) Test Significance: dev(5.70) ; prob(4.826709E-03) Application to new data (2167 items): ecoli560 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b0570' 'paral putative sensor/kinase in regulatory system' ecoli2419 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yffG' 'paral putative oxidoreductase (2nd module)' ecoli2507 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b2556' 'paral putative sensor/kinase in regulatory system' ecoli2960 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b3026' 'paral putative sensor/kinase in regulatory system' ecoli2337 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b2380' 'putative sensor protein(1st module)' ecoli1932 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1968' 'paral putative sensor/kinase in regulatory system' ecoli609 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b0619' 'sensory kinase in two component regulatory system with DpiA(2nd module)' Frequency rule on new data: 7/2167 (0.32%) Evaluation on training data (939 items): ecoli3327 6,1,1 Global functions Global regulatory functions Global regulatory functions 'envZ' 'protein histidine kinase/phosphatase sensor for ompR modulates expression of ompF and ompC (sensor)(2nd module)' ecoli1579 6,1,1 Global functions Global regulatory functions Global regulatory functions 'rstB' 'sensor histidine protein kinase (RstA regulator)(2nd module)' ecoli4285 6,1,1 Global functions Global regulatory functions Global regulatory functions 'creC' 'catabolite repression sensor kinase for PhoB; alternative sensor for pho regulon(2nd module)' ecoli3817 6,1,1 Global functions Global regulatory functions Global regulatory functions 'cpxA' 'membrane sensor in 2-component cpxAR signal transduction system(2nd module)' ecoli1102 6,1,1 Global functions Global regulatory functions Global regulatory functions 'phoQ' 'periplasmic sensor protein in two component system with PhoP ligand is Mg+(2nd module)' Training Accuracy: 5/5 (100.00%) Training Frequency class 'Global regulatory functions': 30/939 (3.19%) Training Significance: dev(12.31) ; prob(3.328728E-08) Evaluation on validation data (471 items): ecoli392 6,1,1 Global functions Global regulatory functions Global regulatory functions 'phoR' 'positive and negative sensor protein for pho regulon (and asr gene)(2nd module)' Validation Accuracy: 1/1 (100.00%) Validation Frequency class 'Global regulatory functions': 8/471 (1.70%) Validation Significance: dev(7.61) ; prob(1.698514E-02) ------------------ Rule 63: (22/1, lift 37.4) [ss_beta( A ,lteq,830),nss_beta( A , B ,lteq,6),nss_beta( B , C ,gt,10)] = 1 [hom( A ),species( A ,pseudomonas_putida)] = 1 [hom( A ),e_val_lteq( A ,3e-06),species( A ,streptococcus_mutans)] = 0 [hom( A ),species( A ,leishmania_tarentolae__sauroleishmania_tarentolae_),mol_wt_gt( A ,55220)] = 1 -> class 'MFS family' [0.917] Evaluation on test data (712 items): ecoli335 1,5,21 Cell processes Transport/binding proteins MFS family 'lacY' 'MFS family of transport protein galactoside permease (M protein)(1st module)' ecoli2711 1,5,21 Cell processes Transport/binding proteins MFS family 'b2771' 'MFS family of transport protein (3rd module (function unknown)' ecoli470 1,5,21 Cell processes Transport/binding proteins MFS family 'fsr' 'MFS family of transport protein fosmidomycin resistance protein(2nd module)' ecoli818 - 1,5,23 Cell processes Transport/binding proteins Mechanism not stated 'b0842' 'transmembrane multidrug/chloramphenicol efflux transporter (2nd module)' ecoli1499 - 1,5,2 Cell processes Transport/binding proteins ABC superfamily (membrane) 'ydeA' 'ABC superfamily (membrane)putative membrane component of ABC transport system appears to facilitate arabinose export contributes to control of arabinose regulon' ecoli2324 1,5,21 Cell processes Transport/binding proteins MFS family 'emrY' 'MFS family of transport protein multidrug resistance protein y (2nd module)' ecoli3587 1,5,21 Cell processes Transport/binding proteins MFS family 'uhpT' 'MFS family of transport protein hexose phosphate transport protein (2nd module)' ecoli2538 1,5,21 Cell processes Transport/binding proteins MFS family 'kgtP' 'MFS family of transport protein alpha-ketoglutarate permease(1st module)' ecoli3580 1,5,21 Cell processes Transport/binding proteins MFS family 'yicK' 'MFS family of transport protein two-module paral putative transport protein (2nd module)' ecoli2198 - 1,5,23 Cell processes Transport/binding proteins Mechanism not stated 'glpT' 'sn-glycerol-3-phosphate permease' ecoli1514 1,5,21 Cell processes Transport/binding proteins MFS family 'b1543' 'MFS family of transport protein (1st module)' ecoli2357 - 1,5,23 Cell processes Transport/binding proteins Mechanism not stated 'xapB' 'xanthosine permease' ecoli3594 1,5,21 Cell processes Transport/binding proteins MFS family 'emrD' 'MFS family of transport protein 2-module integral membrane pump; multidrug resistance (2nd module)' ecoli45 1,5,21 Cell processes Transport/binding proteins MFS family 'yaaU' 'MFS family transport protein' ecoli3026 1,5,21 Cell processes Transport/binding proteins MFS family 'exuT' 'MFS family of transport protein transport of hexuronates (2nd module)' ecoli2141 1,5,21 Cell processes Transport/binding proteins MFS family 'bcr' 'MFS family of transport protein bicyclomycin resistance protein; transmembrane protein (2nd module)' Test Accuracy: 12/16 (75.00%) Test Frequency class 'MFS family': 14/712 (1.97%) Test Significance: dev(21.04) ; prob(5.649800E-18) Application to new data (2167 items): ecoli3780 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yihN' 'orf' ecoli3419 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yhiP' 'orf' ecoli2626 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2681' 'orf' ecoli1658 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1690' 'paral putative MFS family of transport protein' ecoli4221 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yjiJ' 'paral putative transport protein (2nd module)' ecoli1505 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'ydeF' 'paral putative transport protein (1st module)' ecoli1743 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1775' 'paral putative transport protein (1st module)' ecoli2715 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2775' 'orf' ecoli2204 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b2246' 'putative transport protein' ecoli1943 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1981' 'shikimate and dehydroshikimate permease (2nd module)' ecoli873 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'ycaD' 'paral putative transport protein (1st module)' ecoli4245 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b4356' 'paral putative transport protein cryptic orf joins former yjiZ and yjjL' ecoli1038 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b1065' 'orf' ecoli2729 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b2789' 'paral putative membrane component of transport system (2nd module)' Frequency rule on new data: 14/2167 (0.65%) Evaluation on training data (939 items): ecoli2057 1,5,21 Cell processes Transport/binding proteins MFS family 'b2098' 'MFS family of transport protein (2nd module)' ecoli1659 1,5,21 Cell processes Transport/binding proteins MFS family 'b1691' 'MFS family of transport protein' ecoli3396 1,5,21 Cell processes Transport/binding proteins MFS family 'yhhS' 'MFS family of transport protein (2nd module)' ecoli1566 1,5,21 Cell processes Transport/binding proteins MFS family 'b1596' 'MFS familty transport protein (2nd module)' ecoli3154 1,5,21 Cell processes Transport/binding proteins MFS family 'nanT' 'MFS family of transport protein sialic acid transporter cryptic in K12?(1st module)' ecoli2631 1,5,21 Cell processes Transport/binding proteins MFS family 'emrB' 'MFS family of transport protein multidrug resistance; probably membrane translocase(1st module)' ecoli1026 1,5,21 Cell processes Transport/binding proteins MFS family 'yceE' 'MFS family of transport protein (2nd module)' ecoli425 1,5,21 Cell processes Transport/binding proteins MFS family 'ampG' 'MFS family of transport protein ampicillin resistance (1st module)' ecoli4226 1,5,21 Cell processes Transport/binding proteins MFS family 'yjiO' 'MFS family of transport protein (1st module)' ecoli2036 1,5,21 Cell processes Transport/binding proteins MFS family 'b2077' 'MFS family of transport protein (1st module)' ecoli3675 1,5,21 Cell processes Transport/binding proteins MFS family 'yieO' 'MFS family of tranport protein (1st mdule)' ecoli345 1,5,21 Cell processes Transport/binding proteins MFS family 'b0353' 'MFS family transport protein (2nd module function unknown)' ecoli3059 1,5,21 Cell processes Transport/binding proteins MFS family 'yhaU' 'MFS family of transport protein (D)-glucarate or galactarate transporter (1st module)' ecoli2129 1,5,21 Cell processes Transport/binding proteins MFS family 'yeiO' 'MFS family proton-coupled sugar efflux pump transport selective monosaccharides and disaccharides narrower substr. specificity than SetA(2nd module)' ecoli2899 1,5,21 Cell processes Transport/binding proteins MFS family 'nupG' 'MFS family of transport protein transport of nucleosides (2nd module)' ecoli388 1,5,21 Cell processes Transport/binding proteins MFS family 'araJ' 'MFS family of transport protein involved in either transport or processing of arabinose polymers (2nd module function unknown)' ecoli3287 1,5,21 Cell processes Transport/binding proteins MFS family 'yhfC' 'MFS family of transport protein paral putative transport protein' ecoli2878 1,5,21 Cell processes Transport/binding proteins MFS family 'galP' 'MFS family of transport protein galactose-proton symport of transport system (2nd module)' ecoli2741 1,5,21 Cell processes Transport/binding proteins MFS family 'fucP' 'MFS family of transport protein fucose permease(1st module)' ecoli3469 - 1,4,3 Cell processes Protection responses Drug/analog sensitivity 'yhjX' 'putative resistance protein' ecoli3583 1,5,21 Cell processes Transport/binding proteins MFS family 'yicM' 'MFS family of tranport protein (1st mdule)' ecoli1440 1,5,21 Cell processes Transport/binding proteins MFS family 'narU' 'MFS family of transport protein nitrate sensor-transmitter protein anaerobic respiratory path(1st module)' Training Accuracy: 21/22 (95.45%) Training Frequency class 'MFS family': 23/939 (2.45%) Training Significance: dev(28.22) ; prob(3.180205E-33) Evaluation on validation data (471 items): ecoli419 1,5,21 Cell processes Transport/binding proteins MFS family 'b0427' 'MFS family transport protein' ecoli2778 1,5,21 Cell processes Transport/binding proteins MFS family 'araE' 'MFS family of transport protein low-affinity L-arabinose transport system proton symport protein(1st module)' ecoli1196 - 1,5,23 Cell processes Transport/binding proteins Mechanism not stated 'narK' 'nitrite extrusion protein(2nd module)' ecoli1627 1,5,21 Cell processes Transport/binding proteins MFS family 'b1657' 'MFS family of transport protein (2nd module)' ecoli2772 - 1,4,3 Cell processes Protection responses Drug/analog sensitivity 'ygeD' 'putative resistance proteins' ecoli1630 1,5,21 Cell processes Transport/binding proteins MFS family 'ydhC' 'MFS family transport protein (2nd module)' ecoli70 1,5,21 Cell processes Transport/binding proteins MFS family 'yabM' 'MFS family of transport protein proton-coupled beta-galactosidase/sugar efflux pump ? role in lactose metabolism (2nd module)' ecoli4168 1,5,21 Cell processes Transport/binding proteins MFS family 'yjhB' 'MFS family of tranport protein (1st module)' ecoli1796 1,5,21 Cell processes Transport/binding proteins MFS family 'b1828' 'MFS family of transport protein (2nd module)' ecoli3925 1,5,21 Cell processes Transport/binding proteins MFS family 'xylE' 'MFS family of tranport protein xylose-proton symport (2nd module)' ecoli3612 - 1,5,37 Cell processes Transport/binding proteins Sugar-specific PTS system 'yidT' 'D-galactonate transport' ecoli3588 1,5,21 Cell processes Transport/binding proteins MFS family 'uhpC' 'regulator of uhpT (1st module)' ecoli4005 1,5,21 Cell processes Transport/binding proteins MFS family 'proP' 'MFS family of tranport protein low-affinity constitutive transport system; proline permease II transports proline and betaine under conditions of hyperosmolarity(2nd module)' ecoli3631 1,5,21 Cell processes Transport/binding proteins MFS family 'yidY' 'MFS family of tranport protein (1st mdule)' ecoli1737 1,5,21 Cell processes Transport/binding proteins MFS family 'ydjE' 'MFS family of transport protein (1st module)' Validation Accuracy: 12/15 (80.00%) Validation Frequency class 'MFS family': 14/471 (2.97%) Validation Significance: dev(17.57) ; prob(1.976891E-16) ------------------ Rule 84: (7/1, lift 25.2) amino_acid_pair_ratio_qp <= 1.1 [ss_beta( A ,gt,6),nss_beta( A , B ,lteq,1),nss_beta( B , C ,lteq,10)] = 0 [ss_beta( A ,lteq,5),nss_beta( A , B ,gt,6),nss_beta( B , C ,lteq,5)] = 1 [ss_beta( A ,lteq,5),nss_beta( A , B ,gt,3),nss_beta( B , C ,gt,10)] = 0 [ss_alpha( A ,lteq,3),nss_alpha( A , B ,lteq,830),nss_alpha( B , C ,lteq,3)] = 0 [hom( A ),classification( A ,platyhelminthes)] = 0 [hom( A ),species( A ,cyanophora_paradoxa),mol_wt_lteq( A ,32892)] = 0 [hom( A ),mol_wt_lteq( A ,77359),classification( A ,gentiananae)] = 1 -> class 'Pool multipurpose conversions of intermed metm' [0.778] Evaluation on test data (712 items): ecoli2645 - 3,4,3 Metabolism of small molecules Degradation of small molecules Carbon compounds 'srlD' 'glucitol (sorbitol)-6-phosphate dehydrogenase' ecoli2274 - 3,6,1 Metabolism of small molecules Fatty acid biosynthesis Fatty acid and phosphatidic acid biosynth 'accD' 'acetylCoA carboxylase carboxytransferase component beta subunit' ecoli378 - 3,1,16 Metabolism of small molecules Amino acid biosynthesis Proline 'proC' 'pyrroline-5-carboxylate reductase' ecoli2900 - 3,3,14 Metabolism of small molecules Central intermediary metabolism Polyamine biosynthesis 'speC' 'ornithine decarboxylase isozyme' Test Accuracy: 0/4 (0.00%) Test Frequency class 'Pool multipurpose conversions of intermed metm': 20/712 (2.81%) Test Significance: dev(-0.34) ; prob(1.000000E+00) Application to new data (2167 items): ecoli2211 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b2253' 'paral putative regulator protein' ecoli484 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b0493' 'paral putative oxidoreductase' ecoli4083 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yjfS' 'orf' Frequency rule on new data: 3/2167 (0.14%) Evaluation on training data (939 items): ecoli1648 3,3,15 Metabolism of small molecules Central intermediary metabolism Pool, multipurpose conversions of intermed. met'm 'b1680' 'selenocysteine lyase (in iron sulfur component of FhuF ) regulated by Fur repressor(1st module)' ecoli2607 3,3,15 Metabolism of small molecules Central intermediary metabolism Pool, multipurpose conversions of intermed. met'm 'gabT' '4-aminobutyrate aminotransferase activity' ecoli2877 3,3,15 Metabolism of small molecules Central intermediary metabolism Pool, multipurpose conversions of intermed. met'm 'metK' 'methionine adenosyltransferase 1 (AdoMet synthetase); methyl and propylamine donor corepressor of met genes' ecoli903 - 3,1,4 Metabolism of small molecules Amino acid biosynthesis Aspartate 'aspC' 'aspartate aminotransferase' ecoli1273 3,3,15 Metabolism of small molecules Central intermediary metabolism Pool, multipurpose conversions of intermed. met'm 'goaG' '4-aminobutyrate aminotransferase' ecoli2779 3,3,15 Metabolism of small molecules Central intermediary metabolism Pool, multipurpose conversions of intermed. met'm 'kduD' '2-deoxy-D-gluconate 3-dehydrogenase' ecoli412 3,3,15 Metabolism of small molecules Central intermediary metabolism Pool, multipurpose conversions of intermed. met'm 'b0420' '1-deoxyxylulose-5-phosphate synthase; flavoprotein' Training Accuracy: 6/7 (85.71%) Training Frequency class 'Pool multipurpose conversions of intermed metm': 29/939 (3.09%) Training Significance: dev(12.64) ; prob(5.913450E-09) Evaluation on validation data (471 items): ecoli3539 3,3,15 Metabolism of small molecules Central intermediary metabolism Pool, multipurpose conversions of intermed. met'm 'kbl' '2-amino-3-ketobutyrate CoA ligase (glycine acetyltransferase)' ecoli498 3,3,15 Metabolism of small molecules Central intermediary metabolism Pool, multipurpose conversions of intermed. met'm 'gcl' 'glyoxylate carboligase (2nd module function unknown)' ecoli3530 - 3,5,1 Metabolism of small molecules Energy metabolism, carbon Aerobic respiration 'gpsA' 'glycerol-3-phosphate dehydrogenase (NAD+)' ecoli3494 - 3,1,1 Metabolism of small molecules Amino acid biosynthesis Alanine 'avtA' 'alanine-alpha-ketoisovalerate (or valine-pyruvate) transaminase transaminase C' Validation Accuracy: 2/4 (50.00%) Validation Frequency class 'Pool multipurpose conversions of intermed metm': 16/471 (3.40%) Validation Significance: dev(5.15) ; prob(6.461456E-03) ------------------ Rule 93: (2, lift 352.1) ecoli_aliphatic_index > 96.62 amino_acid_pairs_vx > 0 [hom( A ),species( A ,leishmania_tarentolae__sauroleishmania_tarentolae_),mol_wt_gt( A ,55220)] = 0 [hom( A ),keyword( A ,transmembrane),classification( A ,rhodobacter)] = 1 -> class 'RNDfamily' [0.750] Evaluation on test data (712 items): ecoli3833 - 1,5,22 Cell processes Transport/binding proteins MIP family 'glpF' 'MIP family facilitated diffusion of glycerol' ecoli3589 - 1,5,21 Cell processes Transport/binding proteins MFS family 'uhpB' 'sensor histidine protein kinase phosphorylates UhpA(2nd module)' ecoli1623 - 2,2,3 Macromolecule metabolism Macromolecule synthesis, modification DNA - replication, repair, restraction/modification 'lhr' 'member of ATP-dependent helicase superfamily II (2nd module)' ecoli3836 - 3,2,8 Metabolism of small molecules Biosynthesis of cofactors, carriers Menaquinone, ubiquinone 'menA' '14-dihydroxy-2-naphthoate --> dimethylmenaquinone' Test Accuracy: 0/4 (0.00%) Test Frequency class 'RNDfamily': 2/712 (0.28%) Test Significance: dev(-0.11) ; prob(1.000000E+00) Application to new data (2167 items): ecoli2117 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yeiH' 'orf' ecoli3568 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yicG' 'orf' ecoli3635 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yieG' 'orf (2nd module)' Frequency rule on new data: 3/2167 (0.14%) Evaluation on training data (939 items): ecoli453 1,5,33 Cell processes Transport/binding proteins RNDfamily 'acrB' 'RND family of transport protein acridine efflux pump(2nd module)' ecoli2035 1,5,33 Cell processes Transport/binding proteins RNDfamily 'b2076' 'RND family of transport protein paral putative outer membrane receptor' Training Accuracy: 2/2 (100.00%) Training Frequency class 'RNDfamily': 2/939 (0.21%) Training Significance: dev(30.61) ; prob(4.536582E-06) Evaluation on validation data (471 items): ecoli3196 1,5,33 Cell processes Transport/binding proteins RNDfamily 'acrF' 'RND family of transport protein acriflavin resistance protein F multidrug efflux (?encodes lipoprotein with signal peptide; osmotcially remedial envelope defect)' ecoli3446 - 1,5,21 Cell processes Transport/binding proteins MFS family 'yhjE' 'MFS family of transport protein (2nd module)' Validation Accuracy: 1/2 (50.00%) Validation Frequency class 'RNDfamily': 1/471 (0.21%) Validation Significance: dev(15.30) ; prob(4.237269E-03) ------------------ Rule 94: (14, lift 40.0) amino_acid_pairs_iw <= 0 [ss_beta( A ,lteq,5),nss_beta( A , B ,lteq,6),nss_beta( B , C ,lteq,5)] = 0 [hom( A ),e_val_gt( A ,0.0006),classification( A ,artiodactyla)] = 0 [hom( A ),species( A ,mycoplasma_pneumoniae),mol_wt_lteq( A ,43194)] = 1 [hom( A ),species( A ,leishmania_tarentolae__sauroleishmania_tarentolae_),mol_wt_gt( A ,55220)] = 0 [hom( A ),species( A ,cyanophora_paradoxa),mol_wt_lteq( A ,32892)] = 1 -> class 'Ribosomal proteins - synthesis modificationRiboso' [0.938] Evaluation on test data (712 items): ecoli3264 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsG' '30S ribosomal subunit protein S7 initiates assembly' ecoli3243 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rplC' '50S ribosomal subunit protein L3' ecoli3238 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rplV' '50S ribosomal subunit protein L22' ecoli1067 - 3,2,1 Metabolism of small molecules Biosynthesis of cofactors, carriers Acyl carrier protein (ACP) 'acpP' 'acyl carrier protein' ecoli4092 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsR' '30S ribosomal subunit protein S18' ecoli3882 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rplA' '50S ribosomal subunit protein L1 regulates synthesis of L1 and L11' ecoli3116 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rplU' '50S ribosomal subunit protein L21' ecoli3656 - 3,4,0 Metabolism of small molecules Degradation of small molecules ATP-proton motive force interconversion 'atpH' 'membrane-bound ATP synthase F1 sector delta-subunit' ecoli3231 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rplE' '50S ribosomal subunit protein L5' ecoli3229 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsH' '30S ribosomal subunit protein S8 and regulator' ecoli2556 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsP' '30S ribosomal subunit protein S16' ecoli3323 - 1,6,1 Cell processes Adaptation Adaptations, atypical conditions 'yrfH' 'heat shock protein 15 DNA/RNA binding' ecoli4182 - 6,1,1 Global functions Global regulatory functions Global regulatory functions 'fecI' 'sigma factor in two component regulatory system wtih FecR FecR interacts wtih the periplasmic iron binding FecA' ecoli2553 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rplS' '50S ribosomal subunit protein L19' ecoli3244 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsJ' '30S ribosomal subunit protein S10' Test Accuracy: 11/15 (73.33%) Test Frequency class 'Ribosomal proteins - synthesis modificationRiboso': 23/712 (3.23%) Test Significance: dev(15.36) ; prob(4.837919E-14) Application to new data (2167 items): ecoli499 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'gip' 'glyoxylate-induced protein(2nd module)' ecoli4190 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yjhK' 'paral putative epimerase' Frequency rule on new data: 2/2167 (0.09%) Evaluation on training data (939 items): ecoli3559 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpmB' '50S ribosomal subunit protein L28' ecoli4090 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsF' '30S ribosomal subunit protein S6' ecoli3881 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rplK' '50S ribosomal subunit protein L11' ecoli3220 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsK' '30S ribosomal subunit protein S11' ecoli3240 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rplB' '50S ribosomal subunit protein L2' ecoli1684 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rplT' '50S ribosomal subunit protein L20 and regulator' ecoli3234 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsQ' '30S ribosomal subunit protein S17' ecoli3230 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsN' '30S ribosomal subunit protein S14' ecoli3558 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpmG' '50S ribosomal subunit protein L33' ecoli3624 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpmH' '50S ribosomal subunit protein L34' ecoli3265 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsL' '30S ribosomal subunit protein S12' ecoli3161 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rplM' '50S ribosomal subunit protein L13' ecoli3222 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpmJ' '50S ribosomal subunit protein X' ecoli3233 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rplN' '50S ribosomal subunit protein L14' Training Accuracy: 14/14 (100.00%) Training Frequency class 'Ribosomal proteins - synthesis modificationRiboso': 22/939 (2.34%) Training Significance: dev(24.16) ; prob(1.501756E-23) Evaluation on validation data (471 items): ecoli4035 - 1,7,1 Cell processes Cell division Cell division 'mopB' 'GroES 10 Kd chaperone binds to Hsp60 suppressing its ATPase activity; affects cell division' ecoli3237 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsC' '30S ribosomal subunit protein S3' ecoli3226 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsE' '30S ribosomal subunit protein S5' ecoli3239 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsS' '30S ribosomal subunit protein S19' ecoli1685 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpmI' '50S ribosomal subunit protein A' ecoli3227 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rplR' '50S ribosomal subunit protein L18' ecoli3160 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsI' '30S ribosomal subunit protein S9' ecoli3221 4,2,2 Structural elements Ribosome constituents Ribosomal proteins - synthesis, modificationRiboso 'rpsM' '30S ribosomal subunit protein S13' Validation Accuracy: 7/8 (87.50%) Validation Frequency class 'Ribosomal proteins - synthesis modificationRiboso': 12/471 (2.55%) Validation Significance: dev(15.25) ; prob(5.432555E-11) ------------------ Rule 101: (4, lift 30.1) amino_acid_pair_ratio_ac <= 0.8 [hom( A ),classification( A ,zymomonas)] = 0 [hom( A ),e_val_gt( A ,2e-37),species( A ,streptococcus_agalactiae)] = 0 [hom( A ),species( A ,leishmania_tarentolae__sauroleishmania_tarentolae_),mol_wt_gt( A ,55220)] = 0 [hom( A ),mol_wt_gt( A ,55220),keyword( A ,inner_membrane)] = 1 [hom( A ),mol_wt_gt( A ,43194),classification( A ,shigella)] = 1 [hom( A ),keyword( A ,transmembrane),classification( A ,rhodobacter)] = 0 -> class 'Surface structures' [0.833] Evaluation on test data (712 items): ecoli222 4,1,5 Structural elements Cell envelop Surface structures 'fhiA' 'flagellar biosynthesis paral putative transport protein' ecoli454 - 1,4,3 Cell processes Protection responses Drug/analog sensitivity 'acrA' 'acridine efflux pump(1st module)' ecoli36 - 3,4,1 Metabolism of small molecules Degradation of small molecules Amines 'caiD' 'Canitine racemase' Test Accuracy: 1/3 (33.33%) Test Frequency class 'Surface structures': 22/712 (3.09%) Test Significance: dev(3.03) ; prob(8.986191E-02) Application to new data (2167 items): ecoli2068 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yehB' 'paral putative outer membrane protein (2nd module)' ecoli3248 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yheF' 'orf (2nd module)' ecoli1343 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1372' 'putative membrane protein' ecoli916 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b0941' 'paral putative fimbrial-like protein' ecoli2097 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yohG' 'orf' ecoli701 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b0718' 'paral putative outer membrane protein (2nd module)' ecoli1422 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1451' 'paral putative outer membrane receptor' ecoli3974 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yjcP' 'orf' ecoli1476 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1505' 'paral putative outer membrane protein (2nd module)' ecoli3960 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yjcF' 'orf' ecoli2989 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'ygiM' 'orf' ecoli3146 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yhcD' 'paral putative outer membrane protein (2nd module)' ecoli1225 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yciB' 'orf' ecoli3314 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'hofQ' 'orf (2nd module)' ecoli1442 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1471' 'putative glycoportein' Frequency rule on new data: 15/2167 (0.69%) Evaluation on training data (939 items): ecoli1888 4,1,5 Structural elements Cell envelop Surface structures 'fliC' 'flagellar biosynthesis; flagellin filament structural protein(2nd module)' ecoli1049 4,1,5 Structural elements Cell envelop Surface structures 'flgE' 'flagellar biosynthesis hook protein(1st module)' ecoli1847 4,1,5 Structural elements Cell envelop Surface structures 'flhA' 'flagellar biosynthesis; possible export of flagellar proteins(2nd module)' ecoli1906 4,1,5 Structural elements Cell envelop Surface structures 'fliI' 'flagellum-specific ATP synthase(2nd module)' Training Accuracy: 4/4 (100.00%) Training Frequency class 'Surface structures': 26/939 (2.77%) Training Significance: dev(11.85) ; prob(5.878020E-07) Evaluation on validation data (471 items): ecoli3335 - 3,2,2 Metabolism of small molecules Biosynthesis of cofactors, carriers Biotin 'bioH' 'biotin biosynthesis; reaction prior to pimeloyl CoA' ecoli1055 4,1,5 Structural elements Cell envelop Surface structures 'flgK' 'flagellar biosynthesis hook-filament junction protein 1 C-terminal involved in chaperone (probably FlgN) binding(2nd module)' ecoli1056 4,1,5 Structural elements Cell envelop Surface structures 'flgL' 'flagellar biosynthesis; hook-filament junction protein C-terminal involved in chaperone (probably FlgN) binding' Validation Accuracy: 2/3 (66.67%) Validation Frequency class 'Surface structures': 12/471 (2.55%) Validation Significance: dev(7.05) ; prob(1.897727E-03) ------------------ Rule 106: (9/1, lift 34.9) amino_acid_pair_ratio_ay <= 1.5 amino_acid_pairs_ah > 0 [ss_coil( A ,gt,1),nss_coil( A , B ,lteq,3),nss_coil( B , C ,lteq,3),nss_coil( C , D ,lteq,3),nss_coil( D , E ,lteq,3)] = 0 [hom( A ),e_val_gt( A ,0.0006),classification( A ,clostridium)] = 1 [hom( A ),e_val_gt( A ,2e-37),species( A ,streptococcus_agalactiae)] = 0 [hom( A ),e_val_lteq( A ,2e-37),classification( A ,eurotiales)] = 0 [hom( A ),psi_iter_lteq( A ,13),species( A ,lactococcus_lactis__subsp__lactis___streptococcus_lactis_)] = 0 [hom( A ),species( A ,mycoplasma_pneumoniae),mol_wt_lteq( A ,32892)] = 0 [hom( A ),mol_wt_gt( A ,55220),keyword( A ,inner_membrane)] = 0 [hom( A ),keyword( A ,transmembrane),classification( A ,rhodobacter)] = 0 -> class 'Transposon-related functions' [0.818] Evaluation on test data (712 items): ecoli3863 - 3,1,2 Metabolism of small molecules Amino acid biosynthesis Arginine 'argE' 'acetylornithine deacetylase(1st module)' ecoli1989 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi52_7' 'IS5 protein' ecoli2246 - 3,5,1 Metabolism of small molecules Energy metabolism, carbon Aerobic respiration 'nuoA' 'NADH dehydrogenase I chain A' ecoli2460 - 2,1,1 Macromolecule metabolism Macromolecule degradation Degradation of DNA 'xseA' 'exonuclease VII large subunit' ecoli2979 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi22_5' 'IS2 protein' Test Accuracy: 2/5 (40.00%) Test Frequency class 'Transposon-related functions': 13/712 (1.83%) Test Significance: dev(6.38) ; prob(3.213624E-03) Application to new data (2167 items): ecoli314 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b0322' 'orf' ecoli814 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b0838' 'paral putative S-transferase(1st module)' ecoli2904 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2970' 'orf' ecoli3972 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yjcO' 'orf(2nd module)' ecoli793 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b0817' 'putative toxin' ecoli2207 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2249' 'orf' ecoli2590 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2644' 'orf' ecoli3084 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yraR' 'orf' ecoli3750 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b3838' 'transmembrane protein part of sec-independent protein export' ecoli845 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b0869' 'putative dTDP-glucose enz' ecoli2011 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yefB' 'fucose synthetase (epimerase and reductase) in GDP-L-fucose synthetis colanic acid gene cluster' ecoli1466 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yddB' 'orf' ecoli1802 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b1834' 'orf(1st module)' ecoli4197 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yjhR' 'putative frameshift suppressor' ecoli2259 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2301' 'orf' ecoli2684 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'surE' 'survival protein; protein damage control possibly acts with pcm' ecoli1308 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1337' 'p-aminobenzoyl-glutamate utilization paral putative aminohydrolase (2nd module)' ecoli977 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yccJ' 'orf' ecoli4162 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b4273' 'putative transposase' Frequency rule on new data: 19/2167 (0.88%) Evaluation on training data (939 items): ecoli251 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi52_1' 'IS5 protein 1' ecoli1373 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi22_2' 'IS22 protein 2' ecoli1955 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi22_3' 'IS2 protein' ecoli2179 - 3,4,4 Metabolism of small molecules Degradation of small molecules Fatty acids 'atoD' 'acetyl-CoA:acetoacetyl-CoA transferase alpha subunit' ecoli3428 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi52_11' 'IS5 protein 11' ecoli2150 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi52_8' 'IS5 protein' ecoli542 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi52_2' 'IS 5 protein' ecoli353 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi22_1' 'IS22 protein 1' ecoli2916 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi52_9' 'IS5 protein' Training Accuracy: 8/9 (88.89%) Training Frequency class 'Transposon-related functions': 22/939 (2.34%) Training Significance: dev(17.16) ; prob(8.001274E-13) Evaluation on validation data (471 items): ecoli31 - 3,1,13 Metabolism of small molecules Amino acid biosynthesis Lysine 'dapB' 'dihydrodipicolinate reductase' ecoli1302 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi52_4' 'IS5 protein' ecoli3155 - 4,1,4 Structural elements Cell envelop Surface polysaccharides & antigens 'nanA' 'N-acetylneuraminate lyase (aldolase); catabolism of sialic acid; not K-12?' ecoli3045 - 3,4,2 Metabolism of small molecules Degradation of small molecules Amino acids 'yhaQ' 'L-serine dehydratase 3 part 1 (EC 4 2 1 13) fuse with b3111' ecoli3148 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi52_10' 'IS5 protein 10' ecoli2797 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi22_4' 'IS2 protein' Validation Accuracy: 3/6 (50.00%) Validation Frequency class 'Transposon-related functions': 14/471 (2.97%) Validation Significance: dev(6.78) ; prob(4.797746E-04) ------------------ Rule 107: (5/1, lift 30.5) ecoli_atomic_comp_s <= 18 [hom( A ),e_val_gt( A ,0.0006),classification( A ,clostridium)] = 0 [hom( A ),psi_iter_lteq( A ,7),species( A ,methanobacterium_thermoautotrophicum)] = 0 [hom( A ),species( A ,mycoplasma_pneumoniae),mol_wt_lteq( A ,32892)] = 0 [hom( A ),species( A ,drosophila_melanogaster__fruit_fly_),keyword( A ,repeat)] = 1 [hom( A ),species( A ,cyanophora_paradoxa),mol_wt_lteq( A ,32892)] = 0 [hom( A ),keyword( A ,transmembrane),classification( A ,enterobacteriaceae)] = 0 -> class 'Transposon-related functions' [0.714] Evaluation on test data (712 items): ecoli1375 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'tra8_2' 'transposase 2 for IS30' ecoli4173 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'tra8_3' 'transposase for IS30' ecoli22 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'insA_1' 'IS1 protein InsA' ecoli1374 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi21_2' 'IS21 protein 2' ecoli2527 - 2,2,11 Macromolecule metabolism Macromolecule synthesis, modification RNA synthesis, modification, DNA transcription 'srmB' 'ATP-dependent RNA helicase (2nd module)' Test Accuracy: 4/5 (80.00%) Test Frequency class 'Transposon-related functions': 13/712 (1.83%) Test Significance: dev(13.06) ; prob(5.475617E-07) Application to new data (2167 items): ecoli2879 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'sprT' 'orf' ecoli365 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b0373' 'orf' ecoli1425 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1454' 'paral putative S-transferase(1st module)' ecoli1899 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b1934' 'orf' ecoli290 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b0298' 'orf' ecoli1568 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b1598' 'orf' ecoli2047 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2088' 'orf' ecoli2923 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b2989' 'paral putative S-transferase(1st module)' ecoli1423 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b1452' 'putative receptor' ecoli2208 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2250' 'orf' ecoli3035 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yqjG' 'putative transferase' ecoli3820 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b3914' 'orf' ecoli530 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b0540' 'orf' ecoli2224 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2266' 'orf' ecoli2134 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b2175' 'suppresses thermosensitivity of prc mutants at low osmolality; in turn suppressed by multicopy expression of PBP 7' ecoli1230 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yciG' 'orf' ecoli4161 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b4272' 'orf' ecoli3748 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b3836' 'orf' ecoli1001 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b1027' 'orf' ecoli3147 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'yhcE' 'orf' ecoli247 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b0255' 'orf' ecoli310 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b0318' 'putative transcription factor' Frequency rule on new data: 22/2167 (1.02%) Evaluation on training data (939 items): ecoli248 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'tra8_1' 'transposase1 for IS30' ecoli1956 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi21_3' 'IS2 protein' ecoli267 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'insA_3' 'IS1 protein InsA' ecoli1678 - 1,5,23 Cell processes Transport/binding proteins Mechanism not stated 'btuE' 'not required for vitamin B12 transport perhaps periplasmic protein' ecoli1862 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'insA_5' 'IS1 protein InsA' Training Accuracy: 4/5 (80.00%) Training Frequency class 'Transposon-related functions': 22/939 (2.34%) Training Significance: dev(11.48) ; prob(1.478363E-06) Evaluation on validation data (471 items): ecoli257 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'insA_2' 'IS1 protein InsA 2' ecoli950 - 3,5,1 Metabolism of small molecules Energy metabolism, carbon Aerobic respiration 'hyaE' 'processing of HyaA and HyaB proteins' ecoli1977 - 3,1,10 Metabolism of small molecules Amino acid biosynthesis Histidine 'hisL' 'his operon leader peptide' ecoli1360 - 3,4,3 Metabolism of small molecules Degradation of small molecules Carbon compounds 'b1389' 'phenylacetic acid degradation protein possibly part of multicomponent oxygenase' ecoli352 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'yi21_1' 'IS21 protein 1' ecoli4183 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'insA_7' 'IS1 protein InsA' ecoli3270 - 2,2,9 Macromolecule metabolism Macromolecule synthesis, modification Protein modufication 'fkpA' 'FKBP-type peptidyl-prolyl cis-trans isomerase (rotamase)' ecoli3367 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'insA_6' 'IS1 protein InsA' Validation Accuracy: 4/8 (50.00%) Validation Frequency class 'Transposon-related functions': 14/471 (2.97%) Validation Significance: dev(7.83) ; prob(4.842925E-05) ------------------ Rule 108: (3/1, lift 25.6) amino_acid_pair_ratio_gx > 0 [hom( A ),classification( A ,zymomonas)] = 0 [hom( A ),e_val_gt( A ,2e-37),species( A ,streptococcus_agalactiae)] = 0 [hom( A ),mol_wt_gt( A ,55220),keyword( A ,inner_membrane)] = 0 [hom( A ),keyword( A ,transmembrane),classification( A ,enterobacteriaceae)] = 1 [hom( A ),keyword( A ,transmembrane),classification( A ,rhodobacter)] = 0 -> class 'Transposon-related functions' [0.600] Evaluation on test data (712 items): ecoli2365 - 3,1,6 Metabolism of small molecules Amino acid biosynthesis Cysteine 'cysK' 'cysteine synthase A O-acetylserine sulfhydrolase A' ecoli2274 - 3,6,1 Metabolism of small molecules Fatty acid biosynthesis Fatty acid and phosphatidic acid biosynth 'accD' 'acetylCoA carboxylase carboxytransferase component beta subunit' ecoli2523 - 6,1,1 Global functions Global regulatory functions Global regulatory functions 'rseA' 'sigma-E factor negative regulatory protein' Test Accuracy: 0/3 (0.00%) Test Frequency class 'Transposon-related functions': 13/712 (1.83%) Test Significance: dev(-0.24) ; prob(1.000000E+00) Application to new data (2167 items): ecoli2801 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2865' 'orf (2nd module)' ecoli648 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b0658' 'orf' ecoli765 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'b0789' 'paral putative synthase (2nd module)' ecoli2472 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'sseA' 'enhances serine sensitivity (inhibits homoserine deHase) on lactate; rhodanese-like protein' ecoli2535 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b2584' 'orf' ecoli313 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b0321' 'orf(2nd module)' ecoli686 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b0703' 'orf(2nd module)' ecoli1577 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b1607' 'orf' ecoli3255 - 7,0,0 Miscellaneous Some information, but not classifiable Not classified (included putative assignments) 'yheJ' 'putative export protein J for general secretion pathway (GSP)' ecoli1141 - 0,0,0 Open reading frames Unknown proteins, no known homologs Unknown function 'b1168' 'orf' Frequency rule on new data: 10/2167 (0.46%) Evaluation on training data (939 items): ecoli3405 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'rhsB' 'rhsB protein in rhs element' ecoli683 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'rhsC' 'rhsC protein in rhs element' ecoli611 - 1,5,13 Cell processes Transport/binding proteins DcuC 'b0621' 'DcuC family of tranport protein transport of dicarboxylates succinate efflux during glucose fermentation(2nd module)' Training Accuracy: 2/3 (66.67%) Training Frequency class 'Transposon-related functions': 22/939 (2.34%) Training Significance: dev(7.37) ; prob(1.621058E-03) Evaluation on validation data (471 items): ecoli488 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'rhsD' 'rhsD protein in rhs element' ecoli1427 5,1,4 Extrachromosomal Laterally acquirred elements Transposon-related functions 'rhsE' 'rhsE protein in rhs element' ecoli1908 - 4,1,5 Structural elements Cell envelop Surface structures 'fliK' 'flagellar hook-length control protein(1st module)' ecoli2652 - 3,5,2 Metabolism of small molecules Energy metabolism, carbon Anaerobic respiration 'hypF' 'transcriptional regulatory protein(2nd module)' Validation Accuracy: 2/4 (50.00%) Validation Frequency class 'Transposon-related functions': 14/471 (2.97%) Validation Significance: dev(5.54) ; prob(4.990638E-03) ------------------