-- dump date 20140620_042956 -- class Genbank::CDS -- table cds_note -- id note YP_004728927.1 Weakly similar to several. Appears as an insertion in this genome compared to S. Typhi YP_004728939.1 this CDS is represented by two in S. Typhi CT18 STY0019 and STY0020 YP_004728968.1 no significant database hits YP_004728969.1 no significant database hits YP_004728985.1 highly similar to STY0093 and Stm0081, with slight variation to the very C-termini YP_004729056.1 the similarity to the uropathogenic E. coli protein found within the Usp pathogenicity island YP_004729078.1 similarity to transposases and the presence of a PFAM PF04754 Transposase_31domain YP_004729158.1 similarities to ShdA and SapA are extensive but weak YP_004729172.1 no significant database hits YP_004729179.1 no significant database hits YP_004729180.1 no significant database hits YP_004729181.1 rich in Ala and Gly residues YP_004729183.1 no significant database hits YP_004729184.1 no significant database hits YP_004729187.1 Similar to the C-terminus of RadC and many RadC-like DNA repair proteins YP_004729188.1 no significant database hits YP_004729191.1 the longer N-terminus in the database hit YP_004729194.1 no significant database hits YP_004729204.1 alternative possible translational start site YP_004729206.1 no significant database hits YP_004729211.1 no significant database hits YP_004729220.1 doubtful CDS YP_004729291.1 the differing N-termini when compared to the other closely related orthologues YP_004729293.1 no significant database hits YP_004729369.1 no significant database hits YP_004729382.1 Similarity to the C-terminus of many known isomerases YP_004729415.1 the frameshift following codon 82 YP_004729435.1 transcriptional regulatory protein YP_004729497.1 Database matches are either to the N- or C-terminus only YP_004729518.1 no significant database hits YP_004729567.1 no significant database hits. Note the excess Ser,Asn, Leu and Lys residues in the predicted protein product YP_004729568.1 carries a frameshift following codon 55 YP_004729646.1 gpE-like YP_004729647.1 also gpE YP_004729649.1 also gpD YP_004729676.1 no significant database hits, doubtful CDS YP_004729712.1 the differing N-termini when compared to the database hits YP_004729731.1 methyltransferase YP_004729742.1 doubtful CDS with no significant database hits YP_004729744.1 doubtful CDS with no significant database hits YP_004729776.1 no significant database hits YP_004729780.1 also gp14 YP_004729783.1 no significant database hits YP_004729784.1 no significant database hits YP_004729786.1 the database similarity is not full length and is weak YP_004729788.1 the database similarity is not full length and is weak YP_004729790.1 no significant database hits YP_004729792.1 the database similarity is not full length and is weak YP_004729794.1 the database similarity is not full length and is weak YP_004729803.1 the Pfam KilA motif thought to define a conserved nucleotide binding domain YP_004729804.1 no significant database hits YP_004729807.1 no significant database hits YP_004729809.1 no significant database hits YP_004729811.1 sequence similarity to SopA YP_004729815.1 no significant database hits YP_004729817.1 This CDS is found within a prophage in S. bongori. The orthologue of this CDS is known to be involved in lipopolysaccharide biosynthesis in E. coli YP_004729818.1 no significant database hits YP_004729819.1 no significant database hits YP_004729820.1 no significant database hits YP_004729879.1 doubtful CDS with no significant database hits YP_004729962.1 similar to many cytolethal distending toxins YP_004729963.1 no significant database hits YP_004730013.1 the differing N-termini YP_004730113.1 no significant database hits YP_004730115.1 no significant database hits YP_004730124.1 no significant database hits YP_004730182.1 doubtful CDS with no significant database hits YP_004730289.1 the differing N-termini of the product of this CDS and those of Salmonella typhimurium and Escherichia coli orthologues YP_004730299.1 the Pfam and Prosite peptidase motifs in the predicted protein product YP_004730324.1 doubtful CDS with no significant database hits YP_004730325.1 no significant database hits YP_004730340.1 the YP_004730483.1 the alternative possible translational start site at codon 21 YP_004730554.1 also proP YP_004730557.1 the similarities to mce (mammalian cell entry) proteins originally described in Mycobacterium tuberculosis YP_004730629.1 no significant database hits YP_004730675.1 no significant database hits YP_004730677.1 no significant database hits YP_004730678.1 that this CDS is highly similar to SBG1842 88.379% identity in 327 aa overlap YP_004730680.1 no significant database hits YP_004730683.1 this CDS appears to have been subject to a deletion event removing the 3' end YP_004730685.1 that this CDS is highly similar to SBG1829 88.379% identity in 327 aa overlap YP_004730691.1 no significant database hits YP_004730698.1 no significant database hits YP_004730700.1 alternative possible translational start site at codon 14 YP_004730718.1 doubtful CDS with no significant database hits YP_004730727.1 This is a chimeric gene where the first 450 bases represent sopA (type III secretion system effector protein. The following 828 bps are derived from a gene similar to a non-LEE encoded effector proteins from pathogenic E. coli. The Nle region of this gene appears to be part of a later insertion that deleted the sopA 3 prime region YP_004730856.1 also significantly similar to SBG3382 30% identity in 190 aa overlap YP_004730857.1 also significantly similar to SBG3383 56% identity in 824 aa overlap YP_004730858.1 also significantly similar to SBG3384 49% identity in 366 aa overlap YP_004730878.1 that this CDS is almost identical to SBG3374 99% identity in 340 aa overlap YP_004730879.1 that this CDS is identical to SBG3375 YP_004730880.1 that this CDS is identical to SBG3376 YP_004730881.1 that this CDS is identical to SBG3377 YP_004730882.1 that this CDS is identical to SBG3378 YP_004730883.1 that this CDS is identical to SBG3379 YP_004730884.1 that this CDS is identical to SBG3380 YP_004730914.1 the PFAM transposase motif YP_004730916.1 sequence similarity extends further upstream for these two CDS but the only suitable translational start site for S. bongori would incorporate a stop codon into the sequence. Consequently it is possible that this CDS is a pseudogene YP_004731003.1 no significant database hits YP_004731006.1 no significant database hits YP_004731008.1 doubtful CDS with no significant database hits YP_004731009.1 doubtful CDS with no significant database hits YP_004731011.1 no significant database hits YP_004731020.1 no significant database hits YP_004731024.1 no significant database hits YP_004731026.1 doubtful CDS with no significant database hits YP_004731027.1 protease VII precursor YP_004731082.1 no significant database hits YP_004731083.1 no significant database hits YP_004731084.1 no significant database hits YP_004731205.1 doubtful CDS with no significant database hits YP_004731206.1 similar to many non-LEE encoded EspJ effector proteins YP_004731249.1 the N-terminal extension in comparison to its orthologues. Theer is no similar translational start site for this CDS YP_004731311.1 oxygen-regulated invasion protein YP_004731312.1 oxygen-regulated invasion protein YP_004731341.1 no significant database hits YP_004731529.1 the central region of the predicted product of this CDS shares 95.092% identity in 163 aa overlap with the down stream CDS remnant SBG2715 YP_004731531.1 no significant database hits YP_004731532.1 no significant database hits except to the downstream CDS SBG2723 51.466% identity in 307 aa overlap YP_004731535.1 the similarity to the database match is very low YP_004731536.1 the similarity to the database match is very low YP_004731537.1 no significant database hits except to the upstream CDS SBG2718 51.466% identity in 307 aa overlap YP_004731539.1 no significant database hits YP_004731540.1 no significant database hits YP_004731544.1 also exuT YP_004731547.1 also uxaC YP_004731585.1 similar to many proposed modulator of drug activity proteins e.g. Escherichia coli MdaB YP_004731620.1 also gpU YP_004731623.1 also gpH YP_004731627.1 also gpH YP_004731642.1 also gpV YP_004731645.1 no significant database hits YP_004731646.1 no significant database hits YP_004731650.1 no significant database hits YP_004731660.1 no significant database hits YP_004731675.1 also uxaC YP_004731676.1 also exuT YP_004731765.1 no significant database hits. Highly similar to the two downstream CDS SBG2953 69.615% identity in 260 aa overlap and SBG2954 71.538% identity in 260 aa overlap YP_004731766.1 no significant database hits. Highly similar to CDS SBG2952 69.615% identity in 260 aa overlap and SBG2954 96.875% identity in 256 aa overlap YP_004731767.1 no significant database hits. Highly similar to CDS SBG2952 69.615% identity in 260 aa overlap and SBG2953 96.875% identity in 256 aa overlap YP_004731768.1 that the database similarity is weak and does not cover the whole sequence YP_004731819.1 the PFAM motifs to PF03707 MHYT, Bacterial signalling protein N terminal repeats YP_004732046.1 no significant database hits YP_004732068.1 no significant database hits YP_004732077.1 the Pfam sulfatase motif and the database similarities to many sulfatases YP_004732138.1 no significant database hits YP_004732155.1 no significant database hits YP_004732174.1 that this CDS is almost identical to SBG2045 99% identity in 340 aa overlap YP_004732175.1 that this CDS is identical to SBG2046 YP_004732176.1 that this CDS is identical to SBG20472 YP_004732177.1 that this CDS is identical to SBG2048 YP_004732178.1 that this CDS is identical to SBG2049 YP_004732180.1 that this CDS is identical to SBG2051 YP_004732182.1 also significantly similar to SBG2023 30% identity in 190 aa overlap YP_004732183.1 also significantly similar to SBG2024 56% identity in 824 aa overlap YP_004732184.1 also significantly similar to SBG2025 49% identity in 366 aa overlap YP_004732349.1 Carries a UGA stop codon which is readthrough as an opal codon by the selenocysteine tRNA YP_004732397.1 no significant database hits YP_004732484.1 no significant database hits YP_004732500.1 the product of this CDS is composed of multiple degenerate repeats; note similarity to SBG3755 29.701% identity in 3313 aa overlap YP_004732516.1 this CDS contains a UGA stop codon which is readthrough as an opal codon by the selenocysteine tRNA YP_004732518.1 the biased amino acid content with an excess of Arg and Asp amino acids YP_004732522.1 also proP YP_004732547.1 The product of this CDS is composed of multiple degenerate repeats.Note similarity to the SPI-4 gene product SBG3705 29.701% identity in 3313 aa overlap YP_004732548.1 no significant database hits YP_004732549.1 database match is weak and incomplete YP_004732550.1 no significant database hits YP_004732552.1 no significant database hits YP_004732556.1 the database hit is only partial YP_004732591.1 HflX protein, GTP-binding protein specific for phage lambda cII repressor YP_004732619.1 the differing N-termini compared to orthologues in the database YP_004732676.1 no significant database hits YP_004732691.1 no significant database hits YP_004732692.1 doubtful CDS with no significant database hits YP_004732693.1 doubtful CDS with no significant database hits YP_004732694.1 no significant database hits YP_004732697.1 no significant database hits YP_004732700.1 the database matches are not full length YP_004732736.1 similarity to database hits is limited to their very C-terminus. It is possible that this CDS is an adhesin gene remnant YP_004732775.1 Similar to the N-terminus of many related membrane proteins