An Integrated Prediction Pipeline for Bacterial Type III Secreted Effectors.  


How to use the T3SEpp website to predict T3SS effecter proteins?

  • Paste the candidate protein sequences into the textarea or upload a FASTA file (.fa/.fasta/.FASTA). For each protein, the name should neither be separated by space nor be with illegal characters such as "|".

  • The T3SEpp website also offers 4 optional checkboxes, you can select them to upload other files to improve the prediction accuracy. Option2~4 need result-documents predicted by PSORTb, TMHMM and SignalP4.0, respectively. Details are described below.

  • Offer an email address when you submit your job if you want to be informed once your job is finished.

  • An unique job name and a link will be provided after you submit your job successfully. You can check your job results by using the link.

  • Standalone Version of T3SEpp is freely available at http://www.szu-bioinf.org/T3SEpp/download.html; and a manual of it can be download here.


NOTE:


  • Promoter sequences should be in FASTA format with a < 2001-nt length for each promoter.

  • The name should exactly be consistent with that of protein sequence.

  • The name should neither be separated by space nor be with illegal characters such as "|".

NOTE:


  • PSORTb is available at https://www.psort.org/psortb/index.html.

  • Using the Tab-delimited Output Format (Short Format) with 3 columns.

  • Note that the protein name should exactly be consistent with that of protein sequence; the name should neither be separated by space nor be with illegal characters such as "|".

  • Upload the TXT FILE without header.

NOTE:


  • TMHMM is available at https://services.healthtech.dtu.dk/service.php?TMHMM-2.0.

  • Using the "One line per protein" output format.

  • Note that the protein name should exactly be consistent with that of protein sequence; the name should neither be separated by space nor be with illegal characters such as "|".

  • Upload the TXT FILE without header.

NOTE:


  • SignalP is available at https://services.healthtech.dtu.dk/service.php?SignalP-4.1.

  • Using the "Short (no graphics)" output format.

  • Note that the protein name should exactly be consistent with that of protein sequence; the name should neither be separated by space nor be with illegal characters such as "|".

  • Upload the TXT FILE without header.



You are here : HCD Lab ;    H: Haplotype    C: Cloud computing    D: Deep learning



How to explain the results you got from the T3SEpp website?

  • At the top of the Results Page, the summarized prediction results of different modules, the general prediction scores and the classification (T3S/nonT3S) are shown first, and have been written to "T3SEpp.out.txt".

  • In the table, “1” represents “a positive contribution to the final prediction of T3SEpp” while “0” means the opposite. (note: the defaults of transHMM, TMHMM, PSORTb and SignalP are "0" if there are no related input files).

  • All results files from different modules are listed in a zip file and can be downloaded at the Results Page. 'T3SEpp.out.txt' can be downloaded independently.

Parameter and Performance :

Cutoff value for web-based T3SEpp prediction: 0.5;

Selectivity: ~95% (FPR: 0.05);

Sensitivity: ~91%.



Prediction results from modules:

  • We provide nine modules to predict candidate T3S proteins, i.e. T3SEppML, flBlast, sigHMM, cbdHMM, effectHMM, transHMM [optional], TMHMM [optional], PSORTb [optional] and SignalP [optional]. The modules newly developed were in italic and blue font while those cited from other groups were shown in red.

  • Each module will generate a output file named "MODULE.out.txt", and will be contained in a zip file.

  • Details are described below.

  • Show All
  • T3SEppML
  • flBlast
  • sigHMM
  • cbdHMM
  • effectHMM
  • transHMM
  • TMHMM, PSORTb & SignalP

T3SEppML

The format of the prediction result is shown above, containing 2 columns, protein ID, and the classification with default cutoff of each model, where “1” means "T3S signal" and “0” means "non-T3S signal".


flBlast

FlBlast screens the proteins with homology to validated T3SEs.

Its result has seven columns, representing protein ID, candidate protein name, the most significantly homologous hit, effector family, functional group, length coverage and similarity for the covered regions.


sigHMM

SigHMM screens the proteins with conserved T3S signal sequences.

Its result has two columns. the first column lists the names of proteins to be predicted, and the second gives the prediction results: ‘-’ represents no hit while "SigFAM_NUMBER_N50" indicates the T3S signal family profile that the predicted protein contains.


cbdHMM

CbdHMM screens the proteins containing a conserved motif in the chaperone-binding domain(CBD).

Its result has three columns. the first column lists the names of proteins to be predicted, and the last two show the conserved CBD motif locations and sequences, respectively.


effectHMM

EffectHMM screens the proteins with known effector domains.

Its result has two columns, the first column lists the names of proteins to be predicted, and the second column gives the prediction results: ‘-’ represents no hit while ‘Effector_FAM_NUMBER’ indicates the effector domain family that the predicted protein belongs to.


transHMM

transHMM screens the proteins containing a putative T3SE regulatory motif within the gene promoter region.

Its result has four columns, representing protein ID, the transcription regulators, and the locations and sequences of binding motifs within promoters of the regulated genes.


TMHMM, PSORTb & SignalP

There are results if you submit optional input files (PSORTb prediction file, SignalP prediction file and TMHMM prediction file). Since a typical T3SE protein should be not a cytoplasmic protein, without a putative signal peptide and not a trans-membrane protein, these prediction results were considered as contributors for the T3SEpp pipeline:


  • A non-transmembrane topology was considered as a positive contribution to the final prediction of T3SEpp.
  • A non-'Cytoplasmic' localization was considered as a positive contribution to the final prediction of T3SEpp.
  • A non-‘SP’ result was considered as a positive contribution to the final prediction of T3SEpp.



You are here : HCD Lab ;    H: Haplotype    C: Cloud computing    D: Deep learning


CONTACT

Please feel free to send us message if you have any questions.

Yejun Wang, PhD.

HCD Laboratory,

Shenzhen University Health Science Center