Catalogue of Molecular Biology Programs Release 6.1 23 Jul 1999 All thanks are in the file biocatal.thanks. ===================================================================== AC BC00056 NAME PLSEARCH DOMAIN Alignment Search software DOMAIN Pattern Identification DESCRIPTION A database of primary sequence patterns, DESCRIPTION constructed from all sequence families in DESCRIPTION EMBL's SWISS-PROT protein sequence database, DESCRIPTION and a dynamic-programming search tool for DESCRIPTION matching newly-generated sequences against DESCRIPTION the pattern library AUTHOR Smith, R.F. and Smith T.F. RA Smith R.F., Smith T.S.; RT "Automatic generation of primary sequence patterns from sets RT of related protein sequences."; RL Proc. Natl. Acad. Sci. U.S.A. 87:118-122(1990). RX Medline; 90115821. RX SeqAnalRef; SMIR9001. ADDRESS Molecular Biology Computer Research Resource ADDRESS Dana-Farber Cancer Institute and School of Public Health ADDRESS Harvard University, 44 Binney St., Boston MA 02115 USA. CONTACT Smith T: tsmith@mbcrr.harvard.edu SITE ftp anonymous mbcrr.harvard.edu SITE Directory /MBCRR-Package SITE-CONTACT Smith T: tsmith@mbcrr.harvard.edu SITE ftp anonymous ftp.ebi.ac.uk SITE Directory /pub/software/unix UNIX software are stored SITE as archive (tar) files compressed (.Z). SITE-CONTACT nethelp@ebi.ac.uk SITE ftp anonymous ftp.bchs.uh.edu SITE Directory /pub/gene-server/unix SITE-CONTACT Dan Davison: dbd@theory.bchs.uh.edu OS Unix LANGUAGE C VOLUME - AC BC00160 NAME ARIADNE DOMAIN Pattern Identification DESCRIPTION Complex Pattern Identification Tool using the DESCRIPTION GenBank database. AUTHOR - RT - ADDRESS DNASTAR, Inc. CONTACT Philip E. Doggett SITE - SITE-CONTACT - OS VAX, Unix Sun-3 and Sun-4 LANGUAGE Any Common Lisp System VOLUME - AC BC00161 NAME PATTERN DOMAIN Pattern Identification DESCRIPTION Pattern searching program AUTHOR Giles I. RA Cockwell K.Y., Giles I.G.; RT "Software tools for motif and pattern scanning: program descriptions RT including a universal sequence reading algorithm."; RL Comput. Appl. Biosci. 5:227-232(1989). RX Medline; 89353669. RX SeqAnalRef; COCK8901. ADDRESS - CONTACT - SITE ftp anonymous ftp.ebi.ac.uk SITE Directory /pub/software/vax VAX software are stored as SITE uuencoded (.UUE) files. SITE-CONTACT nethelp@ebi.ac.uk SITE ftp anonymous ftp.bchs.uh.edu (129.7.2.43)/ SITE Directory /pub/gene-server/vms SITE-CONTACT Dan Davison: dbd@theory.bchs.uh.edu OS VAX/VMS LANGUAGE C VOLUME - AC BC00162 NAME PROSEARCH DOMAIN Pattern Identification DESCRIPTION ProSearch allows direct searching of regular expressions on DESCRIPTION protein sequences. The PROSITE patterns must be translated DESCRIPTION into Unix-style regular expressions by hand or by cregex (see DESCRIPTION the description for that program). One or more protein DESCRIPTION sequences can be swiftly searched for all patterns in the DESCRIPTION database. Patterns can be easily developed and added to the DESCRIPTION database. All output is written to standard output, and as DESCRIPTION such can be redirected to any device. The input format is DESCRIPTION constrained to just sequence, but many input formats can be DESCRIPTION handled by filtering the input through READSEQ (gilbert@iubio. DESCRIPTION vax.indiana.edu). There are two levels of output. The simplest DESCRIPTION is a table of sites in the protein which match patterns in the DESCRIPTION database, and the pattern's name. The more complete output is DESCRIPTION this table, the sequence matched, and the PROSITE DESCRIPTION documentation associated with the pattern. AUTHOR Kolakowski L. F. RA Kolakowski L.F. Jr., Leunissen J.A.M., Smith J.E.; RT "ProSearch: fast searching of protein sequences with regular RT expression patterns related to protein structure and function."; RL Biotechniques 13:919-921(1992). RX Medline; 93119652. RX SeqAnalRef; KOLL9201. ADDRESS Massachusetts General Hospital, Endocrine Unit, ADDRESS Wellman 5, Fruit Street, Boston, MA 02114, USA CONTACT kolakowski@helix.mgh.harvard.edu SITE ftp anonymous ftp.bio.indiana.edu SITE Directory /molbio/search SITE-CONTACT D Gilbert: gilbertd@cricket.bio.indiana.edu SITE ftp anonymous ftp.ebi.ac.uk SITE Directory /pub/software/unix UNIX software are stored SITE as archive (tar) files compressed (.Z). SITE-CONTACT nethelp@ebi.ac.uk SITE ftp anonymous ftp.bchs.uh.edu (129.7.2.43)/ SITE Directory /pub/gene-server/unix SITE-CONTACT Dan Davison: dbd@theory.bchs.uh.edu SITE ftp anonymous ftp.nig.ac.jp SITE Directory /pub/unix SITE-CONTACT Y Ugawa: yugawa@genes.nig.ac.jp OS Unix LANGUAGE C, awk VOLUME - REQUIRES CREGEX AC BC00163 NAME SEQ_VIS DOMAIN Pattern Identification DESCRIPTION MBCRR Software Package DESCRIPTION Generates a compact pictorial display of the DESCRIPTION distribution of user-defined regular- DESCRIPTION expression patterns along any sequence AUTHOR TF Smith et al. RA Smith T.F., Srinivasan A., Schochetman G., Marcus M., Myers G.; RT "The phylogenetic history of immunodeficiency viruses."; RL Nature 333:573-575(1988). ADDRESS Molecular Biology Computer Research Resource ADDRESS Dana-Farber Cancer Institute and School of Public Health ADDRESS Harvard University, 44 Binney St., Boston MA 02115 USA. CONTACT Smith T: tsmith@mbcrr.harvard.edu SITE ftp anonymous mbcrr.harvard.edu SITE Directory /MBCRR-Package SITE-CONTACT Smith T: tsmith@mbcrr.harvard.edu OS Unix LANGUAGE C VOLUME - AC BC00164 NAME WORDUP DOMAIN Pattern Identification DESCRIPTION The huge production of nucleic acid sequences has generated an DESCRIPTION impelling need for specific algorithms aimed at decoding the DESCRIPTION genetic information written in the DNA. In a previous paper, DESCRIPTION Pesole et al. (1992) presented WORDUP method which is based on a DESCRIPTION first order Markov analysis and allows us to detect statistically DESCRIPTION significant sequence motifs from six to ten nucleotides long in DESCRIPTION the sequences under investigation. DESCRIPTION We present here an improvement of the previous algorithm. The DESCRIPTION lastest version allows us to dynamically detect significant DESCRIPTION signals of any length in the same analysis. The problem addressed DESCRIPTION is the singling out of short nucleic sequences with non-random DESCRIPTION statistical properties, which may be thus biologically active. DESCRIPTION The key element in the processing is the use of a statistical DESCRIPTION analysis based on a special chi square test between the pattern DESCRIPTION of observed frequencies and of that of frequencies assessed with DESCRIPTION probability calculation. The statistical significance of each DESCRIPTION pattern is determined by comparing the expected number of DESCRIPTION sequences containing a given pattern with the observed one. DESCRIPTION The results are then futher tested to reject the falsely DESCRIPTION significant patterns, caused by partial overlapping with the DESCRIPTION true biological signal searched. The problem was solved by DESCRIPTION starting from the analysis of shorter oligomers and, through DESCRIPTION subsequent interactions, by checking that the oligomers resulting DESCRIPTION as statistically significant are not actually components of DESCRIPTION longer biological signals. DESCRIPTION The theoretical and practical importance of this method resides DESCRIPTION in the application of a very fast string matching algorithm DESCRIPTION (FastPat). The algorithm was adjusted to suit the specific DESCRIPTION requirements of biolgical research, whereby both the pattern and DESCRIPTION the sequences are compressed so that the natural four letter DESCRIPTION alphabet of DNA sequences is considerably enlarged. The novelty DESCRIPTION of this method is that it does not require of a priori alignment DESCRIPTION of the sequences to be analysed, and it performs an accurate DESCRIPTION statistical analysis accounting for the non-random sequence of DESCRIPTION the four nucleotides in the DNA. AUTHOR Pesole,Prunella,Liuni,Attimonelli,Saccone RA Pesole, G., Prunella, N., Liuni, S., Attimonelli, M. and Saccone C.; RT "WORDUP: an efficient algorithm for discovering statistically RT significant patterns in DNA sequences."; RL Nucleic Acids Res. 20:2871-2875 (1992). ADDRESS Sabino Liuni ADDRESS Area di Ricerca CNR ADDRESS Via Amendola 166/5 ADDRESS 70126 Bari(Italy) CONTACT sabino@area.ba.cnr.it SITE ftp anonymous area.ba.cnr.it SITE Directory /pub/embnet/software/wordup SITE-CONTACT sabino@area.ba.cnr.it OS Unix LANGUAGE C VOLUME - AC BC00212 NAME SCRUTINEER DOMAIN Pattern Identification DESCRIPTION Scrutineer is a program designed for searching protein DESCRIPTION sequences for patterns, motifs, alignments and so on. A very DESCRIPTION wide range of patterns can be handled. One option in DESCRIPTION Scrutineer is to place patterns in a file and then search DESCRIPTION typically a small number of sequences for all patterns. We DESCRIPTION provide PROSITE patterns in Scrutineer format in a file. A new DESCRIPTION protein sequence can be readily searched for all PROSITE DESCRIPTION patterns. The interface to the program is simple and command DESCRIPTION driven. The code is portable. The user may easily generate DESCRIPTION their own patterns for later use. Scrutineer does NOT make use DESCRIPTION of the documentation associated with each PROSITE pattern (to DESCRIPTION tour the documentation we use routines like VAX SEARCH or Unix DESCRIPTION grep instead). A manual comes with the program. AUTHOR Peter R. Sibbald, Hubert Sommerfeldt, and Patrick Argos RA Sibbald P.R., Argos P.; RT "Scrutineer: a computer program that flexibly seeks and RT describes motifs and profiles in protein sequence databases."; RL Comput. Appl. Biosci. 6:279-288(1990). RX Medline; 91003522. RX SeqAnalRef; SIBP9001. RA Sibbald P.R., Sommerfeldt H., Argos P.; RT "Automated protein sequence pattern handling and PROSITE RT searching."; RL Comput. Appl. Biosci. 7:535-536(1991). RX Medline; 92083367. RX SeqAnalRef; SIBP9101. ADDRESS EMBL Data Library ADDRESS PostFach 10.2209, D-6900 Heidelberg, Germany CONTACT Peter Sibbald, Biocomputing Programme CONTACT sibbald@embl-heidelberg.de SITE ftp anonymous ftp.ebi.ac.uk SITE Directory /pub/software/vax VAX software are stored as SITE uuencoded (.UUE) files. SITE-CONTACT nethelp@ebi.ac.uk OS VAX/VMS, Unix LANGUAGE Pascal VOLUME - AC BC00349 NAME PATTERN DOMAIN Pattern Identification DESCRIPTION Part of the ANTHEPROT software package for protein sequence DESCRIPTION analysis. Can scan a sequence with the patterns stored in the DESCRIPTION PROSITE.DAT file or with user-defind patterns stored using the DESCRIPTION same syntax. The user can set an optional similarity thresold DESCRIPTION or mismatch number. AUTHOR C. Geourjon, Gilbert Deleage RA Geourjon C., Deleage G.; RT "Interactive and graphic coupling between multiple alignments, RT secondary structure predictions and motif/pattern scanning RT into proteins."; RL Comput. Appl. Biosci. 9:87-91(1993). RX Medline; 93169503. RX SeqAnalRef; GEOC9301. ADDRESS Institut de Biologie et Chimie des Proteines ADDRESS UPR 412-CNRS, Universite Claude Bernard Lyon 1 ADDRESS 7, passage du Vercors ADDRESS F-69367 Lyon Cedex 07, France ADDRESS Tel: +33-72 72 26 47, Fax: +33-72 72 26 01 CONTACT deleage@ibcp.fr SITE - SITE-CONTACT - OS IBM RISC 6000 under AIX (Unix), DOS LANGUAGE F77 Fortran VOLUME - REQUIRES ANTHEPROT software package COMMENTS Availability: Can be obtained for non-commercial use; write COMMENTS to the authors or contact them by electronic mail. AC BC00350 NAME CREGEX DOMAIN Pattern Identification DESCRIPTION CREGEX reformats the native PROSITE database into a file DESCRIPTION containing regular expressions, which can be used in the DESCRIPTION pattern matching language AWK. Entries containing a range DESCRIPTION are split into multiple patterns. The regular expression DESCRIPTION file is most conveniently used in combination with DESCRIPTION PROSEARCH scripts. AUTHOR Jack A.M. Leunissen RT None (if necessary cite as unpublished method) ADDRESS CAOS/CAMM Center, University of Nijmegen ADDRESS Toernooiveld, 6525 ED Nijmegen, The Netherlands ADDRESS Tel: +31-80-652248 CONTACT jackl@caos.caos.kun.nl SITE ftp anonymous ftp.ebi.ac.uk SITE Directory /pub/software/unix SITE-CONTACT nethelp@ebi.acuk SITE WWW Server at URL http://www.ebi.ac.uk/ OS VAX with VMS, Unix, MS-DOS LANGUAGE C VOLUME - REQUIRES - AC BC00351 NAME MOTIFS DOMAIN Pattern Identification DESCRIPTION MOTIFS looks for protein sequence motifs by checking your DESCRIPTION protein sequence for every sequence pattern in the PROSITE DESCRIPTION database. Symbol mismatches can be allowed. For each find, the DESCRIPTION output file displays the original complex pattern from the DESCRIPTION PROSITE database and the actual (simplified) pattern that was DESCRIPTION identified. As an option, the PROSITE documentation relevent DESCRIPTION to each 'found' pattern can be included in the output file. AUTHOR - RT - ADDRESS Genetics Computer Group ADDRESS 575 Science Drive ADDRESS Madison, WI 53711, U.S.A. CONTACT help@gcg.com SITE - SITE-CONTACT - OS VAX with VMS 5.0 or greater, DEC-AXP with Open VMS 1.0 or greater OS Silicon Graphics (RISC) with IRIX Version 5.0 OS SUN (Sparc) with SunOS Version 4.1.3 or Solaris version 2.2 LANGUAGE C, Fortran VOLUME - REQUIRES The Wisconsin Package (GCG) COMMENTS Commercial software AC BC00352 NAME MacPattern DOMAIN Pattern Identification DESCRIPTION MacPattern allows you to use PROSITE (or any pattern database DESCRIPTION adhering to the PROSITE conventions) for searching protein (or DESCRIPTION DNA) sequences for the occurence of known patterns. You may DESCRIPTION search for all patterns in the database, for selected entries DESCRIPTION or you may create "patterns sets". The complete documentation DESCRIPTION for each pattern in Prosite can be accessed easily. Output can DESCRIPTION be viewed on screen, printed, saved to disk or copied to the DESCRIPTION Clipboard. MacPattern accepts to read protein sequences in the DESCRIPTION following formats: SWISS-PROT, NBRF/PIR, Pearson's FASTA, DESCRIPTION Intelligenetics, DNA-Strider, DNAid, and ASCII. Works with DESCRIPTION individual sequences or sets of sequences, including DESCRIPTION SWISS-PROT on the EMBL CD-ROM. DESCRIPTION Version 2.0 and later also utilise Henikoff's BLOCKS database DESCRIPTION for searches of protein sequences with site-specific scoring DESCRIPTION matrices. Output is evaluated according to the "strength" DESCRIPTION value provided in each database entry. Another option is the DESCRIPTION identification of statistically significant protein segments DESCRIPTION using the Maximal Segment Score method by Karlin and Altschul. AUTHOR Rainer Fuchs RA Fuchs R.; RT "MacPattern: protein pattern searching on the RT Apple MacIntosh."; RL Comput. Appl. Biosci. 7:105-106(1991). RX Medline; 91167980. RX SeqAnalRef; FUCR9101. ADDRESS Glaxo Research Institute, 5 Moore Drive, ADDRESS Research Triangle Park, NC 27709, USA CONTACT rf11522@glaxo.com SITE ftp anonymous ftp.ebi.ac.uk SITE Directory /pub/software/mac SITE-CONTACT nethelp@ebi.ac.uk SITE WWW Server at URL http://www.ebi.ac.uk OS MacOS LANGUAGE - VOLUME - REQUIRES - AC BC00353 NAME Motif Master DOMAIN Pattern Identification DESCRIPTION Motif Master searches protein sequence files or sequence DESCRIPTION databases against the PROSITE database. Files & databases may DESCRIPTION also be searched against user-generated databases or patterns. DESCRIPTION Search results (pattern matches) may be displayed on-screen DESCRIPTION or outputted to a file. Motif Master also searches the PROSITE DESCRIPTION documentation file, from which specific chapters may be DESCRIPTION retrieved. The program accepts DNA or RNA pattern databases as DESCRIPTION well, for searching nucleic acid sequence files. PIR, SWISS- DESCRIPTION PROT, EMBL and GenBank formats are all compatible with Motif DESCRIPTION Master. AUTHOR Maltchenko S. RT - ADDRESS Maltchenko S. ADDRESS Zabolotny str.150, ADDRESS Institute of Molecular Biology and Genetics, ADDRESS Ukrainian Acad. of Science, ADDRESS Kiev-143, 252627, Ukraine ADDRESS Tel: +7-44-2665405, Fax: +7-44-2243243 ADDRESS National Biosciences Inc. ADDRESS 3650 Annapolis Lane ADDRESS Plymouth, MN 55447, USA ADDRESS Tel: +1-612-550 2012 or +1-800-747 4362 ADDRESS Fax: +1-612-550 9625 or +1-800-369 5118 CONTACT chacsh%imbg.kiev.ua@relay.ussr.eu.net SITE - SITE-CONTACT - OS DOS 3.1 and higher LANGUAGE - VOLUME - REQUIRES - COMMENTS Commercial software distributed by National Biosciences Inc. AC BC00354 NAME PATMAT DOMAIN Pattern Identification DOMAIN Searching databases DESCRIPTION General database searching tool with a simple and clear user DESCRIPTION interface. Can searches PROSITE.DAT as a pattern database. DESCRIPTION Searches other databases with patterns as probes. AUTHOR James C. Wallace RA Wallace J.C., Henikoff S.; RT "PATMAT: a searching and extraction program for sequence, RT pattern and block queries and databases."; RL Comput. Appl. Biosci. 8:249-254(1992). RX Medline; 92339038. RX SeqAnalRef; WALJ9201. ADDRESS Fred Hutchinson Cancer Research Center ADDRESS 1124 Columbia Street, Seattle, WA 98104, USA ADDRESS Tel: +1-206-6674501 CONTACT root@fred.fhcrc.org SITE ftp anonymous ncbi.nlm.nih.gov SITE Directory /repository/blocks/patmat.dos (for dos version) SITE-CONTACT - OS DOS 3.1 and higher, Unix LANGUAGE - VOLUME - REQUIRES - AC BC00355 NAME PROMOT DOMAIN Pattern Identification DESCRIPTION PROMOT will scan one or more protein sequences against DESCRIPTION PROSITE. Statistics of the matches are evaluated. User-defined DESCRIPTION patterns can also be input. PROMOT will also scan one or more DESCRIPTION motifs against a protein sequence database in SWISS-PROT or DESCRIPTION PIR-NBRF formats. AUTHOR Sternberg M.J.E. RA Sternberg M.J.E.; RT "PROMOT: A FORTRAN program to scan protein sequences against a RT library of known motifs."; RL Comput. Appl. Biosci. 7:257-260(1991). RX Medline; 91283902. RX SeqAnalRef; STEJ9102. ADDRESS Oxford Molecular Ltd (OML) ADDRESS Magdalen Centre, Oxford Science Park ADDRESS Sandford-on-Thames, Oxford OX4 4GA, UK ADDRESS Tel: +44-865-784600, Fax: +44-865-784601 CONTACT - SITE - SITE-CONTACT - OS Silicon Graphics IRIX, Hewlett Packard 700, SUN Sparcstations LANGUAGE - VOLUME - REQUIRES - COMMENTS Commercial software AC BC00356 NAME PROSITE DOMAIN Pattern Identification DESCRIPTION Scans a given protein sequence for the occurrence of any DESCRIPTION pattern listed in PROSITE. Input files in SWISS-PROT, FASTA or DESCRIPTION plain ASCII formats are recognized. The program output can be DESCRIPTION saved in an ASCII file and shows the position of the patterns DESCRIPTION under the sequence, as well as the complete PROSITE pattern. DESCRIPTION Program flags can be set to exclude the PROSITE.DOC or to DESCRIPTION include the PROSITE.DAT information. A position table of DESCRIPTION pattern occurences is produced for each pattern. AUTHOR Klaus Hartmuth, Manfred D. Zorn (Unix version) RT - ADDRESS Klaus Hartmuth, Institut fuer Biochemie, University of Vienna ADDRESS Waehringerstrasse 17, A-1090 Vienna, Austria ADDRESS Tel: +43-222-436141-56 ADDRESS Manfred D. Zorn, Human Genome Center MS 50B-3238 ADDRESS Lawrence Berkeley Laboratory, 1 Cyclotron Road ADDRESS Berkeley, CA 94720, USA ADDRESS Tel: +1-510-4865041 CONTACT a5161dad@vm.univie.ac.at, mdzorn@lbl.gov SITE ftp anonymous genome.lbl.gov SITE Directory /pub/prosite SITE-CONTACT - OS IBM 3090 mainframe under VM/CMS, Unix LANGUAGE C VOLUME - REQUIRES - COMMENTS Availability: VM/CMS version: send an email message to the COMMENTS author or write to him. AC BC00357 NAME PROSITE DOMAIN Pattern Identification DOMAIN Database and analysis DESCRIPTION PROSITE is incorporated in three different parts of the DESCRIPTION GeneWorks program. All or a portion of the patterns described DESCRIPTION in PROSITE can be displayed in the sequence view of GeneWorks. DESCRIPTION The sites described in PROSITE can also be seen in GeneWorks' DESCRIPTION Graphic view, which shows a graphic schematic of the sequence DESCRIPTION and can compare the PROSITE graphic with other elements, such DESCRIPTION as plots of hydrophobicity, structure, and others. GeneWorks DESCRIPTION is also capable of searching all of the SWISS-PROT database DESCRIPTION for patterns. It is possible to enter any of the PROSITE DESCRIPTION patterns in GeneWorks Query view to search the data bank. The DESCRIPTION PROSITE documentation describing each of the patterns is DESCRIPTION available as on-line GeneWorks help. By selecting the pattern, DESCRIPTION the user can bring up a printable window containing the DESCRIPTION appropriate PROSITE help. AUTHOR - RT - ADDRESS IntelliGenetics, Inc. ADDRESS 700 El Camino Real East, Suite 300 ADDRESS Mountain View, CA 94040-2216, USA ADDRESS Tel: +1-415-9627300, Fax: +1-415-9627302 ADDRESS IntelliGenetics Belgium ADDRESS Amocolaan 2 ADDRESS B 2440 Geel, Belgium ADDRESS Tel: +32-3-2195352,Fax : +32-3-2195354 CONTACT - SITE - SITE-CONTACT - OS Apple MacIntosh; System 6.x and 7 LANGUAGE - VOLUME - REQUIRES GeneWorks(R) package COMMENTS Commercial software AC BC00358 NAME PROSITE DOMAIN Pattern Identification DESCRIPTION This program is part of the PC/Gene protein and nucleotide DESCRIPTION sequence analysis package, it allows the user to scan a DESCRIPTION protein sequence for the occurence of PROSITE patterns. The DESCRIPTION output shows the list of hits in a tabular form as well as DESCRIPTION annotations under a standard three-letter code representation DESCRIPTION of the sequence. So as to expound the biological relevance of DESCRIPTION the sites detected in a sequence the program can display the DESCRIPTION documentation relevant to each pattern detected in a sequence. DESCRIPTION The PROSITE documentation file can also be searched for the DESCRIPTION occurence of one or more word(s). AUTHOR - RT - ADDRESS IntelliGenetics, Inc. ADDRESS 700 El Camino Real East, Suite 300 ADDRESS Mountain View, CA 94040-2216, USA ADDRESS Tel: +1-415-9627300, Fax: +1-415-9627302 ADDRESS IntelliGenetics Belgium ADDRESS Amocolaan 2 ADDRESS B 2440 Geel, Belgium ADDRESS Tel: +32-3-2195352,Fax : +32-3-2195354 CONTACT - SITE - SITE-CONTACT - OS DOS 3.1 and higher LANGUAGE - VOLUME - REQUIRES PC/Gene package COMMENTS Commercial software AC BC00360 NAME PROTOMAT DOMAIN Pattern Identification DESCRIPTION Reads an entry from PROSITE.DAT and extracts all true positive DESCRIPTION and false negative sequences listed there from SWISS-PROT. DESCRIPTION Runs H.O. Smith's "motif" algorithm (PNAS 87:826-830) on the DESCRIPTION group of proteins and extends the motifs into blocks. DESCRIPTION Assembles a "best path" set of blocks describing the group of DESCRIPTION proteins. AUTHOR Jorja Henikoff and Steven Henikoff RA Henikoff S., Henikoff J.; RT "Automated assembly of protein blocks for database searching."; RL Nucleic Acids Res. 19:6565-6572(1991). RX Medline; 92093619. RX SeqAnalRef; HENS9101. ADDRESS Howard Hughes Medical Institute ADDRESS Fred Hutchinson Cancer Research Center ADDRESS 1124 Columbia Street, M-684 ADDRESS Seattle, WA 98104, USA ADDRESS Tel: +1-206-6674515 CONTACT henikoff@sparky.fhcrc.org SITE ftp anonymous ncbi.nlm.nih.gov SITE Directory /repository/blocks/protomat.unix(unix version) SITE ftp anonymous ncbi.nlm.nih.gov SITE Directory /repository/blocks/protomat.dos (dos version) SITE-CONTACT - OS DOS 3.1 and higher, SUN (Sparc) with SunOS Version 4 LANGUAGE - VOLUME - REQUIRES - AC BC00362 NAME QUEST DOMAIN Pattern Identification DESCRIPTION The QUEST program allows a user to scan a data bank for any DESCRIPTION pattern of characters in a sequence or annotations. Patterns DESCRIPTION can be entered by the user or can be taken from over 3000 DESCRIPTION known patterns provided with the program. These protein and DESCRIPTION nucleic acid patterns include those found in PROSITE, patterns DESCRIPTION in the Transcription Factors Database (TFD), and patterns DESCRIPTION entered from the literature by IntelliGenetics. AUTHOR - RT - ADDRESS IntelliGenetics, Inc. ADDRESS 700 El Camino Real East, Suite 300 ADDRESS Mountain View, CA 94040-2216, USA ADDRESS Tel: +1-415-9627300, Fax: +1-415-9627302 ADDRESS IntelliGenetics Belgium ADDRESS Amocolaan 2 ADDRESS B 2440 Geel, Belgium ADDRESS Tel: +32-3-2195352,Fax : +32-3-2195354 CONTACT - SITE - SITE-CONTACT - OS Sun with SunOS Version 4.0 or greater, VAX with VMS 5.4 or greater LANGUAGE - VOLUME - REQUIRES IntelliGenetics Suite package COMMENTS Commercial software AC BC00364 NAME SIGNPT DOMAIN Pattern Identification DESCRIPTION SIGNPT is a program which is integrated in a sequence DESCRIPTION analysis package (wriiten in FORTRAN). It permits to search DESCRIPTION the PROSITE patterns on any protein sequence (personal or from DESCRIPTION PIR, GENPRO, SWISS-PROT databanks). Probability of occurence DESCRIPTION is evaluated from the mean abundance of amino acids in the DESCRIPTION database. A given level of mismatch is authorized. AUTHOR Phillipe Dessen RA Dessen P., Fondrat C., Valencien C., Mugnier C. RT "BISANCE: A French service for access to biomolecular sequence RT databases."; RL Comput. Appl. Biosci. 6:355-356(1990). RX Medline; 91077760. RX SeqAnalRef; DESP9001. DDRESS Service de Bioinformatique CNRS-INSERM, ADDRESS 7 rue Guy Moquet, BP8 ADDRESS 94801 Villejuif Cedex, France CONTACT dessen@genome.vjf.inserm.fr SITE - SITE-CONTACT - OS Vax/VMS LANGUAGE FORTRAN VOLUME - REQUIRES - AC BC00367 NAME PIP DOMAIN Pattern Identification DESCRIPTION PIP, SPLITP1, SPLITP2, SPLITP3 : DESCRIPTION PIP can screen protein sequences against the whole PROSITE DESCRIPTION database or against a specific pattern entry in PROSITE. DESCRIPTION SPLITP1: splits the file PROSITE.DAT so that there is a DESCRIPTION separate file for each entry. Each file is automatically named DESCRIPTION "psentry-number.dat" (example PS00197.DAT). In addition an DESCRIPTION index file is created. DESCRIPTION SPLITP2: splits the file PROSITE.DOC so that there is a DESCRIPTION separate file for each document entry. Each file is DESCRIPTION automatically named "pdocentry-number.doc" (example PDOC00128. DESCRIPTION DOC). DESCRIPTION SPLITP3: intereprets the PROSITE.DAT file and rewrites it in a DESCRIPTION notation speicific for the PIP program. This utility produces DESCRIPTION various files. AUTHOR Rodger Staden RA Staden R.; RT "Screening protein and nucleic acid sequences RT against libraries of patterns."; RL "DNA Seq. 1:369-374(1991). RX Medline; 92119325. RX SeqAnalRef; STAR9101. ADDRESS MRC Laboratory of Molecular Biology, Hills Road, ADDRESS Cambridge, CB2 2QH, United Kingdom ADDRESS Tel: +44-1223-248011 CONTACT rs@mrc-lmb.cam.ac.uk SITE - SITE-CONTACT - OS VAX with VMS or Ultrix, DEC Alpha (OSF 1), Unix LANGUAGE - VOLUME - REQUIRES - COMMENTS Availability: Send an email message to the author. AC BC00368 NAME PRO-EXPLORE DOMAIN Molecular modelling and graphics DOMAIN Alignment editing and display DOMAIN Pattern Identification DESCRIPTION PRO-EXPLORE is a graphical protein modelling package. It DESCRIPTION provides tools for multi-sequence editing and alignment, DESCRIPTION sequence analysis, database investigation, and 3D modelling. AUTHOR - RT - ADDRESS OXFORD MOLECULAR, THE OXFORD SCIENCE PARK, OXFORD, OX4 4GA UK CONTACT Paul DAVIE , tel: +44 1865 784600 SITE - SITE-CONTACT - OS All Silicon Graphics (IRIX 4.0) LANGUAGE - VOLUME - REQUIRES PRO-EXPLORE package COMMENTS Silicon Graphics : gl_s run time library COMMENTS 24 bit planes for colour or RGB dithering COMMENTS 24 bit planes for Z-buffering COMMENTS ESV Graphics : PEX 2.0 COMMENTS Commercial software AC BC00387 NAME PRATT DOMAIN Protein sequence analysis DOMAIN Pattern Identification DESCRIPTION Pratt is a program that allows the user to efficiently DESCRIPTION search for patterns conserved in a set of protein sequences. DESCRIPTION It allows the user to define the class of patterns to be DESCRIPTION searched for, and is then guaranteed to find all conserved DESCRIPTION patterns in this class. The time used by the program DESCRIPTION depends on the set of sequences, the class of patterns DESCRIPTION defined, and the minimum number of sequences a pattern DESCRIPTION is to match. Pratt can discover conserved patterns of the DESCRIPTION PROSITE type, including patterns with flexible wildcard DESCRIPTION regions and ambiguous positions. It can for example DESCRIPTION discover the pattern C-x(2,4)-C-x(3)-[ILVFYC]-x(8)-H-x(3,5)-H DESCRIPTION conserved in 240 unaligned zinc finger protein sequences DESCRIPTION (ZINC_FINGER_C2H2 in PROSITE). DESCRIPTION Version 2.1 announced February 1997 includes new functionality, DESCRIPTION patterns may be restricted to those matching a given sequence DESCRIPTION alignment, or a special query sequence. Heuristics and branch- DESCRIPTION and-bound have also been implemented making the program faster, DESCRIPTION especially for analysis of relativey similar sequences. New DESCRIPTION methods for scoring identified patterns have also been incoroporated. DESCRIPTION Version 2.1 has not been tested on VMS. AUTHOR Inge Jonassen RA Jonassen I., Collins J. F., Higgins D. G.; RT "Finding flexible patterns in unaligned protein sequences."; RL Protein Science (1995) 4:1587-159; RA Jonassen I.; RT "Efficient discovery of conserved patterns using a pattern graph."; RL subm. to CABIOS February 1997 ADDRESS Inge Jonassen, ADDRESS Dept. of Informatics, ADDRESS University of Bergen, ADDRESS HIB ADDRESS N5020 BERGEN ADDRESS NORWAY CONTACT Inge.Jonassen@ii.uib.no SITE ftp anonymous ftp.ebi.ac.uk SITE Directory /pub/software/unix SITE-CONTACT higgins@ebi.ac.uk SITE ftp anonymous ftp.ii.uib.no SITE Directory /pub/bio/Pratt SITE-CONTACT Inge.Jonassen@ii.uib.no OS UNIX,VMS,LINUX LANGUAGE ANSI C VOLUME 1 MByte diskspace REQUIRES at least 20Mbyte memory AC BC00391 NAME BCM Search Launcher DOMAIN WWW server DOMAIN Sequence format conversion tools DOMAIN Sequence analysis DOMAIN Protein sequence analysis DOMAIN Protein structure analysis DOMAIN Structure prediction DOMAIN Pattern Identification DOMAIN Alignment Search software DOMAIN Alignment editing and display DOMAIN Alignment browser DOMAIN Restriction maps DOMAIN Database and analysis DOMAIN Searching databases DOMAIN Statistical significance DOMAIN Comparative analysis SERVER http://www.hgsc.bcm.tmc.edu/SearchLauncher/ DESCRIPTION The BCM Search Launcher is an integrated set of World- DESCRIPTION Wide Web (WWW) pages that organize molecular biology- DESCRIPTION related search and analysis services available on the DESCRIPTION WWW by function, and provide a single point-of-entry for DESCRIPTION related searches. The Protein Sequence Search Page, for DESCRIPTION example, provides a single sequence entry form for DESCRIPTION submitting sequences to WWW servers that provide remote DESCRIPTION access to a variety of different protein sequence search DESCRIPTION tools, including BLAST, FASTA, Smith-Waterman, BEAUTY, DESCRIPTION PROSITE, and BLOCKS searches. Other Launch pages provide DESCRIPTION access to 1) nucleic acid sequence searches, 2) multiple DESCRIPTION and pairwise sequence alignments, 3) gene feature searches, DESCRIPTION 4) protein secondary structure prediction, and 5) DESCRIPTION miscellaneous sequence utilities (e.g., 6-frame DESCRIPTION translation). The BCM Search Launcher also provides a DESCRIPTION mechanism to extend the utility of other WWW services by DESCRIPTION adding supplementary hypertext links to results returned DESCRIPTION by remote servers. For example, links to the NCBI's DESCRIPTION Entrez database and to the Sequence Retrieval System DESCRIPTION (SRS) are added to search results returned by the NCBI's DESCRIPTION WWW BLAST server. These links provide easy access to DESCRIPTION auxiliary information, such as Medline abstracts, that DESCRIPTION can be extremely helpful when analyzing BLAST database DESCRIPTION hits. For new or infrequent users of sequence database DESCRIPTION search tools, we have pre-set the default search parameters DESCRIPTION to provide the most informative first-pass sequence DESCRIPTION analysis possible. We have also developed a batch client DESCRIPTION interface for Unix and Macintosh computers that allows DESCRIPTION multiple input sequences to be automatically searched as DESCRIPTION a background task, with the results returned as individual DESCRIPTION HTML documents directly to the user's system. AUTHOR Randall F. Smith, Brent A. Wiese, Mary K. Wojzynski, AUTHOR Daniel B. Davison , and Kim C. Worley RA Smith R.F., Wiese B.A., Wojzynski M.K., Davison D.B., Worley K.C.; RT "BCM Search Launcher--An integrated interface to molecular RT biology database search and analysis services available on RT the World-Wide Web."; RL Submitted. ADDRESS Randall F. Smith, Ph.D. ADDRESS Department of Molecular and Human Genetics, T-921, ADDRESS Baylor College of Medicine ADDRESS One Baylor Plaza, Houston, TX 77030, USA ADDRESS Tel: +1 (713) 798-4735 Fax: +1 (713) 798-5386 CONTACT rsmith@bcm.tmc.edu CONTACT kworley@bcm.tmc.edu SITE WWW Server at URL http://www.hgsc.bcm.tmc.edu/SearchLauncher/ SITE-CONTACT rsmith@bcm.tmc.edu SITE WWW Server at URL http://dot.imgen.bcm.tmc.edu:9331/downloads/software/batch_client.html SITE give information about the downloads SITE ftp anonymous dot.imgen.bcm.tmc.edu for a batch client SITE Directory /pub/software/search-launcher OS Unix and Macintosh for the batch client LANGUAGE Perl (on Unix), AppleScript (on Macintosh) VOLUME - REQUIRES MacTCP on Macintosh and AppleScript, Perl on Unix AC BC00445 NAME ANTHEPROT Web page DOMAIN WWW server DOMAIN Protein sequence analysis DOMAIN Structure prediction DOMAIN Pattern Identification DOMAIN Alignment browser SERVER http://www.ibcp.fr/predict.html DESCRIPTION ANTHEPROT is a server that allows to make: DESCRIPTION * Secondary structure predictions: GOR, DPM, HOMOLOGUE, SOPMA, DESCRIPTION Phd DESCRIPTION * Multiple alignment : Clustalw (Higgins), Multalin (Corpet) DESCRIPTION * Search for biological sites using PROSITE DESCRIPTION * Search for homologous protein : Program FASTA (W. Pearson) AUTHOR G. Deleage, C. Geourjon, C. Blanchet RA Geourjon C., Deleage G.; RT "SOPM: a self-optimised meyhod for protein secondary RT struture prediction."; RL Protein Eng. 7:157-164(1994) RX Medline; 94224747. RX SeqAnalRef; GEOC9302. RA Geourjon C., Deleage G.; RT "SOPMA:significant improvements in protein secondary RT structure prediction by consensus prediction from RT multiple alignments."; RL Comput. Appl. Biosci. 11(6):681-684(1995). ADDRESS Institut de Biologie et Chimie des Proteines ADDRESS IBCP-CNRS UPR 412, Groupe de modelisation et RMN ADDRESS 7, Passage du Vercors ADDRESS 69 367 Lyon cedex 07, FRANCE CONTACT deleage@ibcp.fr, geourjon@ibcp.fr SITE WWW server at URL http://www.ibcp.fr/predict.html SITE-CONTACT geourjon@ibcp.fr or deleage@ibcp.fr OS - LANGUAGE - VOLUME - REQUIRES - AC BC00461 NAME DNA Stacks DOMAIN Sequence format conversion tools DOMAIN Sequence display DOMAIN Sequence editor DOMAIN Sequence analysis DOMAIN Sequence tools DOMAIN Protein sequence analysis DOMAIN Pattern Identification DOMAIN Alignment editing and display DOMAIN Alignment browser DOMAIN Genome Mapping Databases DOMAIN Phylogeny DESCRIPTION DNA Stacks (v. 1.1) is a HyperCard 2.x stack package DESCRIPTION for Macintosh computers featuring utilities for editing or DESCRIPTION coloring multiple DNA or protein sequence alignments, DESCRIPTION performing numerous data conversions or analyses related to DESCRIPTION molecular systematics, displaying auto-rescalable gene maps DESCRIPTION of mitochondrial or chloroplast genomes, extracting DNA DESCRIPTION gene sequences or translated protein sequences from about DESCRIPTION 26 animal mitochondrial genomes, and graphically depicting DESCRIPTION codon usage patterns. AUTHOR Eernisse, Douglas J. RA Eernisse, D. J.; RT "DNA Translator and Aligner: HyperCard utilities to aid RT phylogenetic analysis of molecules."; RL Comput. Appl. Biosci. 8:177-184(1992). RX Medline; 92274217. RX SeqAnalRef; EERD9201. ADDRESS D. J. Eernisse ADDRESS Dept. Biol. Sci. MH282 ADDRESS California State University ADDRESS Fullerton, CA 92634, USA CONTACT deernisse@fullerton.edu SITE ftp anonymous ftp.bio.indiana.edu SITE Directory /molbio/mac/ SITE-CONTACT archive@bio.indiana.edu SITE gopher Host: gopher://ftp.bio.indiana.edu Port: 70 SITE URL gopher://gopher://ftp.bio.indiana.edu:/70/11/IUBio-Software+Data/molbio/mac SITE WWW Server at URL http://biology.fullerton.edu/people/faculty/doug-eernisse/ OS MacOS LANGUAGE HyperTalk, C VOLUME 2 MB REQUIRES HyperCard 2.0 or greater, Macintosh AC BC00506 NAME BioMotif DOMAIN Pattern Identification DESCRIPTION Patterns searches in proteins or nucleic sequences. DESCRIPTION A wide range of patterns can be defined, including DESCRIPTION hydrophobicity like constraints, local homologies, DESCRIPTION and some other built-in or user-defined functions. DESCRIPTION Works with individual sequence or full database, DESCRIPTION in Pearson's FASTA format. AUTHOR Mennessier Gerard RT - ADDRESS Lab.Phys.Math. ADDRESS Case 50 ADDRESS Universite Montpellier II ADDRESS 34095 Montpellier Cedex 5, FRANCE CONTACT menes@lpm.univ-montp2.fr SITE ftp anonymous ftp.lpm.univ-montp2.fr SITE Directory /pub/BioMotif SITE-CONTACT - OS Unix LANGUAGE C VOLUME - REQUIRES - AC BC00529 NAME The USC Sequence Alignment Package DOMAIN Sequence analysis DOMAIN Sequence tools DOMAIN Protein sequence analysis DOMAIN Pattern Identification DOMAIN Alignment Search software DOMAIN Statistical significance DOMAIN WWW server SERVER http://www-hto.usc.edu/software/seqaln/ DESCRIPTION This library of functions aligns nucleotide and DESCRIPTION protein sequences finding global, local, overlapping, DESCRIPTION and fitting regions of alignment using the generalized DESCRIPTION Smith-Waterman linear gap function g(k) = DESCRIPTION a + b(k-1) with one or two DNA, RNA or one-letter coded protein sequences. Also finds DESCRIPTION statistical significance of local alignments using the Poisson clumping heuristic. DESCRIPTION DESCRIPTION The package also includes GFSR, a good pseudorandom number generator, and standalone DESCRIPTION programs to perform global, local, fit and overlap alignments. AUTHOR Paul Hardy, Michael S. Waterman RT - ADDRESS Department of Mathematics ADDRESS University of Southern California ADDRESS 1042 W. 36th Place, DRB 155 ADDRESS Los Angeles, CA 90089-1113 CONTACT phardy@hto.usc.edu SITE WWW Server at URL http://www-hto.usc.edu/software/seqaln/ OS - LANGUAGE C VOLUME - REQUIRES - Ac BC00557 NAME PESTFIND DOMAIN Protein sequence analysis DOMAIN Pattern Identification DESCRIPTION Finds PEST protein degradation signals in protein sequences. This version is DESCRIPTION a straight translation to ANSI C from the original BASICA DESCRIPTION version from Martin Rechsteiner's laboratory. AUTHOR David Mathog, Bob Stellwagen, Martin Rechsteiner RA - RT - RL J.Biol.Chem 266(17):11213-11220(1991). RL Science 234:364-368(1986). ADDRESS - CONTACT mathog@seqaxp.bio.caltech.edu SITE WWW Server at URL http://seqaxp.bio.caltech.edu/pub/SOFTWARE/ OS any LANGUAGE ANSI C VOLUME - REQUIRES ANSI C compiler, UNZIP program AC BC00564 NAME SEALS: A System for Easy Analysis of Lots of Sequences DOMAIN Sequence format conversion tools DOMAIN Sequence tools DOMAIN Protein sequence analysis DOMAIN Pattern Identification DOMAIN Genome Analysis DOMAIN Phylogeny DOMAIN Database and analysis DOMAIN Searching databases DESCRIPTION SEALS (A System for Easy Analysis of Lots of Sequences) is a DESCRIPTION free, public domain package for large-scale sequence analysis DESCRIPTION investigations. It is still under development, but already DESCRIPTION provides a number of very useful programs, including unique DESCRIPTION taxonomy-aware tools. AUTHOR Walker DR, Koonin EV RA Walker DR, Koonin EV; RT "SEALS: A System for Easy Analysis of Lots of Sequences"; RL ISMB 5:333-339(1997). ADDRESS D Roland Walker ADDRESS National Center for Biotechnology Information ADDRESS National Library of Medicine ADDRESS National Institutes of Health, Bldg. 38A ADDRESS Bethesda, MD 20894, USA ADDRESS Voice: +1 (301)435-5909 Fax: +1 (301)435-5909 ADDRESS email: walker@ncbi.nlm.nih.gov ADDRESS ADDRESS Eugene V. Koonin, PhD ADDRESS National Center for Biotechnology Information ADDRESS National Library of Medicine ADDRESS National Institutes of Health, Bldg. 38A ADDRESS Bethesda, MD 20894, USA ADDRESS Voice: +1(301)435-5913 Fax:+1 (301)480-9241 ADDRESS email: koonin@ncbi.nlm.nih.gov CONTACT walker@ncbi.nlm.nih.gov (Roland Walker) SITE WWW Server at URL http://www.ncbi.nlm.nih.gov/Walker/SEALS/index.html OS Unix LANGUAGE Perl VOLUME 150 MB REQUIRES a number of other programs and db detailed on the home page AC BC00570 NAME RASA 2.2 DOMAIN Sequence analysis DOMAIN Structure prediction DOMAIN Pattern Identification DOMAIN Phylogeny DOMAIN Statistical significance DOMAIN Comparative analysis DESCRIPTION RASA 2.2 provides a set of tools for measuring DESCRIPTION phylogenetic signal and for studying its distribution DESCRIPTION among taxa. DESCRIPTION DESCRIPTION This software is used for DESCRIPTION * measuring signal DESCRIPTION * testing the suitability of available outgroup taxa DESCRIPTION * detecting long branches in the taxon variance plot DESCRIPTION * performing the mixed-character taxon test for differential DESCRIPTION lineage sorting DESCRIPTION * performing bootstrap power and effect to examine the DESCRIPTION power curve for your data. DESCRIPTION DESCRIPTION Exploratory options include DESCRIPTION * determining the signal spectrum for a set of sequences DESCRIPTION * removing noisy sites DESCRIPTION DESCRIPTION Some researchers have used the software to examine DESCRIPTION hypotheses of alignment using phylogenetic signal as the DESCRIPTION criterion. DESCRIPTION DESCRIPTION This software can be used on discrete data (molecular, DESCRIPTION morphological, or mixed). AUTHOR James Lyons-Weiler, PhD RA Lyons-Weiler, J., Hoelzer, G.A. and Tausch R.J.; RT "Relative Apparent Synapomorphy Analysis (RASA) I: RT the statistical measurement of phylogenetic signal."; RL Molecular Biology and Evolution 13:749-757(1996). RA Lyons-Weiler, J. and Milinkovitch, M.C.; RT "A phylogenetic approach to the problem of differential RT lineage sorting."; RL Molecular Biology and Evolution 14:968-975(1997). RA Lyons-Weiler, J. and Hoelzer, G.A.; RT "Escaping from the Felsenstein Zone by detecting long RT branches in phylogenetic data."; RL Molecular Phylogenetics and Evolution 8:375-384(1997). RA Lyons-Weiler, J. and Hoelzer, G.A.; RL "Optimal outgroup analysis."; RL Biological Journal of the Linnean Society 64:493(1998). RA Milinkovitch, M.C. and Lyons-Weiler, J.; RT "Finding optimal outgroup topologies and convexities RT when the choice of outgroups is not obvious."; RL Molecular Phylogenetics and Evolution 9:348-357(1998). ADDRESS - CONTACT jfl8@psu.edu SITE WWW Server at URL http://test1.bio.psu.edu/LW/list.htm OS Mac/PowerPC REQUIRES All superfluous extensions off. COMMENTS The software will be updated and bugs fixed at COMMENTS irregular intervals. Watch for updates. AC BC00582 NAME TargetFinder DOMAIN Searching databases DOMAIN Pattern Identification DOMAIN Genome Analysis DOMAIN WWW server SERVER http://hercules.tigem.it/TargetFinder.html DESCRIPTION Search for target genes of transcription factors DESCRIPTION TargetFinder is a new tool to facilitate database searches DESCRIPTION for candidate target genes of transcription factors. The DESCRIPTION use of this program allows to search a database of annotated DESCRIPTION sequences for binding sites located in context with other DESCRIPTION important transcription regulatory signals and regions, DESCRIPTION like the TATA element, the transcription start site, the DESCRIPTION promoter and so on, thereby greatly reducing the background DESCRIPTION usually associated with this kind of searches. AUTHOR Giovanni Lavorgna RA Lavorgna et al. RL Trends in Genetics 14:375-376(1998). RA Lavorgna et al, 1998. RL Bioinformatics, in press ADDRESS DIBIT-HSR ADDRESS Via Olgettina, 58 ADDRESS 20132 Milano Italy CONTACT giovanni.lavorgna@hsr.it SITE WWW Server at URL http://hercules.tigem.it/TargetFinder.html OS - LANGUAGE - VOLUME - REQUIRES - AC BC00583 NAME CloneIt Online DOMAIN Sequence tools DOMAIN Pattern Identification DOMAIN Restriction maps DOMAIN WWW server SERVER http://topaze.jouy.inra.fr DESCRIPTION Molecular biologists often have to sub-clone plasmidic DESCRIPTION vectors: a DNA plasmid is and ligated with an exogen DNA DESCRIPTION fragment previously excised from an other plasmid. The DESCRIPTION necessary cuts are achieved by restriction enzymes which DESCRIPTION then must be carefully choosen in order to minimize the DESCRIPTION steps required to obtain the desired molecule. During the DESCRIPTION selection of those enzymes, the main difficulties DESCRIPTION encountered come from: the knowldege of: DESCRIPTION the enzymes' characteristics DESCRIPTION the localization of the cuts within the sequence DESCRIPTION the complementarity between the protuding ends DESCRIPTION the possible self ligation of the vector DESCRIPTION the use of modifying DNA polymerases generating blunt ends DESCRIPTION the constraint to clone the insert in-frame with a vector sequence DESCRIPTION the use of partial digestions DESCRIPTION the creation of a stop codon after the ligation. DESCRIPTION We developed the CloneIt program that quickly finds DESCRIPTION in-frame deletions using restriction enzymes and DESCRIPTION frameshifts (using digestion, fill-in and ligation) in a DESCRIPTION plasmid sequence, Then, as the main functions and DESCRIPTION procedures were being developed, we have extended the DESCRIPTION capacities of the program to find strategies to sub-clone a DESCRIPTION fragment from a plasmid to another vector while still DESCRIPTION controling the problems described above. This program is DESCRIPTION not an expert system, as it does not "learn" the logical DESCRIPTION steps accomplished by the biologist and it does not have to DESCRIPTION be accompanied in its search: it just runs an algorithm DESCRIPTION that explores all the possible enzymes combinations that DESCRIPTION could be used to clone the molecules. DESCRIPTION This program provides a useful aid for any molecular DESCRIPTION biologist who wants to quickly find sub-cloning, in-frame DESCRIPTION deletions, frameshifts strategies, which would otherwise be DESCRIPTION difficult to discover. AUTHOR Pierre LINDENBAUM, Christophe CARON RA Lindenbaum P.; RT "CloneIt: finding cloning strategies, in-frame deletions and frameshifts."; RL Bioinformatics 14(5):465-466(1998). ADDRESS Pierre LINDENBAUM. ADDRESS Laboratoire de Biologie Moleculaire des Rotavirus. ADDRESS Virologie et Immunologie Moleculaires. ADDRESS Institut National de la Recherche Agronomique. ADDRESS 78350 Jouy-en-Josas Cedex FRANCE. ADDRESS lindenb@biotec.jouy.inra.fr CONTACT lindenb@biotec.jouy.inra.fr SITE WWW Server at URL http://topaze.jouy.inra.fr/cgi-bin/CloneIt/CloneIt OS - LANGUAGE - VOLUME - REQUIRES - COMMENTS Merci a Mademoiselle DERAT pour son aide. .