Main > PROTEINS > Proteomics > Human Proteomics > Prostate > Specific Secreted Protein

Product USA. H

PATENT NUMBER This data is not available for free
PATENT GRANT DATE April 2, 2002
PATENT TITLE Prostate specific secreted protein

PATENT ABSTRACT The present invention relates to a novel human protein called Prostate Specific Secreted Protein, and isolated polynucleotides encoding this protein. Also provided are vectors, host cells, antibodies, and recombinant methods for producing this human protein. The invention further relates to diagnostic and therapeutic methods useful for diagnosing and treating disorders related to this novel human protein.

PATENT INVENTORS This data is not available for free
PATENT ASSIGNEE This data is not available for free
PATENT FILE DATE March 30, 1999
PATENT REFERENCES CITED Amino acid and Nucleic acid database, Sequence 1 of U.S. Patent 6,025,197, 1998.*
Amino acid database, Accession #U00964, 1994.*
Krizman, David B. et al. Construction of a Representative cDNA Library from Prostatic Intraepithelial Neoplasia, Cancer Research 56:5380-5383, 1996.*
Amino acid database, Accession #AA716150, 1997.*
Amino acid database, Accession #AA722847, 1998.*
Amino acid database, Accession #AA559906, 1997.*
Alberts, Bruce et al. Molecular Biology of the Cell, second edition. pp. 95-98, 1989.*
Biotechnology Industry Organization, Critical Synergy: The Biotechnology Industry and Intellectual Property Protection, pp. 101, 103 and 104, 1994.*
Lin, M. et al. Structure-Function Relationships in Glucagon: Properties of Highly Purified Des-His (1), Monoiodo-, and [Des-Asn (28), Thr (29)] (homoserine lactose 27)-glucagon, Biochemistry 14(8):1559-1563, 1975.*
Schwartz G. et al. A superactive insulin : [B10-Aspartic acid] insulin (human), Proc. Natl. Acad. Sci. USA 84:6408-6411, 1987.*
Ricke, D.O., et al., GenBank Accession No. AC005361 Aug. 1, 1998.
NCI-CGAP http://www.ncbi.nlm.nih.gov/ncicgap, Contact: Robert Strausberg, Ph.D., GenBank Accession No. AI076929 Aug. 27, 1998.
NCI-CGAP http://www.ncbi.nlm.nih.gov/ncicgap Contact: Robert Strausberg, Ph.D., GenBank Accession No. AI027322 Aug. 28, 1998.
Genbank Accession No. AA722847 (Jan. 2, 1998).
Genbank Accession No. AA716150 (Dec. 29, 1997).
Genbank Accession No. AA577045 (Sep. 3, 1997).
Genbank Accession No. AA559906 (Aug. 18, 1997).
Genbank Accession No. AA533247 (Aug. 18, 1997).
Genbank Accession No. AA367486 (Apr. 21, 1997).
Genbank Accession No. AA340605 (Apr. 21, 1997).
PATENT PARENT CASE TEXT This data is not available for free
PATENT CLAIMS What is claimed is:

1. An isolated polynucleotide comprising a nucleic acid encoding an amino acid sequence selected from the group consisting of:

(a) amino acids +2 to +178 of SEQ ID NO:2;

(b) amino acids +1 to +178 of SEQ ID NO:2;

(c) the amino acid sequence of the full length polypeptide encoded by the cDNA clone contained in ATCC Deposit No. 209664; and

(d) the amino acid sequence of the polypeptide encoded by the cDNA clone contained in ATCC Deposit No. 209664, except wherein the N-terminal methionine codon of said clone is deleted.

2. An isolated polynucleotide complementary to the polynucleotide of claim 1.

3. The isolated polynucleotide of claim 1 further comprising a heterologous polynucleotide.

4. A vector comprising the polynucleotide of claim 1.

5. A method of producing a vector comprising inserting the isolated polynucleotide of claim 1 into a vector.

6. A host cell comprising the vector of claim 4.

7. A host cell comprising the isolated polynucleotide of claim 1 operably associated with a heterologous regulatory sequence.

8. A method of producing a polypeptide comprising:

(a) culturing the host cell of claim 7 under conditions such that the polypeptide is expressed; and

(b) recovering said polypeptide.

9. A composition comprising the isolated polynucleotide of claim 1.
--------------------------------------------------------------------------------

PATENT DESCRIPTION FIELD OF THE INVENTION

The present invention relates to a novel human gene encoding a polypeptide expressed in prostate tissue. More specifically, the present invention relates to a polynucleotide encoding a novel human polypeptide named Prostate Specific Secreted Protein, or "PSSP." This invention also relates to PSSP polypeptides, as well as vectors, host cells, antibodies directed to PSSP polypeptides, and the recombinant methods for producing the same. Also provided are diagnostic methods for detecting disorders related to the urogenital system, and therapeutic methods for treating such disorders. The invention further relates to screening methods for identifying agonists and antagonists of PSSP activity.

BACKGROUND OF THE INVENTION

The mechanisms involved in the development and maintenance of prostatic tissue are poorly understood. Although it has been recognized for years that normal development and continued expression in adults of the male secondary sexual phenotype is androgen-dependent, there is relatively little known about the genes on which androgens act or the downstream pathways that lead to development of differentiated tissue. As with prostate development, the fundamental mechanisms underlying prostate cancer also remain obscure. However, androgen regulation and the loss thereof plays a critical role. In both developing and mature prostate, the maintenance of prostate-specific cellular functions requires continuous stimulation by androgens; in prostate cancer tissue, the reciprocal loss of this cellular differentiation, which occurs during progression of the disease, is largely concomitant with a loss of androgen responsiveness by prostatic cells. Identifying the genes involved in either of these largely opposing processes, will likely lead to a greater understanding of the fundamental mechanisms involved in both.

Thus, there is a need for polypeptides involved in the development of the prostate and other urogenital tissues, since disturbances of such regulation may be involved in disorders relating to urogenital system. Therefore, there is a need for identification and characterization of such human polypeptides which can play a role in detecting, preventing, ameliorating or correcting such disorders.

SUMMARY OF THE INVENTION

The present invention relates to a novel polynucleotide and the encoded polypeptide of PSSP. Moreover, the present invention relates to vectors, host cells, antibodies, and recombinant methods for producing the polypeptides and polynucleotides. Also provided are diagnostic methods for detecting disorders relates to the polypeptides, and therapeutic methods for treating such disorders. The invention further relates to screening methods for identifying binding partners of PSSP.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows the nucleotide sequence (SEQ ID NO:1) and the deduced amino acid sequence (SEQ ID NO:2) of PSSP. The predicted leader sequence located at about amino acids 1-22 is underlined.

FIG. 2 shows the regions of identity between the amino acid sequence of the PSSP protein and the translation product of the rat Common Salivary Protein 1 (U00964) (SEQ ID NO:3), determined by BLAST analysis. Identical amino acids between the two polypeptides are shaded, while conservative amino acids are boxed. By examining the regions of amino acids shaded and/or boxed, the skilled artisan can readily identify conserved domains between the two polypeptides. These conserved domains are preferred embodiments of the present invention.

FIG. 3 shows an analysis of the PSSP amino acid sequence. Alpha, beta, turn and coil regions; hydrophilicity and hydrophobicity; amphipathic regions; flexible regions; antigenic index and surface probability are shown, and all were generated using the default settings. In the "Antigenic Index or Jameson-Wolf" graph, the positive peaks indicate locations of the highly antigenic regions of the PSSP protein, i.e., regions from which epitope-bearing peptides of the invention can be obtained. The domains defined by these graphs are contemplated by the present invention. Tabular representation of the data summarized graphically in FIG. 3 can be found in Table 1.

DETAILED DESCRIPTION

Definitions

The following definitions are provided to facilitate understanding of certain terms used throughout this specification.

In the present invention, "isolated" refers to material removed from its original environment (e.g., the natural environment if it is naturally occurring), and thus is altered "by the hand of man" from its natural state. For example, an isolated polynucleotide could be part of a vector or a composition of matter, or could be contained within a cell, and still be "isolated" because that vector, composition of matter, or particular cell is not the original environment of the polynucleotide.

In the present invention, a "secreted" PSSP protein refers to a protein capable of being directed to the ER, secretory vesicles, or the extracellular space as a result of a signal sequence, as well as a PSSP protein released into the extracellular space without necessarily containing a signal sequence. If the PSSP secreted protein is released into the extracellular space, the PSSP secreted protein can undergo extracellular processing to produce a "mature" PSSP protein. Release into the extracellular space can occur by many mechanisms, including exocytosis and proteolytic cleavage.

As used herein, a PSSP "polynucleotide" refers to a molecule having a nucleic acid sequence contained in SEQ ID NO:1 or the cDNA contained within the clone deposited with the ATCC. For example, the PSSP polynucleotide can contain the nucleotide sequence of the full length cDNA sequence, including the 5' and 3' untranslated sequences, the coding region, with or without the signal sequence, the secreted protein coding region, as well as fragments, epitopes, domains, and variants of the nucleic acid sequence. Moreover, as used herein, a PSSP "polypeptide" refers to a molecule having the translated amino acid sequence generated from the polynucleotide as broadly defined.

In specific embodiments, the polynucleotides of the invention are less than 300 kb, 200 kb, 100 kb, 50 kb, 15 kb, 10 kb, or 7.5 kb in length. In a further embodiment, polynucleotides of the invention comprise at least 15 contiguous nucleotides of PSSP coding sequence, but do not comprise all or a portion of any PSSP intron. In another embodiment, the nucleic acid comprising PSSP coding sequence does not contain coding sequences of a genomic flanking gene (i.e., 5' or 3' to the PSSP gene in the genome).

In the present invention, the full length PSSP sequence identified as SEQ ID NO:1 was generated by overlapping sequences of the deposited clone (contig analysis). A representative clone containing all or most of the sequence for SEQ ID NO:1 was deposited with the American Type Culture Collection ("ATCC") on Mar. 6, 1998, and was given the ATCC Deposit Number 209664. The ATCC is located at 10801 University Boulevard, Manassas, Va. 20110-2209, USA. The ATCC deposit was made pursuant to the terms of the Budapest Treaty on the international recognition of the deposit of microorganisms for purposes of patent procedure.

A PSSP "polynucleotide" also includes those polynucleotides capable of hybridizing, under stringent hybridization conditions, to sequences contained in SEQ ID NO:1, the complement thereof, or the cDNA within the deposited clone. "Stringent hybridization conditions" refers to an overnight incubation at 42 degree C. in a solution comprising 50% formamide, 5.times.SSC (750 mM NaCl, 75 mM sodium citrate), 50 mM sodium phosphate (pH 7.6), 5.times. Denhardt's solution, 10% dextran sulfate, and 20 .mu.g/ml denatured, sheared salmon sperm DNA, followed by washing the filters in 0.1.times.SSC at about 65 degree C.

Also contemplated are nucleic acid molecules that hybridize to the PSSP polynucleotides at moderatetly high stringency hybridization conditions. Changes in the stringency of hybridization and signal detection are primarily accomplished through the manipulation of formamide concentration (lower percentages of formamide result in lowered stringency); salt conditions, or temperature. For example, moderately high stringency conditions include an overnight incubation at 37 degree C. in a solution comprising 6.times.SSPE (20.times.SSPE=3M NaCl; 0.2M NaH.sub.2 PO.sub.4 ; 0.02M EDTA, pH 7.4), 0.5% SDS, 30% formamide, 100 ug/ml salmon sperm blocking DNA; followed by washes at 50 degree C. with 1.times.SSPE, 0.1% SDS. In addition, to achieve even lower stringency, washes performed following stringent hybridization can be done at higher salt concentrations (e.g. 5.times.SSC).

Note that variations in the above conditions may be accomplished through the inclusion and/or substitution of alternate blocking reagents used to suppress background in hybridization experiments. Typical blocking reagents include Denhardt's reagent, BLOTTO, heparin, denatured salmon sperm DNA, and commercially available proprietary formulations. The inclusion of specific blocking reagents may require modification of the hybridization conditions described above, due to problems with compatibility.

Of course, a polynucleotide which hybridizes only to polyA+ sequences (such as any 3' terminal polyA+ tract of a cDNA shown in the sequence listing), or to a complementary stretch of T (or U) residues, would not be included in the definition of "polynucleotide," since such a polynucleotide would hybridize to any nucleic acid molecule containing a poly (A) stretch or the complement thereof (e.g., practically any double-stranded cDNA clone).

The PSSP polynucleotide can be composed of any polyribonucleotide or polydeoxribonucleotide, which may be unmodified RNA or DNA or modified RNA or DNA. For example, PSSP polynucleotides can be composed of single- and double-stranded DNA, DNA that is a mixture of single- and double-stranded regions, single- and double-stranded RNA, and RNA that is mixture of single- and double-stranded regions, hybrid molecules comprising DNA and RNA that may be single-stranded or, more typically, double-stranded or a mixture of single- and double-stranded regions. In addition, the PSSP polynucleotides can be composed of triple-stranded regions comprising RNA or DNA or both RNA and DNA. PSSP polynucleotides may also contain one or more modified bases or DNA or RNA backbones modified for stability or for other reasons. "Modified" bases include, for example, tritylated bases and unusual bases such as inosine. A variety of modifications can be made to DNA and RNA; thus, "polynucleotide" embraces chemically, enzymatically, or metabolically modified forms.

PSSP polypeptides can be composed of amino acids joined to each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres, and may contain amino acids other than the 20 gene-encoded amino acids. The PSSP polypeptides may be modified by either natural processes, such as posttranslational processing, or by chemical modification techniques which are well known in the art. Such modifications are well described in basic texts and in more detailed monographs, as well as in a voluminous research literature. Modifications can occur anywhere in the PSSP polypeptide, including the peptide backbone, the amino acid side-chains and the amino or carboxyl termini. It will be appreciated that the same type of modification may be present in the same or varying degrees at several sites in a given PSSP polypeptide. Also, a given PSSP polypeptide may contain many types of modifications. PSSP polypeptides may be branched, for example, as a result of ubiquitination, and they may be cyclic, with or without branching. Cyclic, branched, and branched cyclic PSSP polypeptides may result from posttranslation natural processes or may be made by synthetic methods. Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, covalent attachment of phosphotidylinositol, cross-linking, cyclization, disulfide bond formation, demethylation, formation of covalent cross-links, formation of cysteine, formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI anchor formation, hydroxylation, iodination, methylation, myristoylation, oxidation, pegylation, proteolytic processing, phosphorylation, prenylation, raceniization, selenoylation, sulfation, transfer-RNA mediated addition of amino acids to proteins such as arginylation, and ubiquitination. (See, for instance, PROTEINS--STRUCTURE AND MOLECULAR PROPERTIES, 2nd Ed., T. E. Creighton, W.H. Freeman and Company, New York (1993); POSTTRANSLATIONAL COVALENT MODIFICATION OF PROTEINS, B.C. Johnson, Ed., Academic Press, New York, pgs. 1-12 (1983); Seifter et al., Meth Enzymol 182:626-646 (1990); Rattan et al., Ann N.Y. Acad Sci 663:48-62 (1992).)

"SEQ ID NO:1" refers to a PSSP polynucleotide sequence while "SEQ ID NO:2" refers to a PSSP polypeptide sequence.

A PSSP polypeptide "having biological activity" refers to polypeptides exhibiting activity similar, but not necessarily identical to, an activity of a PSSP polypeptide, including mature forms, as measured in a particular biological assay, with or without dose dependency. In the case where dose dependency does exist, it need not be identical to that of the PSSP polypeptide, but rather substantially similar to the dose-dependence in a given activity as compared to the PSSP polypeptide (i.e., the candidate polypeptide will exhibit greater activity or not more than about 25-fold less and, preferably, not more than about tenfold less activity, and most preferably, not more than about three-fold less activity relative to the PSSP polypeptide.)

PSSP Polynucleotides and Polypeptides

Clone HPRCF77 was isolated from a human prostate cDNA library. This clone contains the entire coding region identified as SEQ ID NO:2. The deposited clone contains a cDNA having a total of 825 nucleotides, which encodes a predicted open reading frame of 178 amino acid residues. (See FIG. 1.) The open reading frame begins at a N-terminal methionine located at nucleotide position 146, and ends at a stop codon at nucleotide position 682. The predicted molecular weight of the PSSP protein should be about 1.96 kDa.

Subsequent Northern analysis also showed PSSP expression in primarily in prostate, and minorly in salivary gland, stomach, and trachea tissues, a pattern consistent with urogenital specific expression.

Using BLAST analysis, SEQ ID NO:2 contains domains homologous to the translation product of the rat mRNA for Common Salivary Protein 1 (U00964) (FIG. 2) (SEQ ID NO:3), including the following conserved domains: (a) a predicted homologous domain located at about amino acids F96-S108; (b) a predicted homologous domain located at about amino acids Y126-Y141; and (c) a predicted homologous domain located at about amino acids E80-E88. These polypeptide fragments of PSSP are specifically contemplated in the present invention. Because Common Salivary Protein 1 (U00964) is thought to be important in the development of the rat salivary glands, the homology between Common Salivary Protein 1 (U00964) and PSSP suggests that PSSP may also be involved in the development of the prostate and other urogenital tissues.

Moreover, the encoded polypeptide has a predicted leader sequence located at about amino acids 1-22. (See FIG. 1.) Also shown in FIG. 1, the predicted secreted form of PSSP encompasses about amino acids 23-178. These polypeptide fragments of PSSP are specifically contemplated in the present invention.

The PSSP nucleotide sequence identified as SEQ ID NO:1 was assembled from partially homologous ("overlapping") sequences obtained from the deposited clone, and in some cases, from additional related DNA clones. The overlapping sequences were assembled into a single contiguous sequence of high redundancy (usually three to five overlapping sequences at each nucleotide position), resulting in a final sequence identified as SEQ ID NO:1.

Therefore, SEQ ID NO:1 and the translated SEQ ID NO:2 are sufficiently accurate and otherwise suitable for a variety of uses well known in the art and described further below. For instance, SEQ ID NO:1 is useful for designing nucleic acid hybridization probes that will detect nucleic acid sequences contained in SEQ ID NO:1 or the cDNA contained in the deposited clone. These probes will also hybridize to nucleic acid molecules in biological samples, thereby enabling a variety of forensic and diagnostic methods of the invention. Similarly, polypeptides identified from SEQ ID NO:2 may be used to generate antibodies which bind specifically to PSSP.

Nevertheless, DNA sequences generated by sequencing reactions can contain sequencing errors. The errors exist as misidentified nucleotides, or as insertions or deletions of nucleotides in the generated DNA sequence. The erroneously inserted or deleted nucleotides cause frame shifts in the reading frames of the predicted amino acid sequence. In these cases, the predicted amino acid sequence diverges from the actual amino acid sequence, even though the generated DNA sequence may be greater than 99.9% identical to the actual DNA sequence (for example, one base insertion or deletion in an open reading frame of over 1000 bases).

Accordingly, for those applications requiring precision in the nucleotide sequence or the amino acid sequence, the present invention provides not only the generated nucleotide sequence identified as SEQ ID NO:1 and the predicted translated amino acid sequence identified as SEQ ID NO:2, but also a sample of plasmid DNA containing a human cDNA of PSSP deposited with the ATCC. The nucleotide sequence of the deposited PSSP clone can readily be determined by sequencing the deposited clone in accordance with known methods. The predicted PSSP amino acid sequence can then be verified from such deposits. Moreover, the amino acid sequence of the protein encoded by the deposited clone can also be directly determined by peptide sequencing or by expressing the protein in a suitable host cell containing the deposited human PSSP cDNA, collecting the protein, and determining its sequence.

The present invention also relates to the PSSP gene corresponding to SEQ ID NO:1, SEQ ID NO:2, or the deposited clone. The PSSP gene can be isolated in accordance with known methods using the sequence information disclosed herein. Such methods include preparing probes or primers from the disclosed sequence and identifying or amplifying the PSSP gene from appropriate sources of genomic material.

Also provided in the present invention are species homologs of PSSP. Species homologs may be isolated and identified by making suitable probes or primers from the sequences provided herein and screening a suitable nucleic acid source for the desired homologue.

The PSSP polypeptides can be prepared in any suitable manner. Such polypeptides include isolated naturally occurring polypeptides, recombinantly produced polypeptides, synthetically produced polypeptides, or polypeptides produced by a combination of these methods. Means for preparing such polypeptides are well understood in the art.

The PSSP polypeptides may be in the form of the secreted protein, including the mature form, or may be a part of a larger protein, such as a fusion protein (see below). It is often advantageous to include an additional amino acid sequence which contains secretory or leader sequences, pro-sequences, sequences which aid in purification, such as multiple histidine residues, or an additional sequence for stability during recombinant production.

PSSP polypeptides are preferably provided in an isolated form, and preferably are substantially purified. A recombinantly produced version of a PSSP polypeptide, including the secreted polypeptide, can be substantially purified by the one-step method described in Smith and Johnson, Gene 67:31-40 (1988). PSSP polypeptides also can be purified from natural or recombinant sources using antibodies of the invention raised against the PSSP protein in methods which are well known in the art.

Polynucleotide and Polypeptide Variants

"Variant" refers to a polynucleotide or polypeptide differing from the PSSP polynucleotide or polypeptide, but retaining essential properties thereof. Generally, variants are overall closely similar, and, in many regions, identical to the PSSP polynucleotide or polypeptide.

By a polynucleotide having a nucleotide sequence at least, for example, 95% "identical" to a reference nucleotide sequence of the present invention, it is intended that the nucleotide sequence of the polynucleotide is identical to the reference sequence except that the polynucleotide sequence may include up to five point mutations per each 100 nucleotides of the reference nucleotide sequence encoding the PSSP polypeptide. In other words, to obtain a polynucleotide having a nucleotide sequence at least 95% identical to a reference nucleotide sequence, up to 5% of the nucleotides in the reference sequence may be deleted or substituted with another nucleotide, or a number of nucleotides up to 5% of the total nucleotides in the reference sequence may be inserted into the reference sequence. The query sequence may be an entire sequence shown of SEQ ID NO:1, the ORF (open reading frame), or any fragement specified as described herein.

As a practical matter, whether any particular nucleic acid molecule or polypeptide is at least 90%, 95%, 96%, 97%, 98% or 99% identical to a nucleotide sequence of the presence invention can be determined conventionally using known computer programs. A preferred method for determing the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of Brutlag et al. (Comp. App. Biosci. (1990) 6:237-245.) In a sequence alignment the query and subject sequences are both DNA sequences. An RNA sequence can be compared by converting U's to T's. The result of said global sequence alignment is in percent identity. Preferred parameters used in a FASTDB alignment of DNA sequences to calculate percent identiy are: Matrix=Unitary, k-tuple=4, Mismatch Penalty=1, Joining Penalty=30, Randomization Group Length=0, Cutoff Score=1, Gap Penalty=5, Gap Size Penalty 0.05, Window Size=500 or the lenght of the subject nucleotide sequence, whichever is shorter.

If the subject sequence is shorter than the query sequence because of 5' or 3' deletions, not because of internal deletions, a manual correction must be made to the results. This is becuase the FASTDB program does not account for 5' and 3' truncations of the subject sequence when calculating percent identity. For subject sequences truncated at the 5' or 3' ends, relative to the the query sequence, the percent identity is corrected by calculating the number of bases of the query sequence that are 5' and 3' of the subject sequence, which are not matched/aligned, as a percent of the total bases of the query sequence. Whether a nucleotide is matched/aligned is determined by results of the FASTDB sequence alignment. This percentage is then subtracted from the percent identity, calculated by the above FASTDB program using the specified parameters, to arrive at a final percent identity score. This corrected score is what is used for the purposes of the present invention. Only bases outside the 5' and 3' bases of the subject sequence, as displayed by the FASTDB alignment, which are not matched/aligned with the query sequence, are calculated for the purposes of manually adjusting the percent identity score.

For example, a 90 base subject sequence is aligned to a 100 base query sequence to determine percent identity. The deletions occur at the 5' end of the subject sequence and therefore, the FASTDB alignment does not show a matched/alignement of the first 10 bases at 5' end. The 10 unpaired bases represent 10% of the sequence (number of bases at the 5' and 3' ends not matched/total number of bases in the query sequence) so 10% is subtracted from the percent identity score calculated by the FASTDB program. If the remaining 90 bases were perfectly matched the final percent identity would be 90%. In another example, a 90 base subject sequence is compared with a 100 base query sequence. This time the deletions are internal deletions so that there are no bases on the 5' or 3' of the subject sequence which are not matched/aligned with the query. In this case the percent identity calculated by FASTDB is not manually corrected. Once again, only bases 5' and 3' of the subject sequence which are not matched/aligned with the query sequence are manually corrected for. No other manual corrections are to made for the purposes of the present invention.

By a polypeptide having an amino acid sequence at least, for example, 95% "identical" to a query amino acid sequence of the present invention, it is intended that the amino acid sequence of the subject polypeptide is identical to the query sequence except that the subject polypeptide sequence may include up to five amino acid alterations per each 100 amino acids of the query amino acid sequence. In other words, to obtain a polypeptide having an amino acid sequence at least 95% identical to a query amino acid sequence, up to 5% of the amino acid residues in the subject sequence may be inserted, deleted, (indels) or substituted with another amino acid. These alterations of the reference sequence may occur at the amino or carboxy terminal positions of the reference amino acid sequence or anywhere between those terminal positions, interspersed either individually among residues in the reference sequence or in one or more contiguous groups within the reference sequence.

As a practical matter, whether any particular polypeptide is at least 90%, 95%, 96%, 97%, 98% or 99% identical to, for instance, the amino acid sequences shown in SEQ ID NO:2 or to the amino acid sequence encoded by deposited DNA clone can be determined conventionally using known computer programs. A preferred method for determing the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of Brutlag et al. (Comp. App. Biosci. (1990) 6:237-245). In a sequence alignment the query and subject sequences are either both nucleotide sequences or both amino acid sequences. The result of said global sequence alignment is in percent identity. Preferred parameters used in a FASTDB amino acid alignment are: Matrix=PAM 0, k-tuple=2, Mismatch Penalty=1, Joining Penalty=20, Randomization Group Length=0, Cutoff Score=1, Window Size=sequence length, Gap Penalty=5, Gap Size Penalty=0.05, Window Size=500 or the length of the subject amino acid sequence, whichever is shorter.

If the subject sequence is shorter than the query sequence due to N- or C-terminal deletions, not because of internal deletions, a manual correction must be made to the results. This is because the FASTDB program does not account for N- and C-terminal truncations of the subject sequence when calculating global percent identity. For subject sequences truncated at the N- and C-termini, relative to the the query sequence, the percent identity is corrected by calculating the number of residues of the query sequence that are N- and C-terminal of the subject sequence, which are not matched/aligned with a corresponding subject residue, as a percent of the total bases of the query sequence. Whether a residue is matched/aligned is determined by results of the FASTDB sequence alignment. This percentage is then subtracted from the percent identity, calculated by the above FASTDB program using the specified parameters, to arrive at a final percent identity score. This final percent identity score is what is used for the purposes of the present invention. Only residues to the N- and C-termini of the subject sequence, which are not matched/aligned with the query sequence, are considered for the purposes of manually adjusting the percent identity score. That is, only query residue positions outside the farthest N- and C-terminal residues of the subject sequence.

For example, a 90 amino acid residue subject sequence is aligned with a 100 residue query sequence to determine percent identity. The deletion occurs at the N-terminus of the subject sequence and therefore, the FASTDB alignment does not show a matching/alignment of the first 10 residues at the N-terminus. The 10 unpaired residues represent 10% of the sequence (number of residues at the N- and C-termini not matched/total number of residues in the query sequence) so 10% is subtracted from the percent identity score calculated by the FASTDB program. If the remaining 90 residues were perfectly matched the final percent identity would be 90%. In another example, a 90 residue subject sequence is compared with a 100 residue query sequence. This time the deletions are internal deletions so there are no residues at the N- or C-termini of the subject sequence which are not matched/aligned with the query. In this case the percent identity calculated by FASTDB is not manually corrected. Once again, only residue positions outside the N- and C-terminal ends of the subject sequence, as displayed in the FASTDB alignment, which are not matched/aligned with the query sequence are manually corrected for. No other manual corrections are to made for the purposes of the present invention.

The PSSP variants may contain alterations in the coding regions, non-coding regions, or both. Especially preferred are polynucleotide variants containing alterations which produce silent substitutions, additions, or deletions, but do not alter the properties or activities of the encoded polypeptide. Nucleotide variants produced by silent substitutions due to the degeneracy of the genetic code are preferred. Moreover, variants in which 5-10, 1-5, or 1-2 amino acids are substituted, deleted, or added in any combination are also preferred. PSSP polynucleotide variants can be produced for a variety of reasons, e.g., to optimize codon expression for a particular host (change codons in the human mRNA to those preferred by a bacterial host such as E. coli).

Naturally occurring PSSP variants are called "allelic variants," and refer to one of several alternate forms of a gene occupying a given locus on a chromosome of an organism. (Genes II, Lewin, B., ed., John Wiley & Sons, New York (1985).) These allelic variants can vary at either the polynucleotide and/or polypeptide level. Alternatively, non-naturally occurring variants may be produced by mutagenesis techniques or by direct synthesis.

Using known methods of protein engineering and recombinant DNA technology, variants may be generated to improve or alter the characteristics of the PSSP polypeptides. For instance, one or more amino acids can be deleted from the N-terninus or C-terminus of the secreted protein without substantial loss of biological function. The authors of Ron et al., J. Biol. Chem. 268: 2984-2988 (1993), reported variant KGF proteins having heparin binding activity even after deleting 3, 8, or 27 amino-terminal amino acid residues. Similarly, Interferon gamma exhibited up to ten times higher activity after deleting 8-10 amino acid residues from the carboxy terminus of this protein. (Dobeli et al., J. Biotechnology 7:199-216 (1988).)

Moreover, ample evidence demonstrates that variants often retain a biological activity similar to that of the naturally occurring protein. For example, Gayle and coworkers (J. Biol. Chem 268:22105-22111 (1993)) conducted extensive mutational analysis of human cytokine IL-1a. They used random mutagenesis to generate over 3,500 individual IL-1a mutants that averaged 2.5 amino acid changes per variant over the entire length of the molecule. Multiple mutations were examined at every possible amino acid position. The investigators found that "[m]ost of the molecule could be altered with little effect on either [binding or biological activity]." (See, Abstract.) In fact, only 23 unique amino acid sequences, out of more than 3,500 nucleotide sequences examined, produced a protein that significantly differed in activity from wild-type.

Furthermore, even if deleting one or more amino acids from the N-terminus or C-terminus of a polypeptide results in modification or loss of one or more biological functions, other biological activities may still be retained. For example, the ability of a deletion variant to induce and/or to bind antibodies which recognize the secreted form will likely be retained when less than the majority of the residues of the secreted form are removed from the N-terminus or C-terminus. Whether a particular polypeptide lacking N- or C-terminal residues of a protein retains such immunogenic activities can readily be determined by routine methods described herein and otherwise known in the art.

Thus, the invention further includes PSSP polypeptide variants which show substantial biological activity. Such variants include deletions, insertions, inversions, repeats, and substitutions selected according to general rules known in the art so as have little effect on activity. For example, guidance concerning how to make phenotypically silent amino acid substitutions is provided in Bowie, J. U. et al., Science 247:1306-1310 (1990), wherein the authors indicate that there are two main strategies for studying the tolerance of an amino acid sequence to change.

The first strategy exploits the tolerance of amino acid substitutions by natural selection during the process of evolution. By comparing amino acid sequences in different species, conserved amino acids can be identified. These conserved amino acids are likely important for protein function. In contrast, the amino acid positions where substitutions have been tolerated by natural selection indicates that these positions are not critical for protein function. Thus, positions tolerating amino acid substitution could be modified while still maintaining biological activity of the protein.

The second strategy uses genetic engineering to introduce amino acid changes at specific positions of a cloned gene to identify regions critical for protein function. For example, site directed mutagenesis or alanine-scanning mutagenesis (introduction of single alanine mutations at every residue in the molecule) can be used. (Cunningham and Wells, Science 244:1081-1085 (1989).) The resulting mutant molecules can then be tested for biological activity.

As the authors state, these two strategies have revealed that proteins are surprisingly tolerant of amino acid substitutions. The authors further indicate which amino acid changes are likely to be permissive at certain amino acid positions in the protein. For example, most buried (within the tertiary structure of the protein) amino acid residues require nonpolar side chains, whereas few features of surface side chains are generally conserved. Moreover, tolerated conservative amino acid substitutions involve replacement of the aliphatic or hydrophobic amino acids Ala, Val, Leu and Ile; replacement of the hydroxyl residues Ser and Thr; replacement of the acidic residues Asp and Glu; replacement of the amide residues Asn and Gln, replacement of the basic residues Lys, Arg, and His; replacement of the aromatic residues Phe, Tyr, and Trp, and replacement of the small-sized amino acids Ala, Ser, Thr, Met, and Gly.

Besides conservative amino acid substitution, variants of PSSP include (i) substitutions with one or more of the non-conserved amino acid residues, where the substituted amino acid residues may or may not be one encoded by the genetic code, or (ii) substitution with one or more of amino acid residues having a substituent group, or (iii) fusion of the mature polypeptide with another compound, such as a compound to increase the stability and/or solubility of the polypeptide (for example, polyethylene glycol), or (iv) fusion of the polypeptide with additional amino acids, such as an IgG Fc fusion region peptide, or leader or secretory sequence, or a sequence facilitating purification. Such variant polypeptides are deemed to be within the scope of those skilled in the art from the teachings herein.

For example, PSSP polypeptide variants containing amino acid substitutions of charged amino acids with other charged or neutral amino acids may produce proteins with improved characteristics, such as less aggregation. Aggregation of pharmaceutical formulations both reduces activity and increases clearance due to the aggregate's immunogenic activity. (Pinckard et al., Clin. Exp. Immunol. 2:331-340 (1967); Robbins et al., Diabetes 36: 838-845 (1987); Cleland et al., Crit. Rev. Therapeutic Drug Carrier Systems 10:307-377 (1993).)

A further embodiment of the invention relates to a polypeptide which comprises the amino acid sequence of a PSSP polypeptide having an amino acid sequence which contains at least one amino acid substitution, but not more than 50 amino acid substitutions, even more preferably, not more than 40 amino acid substitutions, still more preferably, not more than 30 amino acid substitutions, and still even more preferably, not more than 20 amino acid substitutions. Of course, in order of ever-increasing preference, it is highly preferable for a peptide or polypeptide to have an amino acid sequence which comprises the amino acid sequence of a PSSP polypeptide, which contains at least one, but not more than 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 amino acid substitutions. In specific embodiments, the number of additions, substitutions, and/or deletions in the amino acid sequence of FIG. 1 or fragments thereof (e.g., the mature form and/or other fragments described herein), is 1-5, 5-10, 5-25, 5-50, 10-50 or 50-150, conservative amino acid substitutions are preferable.

Polynucleotide and Polypeptide Fragments

In the present invention, a "polynucleotide fragment" refers to a short polynucleotide having a nucleic acid sequence contained in the deposited clone or shown in SEQ ID NO:1. The short nucleotide fragments are preferably at least about 15 nt, and more preferably at least about 20 nt, still more preferably at least about 30 nt, and even more preferably, at least about 40 nt in length. A fragment "at least 20 nt in length," for example, is intended to include 20 or more contiguous bases from the cDNA sequence contained in the deposited clone or the nucleotide sequence shown in SEQ ID NO:1. These nucleotide fragments are useful as diagnostic probes and primers as discussed herein. Of course, larger fragments (e.g., 50, 150, 500, 600, 2000 nucleotides) are preferred.

Moreover, representative examples of PSSP polynucleotide fragments include, for example, fragments having a sequence from about nucleotide number 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-400, 401-450, 451-500, 501-550, 551-600, 651-700, 701-750, 751-800, 800-850, 851-900, 901-950, 951-1000, 1001-1050, 1051-1100, 1101-1150, 1151-1200, 1201-1250, 1251-1300, 1301-1350, 1351-1400, 1401-1450, 1451-1500, 1501-1550, 1551-1600, 1601-1650, 1651-1700, 1701-1750, 1751-1800, 1801-1850, 1851-1900, 1901-1950, 1951-2000, or 2001 to the end of SEQ ID NO:1 or the cDNA contained in the deposited clone. In this context "about" includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1) nucleotides, at either terminus or at both termini. Preferably, these fragments encode a polypeptide which has biological activity. More preferably, these polynucleotides can be used as probes or primers as discussed herein.

In the present invention, a "polypeptide fragment" refers to a short amino acid sequence contained in SEQ ID NO:2 or encoded by the cDNA contained in the deposited clone. Protein fragments may be "free-standing," or comprised within a larger polypeptide of which the fragment forms a part or region, most preferably as a single continuous region. Representative examples of polypeptide fragments of the invention, include, for example, fragments from about amino acid number 1-20, 21-40, 41-60, 61-80, 81-100, 102-120, 121-140, 141-160, 161-180, 181-200, 201-220, 221-240, 241-260, 261-280, or 281 to the end of the coding region. Moreover, polypeptide fragments can be about 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, or 150 amino acids in length. In this context "about" includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1) amino acids, at either extreme or at both extremes.

Preferred polypeptide fragments include the secreted PSSP protein as well as the mature form. Further preferred polypeptide fragments include the secreted PSSP protein or the mature form having a continuous series of deleted residues from the amino or the carboxy terminus, or both. For example, any number of amino acids, ranging from 1-60, can be deleted from the amino terminus of either the secreted PSSP polypeptide or the mature form. Similarly, any number of amino acids, ranging from 1-30, can be deleted from the carboxy terminus of the secreted PSSP protein or mature form. Furthermore, any combination of the above amino and carboxy terminus deletions are preferred. Similarly, polynucleotide fragments encoding these PSSP polypeptide fragments are also preferred.

Particularly, N-terminal deletions of the PSSP polypeptide can be described by the general formula m-178, where m is an integer from 2 to 177, where m corresponds to the position of the amino acid residue identified in SEQ ID NO:2. More in particular, the invention provides polynucleotides encoding polypeptides comprising, or alternatively consisting of, the amino acid sequence of: H-2 to R-178; R-3 to R-178; P-4 to R-178; E-5 to R-178; A-6 to R-178; M-7 to R-178; L-8 to R-178; L-9 to R-10 to R-178; L-11 to R-178; T-12 to R-178; L-13 to R-178; A-14 to R-178; L-15 to R-178; L-16 to R-178; G-17 to R-178; G-18 to R-178; P-19 to R-178; T-20 to R-178; W-21 to R-178; A-22 to R-178; G-23 to R-178; K-24 to R-178; M-25 to R-178; Y-26 to R-178; G-27 to R-178; P-28 to R-178; G-29 to R-178; G-30 to R-178; G-31 to R-178; K-32 to R-178; Y-33 to R-178; F-34 to R-178; S-35 to R-178; T-36 to R-178; T-37 to R-178; E-38 to R-178; D-39 to R-178; Y-40 to R-178; D-41 to R-178; H-42 to R-178; E-43 to R-178; I-44 to R-178; T-45 to R-178; G-46 to R-178; L-47 to R-178; R-48 to R-178; V-49 to R-178; S-50 to R-178; V-51 to R-178; G-52 to R-178; L-53 to R-178; L-54 to R-178; L-55 to R-178; V-56 to R-178; K-57 to R-178; S-58 to R-178; V-59 to R-178; Q-60 to R-178; V-61 to R-178; K-62 to R-178; L-63 to R-178; G-64 to R-178; D-65 to R-178; S-66 to R-178; W-67 to R-178; D-68 to R-178; V-69 to R-178; K-70 to R-178; L-71 to R-178; G-72 to R-178; A-73 to R-178; L-74 to R-178; G-75 to R-178; G-76 to R-178; N-77 to R-178; T-78 to R-178; Q-79 to R-178; E-80 to R-178; V-81 to R-178; T-82 to R-178; L-83 to R-178; Q-84 to R-178; P-85 to R-178; G-86 to R-178; E-87 to R-178; Y-88 to R-178; I-89 to R-178; T-90 to R-178; K-91 to R-178; V-92 to R-178; F-93 to R-178; V-94 to R-178; A-95 to R-178; F-96 to R-178; Q-97 to R-178; A-98 to R-178; F-99 to R-178; L-100 to R-178; R-101 to R-178; G-102 to R-178; V-103 to R-178; V-104 to R-178; M-105 to R-178; Y-106 to R-178; T-107 to R-178; S-108 to R-178; K-109 to R-178; D-110to R-178; R-111 to R-178; Y-112 to R-178; F-113 to R-178; Y-114 to R-178; F-115 to R-178; G-116 to R-178; K-117 to R-178; L-118 to R-178; D-119 to R-178; G-120 to R-178; Q-121 to R-178; I-122 to R-178; S-123 to R-178; S-124 to R-178; A-125 to R-178; Y-126 to R-178; P-127 to R-178; S-128 to R-178; Q-129 to R-178; E-130 to R-178; G-131 to R-178; Q-132 to R-178; V-133 to R-178; L-134 to R-178; V-135 to R-178; G-136 to R-178; I-137 to R-178; Y-138 to R-178; G-139 to R-178; Q-140 to R-178; Y-141 to R-178; Q-142 to R-178; L-143 to R-178; L-144 to R-178; G-145 to R-178; I-146 to R-178; K-147 to R-178; S-148 to R-178; I-149 to R-178; G-150 to R-178; F-151 to R-178; E-152 to R-178; W-153 to R-178; N-154 to R-178; Y-155 to R-178; P-156 to R-178; L-157 to R-178; E-158 to R-178; E-159 to R-178; P-160 to R-178; T-161 to R-178; T-162 to R-178; E-163 to R-178; P-164 to R-178; P-165 to R-178; V-166 to R-178; N-167 to R-178; L-168 to R-178; T-169 to R-178; Y-170 to R-178; S-171 to R-178; A-172 to R-178; N-173 to R-178; of SEQ ID NO:2. Polynucleotides encoding these polypeptides are also encompassed by the invention.

Moreover, C-terminal deletions of the PSSP polypeptide can also be described by the general formula 1-n, where n is an integer from 2 to 177, where n corresponds to the position of amino acid residue identified in SEQ ID NO:2. More in particular, the invention provides polynucleotides encoding polypeptides comprising, or altematively consisting of, the amino acid sequence of residues: M-1 to G-177; M-1 to V-176; M-1 to P-175; M-1 to S-174; M-1 to N-173; M-1 to A-172; M-1 to S-171; M-1 to Y-170; M-1 to T-169; M-1 to L-168; M-1 to N-167; M-1 to V-166; M-1 to P-165; M-1 to P-164; M-1 to E-163; M-1 to T-162; M-1 to T-161; M-1 to P-160; M-1 to E-159; M-1 to E-158; M-1 to L-157; M-1 to P-156; M-1 to Y-155; M-1 to N-154; M-1 to W-153; M-1 to E-152; M-1 to F-151; M-1 to G-150; M-1 to I-149; M-1 to S-148; M-1 to K-147; M-1 to I-146; M-1 to G-145; M-1 to L-144; M-1 to L-143; M-1 to Q-142; M-1 to Y-141; M-1 to Q-140; M-1 to G-139; M-1 to Y-138; M-1 to I-137; M-1 to G-136; M-1 to V-135; M-1 to L-134; M-1 to V-133; M-1 to Q-132; M-1 to G-131; M-1 to E-130; M-1 to Q-129; M-1 to S-128; M-1 to P-127; M-1 to Y-126; M-1 to A-125; M-1 to S-124; M-1 to S-123; M-1 to I-122; M-1 to Q-121; M-1 to G-120; M-1 to D-119; M-1 to L-118; M-1 to K-117; M-1 to G-116; M-1 to F-115; M-1 to Y-114; M-1 to F-113; M-1 to Y-112; M-1 to R-111; M-1 to D-110; M-1 to K-109; M-1 to S-108; M-1 to T-107; M-1 to Y-106; M-1 to M-105; M-1 to V-104; M-1 to V-103; M-1 to G-102; M-1 to R-101; M-1 to L-100; M-1 to F-99; M-1 to A-98; M-1 to Q-97; M-1 to F-96; M-1 to A-95; M-1 to V-94; M-1 to F-93; M-1 to V-92; M-1 to K-91; M-1 to T-90; M-1 to I-89; M-1 to Y-88; M-1 to E-87; M-1 to G-86; M-1 to P-85; M-1 to Q-84; M-1 to L-83; M-1 to T-82; M-1 to V-81; M-1 to E-80; M-1 to Q-79; M-1 to T-78; M-1 to N-77; M-1 to G-76; M-1 to G-75; M-1 to L-74; M-1 to A-73; M-1 to G-72; M-1 to L-71; M-1 to K-70; M-1 to V-69; M-1 to D-68; M-1 to W-67; M-1 to S-66; M-1 to D-65; M-1 to G-64; M-1 to L-63; M-1 to K-62; M-1 to V-61; M-1 to Q-60; M-1 to V-59; M-1 to S-58; M-1 to K-57; M-1 to V-56; M-1 to L-55; M-1 to L-54; M-1 to L-53; M-1 to G-52; M-1 to V-51; M-1 to S-50; M-1 to V-49; M-1 to R-48; M-1 to L-47; M-1 to G-46; M-1 to T-45; M-1 to I-44; M-1 to E-43; M-1 to H-42; M-1 to D-41; M-1 to Y-40; M-1 to D-39; M-1 to E-38; M-1 to T-37; M-1 to T-36; M-1 to S-35; M-1 to F-34; M-1 to Y-33; M-1 to K-32; M-1 to G-31; M-1 to G-30; M-1 to G-29; M-1 to P-28; M-1 to G-27; M-1 to Y-26; M-1 to M-25; M-1 to K-24; M-1 to G-23; M-1 to A-22; M-1 to W-21; M-1 to T-20; M-1 to P-19; M-1 to G-18; M-1 to G-17; M-1 to L-16; M-1 to L-15; M-1 to A-14; M-1 to L-13; M-1 to T-12; M-1 to L-11; M-1 to L-10; M-1 to L-9; M-1 to L-8; M-1 to M-7; of SEQ ID NO:2. Polynucleotides encoding these polypeptides are also encompassed by the invention.

In addition, any of the above listed N- or C-terminal deletions can be combined to produce a N- and C-terminal deleted PSSP polypeptide. The invention also provides polypeptides having one or more amino acids deleted from both the amino and the carboxyl termini, which may be described generally as having residues m-n of SEQ ID NO:2, where n and m are integers as described above. Polynucleotides encoding these polypeptides are also encompassed by the invention.

Also preferred are PSSP polypeptide and polynucleotide fragments characterized by structural or functional domains. Preferred embodiments of the invention include fragments that comprise alpha-helix and alpha-helix forming regions ("alpha-regions"), beta-sheet and beta-sheet-forming regions ("beta-regions"), turn and turn-forming regions ("turn-regions"), coil and coil-forming regions ("coil-regions"), hydrophilic regions, hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface-forming regions, substrate binding region, and high antigenic index regions. As set out in the Figures, such preferred regions include Garnier-Robson alpharegions, beta-regions, turn-regions, and coil-regions, Chou-Fasman alpha-regions, beta-regions, and turnegions, Kyte-Doolittle hydrophilic regions and hydrophobic regions, Eisenberg alpha and beta amphipathic regions, Karplus-Schulz flexible regions, Emini surface-forming regions, and Jameson-Wolf high antigenic index regions. Polypeptide fragments of SEQ ID NO:2 falling within conserved domains are specifically contemplated by the present invention. (See FIG. 3.) Moreover, polynucleotide fragments encoding these domains are also contemplated.

Other preferred fragments are biologically active PSSP fragments. Biologically active fragments are those exhibiting activity similar, but not necessarily identical, to an activity of the PSSP polypeptide. The biological activity of the fragments may include an improved desired activity, or a decreased undesirable activity.

However, many polynucleotide sequences, such as EST sequences, are publicly available and accessible through sequence databases. Some of these sequences are related to SEQ ID NO:1 and may have been publicly available prior to conception of the present invention. Preferably, such related polynucleotides are specifically excluded from the scope of the present invention. To list every related sequence would be cumbersome. Accordingly, preferably excluded from the present invention are one or more polynucleotides comprising a nucleotide sequence described by the general formula of a-b, where a is any integer between 1 to 811 of SEQ ID NO:1, b is an integer of 15 to 825, where both a and b correspond to the positions of nucleotide residues shown in SEQ ID NO:1, and where the b is greater than or equal to a+14.

Epitope-Bearing Portions

In another aspect, the invention provides peptides and polypeptides comprising epitope-bearing portions of the polypeptides of the present invention. These epitopes are immunogenic or antigenic epitopes of the polypeptides of the present invention. An "immunogenic epitope" is defined as a part of a protein that elicits an antibody response in vivo when the whole polypeptide of the present invention, or fragment thereof, is the immunogen. On the other hand, a region of a polypeptide to which an antibody can bind is defined as an "antigenic determinant" or "antigenic epitope." The number of in vivo immunogenic epitopes of a protein generally is less than the number of antigenic epitopes. See, e.g., Geysen, et al. (1983) Proc. Natl. Acad. Sci. USA 81:3998-4002. However, antibodies can be made to any antigenic epitope, regardless of whether it is an immunogenic epitope, by using methods such as phage display. See e.g., Petersen G. et al. (1995) Mol. Gen. Genet. 249:425-431. Therefore, included in the present invention are both immunogenic epitopes and antigenic epitopes.

A list of exemplified amino acid sequences comprising immunogenic epitopes are shown in Table 1 below. It is pointed out that Table 1 only lists amino acid residues comprising epitopes predicted to have the highest degree of antigenicity using the algorithm of Jameson and Wolf, (1988) Comp. Appl. Biosci. 4:181-186 (said references incorporated by reference in their entireties). The Jameson-Wolf antigenic analysis was performed using the computer program PROTEAN, using default parameters (Version 3.11 for the Power Macintosh, DNASTAR, Inc., 1228 South Park Street Madison, Wis.). Table 1 and portions of polypeptides not listed in Table 1 are not considered non-immunogenic. The immunogenic epitopes of Table 1 is an exemplified list, not an exhaustive list, because other immunogenic epitopes are merely not recognized as such by the particular algorithm used. Amino acid residues comprising other immunogenic epitopes may be routinely determined using algorithms similar to the Jameson-Wolf analysis or by in vivo testing for an antigenic response using methods known in the art. See, e.g., Geysen et al., supra; U.S. Pat. Nos. 4,708,781; 5, 194,392; 4,433,092; and 5,480,971 (said references incorporated by reference in their entireties). As shown in Table 1, SEQ ID NO:2 was found antigenic at amino acids: Y26-F34, S35-E43, Y26-E43, K62-K70, L74-T82, M105-F113, F115-S123, A125-V133, P156-P165, and S171-R178.

It is particularly pointed out that the amino acid sequences of Table 1 comprise immunogenic epitopes. Table 1 lists only the critical residues of immunogenic epitopes determined by the Jameson-Wolf analysis. Thus, additional flanking residues on either the N-terminal, C-terminal, or both N- and C-terminal ends may be added to the sequences of Table 1 to generate an epitope-bearing polypeptide of the present invention. Therefore, the immunogenic epitopes of Table 1 may include additional N-terminal or C-terminal amino acid residues. The additional flanking amino acid residues may be contiguous flanking N-terminal and/or C-terminal sequences from the polypeptides of the present invention, heterologous polypeptide sequences, or may include both contiguous flanking sequences from the polypeptides of the present invention and heterologous polypeptide sequences. Polypeptides of the present invention comprising immunogenic or antigenic epitopes are at least 7 amino acids residues in length. "At least" means that a polypeptide of the present invention comprising an immunogenic or antigenic epitope may be 7 amino acid residues in length or any integer between 7 amino acids and the number of amino acid residues of the full length polypeptides of the invention. Preferred polypeptides comprising immunogenic or antigenic epitopes are at least 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 amino acid residues in length. However, it is pointed out that each and every integer between 7 and the number of amino acid residues of the full length polypeptide are included in the present invention.

The immuno and antigenic epitope-bearing fragments may be specified by either the number of contiguous amino acid residues, as described above, or further specified by N-terminal and C-terminal positions of these fragments on the amino acid sequence of SEQ ID NO:2. Every combination of a N-terminal and C-terminal position that a fragment of, for example, at least 7 or at ;east 15 contiguous amino acid residues in length could occupy on the amino acid sequence of SEQ ID NO:2 is included in the invention. Again, "at least 7 contiguous amino acid residues in length" means 7 amino acid residues in length or any integer between 7 amino acids and the number of amino acid residues of the fill length polypeptide of the present invention. Specifically, each and every integer between 7 and the number of amino acid residues of the full length polypeptide are included in the present invention.

Immunogenic and antigenic epitope-bearing polypeptides of the invention are useful, for example, to make antibodies which specifically bind the polypeptides of the invention, and in immunoassays to detect the polypeptides of the present invention. The antibodies are useful, for example, in affinity purification of the polypeptides of the present invention. The antibodies may also routinely be used in a variety of qualitative or quantitative immunoassays, specifically for the polypeptides of the present invention using methods known in the art. See, e.g., Harlow et al., Antibodies: A Laboratory Manual, (Cold Spring Harbor Laboratory Press; 2nd Ed. 1988).

The epitope-bearing polypeptides of the present invention may be produced by any conventional means for making polypeptides including synthetic and recombinant methods known in the art. For instance, epitope-bearing peptides may be synthesized using known methods of chemical synthesis. For instance, Houghten has described a simple method for the synthesis of large numbers of peptides, such as 10-20 mgs of 248 individual and distinct 13 residue peptides representing single amino acid variants of a segment of the HA1 polypeptide, all of which were prepared and characterized (by ELISA-type binding studies) in less than four weeks (Houghten, R. A. Proc. Natl. Acad. Sci. USA 82:5131-5135 (1985)). This "Simultaneous Multiple Peptide Synthesis (SMPS)" process is further described in U.S. Pat. No. 4,631,211 to Houghten and coworkers (1986). In this procedure the individual resins for the solid-phase synthesis of various peptides are contained in separate solvent-permeable packets, enabling the optimal use of the many identical repetitive steps involved in solid-phase methods. A completely manual procedure allows 500-1000 or more syntheses to be conducted simultaneously (Houghten et al. (1985) Proc. Natl. Acad. Sci. 82:5131-5135 at 5134.

Epitope-bearing polypeptides of the present invention are used to induce antibodies according to methods well known in the art including, but not limited to, in vivo immunization, in vitro immunization, and phage display methods. See, e.g., Sutcliffe, et al., supra; Wilson, et al., supra, and Bittle, et al. (1985) J. Gen. Virol. 66:2347-2354. If in vivo immunization is used, animals may be immunized with free peptide; however, anti-peptide antibody titer may be boosted by coupling of the peptide to a macromolecular carrier, such as keyhole limpet hemacyanin (KLH) or tetanus toxoid. For instance, peptides containing cysteine residues may be coupled to a carrier using a linker such as -maleimidobenzoyl- N-hydroxysuccinimide ester (MBS), while other peptides may be coupled to carriers using a more general linking agent such as glutaraldehyde. Animals such as rabbits, rats and mice are immunized with either free or carrier-coupled peptides, for instance, by intraperitoneal and/or intradermal injection of emulsions containing about 100 .mu.gs of peptide or carrier protein and Freund's adjuvant. Several booster injections may be needed, for instance, at intervals of about two weeks, to provide a useful titer of anti-peptide antibody which can be detected, for example, by ELISA assay using free peptide adsorbed to a solid surface. The titer of anti-peptide antibodies in serum from an immunized animal may be increased by selection of anti-peptide antibodies, for instance, by adsorption to the peptide on a solid support and elution of the selected antibodies according to methods well known in the art.

As one of skill in the art will appreciate, and discussed above, the polypeptides of the present invention comprising an immunogenic or antigenic epitope can be fused to heterologous polypeptide sequences. For example, the polypeptides of the present invention may be fused with the constant domain of immunoglobulins (IgA, IgE, IgG, IgM), or portions thereof (CH1, CH2, CH3, any combination thereof including both entire domains and portions thereof) resulting in chimeric polypeptides. These fusion proteins facilitate purification, and show an increased half-life in vivo. This has been shown, e.g., for chimeric proteins consisting of the first two domains of the human CD4-polypeptide and various domains of the constant regions of the heavy or light chains of mammalian immunoglobulins. See, e.g., EPA 0,394,827; Traunecker et al. (1988) Nature 331:84-86. Fusion proteins that have a disulfide-inked dimeric structure due to the IgG portion can also be more efficient in binding and neutralizing other molecules than monomeric polypeptides or fragments thereof alone. See, e.g., Fountoulakis et al. (1995) J. Biochem. 270:3958-3964. Nucleic acids encoding the above epitopes can also be recombined with a gene of interest as an epitope tag to aid in detection and purification of the expressed polypeptide.

Antibodies

The present invention further relates to antibodies and T-cell antigen receptors (TCR) which specifically bind the polypeptides of the present invention. The antibodies of the present invention include IgG (including IgG1, IgG2, IgG3, and IgG4), IgA (including IgA1 and IgA2), IgD, IgE, or IgM, and IgY. As used herein, the term "antibody" (Ab) is meant to include whole antibodies, including single-chain whole antibodies, and antigen-binding fragments thereof. Most preferably the antibodies are human antigen binding antibody fragments of the present invention include, but are not limited to, Fab, Fab' and F(ab')2, Fd, single-chain Fvs (scFv), single-chain antibodies, disulfide-linked Fvs (sdfv) and fragments comprising either a V.sub.L or V.sub.H domain. The antibodies may be from any animal origin including birds and manmmals. Preferably, the antibodies are human, murine, rabbit, goat, guinea pig, camel, horse, or chicken.

Antigen-binding antibody fragments, including single-chain antibodies, may comprise the variable region(s) alone or in combination with the entire or partial of the following: hinge region, CH1, CH2, and CH3 domains. Also included in the invention are any combinations of variable region(s) and hinge region, CH1, CH2, and CH3 domains. The present invention further includes chimeric, humanized, and human monoclonal and polyclonal antibodies which specifically bind the polypeptides of the present invention. The present invention further includes antibodies which are anti-idiotypic to the antibodies of the present invention.

The antibodies of the present invention may be monospecific, bispecific, trispecific or of greater multispecificity. Multispecific antibodies may be specific for different epitopes of a polypeptide of the present invention or may be specific for both a polypeptide of the present invention as well as for heterologous compositions, such as a heterologous polypeptide or solid support material. See, e.g., WO 93/17715; WO 92/08802; WO 91/00360; WO 92/05793; Tutt, A. et al. (1991) J. Immunol. 147:60-69; U.S. Pat. Nos. 5,573,920, 4,474,893, 5,601,819, 4,714,681, 4,925,648; Kostelny, S. A. et al. (1992) J. Imnunol. 148:1547-1553.

Antibodies of the present invention may be described or specified in terms of the epitope(s) or portion(s) of a polypeptide of the present invention which are recognized or specifically bound by the antibody. The epitope(s) or polypeptide portion(s) may be specified as described herein, e.g., by N-terminal and C-terminal positions, by size in contiguous amino acid residues, or listed in the Tables and Figures. Antibodies which specifically bind any epitope or polypeptide of the present invention may also be excluded. Therefore, the present invention includes antibodies that specifically bind polypeptides of the present invention, and allows for the exclusion of the same.

Antibodies of the present invention may also be described or specified in terms of their cross-reactivity. Antibodies that do not bind any other analog, ortholog, or homolog of the polypeptides of the present invention are included. Antibodies that do not bind polypeptides with less than 95%, less than 90%, less than 85%, less than 80%, less than 75%, less than 70%, less than 65%, less than 60%, less than 55%, and less than 50% identity (as calculated using methods known in the art and described herein) to a polypeptide of the present invention are also included in the present invention. Further included in the present invention are antibodies which only bind polypeptides encoded by polynucleotides which hybridize to a polynucleotide of the present invention under stringent hybridization conditions (as described herein). Antibodies of the present invention may also be described or specified in terms of their binding affinity. Preferred binding affinities include those with a dissociation constant or Kd less than 5.times.10.sup.-6 M, 10.sup.-6 M, 5.times.10.sup.-7 M, 10.sup.-7 M, 5.times.10.sup.-8 M, 10.sup.-8 M, 5.times.10.sup.-9 M, 10.sup.-9 M, 5.times.10.sup.-10 M, 10.sup.-10 M, 5.times.10.sup.-11 M, 10.sup.-11 M, 5.times.10.sup.-12 M, 10.sup.-12 M, 5.times.10.sup.-13 M, 10.sup.-13 M, 5.times.10.sup.-14 M, 10.sup.-14 M, 5.times.10.sup.-15 M, and 10.sup.-15 M.

Antibodies of the present invention have uses that include, but are not limited to, methods known in the art to purify, detect, and target the polypeptides of the present invention including both in vitro and in vivo diagnostic and therapeutic methods. For example, the antibodies have use in immunoassays for qualitatively and quantitatively measuring levels of the polypeptides of the present invention in biological samples. See, e.g., Harlow et al., ANTIBODIES: A LABORATORY MANUAL, (Cold Spring Harbor Laboratory Press, 2nd ed. 1988) (incorporated by reference in the entirety).

The antibodies of the present invention may be used either alone or in combination with other compositions. The antibodies may further be recombinantly fused to a heterologous polypeptide at the N- or C-terminus or chemically conjugated (including covalently and non-covalently conjugations) to polypeptides or other compositions. For example, antibodies of the present invention may be recombinantly fused or conjugated to molecules useful as labels in detection assays and effector molecules such as heterologous polypeptides, drugs, or toxins. See, e.g., WO 92/08495; WO 91/14438; WO 89/12624; U.S. Pat. No. 5,314,995; and EP 0 396 387.

The antibodies of the present invention may be prepared by any suitable method known in the art. For example, a polypeptide of the present invention or an antigenic fragment thereof can be administered to an animal in order to induce the production of sera containing polyclonal antibodies. Monoclonal antibodies can be prepared using a wide of techniques known in the art including the use of hybridoma and recombinant technology. See, e.g., Harlow et al., ANTIBODEES: A LABORATORY MANUAL, (Cold Spring Harbor Laboratory Press, 2nd ed. 1988); Hammerling, et al., in: MONOCLONAL ANTEBODIES AND T-CELL HYBRIDOMAS 563-681 (Elsevier, N.Y., 1981) (said references incorporated by reference in their entireties). Fab and F(ab')2 fragments may be produced by proteolytic cleavage, using enzymes such as papain (to produce Fab fragments) or pepsin (to produce F(ab')2 fragments).

Alternatively, antibodies of the present invention can be produced through the application of recombinant DNA technology or through synthetic chemistry using methods known in the art. For example, the antibodies of the present invention can be prepared using various phage display methods known in the art. In phage display methods, functional antibody domains are displayed on the surface of a phage particle which carries polynucleotide sequences encoding them. Phage with a desired binding property are selected from a repertoire or combinatorial antibody library (e.g. human or murine) by selecting directly with antigen, typically antigen bound or captured to a solid surface or bead. Phage used in these methods are typically filamentous phage including fd and M13 with Fab, Fv or disulfide stabilized Fv antibody domains recombinantly fused to either the phage gene III or gene VIII protein. Examples of phage display methods that can be used to make the antibodies of the present invention include those disclosed in Brinkman U. et al. (1995) J. Inmunol. Methods 182:41-50; Ames, R. S. et al. (1995) J. Immunol. Methods 184:177-186; Kettleborough, C. A. et al. (1994) Eur. J. Immunol. 24:952-958; Persic, L. et al. (1997) Gene 187 9-18; Burton, D. R. et al. (1994) Advances in Immunology 57:191-280; PCT/GB91/01134; WO 90/02809; WO 91/10737; WO 92/01047; WO 92/18619; WO 93/11236; WO 95/15982; WO 95/20401; and U.S. Pat. Nos. 5,698,426, 5,223,409, 5,403,484, 5,580,717, 5,427,908, 5,750,753, 5,821,047, 5,571,698, 5,427,908, 5,516,637, 5,780,225, 5,658,727 and 5,733,743 (said references incorporated by reference in their entireties).

As described in the above references, after phage selection, the antibody coding regions from the phage can be isolated and used to generate whole antibodies, including human antibodies, or any other desired antigen binding fragment, and expressed in any desired host including mammalian cells, insect cells, plant cells, yeast, and bacteria. For example, techniques to recombinandly produce Fab, Fab' and F(ab')2 fragments can also be employed using methods known in the art such as those disclosed in WO 92/22324; Mullinax, R. L. et al. (1992) BioTechniques 12(6):864-869; and Sawai, H. et al. (1995) AJRI 34:26-34; and Better, M. et al. (1988) Science 240:1041-1043 (said references incorporated by reference in their entireties).

Examples of techniques which can be used to produce single-chain Fvs and antibodies include those described in U.S. Pat. Nos. 4,946,778 and 5,258,498; Huston et al. (1991) Methods in Enzymology 203:46-88; Shu, L. et al. (1993) PNAS 90:7995-7999; and Skerra, A. et al. (1988) Science 240:1038-1040. For some uses, including in vivo use of antibodies in humans and in vitro detection assays, it may be preferable to use chimeric, humanized, or human antibodies. Methods for producing chimeric antibodies are known in the art. See e.g., Morrison, Science 229:1202 (1985); Oi et al., BioTechniques 4:214 (1986); Gillies, S. D. et al. (1989) J. Immunol. Methods 125:191-202; and U.S. Pat. No. 5,807,715. Antibodies can be humanized using a variety of techniques including CDR-grafting (EP 0 239 400; WO 91/09967; U.S. Pat. Nos. 5,530,101; and 5,585,089), veneering or resurfacing (EP 0 592 106; EP 0 519 596; Padlan E. A., (1991) Molecular Immunology 28(4/5):489-498; Studnicka G. M. et al. (1994) Protein Engineering 7(6):805-814; Roguska M. A. et al. (1994) PNAS 91:969-973), and chain shuffling (U.S. Pat. No. 5,565,332). Human antibodies can be made by a variety of methods known in the art including phage display methods described above. See also, U.S. Pat. Nos. 4,444,887, 4,716,111, 5,545,806, and 5,814,318; and WO 98/46645 (said references incorporated by reference in their entireties).

Further included in the present invention are antibodies recombinantly fused or chemically conjugated (including both covalently and non-covalently conjugations) to a polypeptide of the present invention. The antibodies may be specific for antigens other than polypeptides of the present invention. For example, antibodies may be used to target the polypeptides of the present invention to particular cell types, either in vitro or in vivo, by fusing or conjugating the polypeptides of the present invention to antibodies specific for particular cell surface receptors. Antibodies fused or conjugated to the polypeptides of the present invention may also be used in in vitro immunoassays and purification methods using methods known in the art. See e.g., Harbor et al. supra and WO 93/21232; EP 0 439 095; Naramura, M. et al. (1994) Immunol. Lett. 39:91-99; U.S. Pat. No. 5,474,981; Gillies, S. O. et al. (1992) PNAS 89:1428-1432; Fell, H. P. et al. (1991) J. Inmunol. 146:2446-2452 (said references incorporated by reference in their entireties).

The present invention further includes compositions comprising the polypeptides of the present invention fused or conjugated to antibody domains other than the variable regions. For example, the polypeptides of the present invention may be fused or conjugated to an antibody Fc region, or portion thereof. The antibody portion fused to a polypeptide of the present invention may comprise the hinge region, CH1 domain, CH2 domain, and CH3 domain or any combination of whole domains or portions thereof. The polypeptides of the present invention may be fused or conjugated to the above antibody portions to increase the in vivo half life of the polypeptides or for use in immunoassays using methods known in the art. The polypeptides may also be fused or conjugated to the above antibody portions to form multimers. For example, Fc portions fused to the polypeptides of the present invention can form dimers through disulfide bonding between the Fc portions. Higher multimeric forms can be made by fusing the polypeptides to portions of IgA and IgM. Methods for fusing or conjugating the polypeptides of the present invention to antibody portions are known in the art. See e.g., U.S. Pat. Nos. 5,336,603, 5,622,929, 5,359,046, 5,349,053, 5,447,851, 5,112,946; EP 0 307 434, EP 0 367 166; WO 96/04388, WO 91/06570; Ashkenazi, A. et al. (1991) PNAS 88:10535-10539; Zheng, X. X. et al. (1995) J. Immunol. 154:5590-5600; and Vil, H. et al. (1992) PNAS 89:11337-11341 (said references incorporated by reference in their entireties).

The invention further relates
PATENT EXAMPLES This data is not available for free
PATENT PHOTOCOPY Available on request

Want more information ?
Interested in the hidden information ?
Click here and do your request.


back