Main > PROTEINS > Proteomics > Human Proteomics > Ubiquitin > Expression. in Yeast

Product USA. C

PATENT ASSIGNEE'S COUNTRY USA
UPDATE 05.00
PATENT NUMBER This data is not available for free
PATENT GRANT DATE 30.05.00
PATENT TITLE Ubiquitin expression system

PATENT ABSTRACT The present invention provides an improved ubiquitin fusion system for gene expression in yeast systems which allows for the regulatable high level production of heterologous proteins having destabilizing amino terminal residues. The ubiquitin fusion proteins expressed in yeast are cleaved precisely in vivo by an endogenous ubiquitin-specific hydrolase to yield heterologous proteins such as human alpha-1-antitrypsin, human gamma-interferon and human immunodeficiency virus integrase protein, all of which initiate with destabilizing residues. An expression vector containing a synthetic gene for monomeric yeast ubiquitin was constructed and expressed under the control of a glucose regulatable yeast promoter. Inclusion of unique restriction sites at the 3'-end of the synthetic ubiquitin gene allows for precise in-frame insertion of heterologous genes. The system can be used to increase expression of poorly expressed proteins and to produce proteins having selective amino-terminal destabilizing residues.

PATENT INVENTORS This data is not available for free
PATENT ASSIGNEE This data is not available for free
PATENT FILE DATE 25.04.95
PATENT REFERENCES CITED Gene, 61:265-275, 1987, Cousens et al. High level expression of proinsulin in the yeast, Saccharomyces cerevisiae.
Virology, 167:634-638, 1988, Hizi et al. Expression of the Molonex Murine Leukemia Virus and Human Immunodeficiency Virus Integration Proteins in E. coli.
Bachmair et al., Science 234:179-186 (1986).
Varshavsky et al., Ubiquitin (Rechsteiner, ed) pp. 287-324, Plenum Press, New York 1988).
Varshavsky et al., Yeast Genetic Engineering (Barr, Brake, and Valenzuela, eds) pp. 109-143 (1989), Butterworths, New York.
Butt et al., J Biol Chem 263:16364-16371 (1988).
Ecker et al., J Biol Chem 264(13):7715-7719 (1989).
Butt et al., Proc Natl Acad Sci USA 86:2540-2544 (1989).
Barr et al., Yeast 4:S24 (Abstract) (1988), 14th International Conference on Yeast Genetics and Molecular Biology, Aug. 7-13, 1988 in Espoo, Finland.
Sabin et al., Biotechnology 7:705-709 (1989).
Barr et al., Production of Recombinant DNA-Derived Pharmaceuticals in the Yeast Saccharomyces Cerevisiae, American Chemical Society Meeting, Sep. 1988, Los Angeles, CA.
Bachmair and Varshavsky, Cell 56:1019-1032 (1989).
Jonnalagadda et al., J Biol Chem 262(36):17750-17756 (1987).
Chau et al., Science 243:1576-1583 (1989).
Ben-Bassat and Bauer et al., Nature 326:315.
Barr et al., J Biol Chem 268:1671-1678 (1988).
Barr et al., Vaccine 5:90-101 (1987).
Bialy, H. (1989) Bio/technology 7:649; and.
Miller et al. (1989) Bio/technology 7:698-704.

PATENT PARENT CASE TEXT This data is not available for free
PATENT CLAIMS I claim:

1. A method for expression in yeast of a heterologous protein to a level of at least 5% of total cell protein, wherein the protein comprises a destabilizing N-terminal amino acid residue selected from the group consisting of isoleucine, glutamic acid, histidine, tyrosine, glutamine, aspartic acid, asparagine, phenylalanine, leucine, tryptophan, lysine, and arginine, comprising:

(a) providing a DNA molecule encoding a ubiquitin fusion protein, said DNA molecule comprising a first DNA sequence that encodes ubiquitin and is immediately adjacent to, upstream of and in reading frame with a second DNA sequence that encodes the heterologous protein,

wherein said first and second DNA sequences are operably linked to a promoter capable of promoting expression of the fusion protein to a level of at least 5% of total cell protein;

(b) transforming a yeast host cell with said DNA molecule; and

(c) culturing the transformed yeast host cell under conditions capable of inducing the expression and quantitative in vivo processing of the fusion protein, thereby yielding expression of the heterologous protein to a level of at least 5% of total cell protein.

2. The method of claim 1 further comprising the step of recovering from the transformed cell culture, the heterologous protein free of ubiquitin.

3. The method of claim 1 wherein the heterologous protein is a mammalian or viral protein.

4. The method of claim 3 wherein the mammalian protein is selected from the group consisting of human alpha-1-antitrypsin, human gamma-interferon and human immunodeficiency virus integrase protein.

5. The method of claim 4 wherein the first codon of said nucleotide sequence encoding human alpha-1-antitrypsin encodes the amino acid glutamic acid.

6. The method of claim 4 wherein the first codon of said nucleotide sequence encoding human gamma-interferon encodes the amino acid glutamic acid.

7. The method of claim 4 wherein the first codon of said nucleotide sequence encoding the human immunodeficiency virus integrase protein encodes the amino acid phenylalanine.

8. The method of claim 1 wherein said yeast cell is from the genus Saccharomyces.

9. The method of claim 8 wherein said yeast cell is S. cerevisiae.

10. The method of claim 11 wherein the promoter is a regulatable promoter.

11. The method of claim 10 wherein the promoter is an ADH2-GAPDH hybrid yeast promoter.

12. The method of claim 11 wherein the N-terminal destabilizing amino acid residue is arginine.

13. The method of claim 11, wherein the N-terminal destabilizing amino acid residue is one selected from the group consisting of: lysine, phenylalanine, leucine, aspartic acid, asparagine, and tryptophan.

14. The method of claim 11, wherein the N-terminal destabilizing amino acid residue is one selected from the group consisting of: histidine, tyrosine, and glutamine.

15. The method of claim 11, wherein the N-terminal destabilizing amino acid residue is one selected from the group consisting of isoleucine and glutamic acid.

16. The method of claim 11, wherein the heterologous protein is an HIV-1 envelope polypeptide env4.

17. The method of claim 11, wherein the promoter is a promoter derived from a yeast glycolytic enzyme gene or a hybrid yeast promoter.

18. The method of claim 1, wherein the heterologous protein is a eukaryotic protein.

19. The method of claim 1, wherein the heterologous protein is a hormone or a growth factor.

20. The method of claim 1, wherein the heterologous protein is selected from the group consisting of growth hormone, somatomedins, epidermal growth factor, fibroblast growth factors, insulin, nerve growth factor, vasopressin, renin, calcitonin, erythropoietin, colony-stimulating factors, lymphokines and enzymes.

21. The method of claim 1, wherein the destabilizing N-terminal amino acid is authentic.

22. The method of claim 21, wherein the heterologous protein is expressed to a level of between about 5% and 50% of total cell protein.

23. The method of claim 22, wherein the heterologous protein is expressed to a level of between about 10% and 50% of total cell protein.

24. The method of claim 23, wherein the heterologous protein is expressed to a level of between about 10% to 30% of total cell protein.

25. The method of claim 22, wherein heterologous protein is expressed to a level of between about 5% and 30% of total cell protein.

26. The method of claim 25, wherein heterologous protein is expressed to a level of between about 5% and 10% of total cell protein.

27. The method of claim 11, wherein the heterologous protein is expressed to a level of between about 5% and 50% of total cell protein.

28. The method of claim 27, wherein the heterologous protein is expressed to a level of between about 10% and 50% of total cell protein.

29. The method of claim 28, wherein the heterologous protein is expressed to a level of between about 10% to 30% of total cell protein.

30. The method of claim 27, wherein heterologous protein is expressed to a level of between about 5% and 30% of total cell protein.

31. The method of claim 30 wherein heterologous protein is expressed to a level of between about 5% and 10% of total cell protein.

32. The method of claim 27 further comprising the step of recovering from the transformed cell culture, the heterologous protein free of ubiquitin.

33. The method of claim 27 wherein the heterologous protein is a mammalian or viral protein.

34. The method of claim 27 wherein the heterologous protein is selected from the group consisting of human alpha-1-antitrypsin, human gamma-interferon and human immunodeficiency virus integrase protein.

35. The method of claim 34 wherein the first codon of said nucleotide sequence encoding human alpha-1-antitrypsin encodes the amino acid glutamic acid.

36. The method of claim 34 wherein the first codon of said nucleotide sequence encoding human gamma-interferon encodes the amino acid glutamic acid.

37. The method of claim 34 wherein the first codon of said nucleotide sequence encoding the human immunodeficiency virus integrase protein encodes the amino acid phenylalanine.

38. The method of claim 27 wherein said yeast cell is from the genus Saccharomyces.

39. The method of claim 38 wherein said yeast cell is S. cerevisiae.

40. The method of claim 27 wherein the promoter is a regulatable promoter.

41. The method of claim 40 wherein the promoter is derived from a yeast glycolytic enzyme gene or a hybrid yeast promoter.

42. The method of claim 41 wherein the promoter is an ADH2-GAPDH hybrid yeast promoter.

43. The method of claim 27 wherein the N-terminal destabilizing amino acid residue is arginine.

44. The method of claim 27, wherein the N-terminal destabilizing amino acid residue is one selected from the group consisting of: lysine, phenylalanine, leucine, aspartic acid, asparagine, and tryptophan.

45. The method of claim 27 jwherein the N-terminal destabilizing amino acid residue is one selected from the group consisting of histidine, tyrosine, and glutamine.

46. The method of claim 27 wherein the N-terminal destabilizing amino acid residue is one selected from the group consisting of isoleucine and glutamic acid.

47. The method of claim 27, wherein the heterologous protein is an HIV-1 envelope polypeptide env4.

48. A DNA construct for expression in yeast of a heterologous protein having a destabilizing N-terminal amino acid residue selected from the group consisting of isoleucine, glutamic acid, histidine, tyrosine, glutamine, aspartic acid, asparagine, phenylalanine, leucine, tryptophan, lysine, and arginine, wherein the construct comprises a first DNA sequence that encodes ubiquitin and that is immediately adjacent to, upstream of and in reading frame with a second DNA sequence that encodes the heterologous protein, and wherein said first and second DNA sequences are operably linked to a promoter capable of promoting expression of the fusion protein to a level of at least 5% of total cell protein,

wherein the fusion protein is capable of being quantitatively processed in vivo to yield the heterologous protein in yeast.

49. The DNA construct of claim 48, wherein the vector further comprises a transcription termination sequence located 3' to the second DNA fragment.

50. The DNA construct of claim 49, wherein the transcription termination sequence is the alpha-factor terminator.

51. The DNA construct of claim 49, wherein the transcription termination sequence is the GAPDH terminator.

52. The DNA construct of claim 48, wherein the promoter is a regulatable promoter.

53. The DNA construct of claim 52, wherein the regulatable promoter is an ADH2-GAPDH hybrid yeast promoter.

54. The DNA construct of claim 48, wherein the heterologous protein is a eukaryotic protein.

55. The DNA construct of claim 48, wherein the heterologous protein is a mammalian or a viral protein.

56. The DNA construct of claim 48, wherein the heterologous protein is a hormone or a growth factor.

57. The DNA construct of claim 48, wherein the heterologous protein is selected from the group consisting of growth hormone, somatomedins, epidermal growth factor, fibroblast growth factors, insulin, nerve growth factor, vasopressin, renin, calcitonin, erythropoietin, colony-stimulating factors, lymphokines and enzymes.

58. The DNA construct of claim 48, wherein the promoter is capable of promoting expression of the fusion protein to a level of betveen about 5% and 50% of total cell protein.

59. The DNA construct of claim 58, wherein the promoter is capable of promoting expression of the fusion protein to a level of betwveen about 10% and 50% of total cell protein.

60. The DNA construct of claim 59, wherein the promoter is capable of promoting expression of the fusion protein to a level of between about 10% to 30% of total cell protein.

61. The method of claim 58, wherein heterologous protein is expressed to a level of between about 5% and 30% of total cell protein.

62. The method of claimt wherein heterologous protein is expressed to a level of between about 5% and 10% of total cell protein.

63. A yeast host cell transformed by a vector that provides for expression of a heterologous protein having a destabilizing N-terminal amino acid residue selected from the group consisting of isoleucine, glutamic acid, histidine, tyrosine, glutamine, aspartic acid, asparagine, phenylalanine, leucine, tryptophan, lysine, and arginine, wherein the vector comprises a first DNA sequence that encodes ubiquitin and that is immediately adjacent to, upstream of and in reading frame with a second DNA sequence that encodes the heterologous protein, and wherein said first and second DNA sequences are operably linked to a promoter capable of promoting expression of the fusion protein to a level of at least 5% of total cell protein, and

wherein the fusion protein is capable of being quantitatively processed in vivo to yield the heterologous protein.

64. The host cell of claim 63 wherein the yeast expression vector is a high copy number vector.

65. The yeast host cell of claim 63, wherein the yeast host cell is from the genus Saccharomyces.

66. The yeast host cell of claim 65, wherein the yeast host cell is S. cerevisiae.

67. The yeast host cell of claim 63, wherein the promoter is a regulatable promoter.

68. The yeast host cell of claim 67, wherein the regulatable promoter is an ADH2-GAPDH hybrid yeast promoter.

69. The yeast host cell of claim 63, wherein the heterologous protein is a eukaryotic protein.

70. The yeast host cell of claim 63, wherein the heterologous protein is a mammalian or a viral protein.

71. The yeast host cell of claim 63, wherein the heterologous protein is a hormone or a growth factor.

72. The yeast host cell of claim 63, wherein the heterologous protein is selected from the group consisting of growth hormone, somatomedins, epidermal growth factor, fibroblast growth factors, insulin, nerve growth factor, vasopressin, renin, calcitonin, erythropoietin, colony-stimulating factors, lymphokines and enzymes.

73. The yeast host cell of claim 63, wherein the promoter is capable of promoting expression of the fusion protein to a level of between about 5% and 50% of total cell protein.

74. The yeast host cell of claim 73, wherein the promoter is capable of promoting expression of the fusion protein to a level of between about 10% and 50% of total cell protein.

75. The yeast host cell of claim 74, wherein the promoter is capable of promoting expression of the fusion protein to a level of between about 10% to 30% of total cell protein.

76. The method of claim 73, wherein heterologous protein is expressed to a level of between about 5% and 30% of total cell protein.

77. The method of claim 76, wherein heterologous protein is expressed to a level of between about 5% and 10% of total cell protein.
--------------------------------------------------------------------------------

PATENT DESCRIPTION TECHNICAL FIELD

The present invention is directed to methods and materials useful for the production of heterologous proteins in yeast by recombinant DNA methods. More particularly, the present invention is directed to methods and materials based on the use of a ubiquitin DNA fusion cassette to improve the yield of heterologous proteins made in yeast and provide for amino terminal authentic heterologous proteins.

BACKGROUND OF THE INVENTION

It has been generally recognized that the intracellular expression of naturally secreted eukaryotic proteins in microorganisms such as bacteria or yeast, as an expressed, mature polypeptide will frequently contain an additional, obligatory, initiation codon-derived methionine residue at the amino terminus. In many situations the extra amino acid is not detrimental, yet if these proteins are used for pharmaceutical indications, immunogenicity problems associated with this additional residue can be problematic.

Subsequent to the development of the initial intracellular expression systems, where the "methionine problem" was first encountered, several methods for circumventing this problem were developed. First and foremost, the viability of heterologous secretion systems for bacteria, yeast, filamentous fungi, and insect and mammalian cells was demonstrated. Indeed, for complex high molecular weight and glycosylated proteins, mammalian cell secretion systems have been essential. Heterologous secretion has, however, tended to be much lower yielding when compared with intracellular expression, and in some cases, secretion systems have failed altogether to generate relevant quantities of recombinant protein.

Secondly, in vitro systems involving chemical removal of methionine by cyanogen bromide and specific processing of fusion proteins with aminopeptidases, enterokinase, collagenase, and factor Xa have been developed in order to retain the high yields often associated with intracellular expression systems. It would be, however, desirable to avoid the use of the additional processing steps required for the cleavage reaction.

Ubiquitin (Ub), a highly conserved 76 residue protein, is found in eukaryotes either free or covalently joined via its carboxy-terminal glycine residue to a variety of cytoplasmic, nuclear, and integral membrane proteins. The coupling of ubiquitin to such proteins serves to target that protein as a proteolytic substrate for degradation. An important component of the degradation signal in a short-lived protein is the protein's amino-terminal residue (Bachmair et al., (1986) Science 234:179-186). The degradative pathway whose initial steps involve amino-terminal recognition of proteolytic substrates is called the N-end rule pathway, to distinguish it from other proteolytic pathways and also from other ubiquitin-dependent processes, some of which may not involve degradation of target proteins.

Varshavsky and coworkers (Varshavsky et al., (1988) in Ubiguitin (Rechsteiner, ed) pp 287-324, Plenum Press, New York; Varshavsky et al., (1989) in Yeast Genetic Engineering (Barr, Brake and Valenzuela, eds) pp 109-143, Butterworths, N.Y.; and PCT WO88/02406, published Apr. 7, 1988) have shown that ubiquitin may be utilized for the production of recombinant proteins with specifically engineered amino termini. Initially, the production of bacterial beta-galactosidase derivatives, and murine dihydrofolate reductases (DHFRS) that differed exclusively at their amino-terminal residues lead to the definition of the N-end rule. According to this general rule, specific amino acids can be ranked according to the degree of stabilization, or destabilization, that they confer upon proteins when positioned at their amino termini. Specifically, in Saccharomyces cerevisiae, any of the stabilizing amino-terminal residues (Met, Gly, Val, Pro, Cys, Ala, Ser, Thr) confers a long (greater than 20 hr) half-life on the test protein beta-galactosidase, whereas destabilizing amino-terminal residues (Ile and Glu, about 30 min; His, Tyr and Gln, about 10 min; Asp, Asn, Phe, Leu, Trp and Lys, about 3 min; and Arg, about 2 min) confer on beta-galactosidase half-lives from less than 3 min to 30 min.

Bachmair et al., supra in their N-end rule work described above, showed the capacity of the endogenous yeast processing enzyme to accurately cleave Ub from heterologous fusion proteins containing any of the 20 amino acids at the Ub-protein junction. Only in the case of proline was this process slow enough to observe Ub-fusion intermediates. For accurate determination of half-lives of the amino-terminally mutated test proteins (a beta-galactosidase derivative and dihydrofolate reductase) it was important to avoid such complications as inclusion body formation and this was achieved by the use of a relatively weak promoter system.

More recently, Butt et al., (1988) J Biol Chem 263:16364-16371 describe studies of ubiquitin fused with a homologous yeast protein, metallothionein. The hybrid gene is under control of the yeast metallothionein promoter, a promoter of intermediate strength. Ecker et al., (1989) J Biol Chem 264(13):7715-7719 also describe the use of the yeast metallothionein promoter to increase ubiquitin fused gene expression of G.sub.8 alpha, sCD4, and the protease domain of human urokinase in yeast while Butt et al., Proc Natl Acad Sci USA 86:2540-2544 (1989) describe a similar ubiquitin expression system developed for use in E. coli.

It would be desirable to develop a yeast expression vector system, preferably an inducible system, for expression of ubiquitin fusion proteins with simultaneous in vivo processing to yield authentic biologically active proteins having destabilizing amino terminal residues.

It would also be desirable to develop a general method using this vector system for quantitative processing of ubiquitin fusions to produce high expression levels of the desired heterologous protein in a yeast host.

In furtherance of these objectives, Barr et al., (1988) Yeast 4:S24 (Abstract) and Sabin et al., (1989) Biotechnology 7:705-709 have extended the observations by Varshavsky and, using strong and regulatable promoters, have produced in yeast high levels of heterologous eukaryotic proteins. Surprisingly, all of the proteins initiate with residues that are known to be destabilizing, yet, with one exception, each of the proteins expressed using the ubiquitin vector system were found to be correctly processed. The results of this work are reproduced herein.

SUMMARY OF THE INVENTION

Methods and compositions are provided for the high fidelity production in yeast hosts of heterologous proteins having destabilizing amino terminal residues. A ubiquitin fusion expression system is provided wherein ubiquitin fusion proteins are cleaved precisely in vivo by an endogenous ubiquitin-specific hydrolase to yield heterologous proteins such as human alpha-1-antitrypsin, human gamma-interferon and human immunodeficiency virus integrase protein, all of which initiate with destabilizing residues.

In embodiment, the present invention provides a method of producing a selectively processed recombinant eukaryotic protein in yeast wherein said eukaryotic protein is encoded by a ubiquitin-heterologous hybrid gene, which method comprises:

(a) constructing a hybrid gene expression cassette comprising a first DNA sequence encoding a promoter-ubiquitin expression cassette ligated in translational reading phase, immediately adjacent to and downstream from a second DNA sequence encoding an heterologous eukaryotic protein, wherein the first codon of said second DNA sequence encodes a selective destabilizing amino acid;

(b) transforming a yeast host cell with a yeast expression vector comprising the hybrid gene expression cassette of step (a); and

(c) culturing the transformed yeast host cell under conditions capable of expressing the ubiquitin-heterologous hybrid gene whereby in vivo processing of the fusion protein yields the recombinant heterologous protein having the selective destabilizing amino terminus.

The recombinant heterologous protein may be recovered from the transformed cell culture and purified.

In yet another embodiment, the invention employs the ubiquitin fusion system to increase the expression levels of eukaryotic proteins. The present ubiquitin fusion approach thus not only leads to greatly increased levels of protein expression but also provides increased expression of quantitiatively processed heterologous proteins.

Other embodiments will also be readily apparent to those of ordinary skill in the art.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic representation of the ubiquitin fusion expression plasmid pBS24UbX. The parent vector pBS24Ub contains selectable markers for yeast growth in uracil- or leucine-deficient media. The promoter-ubiquitin "cassette" is cloned as a BamHI(B)-Sall(L) fragment. Heterologous genes (X) are cloned into pBS24Ub as SstII(S)/Sall fragments. Precise junction sequences are shown in FIG. 2.

FIG. 2 illustrates the synthetic DNA (boxed) and encoded amino acid sequences at ubiquitin-heterologous gene and protein junctions. The fusion protein cleavage site is shown by the arrow above each protein sequence.

FIG. 3 is a Coomassie blue-stained 15-22% gradient (a) and 12.5% gradient (b) SDS-polyacrylamide gels of heterologous proteins expressed in yeast. FIG. 3(a) provides in Lane 1, molecular weight standards (Biorad); Lane 2 is yeast ubiquitin (arrowed); Lane 3 is human gamma-interferon (IFN) and ubiquitin derived from in vivo cleavage of the fusion protein; Lane 4 is N-methionyl-gamma-IFN; Lane 5 is alpha-1-antitrypsin (AT) and ubiquitin from in vivo cleavage of the fusion protein; Lane 6 is N-methionyl-alpha-1-AT; Lane 7 is a control lysate from yeast cells transformed with the parent plasmid pBS24. FIG. 3(b) provides in Lane 1, molecular weight standards; Lane 2 is directly expressed env4; Lane 3 is ubiquitin fusion-derived env4, clearly visible close to its calculated molecular weight of 27.1 kD; Lane 4 is hSOD-env4 fusion having a molecular weight of 42.9 kD; Lane 5 is directly expressed HIV-1 integrase; and Lane 6 is ubiquitin fusion-derived HIV-1 integrase of calculated molecular weight of 32.3 kD.

DETAILED DESCRIPTION OF THE INVENTION

The practice of the present invention will employ unless otherwise indicated, conventional molecular biology, microbiology, and recombinant DNA techniques within the skill of the art. Such techniques are explained fully in the literature. See e.g., Maniatis, Fritsh & Sambrook, "Molecular Cloning: A Laboratory Manual" (1982); "DNA Cloning: A Practical Approach," Volumes I and II (D. N. Glover ed. 1985); "Oligonucleotide Synthesis" (M. J. Gait ed. 1984); "Nucleic Acid Hybridization" (B. D. Hames & S.J. Higgins eds. 1985); "Transcription and Translation" (B. D. Hames & S. J. Higgins eds. 1984); "Animal Cell Culture" (R. I. Freshney ed. 1986); "Immobilized Cells and Enzymes" (IRL Press, 1986); B. Perbal, "A Practical Guide To Molecular Cloning" (1984).

In describing the present invention, the following terminology will be used in accordance with the definitions set out below.

A "replicon" is any genetic element (e.g., plasmid, chromosome, virus) that functions as an autonomous unit of DNA replication in vivo; i.e., capable of replication under its own control.

A "vector" is a replicon, such as plasmid, phage or cosmid, to which another DNA segment may be attached so as to bring about the replication of the attached segment.

A "DNA molecule" refers to the polymeric form of deoxyribonucleotides (adenine, guanine, thymine, or cytosine) in either its single stranded form, or a double-stranded helix. This term refers only to the primary and secondary structure of the molecule, and does not limit it to any particular tertiary forms. Thus, this term includes double-stranded DNA found, inter alia, in linear DNA molecules (e.g., restriction fragments), viruses, plasmids, and chromosomes. In discussing the structure of particular double-stranded DNA molecules, sequences may be described herein according to the normal convention of giving only the sequence in the 51 to 3' direction along the nontranscribed strand of DNA (i.e., the strand having a sequence homologous to the mRNA).

A DNA "coding sequence" is a double-stranded DNA sequence which is transcribed and translated into a polypeptide in vivo when placed under the control of appropriate regulatory sequences. The boundaries of the coding sequence are determined by a start codon at the 5' (amino) terminus and a translation stop codon at the 3' (carboxy) terminus. A coding sequence can include, but is not limited to, procaryotic sequences, cDNA from eucaryotic mRNA, genomic DNA sequences from eucaryotic (e.g., mammalian) DNA, and even synthetic DNA sequences. A polyadenylation signal and transcription termination sequence will usually be located 3' to the coding sequence or, as used in the present invention, at the 3' end of the fusion protein's coding sequence.

"Transcriptional and translational control sequences" are DNA regulatory sequences, such as promoters, enhancers, polyadenylation signals, terminators, and the like, that provide for the expression of a coding sequence in a host cell.

A "promoter sequence" is a DNA regulatory region capable of binding RNA polymerase in a cell and initiating transcription of a downstream (3' direction) coding sequence. For purposes of defining the present invention, the promoter sequence is bounded at its 3' terminus by the transcription initiation site and extends upstream (5' direction) to include the minimum number of bases or elements necessary to initiate transcription at levels detectable above background. Within the promoter sequence will be found a transcription initiation site (conveniently defined by mapping with nuclease S1), as well as protein binding domains (consensus sequences) responsible for the binding of RNA polymerase. Eucaryotic promoters will often, but not always, contain "TATA" boxes and "CAT" boxes. Procaryotic promoters contain Shine-Dalgarno sequences in addition to the -10 and -35 consensus sequences. Preferred promoters for use in the present invention are strong and regulatable.

A coding sequence is "under the control" of transcriptional and translational control sequences in a cell when RNA polymerase transcribes the coding sequence into mRNA, which is then translated into the protein encoded by the coding sequence.

A cell has been "transformed" by exogenous or heterologous DNA when such DNA has been introduced inside the cell. The transforming DNA may or may not be integrated (covalently linked) into chromosomal DNA making up the genome of the cell. In prokaryotes, yeast, and mammalian cells for example, the transforming DNA may be maintained on an episomal element such as a plasmid. The cell has been stably transformed when the cell is able to establish cell lines or clones comprised of a population of daughter cells containing the transforming DNA. A "clone" is a population of cells derived from a single cell or common ancestor by mitosis. A "cell line" is a clone of a cell that is capable of stable growth in vitro for many generations.

Two DNA sequences are "substantially homologous" when at least about 85% (preferably at least about 90%, and most preferably at least about 95%) of the nucleotides match over the defined length of the DNA sequences. Sequences that are substantially homologous can be identified in a Southern hybridization experiment under, for example, stringent conditions as defined for that particular system. Defining appropriate hybridization conditions is within the skill of the art. See, e.g., Maniatis et al., supra; DNA Cloning, Vols. I & II, supra; Nucleic Acid Hybridization, supra.

A "heterologous" region of the DNA construct is an identifiable segment of DNA within a larger DNA molecule that is not found in association with the larger molecule in nature. Thus, when the heterologous region encodes a mammalian gene, the gene will usually be flanked by DNA that does not flank the mammalian genomic DNA in the genome of the source organism. Another example of a heterologous coding sequence is a construct where the coding sequence itself is not found in nature (e.g., a cDNA where the genomic coding sequence contains introns, or synthetic sequences having codons different than the native gene). Allelic variations or naturally-occurring mutational events do not give rise to a heterologous region of DNA as defined herein.

"Eukaryotic" as used herein with reference to heterologous proteins are non-yeast proteins and include mammalian- and viral-derived material. Usually, the eukaryotic polypeptide sequence will be at least about 8 amino acids in length and can include polypeptides up to about 100,000 daltons or higher. Of particular interest are polypeptides of from about 5,000 to about 150,000 daltons, more particularly of about 5,000 to about 100,000 daltons.

The term "analog" includes muteins, fusion proteins comprising domains of the desired polypeptide, and fragments. Examples of analogs which fall within the scope of the present invention include cysteine substitutions to facilitate protein purification and refolding, methionine substitution to reduce a proteins's susceptibility to oxidation, and lysine residue substitution to improve protein stability. Preferred analogs have biological activity. A mutein is a protein substantially homologous to a native sequence of the desired polypeptide (e.g., a minimum of about 75%, 85%, 90% or 95% homologous) wherein at least one amino acid is different. The term fusion protein includes a protein comprising the complete native sequence of the desired polypeptide or a functional domain or the protein, and a heterologous N- or C-terminal sequence (such as a signal sequence or sequence which protects the protein from degradation). A fragment or domain is an amino acid sequence of sufficient length from the desired polypeptide such that it is identifiable as having been derived from such a polypeptide. The origin of a particular peptide can be determined, for example, by comparing its sequence to those found in public databases.

Fragment analogs can be produced by, e.g., expression truncated coding sequences. Synthetic DNA sequences allow convenient construction of genes which will express fragments or muteins. Alternatively, DNA encoding muteins can be made by site-directed mutagenesis of native genes or cDNAs. Analogs exhibiting "biological activity" may be identified by the in vivo and/or in vitro assays, such as described in the examples of alpha-1-antitrypsin, gamma-interferon, and the HIV-1 integrase and envelope proteins.

As used herein, "destabilizing amino acids" or "destabilizing residues" refers to the set of amino-terminal residues (unblocked) that confer on beta-gal half-lives from less than 3 min to 30 min, in S. cerevisiae at 30.degree. C., insert isoleucine, glutamic acid, histidine, tyrosine, glutamine, aspartic acid, asparagine, phenylalanine, leucine, tryptophan, lysine, and arginine, as defined by the N-end rule formulated by Bachmair et al., supra.

An expression vector is constructed according to the present invention so that the ubiquitin-heterologous protein coding sequence is located in the vector with the appropriate regulatory sequences, the positioning and orientation of the hybrid gene coding sequence with respect to the control sequences being such that the hybrid gene coding sequence is transcribed and translated under the control of the control sequences. The control sequences may be ligated to the coding sequence prior to insertion into a vector as taught below. Alternatively, the coding sequence can be cloned directly into an expression vector which already contains the control sequences and an appropriate restriction site. For expression of a heterologous protein in yeast, the control sequences will necessarily be heterologous to the coding sequence.

An "expression cassette" is a DNA construct comprising a coding sequence under the control of transcription initiation and termination sequences. In the practice of the present invention, such constructs will involve bacterial-derived or yeast-recognized transcription and termination sequences, and, for example, a coding sequence for the ubiquitin gene or alternatively, a ubiquitin-heterologous fused gene construction. It is particularly preferred to flank the expression cassettes with restriction sites that will provide for the convenient cloning of the cassettes into an appropriate vector.

The promoter-ubiquitin expression construct of the invention provides a portable sequence for insertion into vectors, which provide the desired replication system. In yeast, promoters involved with enzymes in the glycolytic pathway can provide for high rates of transcription. These promoters are associated with such enzymes as phosphoglucoisomerase, phosphofructokinase, phosphotriose isomerase, phosphoglucomutase, enolase, pyruvic kinase, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), alcohol dehydrogenase(ADH 1 and 2), as well as hybrids of these promoters. A particularly preferred hybrid promoter is the hybrid formed from the 5' regulatory sequences of the ADH2 gene (including the upstream enhancer sequence) and the GAPDH promoter transcription initiation site and consensus sequences, referred to as a "ADH2/GAPDH hybrid promoter." See, e.g., EPO Publication Nos. 120,551; 164,556; 196,056. In like manner, a transcription terminator sequence located 3' to the translation stop codon can be yeast-recognized termination sequences, such as those from the genes for other glycolytic enzymes.

The heterologous gene may encode for any type of polypeptide of interest. Clinically and veterinary important genes that may be employed in the invention include, for example, genes encoding hormones and growth factors, such as growth hormone, somatomedins, epidermal growth factor, fibroblast growth factors, insulin, nerve growth factor, vasopressin, renin, calcitonin, erythropoietin, colony-stimulating factors, lymphokines such as interleukin-2, globins, immunoglobins, interferons (e.g., alpha, beta or gamma), enzymes, etc. Representative viral proteins include those from the human immunodeficiency virus (e.g., envelope, integrase, and gag precursor). Preferred embodiments include human gamma-interferon, human alpha-1-antitrypsin and the integrase protein of the human immunodeficiency virus (HIV).

The genes encoding the heterologous proteins of interest may be synthetic or natural, or combinations thereof. A natural gene (or portion thereof) may be obtained by preparing a cDNA or genomic library and screening for the presence of the gene of interest. Preparation of CDNA libraries from a messenger RNA population is well known and described fully in Huynh et al, (1984) in DNA cloning, Vol. 1: A Practical Approach (D. Glover, ed.), pp. 49-78, IRL Press, Oxford.

When preparing a synthetic nucleotide sequence, it may be desirable to modify the natural nucleotide sequence. For example, it will often be preferred to use codons which are preferentially recognized by the desired host. In some instances, it may be desirable to further alter the nucleotide sequence to create or remove restriction sites to, for example, enhance insertion of the gene sequence into convenient expression vectors or to substitute one or more amino acids in the resulting polypeptide to increase stability. A general method for site-specific mutagenesis is described in Noren et al., (1989), Science 244:182-188. This method may be used to create analogs with unnatural amino acids.

Synthetic oligonucleotides are prepared by either the phosphotriester method as described by Duckworth et al., (1981) Nuc Acids Res 9:1691 or the phosphoramidite method as described by Beaucage and Caruthers, (1981) Tet Letts 22:1859 and Matteucci and Caruthers, (1981) J Am Chem Soc 103:3185, and can be prepared using commercially available automated oligonucleotide synthesizers, such as the Applied Biosystems 380A DNA synthesizers.

Adaptor sequences may be necessary to fuse the ubiquitin and heterologous gene sequences in-frame at the junction site to generate the specific processing site recognized by the endogenous yeast endoprotease. A yeast hydrolase responsible for this cleavage has been characterized at the molecular level by cloning and over expression of its gene product (Miller et al., (1989) Biotechnology 7: 831). The yeast hydrolase cleaves the junction peptide bond between the C-terminal Gly.sub.76 of ubiquitin and the heterologous fusion protein rapidly in all cases, except when the first amino acid of the extension protein is proline.

The vector construct will be an episomal element capable of stable maintenance in a host, particularly a fungal host such as yeast. The construct will include one or more replication systems, desirably two replication systems allowing for maintenance of the replicon in both a yeast host for expression and a bacterial (e.g., E. coli) host for cloning. Numerous cloning vectors are known to those of skill in the art. They may be used as intermediates in the construction of replicable expression vectors, or as integrating expression vectors, when the intended expression host does not recognize the cloning vector's origin of replication. Examples of recombinant DNA vectors for cloning and host cells which they can transform include the various bacteriophage lambda vectors (E. coli), pBR322 (E. coli), pACYC177 (E. coli), pKT230 (gram-negative bacteria), pGV1106 (gram-negative bacteria), pLAFR1 (gram-negative bacteria), pME290 (non-E. coli gram-negative bacteria), pHV14 (E. coli and Bacillus subtilis), pBD9 (Bacillus), pIJ61 (Streptomyces), pUC6 (Streptomyces), actinophage and phiC31 (Streptomyces). Examples of such yeast-bacteria shuttle vectors include YEp24 [Botstein et al., Gene 8:17-24], pC/i [Brake et al., (1984) Proc Natl Acad Sci USA 81:4643-4646], and YRpl7 [Stinchomb et al., (1982) J Mol Biol 158:157]. See generally, DNA Cloning: Vols. I & II, supra; T. Maniatis et al., supra; B. Perbal, supra.

Furthermore, an extra-chromasomal vector may be a high or low copy number vector, the copy number generally ranging from about 1 to about 500. High copy number plasmids may be employed as one means to promote the high level expression of the heterologous proteins of the invention. With high copy number yeast vectors, there will generally be at least 10, preferably at least 20, and usually not exceeding about 150-500 copies in a single host. DNA constructs of the present invention can also be integrated into the yeast genome by an integrating vector. Examples of such vectors are known in the art as shown by Botstein et al., supra. Preferably, less than 50 copies of the cassette are integrated into the genome, more preferably less than about 10, and usually less than about 5. Typically, only 1 or 2 copies are integrated.

The selection of suitable yeast and other micro-organism hosts (e.g., diploid, haploid, auxotrophs, etc.) for the practice of the invention is within the skill of the art. When selecting yeast hosts for expression, suitable hosts may include those shown to have, inter alia, good secretion capacity, low proteolytic activity, and overall robustness. Integrating yeast vectors will not contain an origin of replication recognizable by the yeast host. Replicating yeast vectors generally will contain an origin of replication from the 2 micron yeast plasmid or an autonomously replicating sequence (ARS). The yeast vectors will also typically contain a gene encoding a selectable marker used to confirm transformation, as well as an origin of replication recognizable by a non-yeast host, such as bacteria for convenient cloning. Yeast and other microorganisms are available from a variety of sources, including the Yeast Genetic Stock Center, Department of Biophysics and Medical Physics, University of California, Berkeley, Calif.; and the American Type Culture Collection, Rockville, Md. Yeast expression vectors are known in the art. See, e.g., U.S. Pat. Nos. 4,446,235; 4,443,539; 4,430,428; see also European Pub. Nos. 103,409; 100,561; 96,491.

Methods of introducing exogenous DNA into microbial hosts are well known in the art. There is a wide variety of ways to transform yeast. For example, spheroplast transformation is taught by Hinnen et al., (1978) Proc Natl Acad Sci USA 75:1919-1933 and Stinchomb et al., EP Publication 45,573. The calcium chloride treatment as described by Cohen, et al., (1972) Proc Natl Acad Sci USA 69:2110, or the RbC1.sub.2 method described in Maniatis, et al, supra may be used for procaryotes or other cells which contain substantial cell wall barriers. Transformants are grown in an appropriate nutrient medium, and, where appropriate, maintained under selective pressure to insure retention of endogenous DNA. Where expression is inducible, growth can be permitted of the microbial host to yield a high density of cells, and then expression is induced. The heterologous protein is then isolated from cell lysates and purified.

As used herein, a "selectively processed protein" refers to a protein that is quantitatively processed from the ubiquitin fusion protein to yield a free recombinant heterologous protein having an amino-terminus identical to that of the selected codon in the DNA encoding the heterologous protein.

"High level expression" or "increased expression" refers to the yield of heterologous protein determined from a microbial cell lysate and calculated as a percentage of total cell protein. Quantitation of protein yield may be determined by densitometric scanning of protein bands on a polyacrylamide gel or by alternative methods such as, for example, quantitation using a purified yeast-derived or bacterial-derived protein standard and quantitation of the total soluble proteins by Lowry protein assay. At minimum, high level expression refers to a level of at least 5% of total cell protein, more preferably in the range of about 10 to about 50% of total cell protein, and most preferably in the range of about 10 to about 30% total cell protein.

The heterologous protein can be harvested by any conventional means. In the case of yeast expression, in vivo cleavage of the ubiquitin heterologous fusion protein occurs to provide quantitatively processed "authentic" amino terminal proteins. The desired heterologous protein may be purified by employing conventional purification techniques. Such techniques include, but are not limited to size exclusion chromatography, ion-exchange chromatography, HPLC, electrophoresis, dialysis, solvent-solvent extraction, and the like.

As demonstrated herein and in the art, specific cellular mechanisms exist for the amino terminal modification of proteins prior to degradation via the N-end rule pathway. The existence of these modification pathways has several implications for heterologous gene expression using the ubiquitin fusion approach. Human gamma-interferon and alpha-1-antitrypsin have tertiary (Gln) and secondary (Glu) destabilizing amino acids respectively at their amino termini. It has been shown previously by Varshavsky and coworkers, that amino-terminal Gin and Asn can be hydrolyzed to Glu and Asp residues respectively. The secondary destabilizing residues, Glu, Asp and Cys can be modified by arginyl-RNA-protein-transferase catalyzed addition of Arg, a primary destabilizing amino acid to the amino terminus of a substrate protein. The definite possibility existed therefore that the amino terminal residue could be modified by the addition of Arg, to differ from the native sequence. Indeed, sequence analysis of one of the human immunodeficiency viral proteins, env4, expressed using the system of the present invention gives a heterogeneous final product.

The mechanisms for increased expression using the ubiquitin fusion approach of the present invention are unresolved. Several mechanisms that could account for the observed increase include, for example: increased transcription or message stability; increased mRNA translatability, cellular compartmental targeting by the yeast ubiquitin fusion partner; or a combination of these factors. Moreover, for any given protein, a variety of factors in addition to the N-end rule may combine to modulate its half-life in vivo. Among such factors may be the solubility or insolubility of a protein, the flexibility and accessibility of the protein's amino terminus, the distribution of ubiquitinatable lysine residues near the amino terminus, and the presence of chemically blocking amino terminal groups. Surprisingly, the amino termini of many of the proteins exemplified in the present invention are amino acids of the destabilizing class according to the N-end rule. While the present invention does not delineate the mechanism(s) responsible for increased expression, it does provide a means to achieve such desirable ends.

PATENT EXAMPLES This data is not available for free
PATENT PHOTOCOPY Available on request

Want more information ?
Interested in the hidden information ?
Click here and do your request.


back