An informatics search for the low-molecular weight chromium-binding peptide
© Dinakarpandian et al. 2004
Received: 10 August 2004
Accepted: 16 December 2004
Published: 16 December 2004
Skip to main content
© Dinakarpandian et al. 2004
Received: 10 August 2004
Accepted: 16 December 2004
Published: 16 December 2004
The amino acid composition of a low molecular weight chromium binding peptide (LMWCr), isolated from bovine liver, is reportedly E:G:C:D::4:2:2:2, though its sequence has not been discovered. There is some controversy surrounding the exact biochemical forms and the action of Cr(III) in biological systems; the topic has been the subject of many experimental reports and continues to be investigated. Clarification of Cr-protein interactions will further understanding Cr(III) biochemistry and provide a basis for novel therapies based on metallocomplexes or small molecules.
A genomic search of the non-redundant database for all possible decapeptides of the reported composition yields three exact matches, EDGEECDCGE, DGEECDCGEE and CEGGCEEDDE. The first two sequences are found in ADAM 19 (A Disintegrin and Metalloproteinase domain 19) proteins in man and mouse; the last is found in a protein kinase in rice (Oryza sativa). A broader search for pentameric sequences (and assuming a disulfide dimer) corresponding to the stoichiometric ratio E:D:G:C::2:1:1:1, within the set of human proteins and the set of proteins in, or related to, the insulin signaling pathway, yields a match at an acidic region in the α-subunit of the insulin receptor (-EECGD-, residues 175–184). A synthetic peptide derived from this sequence binds chromium(III) and forms a metal-peptide complex that has properties matching those reported for isolated LMWCr and Cr(III)-containing peptide fractions.
The search for an acidic decameric sequence indicates that LMWCr may not be a contiguous sequence. The identification of a distinct pentameric sequence in a significant insulin-signaling pathway protein suggests a possible identity for the LMWCr peptide. This identification clarifies directions for further investigation of LMWCr peptide fractions, chromium bio-coordination chemistry and a possible role in the insulin signaling pathway. Implications for models of chromium action in the insulin-signaling pathway are discussed.
Recent reports [18, 21, 24], using extracts from bovine liver, suggest that LMWCr is of peptide origin, having an approximate stoichiometric ratio consisting of E:D:G:C::4:2:2:2. In the absence of the exact sequence of this peptide, we generated all possible permutations matching the reported stoichiometry of LMWCr (10!/4!2!2!2! = 18,900) and performed an exact match search of the non-redundant (nr) protein database for their occurrence. Based on a simple model where all amino acids are equally possible and about 1.5 million sequences of average length 200 present in the nr database, the E-value for an independent exact match is about 0.55 [(0.05)10 * 200 * 1.5 * 106 * 18,900].
Multiple sequence alignment showing conservation of the EDGEECDCGE motif in all known mammalian forms of ADAM19 and a homologous putative ADAM in fission yeast (top). Multiple sequence alignment showing proposed Cr(III) binding sequence, NKDDNEECGD, conserved in the insulin receptor (INSR) across species, but not in the insulin-like growth factor receptor (IG1R) (bottom).
Noting that the reported stoichiometry consists of even numbers of amino acids, we further considered the possibility that LMWCr might be a disulfide-linked dimer, with each monomeric unit having the composition, E:D:G:C::2:1:1:1. Such a hypothesis is consistent with the limited resolution of the experimental data used to derive the stoichiometry  and yields 60 sequence permutations. The expected number of random matches in the nr database is much higher, ~104. Restricting this search to the human proteome gave rise to 439 (expected: ~102) matches, with no reason to prefer one over the other.
Considering that LMWCr might be a subsequence of a protein related to "insulin" or known to be involved in the insulin signalling pathway, we compiled a set of 96 such sequences. This set comprised two components: 1) proteins playing a role in the insulin signaling pathway (23 protein sequences were selected) derived from pathway charts, and 2) the set of all protein sequences derived from SwissProt for the search, "insulin + human" (78 entries). The two components were compared for redundancy and the duplicates removed (5 instances).
A cross comparison of the resulting 439 entries using the BLAST query method versus the set of 96 pathway and insulin-related proteins results in a unique match for one of the 60 pentapeptides, EECGD, within the insulin receptor (INSR), residues 180–184 (expected matches: ~10-1). Comparison of the pentapeptide set to a more detailed insulin signaling pathway construct  yielded the same result.
This sub-sequence lies in the extracellular α-subunit of INSR in an acid-rich region (-NKDDNEECGD-) towards the end of the L1 domain, and at the start of the "cysteine-rich region". A BLAST query  for all INSR homologs in the nr database shows that this sub-sequence and acidic region is conserved in mouse and rat. Given the location of this acidic sequence within a molecule central to glucose homeostasis, the correspondence with experimentally measured stoichiometry and conservation across multiple species, we speculate that this sequence, or a fragment from this acid rich region may give rise to Cr-peptide fractions isolated from tissue. This suggestion implies that such fractions may not be homogeneous, discrete Cr(III) complexes, i.e. proteolysis may lead to a group of similar peptides that differ by one or more amino acid residues on either side of the Cr binding site. This interpretation differs significantly from reports on the isolation of LMWCr, which attempt to avoid a proteolytic product.
A crystal or solution structure of this region of the insulin receptor has not been determined. However, the crystal structure of a homologous molecule, the insulin-like growth factor receptor (IGR1, GeneID: 124240) has been published [30, 31]. Sequence alignment shows that this molecule exhibits a conserved difference in this region, having the subsequence -KECGD- in the same region instead of -EECGD- as found in INSR. The decameric "acidic region" found in INSR is not present in this growth factor receptor – cf. -NKPPKECGDLCPGTL-. The cysteine in -KECGD- forms a disulfide bond with another cysteine residue 28 positions away in the crystal structure. However, we do not know if this is due to the crystallization conditions or representative of the natural form. Further, the insulin receptor possesses 3 additional cysteine residues compared to the insulin-like growth factor receptor, thus the pattern of pairing of cysteine residues to form disulfide bonds may be different in the two receptor types. The observed difference may be great enough to preclude Cr(III) acting on the insulin-like growth factor receptor.
Barring the discovery of a novel, unsequenced or unidentified protein or peptide, these data point to the possible sequence of LMWCr fractions and may point to new strategies in therapeutic design. In addition, the question of sequence specificity in Cr(III)-peptide complexes must be fully addressed, along with thermodynamic and kinetic aspects of Cr(III) binding and transfer.
Models of non-toxicological action of Cr(III) in biological systems may broadly fall into three categories: structural, redox, and iron homeostasis. The earliest models , and those advanced by Vincent , fall into a structural category and focus on the interactions of Cr(III) with peptides and proteins to affect insulin signaling and glucose metabolism, either directly or indirectly. There is an important redox model recently advanced  that suggests higher oxidation states of Cr interact with tyrosine phosphatases to inhibit the down-regulation of the insulin receptor. Finally, the chemical similarity of the Cr(III) and Fe(III) cations, and various in vitro studies suggest that Cr(III) replacement in the physiological iron transport and storage apparatus may lead to some small beneficial outcome for certain diseases . The biological relevance of these models, and of in vitro experiments (including our own) may be finally ascertained only after the fact.
Although unexpected, the results in this report and a critical review of other literature [9, 11–18], suggest that an extracellular model for Cr(III) biochemistry with respect to insulin signaling may be plausible (see Supporting Information). Such a structural model would include the known aspects of INSR cycling and insulin degradation , and include the proposed interactions between Cr(III) and the INSR at the acidic site identified by our genomic search. This model is reminiscent of Mertz and co-workers' original proposal  of a ternary interaction between Cr(III), insulin and insulin receptor. It is substantially different from intracellular mechanisms for LMWCr action [16, 18], redox mechanisms , and the iron homeostasis model . In addition, the cycling of the insulin receptor and insulin degradation  may satisfy the problems of cellular distribution of Cr(III) and production of LMWCr via proteolysis. Experimentally observed insulin potentiating activity of Cr(III) may result from binding to the alpha subunit or bridging interaction between the two α subunits of an intact INSR molecule.
This model is a parsimonious alternative to current proposals of Cr action in the insulin signaling pathway. However, this model points directly back to significant kinetic and a thermodynamic questions about Cr(III) in biological systems. For instance, what is the physical form of Cr in the bloodstream? How is Cr(III) transported and exchanged between ligands in the serum? Is transport specific? What structure/activity relationship exists in Cr(III) complexes to allow their transport across biological membranes? Thermodynamically, a hydrolyzed, multinuclear Cr cluster should predominate at neutral pH, but transport by transferrin would presumably be in the mononuclear Fe binding sites. Alternatively, Cr(III) clusters may be transported non-specifically in serum by proteins, possibly including transferrin and serum albumin. At this point, there exist significant gaps in understanding the possible biochemistry of Cr(III) and what molecular processes it may affect.
The proposed extracellular model of Cr(III) action in this report is upstream of IRS1, a therapeutic target of White and others [7, 8], and may lend itself to small-molecule therapeutic strategies for diabetes and other metabolic conditions . We hope this model may pave the way for innovative experiments, better models of Cr(III) biochemistry and excretion, and further understanding of signaling events in complex biochemical pathways.
A bioinformatic search for an acidic decameric sequence matching reported stoichiometries of LMWCr amino acid composition indicates that the peptide may not be a contiguous sequence. An expanded search localized a pentameric sequence in the insulin receptor and suggests a possible identity for the Cr(III)-containing peptide fractions derived from liver. Disulfide linked penta- and hexameric peptides based on the identified sequence bind Cr(III) in a similar fashion to LMWCr fractions reported in the literature.
The nr database was downloaded and Perl scripts used to search for exact matches corresponding to all unique permutations of EEEEGGCCDD. In a separate search, a set of 78 unique protein sequences from Swiss-Prot were obtained as a result of using the query "insulin human." This set was searched for exact matches corresponding to all unique permutations of EEGCD. In addition, the human proteome was also downloaded and searched for matches to the same set of pentameric peptides.
The peptide, AcEECGD-CONH2, was synthesized by continuous-flow automated solid-phase synthesis on a Perseptive Biosystems Pioneer Peptide Synthesis System. Peptide synthesis was performed by using standard Fmoc-protection strategies with TBTU/DIEA activation strategy. Typically, a 0.5 mmol scale, using Rink amide resin (loading ~0.65 mmol/g) and four times excess of the other reagents (TBTU, protected amino acid) were used. After the solid-phase synthesis was complete, acetylation at the amino terminus was carried out for 2 hours (50:50:1:: acetic anhydride:dimethylformamide:pyridine), and the resulting peptide cleaved from the resin. The peptide was cleaved from the solid phase resin by adding a slurry of 94% trifluoroacetic acid, 2.5% of water, 2.5% of ethanedithiol, and 1% of triisopropylsilane to the reaction vessel and shaking it for 5 h. The trifluoroacetic acid, containing the peptide, was filtered off by vacuum, and the resin was washed 2–3 times with trifluoroacetic acid. The filtrate was evaporated under nitrogen gas until the volume was reduced to 15 mL. 30 mL of ice-cold diethyl ether was added to the filtrate, causing the peptide to precipitate, and the mixture centrifuged to form a pellet of the peptide. The diethyl ether was decanted and the peptide pellet was washed with diethyl ether three times. Finally, the peptide was dissolved in 20 mL water containing 0.1% trifluoroacetic acid and extracted with diethyl ether three times. The aqueous portion was collected and freeze-dried to give the synthetic peptide. AcEECGD-CONH2, has a retention time of 5.3 minutes on a 250 mm × 4.5 mm C-18 reversed phase HPLC column using a gradient of 5 to 20% acetonitrile in 0.1% trifluoroacetic acid/water mobile phase running at 1 mL per minute and detected at a wavelength of 220 nm.
The synthetic peptide, AcEECGD-CONH2, was dissolved to 10 mg/mL in a 0.1 M solution of ammonium carbonate at pH 7 and allowed to oxidize under ambient air for 48 hours. The product was isolated by lyophilization and analyzed by HPLC. The disulfide peptide dimer, (AcEECGD-CONH2)2, has a retention time of 6.0 minutes using the above conditions. The product, (AcEECGD-CONH2)2, has a calculated mass of 1183.35141 and exhibits a mass of 1183.3477 Daltons when analyzed by electrospray mass spectrometry.
Fifteen milligrams of (AcEECGD-CONH2)2 was weighed and dissolved in 30 mL of water. A portion of this solution (5 mL) was taken up in a 25 mL tube, and 4 equivalents of chromium(III) chloride were added as a solid or in an aqueous solution. The reaction of the two components took several minutes and was monitored by ultraviolet-visible spectrophotometry. The chromium peptide complex, Cr3O(AcEECGD-CONH2)2, exhibits characteristic spectral absorbance features at 432 nm and 615 nm. The features are consistent with chromium bound to oxygen atom donors and similar to the reported spectrum of LMWCr [24, 32]. EPR spectra were collected using the following parameters: microwave frequency, 9.632 GHz; microwave power incident to the cavity, 2 mW; temperature, 10 K (LHe cryostat). Samples were prepared by incubating a solution of peptide with chromium chloride at a final concentration of 1 mM in metal with excess peptide. The complete characterization (EPR, MS, etc.) of this and analogous Cr-peptide complexes will be reported elsewhere.
This work was supported by an award from the American Heart Association, and by funds from the University of Missouri (to J.D.V.H. and V.M.) and University of Missouri Research Board. EPR training and experiments at the National Biomedical EPR Center, Milwaukee, WI, were supported by NIH grant EB001980.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.