SolupHred is the first phenomenological predictor that considers protein environment pH when calculating aggregation propensity of Intrinsically Disordered Proteins (IDPs).
Polypeptides are exposed to changing environmental conditions that modulate their intrinsic aggregation propensities. Intrinsically disordered proteins (IDPs) constitutively expose their aggregation determinants to the solvent, thus being especially sensitive to its fluctuations. Accordingly, IDPs aggregation is strongly modulated by factors extrinsic to the sequence, such as ion concentration, ligands, or pH (Uversky, 2009). However, this effect has been traditionally disregarded or barely parametrized in state-of-the-art aggregation predictors.
In a recent work, we elaborated a phenomenological model to predict IDPs aggregation as a function of pH, based on the assumption that protein lipophilicity and charge are both dependent on the solution pH (Santos et al., 2020). SolupHred exploits this model to predict IDP’s pH-dependent solubility by implementing a pH-dependent lipophilicity scale (Zamora et al. 2019) based on implicit solvation calculations together with the Henderson-Hasselbalch equation to estimate the polypeptide lipophilicity and net charge. SolupHred is the first computational tool dedicated to this task.
Under the "Submission" section, users may upload a file or paste the sequence(s) in FASTA format. Next, select the pH range for the analysis and the desired pH interval or step size (default is 0.5) and click Submit. Users may alternatively select to predict solubility at a specific pH by clicking the checkbox and choosing a pH for the analysis.
Users can also pre-populate the application form with alpha-synuclein by clicking the Example button.
SolupHred will treat introduced sequence(s) as disordered over the analysed pH range. Recently, Santos et al. 2020 developed an algorithm to explore the pH-dependent folding and unfolding of IDPs.
SolupHred will display two clickable links with a json file and a zip folder containing all calculations and generated files (including graphs, tables).
The primary output is an interactive table with the scores computed by SolupHred, with clickable identifiers that will show the corresponding figure for each entry. Graphs represent solubility variations across the pH range, with colored regions showing the pH range with 10% of the maximum (blue) and 10% of the minimum solubilities (red). Alternatively, predictions for only one sequence will directly display the graph in the primary output. Graphs will not be generated when 10 or more sequences are introduced.
Columns description in the online output and the downloadable summary table:
ID: Sequence identifier, tag line.
Maximum pH: pH within the analysed range where solubility is maximum.
Maximum solubility: Maximum solubility score.
10% maximum: The pH interval that accounts for the 10% maximum solubility, followed by the mean solubility within the interval.
Minimum pH: pH within the analysed range where solubility is minimal.
Minimum solubility: Minimum solubility score.
10% minimum: The pH interval that accounts for the 10% minimum solubility, followed by the mean solubility within the interval.
pI: Isoelectric point.
Length: The length of the introduced disordered region.
Additionally, the ZIP file contains a CSV summary table with an extra column:
pH:solub Scores: The solubility calculated for each pH.
SolupHred architecture limits individual jobs to 5.000 sequences at 29 pHs or a similar load. For more extensive calculations, the user should split the queries into smaller jobs.
The solubility equation implemented in the SolupHred webserver was originally developed in Santos et al., 2020. To develop the equation, we model the role of lipophilicity and the net charge on the pH-dependent aggregation of three variants (wild-type, acid and basic) of the N-terminus moiety of the measles virus phosphoprotein (PNT), a canonical IDP. The main trait of this training set was that the three proteins shared the same sequence, only reverting the charge of specific ionizable residues; in such a way that their recorded experimental solubility fluctuations along the analysed pH interval were directly connected to the protonation state of their residues. By parametrizing this experimental dataset, we sought to build an empirical equation to model the pH-dependent aggregation of intrinsically disordered proteins (IDPs) based on the assumption that both the global protein charge and lipophilicity depend on the solution pH. This simple phenomenological approach showed unprecedented accuracy in predicting the dependence of the aggregation of both pathogenic and functional amyloidogenic IDPs on the pH (Santos et al., 2020).
SolupHred is designed to profile the solubility of an IDP along a continuous pH interval, assuming that the analysed protein is completely unfolded. The presence of native structures may impact the prediction since structural elements reshape the physicochemical and solvent exposition of the polypeptide chain. Thus, SolupHred is not intended to analyse folded proteins.
Importantly, IDPs could undergo disorder-to-order transitions in the assessed pH interval and the appearance of a pH-conditioned secondary or tertiary structure could affect the accuracy of the prediction. In the absence of experimental data on the disordered state of the analysed protein, we suggest using a pH-dependent disorder predictor (DispHred).
1.- Pintado, C.; Santos, J.; Iglesias, V. and Ventura, S. SolupHred: A Server to Predict the pH-dependent Aggregation of Intrinsically Disordered Proteins. Bioinformatics 2020, btaa909. doi:10.1093/bioinformatics/btaa909
2.- Santos, J.; Iglesias, V.; Santos-Suárez, J.; Mangiagalli, M.; Brocca, S.; Pallarès, I.; Ventura, S. pH-Dependent Aggregation in Intrinsically Disordered Proteins Is Determined by Charge and Lipophilicity. Cells 2020, 9, 145. doi:10.3390/cells9010145
3.- Uversky, V. Intrinsically disordered proteins and their environment: effects of strong denaturants, temperature, pH, counter ions, membranes, binding partners, osmolytes, and macromolecular crowding.The Protein Journal 2009, 28:05–325. doi: 10.1007/s10930-009-9201-4
4.- Zamora, W.J.; Campanera, J.M.; Luque, F.J. Development of a Structure-Based, pH-Dependent Lipophilicity Scale of Amino Acids from Continuum Solvation Calculations.J Phys Chem Lett 2019, 10, 883-889. doi:10.1021/acs.jpclett.9b00028
5.- Santos et al. DispHred: A server for the prediction of pH-dependent order-disorder transitions of intrinsically disordered proteins from the primary sequence. IJMS 2020, 21, 5814. doi: 10.3390/ijms21165814