About


 

About PICKLE

PICKLE is a meta-database for the direct protein interaction network in human developed and maintained by:

The need for source PPI database integration and related issues

At the present time, a vast insight into the currently known human interactome can only be achieved by means of integration of source PPI databases collecting experimental PPI data from the literature. This is due to the fact that source PPI databases display limited overlap in their datasets due to different objectives, curation rules and subsets of the literature that they process. However, the integration process is cumbersome due to PPI annotation discrepancies between the source databases, stemming from distinct curation rules, the use of incompatible interactor descriptors and the selection of different primary protein identifier types (i.e. gene, nucleotide sequence (mRNA), or protein (UniProt) IDs) with which PPIs are recorded. Typically, these heterogeneous primary PPI datasets are integrated via normalization; interactions are first converted to a certain target level of genetic reference and then merged. This top-down process, however, carries two drawbacks. Firstly, the node set of the integrated network is not standardized, but depends each time on the number and type of the combined sources. Thus, different meta-databases are not directly comparable, limiting our capability to evaluate the way in which the reconstructed human protein interactome expands over time. This lack of standardization may also lead to artifacts in the integrated PPI network originating from the currently unresolved part of the human proteome. The second drawback is the irreversible nature of this integration process. The a priori normalization approach suspends the connection between the primary and the integrated PPI networks, thus hindering the identification of normalization artifacts introduced to the integrated network due to the inherent nonlinearity of the genetic information flow.

The PICKLE approach to source PPI database integration

In its launch, PICKLE (Protein InteraCtion KnowLedgebasE) employed a bottom-up approach to the node set standardization problem and adopted the use of the reviewed human complete proteome (RHCP) of UniProtKB/Swiss-Prot as a standardized reference node set. Moreover, focusing primarily on direct physical PPIs, PICKLE introduced a PPI filtering protocol, according to which, only PPIs with at least one supporting experiment capable of suggesting direct interactions are selected from the source PPI databases. In its second version, PICKLE provided a major advance in the field of primary PPI dataset integration introducing the concept of ontological integration as an alternative to the traditional integration via normalization. PICKLE 2.0 relies on the reconstruction of the RHCP genetic information ontology network, which facilitates the integration of primary PPI datasets into an heterogeneous network without the need of any a priori transformations. The ontological integration allows the integrated network to be reversibly normalized to any level of genetic reference without loss of source information for any PPI, establishing thus a direct correspondence between the integrated network instances at different levels of genetic reference and enabling primary PPI dataset cross-checking. In this way, PICKLE 2.0 establishes an enhanced confidence scoring protocol for the PPIs being direct based on the "cross-checked" quality of the supporting experimental evidence and provides equivalent integrated human PPI networks at both the protein (UniProt) or the gene level, at three PPI filtering modes: unfiltered, standard and cross-checked (default).

The advantages of the PICKLE structural scheme

The use of the UniProtKB-defined RHCP as the basis of the instantiation of the PICKLE ontological network offers a standardized point of reference between the PPI networks of successive versions and any future releases of PICKLE. In this way, by comparing the default network of PICKLE at the UniProt level between the two current PICKLE versions and in subsequent releases, we can directly and reliably evaluate the manner of expansion of the RHCP PPI network (and its contributing primary datasets). Moreover, the reversible normalization process achieved by PICKLE guarantees a direct correspondence between the various normalized instances of the integrated PPI network. Hence, one can compare a PPI in the integrated network with all the different ways in which it was originally represented in the primary PPI datasets. This feature enables the identification of any potential normalization artifacts through the comparison of the human protein interactome at the UniProt and genel levels and a PPI reliability assessment through cross-checking of the available experimental evidence sets contributed by the various primary datasets. In addition, PICKLE 2.0 enables the storage of interactions between protein and gene or nucleotide sequence (mRNA) entities, e.g. those derived from protein-RNA or chromatin immunoprecipitation arrays/assays, as reported by certain sources, which can be used for purposes of cross-evaluating the supporting evidences provided in other primary datasets. In the same context, it is possible for assorted types of data (e.g. disease-related genes, genomic, transcriptomic or proteomic data) to be consistently integrated, viewed and interpreted in the context of the protein interaction network.


 

Contributing Databases

PPI Databases
Biological Databases
IntAct UniProt
BioGRID GenBank
HPRD Ensembl
MINT (from IntAct) The European Nucleotide Archive
DIP

 

The PICKLE team

Nicholas K. Moschonas, Professor and Head
Department of General Biology
School of Medicine
University of Patras
Rio, Patras
&
Collaborating Faculty Member
Institute of Chemical Engineering Sciences (ICE-HT)
Foundation for Research & Technology - Hellas (FORTH)
Rio, Patras
GREECE
Phone: +30-2610-997602
e-mail: n_moschonas [a.t.] med [DOT] upatras [DOT] gr

Maria I. Klapa, Principal Researcher (Rank B) and Head
Metabolic Engineering and Systems Biology Laboratory (MESBL)
Institute of Chemical Engineering Sciences (ICE-HT)
Foundation for Research & Technology - Hellas (FORTH)
Rio, Patras
GREECE
Phone: +30-2610-965249
e-mail: mklapa [a.t.] iceht [DOT] forth [DOT] gr

Aris G. Gioutlakis, PhD Candidate
Department of General Biology, School of Medicine
University of Patras
&
Metabolic Engineering and Systems Biology Laboratory (MESBL)
Institute of Chemical Engineering Sciences (ICE-HT)
Foundation for Research & Technology - Hellas (FORTH)
Rio, Patras
GREECE
Phone: +30-2610-997603
e-mail: gioutlakis [a.t.] upatras [DOT] gr

Georgios N. Dimitrakopoulos, PhD
Department of General Biology, School of Medicine
University of Patras
&
Metabolic Engineering and Systems Biology Laboratory (MESBL)
Institute of Chemical Engineering Sciences (ICE-HT)
Foundation for Research & Technology - Hellas (FORTH)
Rio, Patras
GREECE
Phone: +30-2610-997603
e-mail: geodimitrak [a.t.] upatras [DOT] gr


 

PICKLE publications

For more information consult these publications: