Skip to content

Result Interpretation Guidelines

This page provides guidance for scientifically responsible interpretation of BioRemPP results.


Purpose of Interpretation Guidance

BioRemPP generates results of bioremediation potential based on functional annotations. These results require careful interpretation within their appropriate scientific context.

This page addresses:

  • The scope and limitations of computational functional inference
  • The distinction between genetic potential and biological activity
  • The proper contextualization of regulatory and toxicological predictions
  • Common interpretation errors that may lead to invalid conclusions

Target audience: Researchers interpreting BioRemPP outputs for hypothesis generation, experimental design, or comparative functional profiling.


What the Results Represent

BioRemPP results indicate genetic potential for bioremediation functions based on the presence of KEGG Orthology (KO) identifiers in user-submitted samples.

Results are derived from:

  • Database integration: Merging user KO annotations with curated bioremediation databases
  • Pathway mapping: Quantifying the presence of enzymes in metabolic pathways
  • Functional inference: Identifying gene-compound relationships relevant to pollutant biotransformation
  • Toxicity modeling: Predicting compound properties using machine learning (toxCSM)

Biological Interpretation

Results represent:

  • Gene presence: Detection of orthologous genes encoding specific functions
  • Functional capacity: The theoretical ability to perform bioremediation processes if genes are expressed
  • Pathway completeness: The proportion of enzymatic steps present in defined metabolic routes
  • Comparative profiling: Relative functional differences across samples

Functional Potential vs Biological Activity

Genetic Potential

Definition: The presence of genes encoding bioremediation functions in genomic or metagenomic samples.

Measured by: Detection of KO identifiers matching curated enzyme-compound relationships.

Interpretation example:

"Sample A contains KOs associated with all enzymatic steps of the benzoate degradation pathway (100% pathway completeness)."

This indicates the potentialcapacity for benzoate degradation if genes are expressed and active.

Biological Activity

Definition: The measurable degradation, transformation, or detoxification of pollutants under specific environmental conditions.

Measured by: Experimental assays (e.g., HPLC quantification of substrate depletion, respirometry, isotopic labeling).

Interpretation example:

"Sample A degraded 85% of benzoate within 48 hours at pH 7.0 and 25°C."

This indicates confirmed activity under defined conditions.

The Gap Between Potential and Activity

Factors influencing discrepancy:

  • Regulatory mechanisms (gene silencing, repression)
  • Environmental conditions (pH, temperature, oxygen availability)
  • Substrate bioavailability
  • Competing metabolic pathways
  • Post-translational modifications
  • Enzyme inhibition or cofactor limitation

Implication for interpretation:

Functional potential results should inform hypothesis generation and candidate prioritization for experimental validation, not serve as definitive evidence of biological activity.


Regulatory and Toxicological Context

Regulatory Framework Integration

BioRemPP integrates compound classifications from:

  • IARC (International Agency for Research on Cancer)
  • EPA (U.S. Environmental Protection Agency)
  • ATSDR (Agency for Toxic Substances and Disease Registry)
  • WFD (Water Framework Directive, EU)
  • PSL (Priority Substances List, Canada)
  • EPC (Environmental Priority Chemicals)
  • CONAMA (National Environment Council, Brazil)

Purpose: Provide contextual reference for the environmental or health significance of compounds associated with detected genes.

Critical disclaimer:

Regulatory classifications are for informational context only. They do not constitute:

  • Legal compliance assessments
  • Risk evaluations
  • Exposure limit determinations
  • Regulatory approval or certification

Users must consult jurisdiction-specific regulations and conduct formal risk assessments for compliance purposes.

Toxicity Predictions (toxCSM)

Nature of predictions:

toxCSM provides machine learning–based predictions for 66 toxicological endpoints, including:

  • Nuclear Response
  • Stress Response
  • Genomic
  • Environmental
  • Dose Response

Interpretation guidelines:

  • Predictions should be used for prioritization, not definitive hazard assessment
  • Never use for clinical decisions or regulatory submissions without experimental validation

Recommended use:

  • Screening large compound sets for experimental design
  • Identifying candidates for further investigation
  • Comparative hazard profiling across samples

Common Misinterpretations

Misinterpretation 1: Pathway Completeness as Proof of Activity

Incorrect interpretation:

"This sample has 100% completeness for the naphthalene degradation pathway, therefore it degrades naphthalene."

Correct interpretation:

"This sample encodes all enzymes required for naphthalene degradation. Experimental validation is needed to confirm degradation activity."

Rationale: Pathway completeness reflects gene presence, not gene expression or catalytic activity.


Misinterpretation 2: Toxicity Predictions as Measured Endpoints

Incorrect interpretation:

"toxCSM predicts LD50 of 250 mg/kg, so this compound is moderately toxic."

Correct interpretation:

"toxCSM computational model predicts LD50 of 250 mg/kg. Experimental toxicity testing is required for hazard classification."

Rationale: Machine learning predictions have inherent uncertainty and are not substitutes for laboratory assays.


Misinterpretation 3: Database Agreement as Validation

Incorrect interpretation:

"This KO appears in BioRemPP, KEGG, and HADEG, so the functional annotation is experimentally validated."

Correct interpretation:

"This KO is consistently annotated across multiple databases, suggesting robust computational evidence. Experimental validation remains necessary."

Rationale: Database consensus reflects agreement among curation efforts, not experimental confirmation.


Misinterpretation 4: Quantitative Comparisons Across Databases

Incorrect interpretation:

"Sample A has 500 KOs in KEGG but only 200 in BioRemPP, so KEGG is more comprehensive."

Correct interpretation:

"Sample A has broader functional coverage in KEGG (500 KOs) than in the bioremediation-specific BioRemPP database (200 KOs), reflecting differences in database scope."

Rationale: Databases have different curation focuses. BioRemPP prioritizes bioremediation-relevant enzymes, while KEGG covers general metabolism.


Misinterpretation 5: Absence of Results as Absence of Function

Incorrect interpretation:

"No results were returned for the ToxCSM database, so this sample has no potential for degradation."

Correct interpretation:

"No matches were found between user KOs and ToxCSM compound mappings. This does not exclude the presence of toxic metabolites not covered by the database."

Rationale: Empty results reflect database coverage limitations, not definitive absence of function.


For scientifically rigorous interpretation:

  1. Identify functional potential using BioRemPP results
  2. Prioritize candidates based on pathway completeness, database agreement, or sample ranking
  3. Design experiments to validate gene expression (RT-qPCR, RNA-seq)
  4. Confirm enzymatic activity with substrate depletion assays or metabolomics
  5. Assess in situ performance under environmentally relevant conditions
  6. Document assumptions and limitations in Methods sections

Limitations

BioRemPP results indicate genetic potential for exploratory analysis, not definitive biological activity or regulatory compliance.

For complete scope and limitations, see Limitations and Scope Boundaries.


Reproducibility and Transparency

Required Documentation

For reproducible interpretation:

  • BioRemPP version: Report version used (e.g., v1.0.0)
  • Analysis parameters: Document thresholds, filters, or Top N settings
  • Database access dates: Especially for KEGG (updated regularly)
  • Annotation tool: Report how KO identifiers were generated (e.g., eggNOG-mapper v2.1.12)

See Downloads Guide for complete reproducibility requirements when exporting results.