Skip to content

Mappers

Application Layer - Mappers Package.

This package contains mappers for converting between domain entities and application DTOs. Mappers are stateless and can be used as functions following Single Responsibility Principle.

Modules

Notes

  • Mappers are stateless and can be used as functions
  • Follow Single Responsibility Principle
  • Enable layer independence (Domain ↔ Application)

Package Overview

Application Layer - Mappers Package.

This package contains mappers for converting between domain entities and application DTOs.

Mappers

SampleMapper Maps between Sample entities and DataFrames MergedDataMapper Maps between MergedData entities and DTOs

Notes
  • Mappers are stateless and can be used as functions
  • Follow Single Responsibility Principle
  • Enable layer independence (Domain ↔ Application)

SampleMapper

SampleMapper

Map between Sample entities and DataFrames.

Provides bidirectional conversion between domain entities (Sample, Dataset) and pandas DataFrames used by the application layer.

Methods:

Name Description
to_dataframe

Convert Dataset to DataFrame

from_dataframe

Convert DataFrame to Dataset

samples_to_dict

Convert Dataset to dictionary format

Notes
  • All methods are static (stateless)
  • Preserves KO identifiers exactly
  • Handles empty datasets gracefully

Functions

to_dataframe staticmethod
to_dataframe(dataset: Dataset) -> pd.DataFrame

Convert Dataset to DataFrame.

Parameters:

Name Type Description Default
dataset Dataset

Domain entity containing samples

required

Returns:

Type Description
DataFrame

DataFrame with columns: Sample, KO (each row is one Sample-KO pair)

Notes
  • Each KO creates a new row
  • Sample ID is repeated for multiple KOs
  • Empty datasets return empty DataFrame with correct columns
Source code in src/application/mappers/sample_mapper.py
@staticmethod
def to_dataframe(dataset: Dataset) -> pd.DataFrame:
    """
    Convert Dataset to DataFrame.

    Parameters
    ----------
    dataset : Dataset
        Domain entity containing samples

    Returns
    -------
    pd.DataFrame
        DataFrame with columns: Sample, KO (each row is one Sample-KO pair)

    Notes
    -----
    - Each KO creates a new row
    - Sample ID is repeated for multiple KOs
    - Empty datasets return empty DataFrame with correct columns
    """
    if not dataset.samples:
        return pd.DataFrame(columns=["Sample", "KO"])

    data = []
    for sample in dataset.samples:
        sample_id_str = sample.id.value
        for ko in sample.ko_list:
            data.append({"Sample": sample_id_str, "KO": ko.id})

    return pd.DataFrame(data)
from_dataframe staticmethod
from_dataframe(df: DataFrame) -> Dataset

Convert DataFrame to Dataset.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with columns: Sample, KO

required

Returns:

Type Description
Dataset

Domain entity with reconstructed samples

Raises:

Type Description
ValueError

If DataFrame missing required columns

ValueError

If DataFrame contains invalid KO identifiers

Notes
  • Groups KOs by Sample ID
  • Validates KO format during conversion
  • Preserves original order of samples
Source code in src/application/mappers/sample_mapper.py
@staticmethod
def from_dataframe(df: pd.DataFrame) -> Dataset:
    """
    Convert DataFrame to Dataset.

    Parameters
    ----------
    df : pd.DataFrame
        DataFrame with columns: Sample, KO

    Returns
    -------
    Dataset
        Domain entity with reconstructed samples

    Raises
    ------
    ValueError
        If DataFrame missing required columns
    ValueError
        If DataFrame contains invalid KO identifiers

    Notes
    -----
    - Groups KOs by Sample ID
    - Validates KO format during conversion
    - Preserves original order of samples
    """
    if df.empty:
        return Dataset([])

    if "Sample" not in df.columns or "KO" not in df.columns:
        raise ValueError("DataFrame must have 'Sample' and 'KO' columns")

    # Group by Sample to collect KOs
    samples = []
    for sample_id, group in df.groupby("Sample", sort=False):
        # Create KO value objects
        kos = [KO(ko_id) for ko_id in group["KO"].values]

        # Create Sample entity
        sample = Sample(id=SampleId(str(sample_id)), ko_list=kos)
        samples.append(sample)

    return Dataset(samples)
samples_to_dict staticmethod
samples_to_dict(dataset: Dataset) -> dict

Convert Dataset to dictionary format.

Parameters:

Name Type Description Default
dataset Dataset

Domain entity with samples

required

Returns:

Type Description
dict

Dictionary with sample_id as keys, list of KO IDs as values

Notes
  • Useful for JSON serialization
  • Preserves all KOs per sample
  • Returns empty dict for empty dataset
Source code in src/application/mappers/sample_mapper.py
@staticmethod
def samples_to_dict(dataset: Dataset) -> dict:
    """
    Convert Dataset to dictionary format.

    Parameters
    ----------
    dataset : Dataset
        Domain entity with samples

    Returns
    -------
    dict
        Dictionary with sample_id as keys, list of KO IDs as values

    Notes
    -----
    - Useful for JSON serialization
    - Preserves all KOs per sample
    - Returns empty dict for empty dataset
    """
    if not dataset.samples:
        return {}

    result = {}
    for sample in dataset.samples:
        sample_id = sample.id.value
        ko_ids = [ko.id for ko in sample.ko_list]
        result[sample_id] = ko_ids

    return result
dict_to_dataset staticmethod
dict_to_dataset(data: dict) -> Dataset

Convert dictionary to Dataset.

Parameters:

Name Type Description Default
data dict

Dictionary with sample_id as keys, list of KO IDs as values

required

Returns:

Type Description
Dataset

Reconstructed domain entity

Notes
  • Validates KO format during conversion
  • Preserves dictionary key order (Python 3.7+)
Source code in src/application/mappers/sample_mapper.py
@staticmethod
def dict_to_dataset(data: dict) -> Dataset:
    """
    Convert dictionary to Dataset.

    Parameters
    ----------
    data : dict
        Dictionary with sample_id as keys, list of KO IDs as values

    Returns
    -------
    Dataset
        Reconstructed domain entity

    Notes
    -----
    - Validates KO format during conversion
    - Preserves dictionary key order (Python 3.7+)
    """
    if not data:
        return Dataset([])

    samples = []
    for sample_id, ko_ids in data.items():
        kos = [KO(ko_id) for ko_id in ko_ids]
        sample = Sample(id=SampleId(str(sample_id)), ko_list=kos)
        samples.append(sample)

    return Dataset(samples)

MergedDataMapper

MergedDataMapper

Map between MergedData entity and MergedDataDTO.

Converts between the domain MergedData entity and the application layer DTO used for data transfer.

Methods:

Name Description
to_dto

Convert MergedData entity to DTO

from_dto

Convert DTO to MergedData entity

Notes
  • All methods are static (stateless)
  • Does not copy DataFrames (shares references)
  • Preserves all metadata

Functions

to_dto staticmethod
to_dto(entity: MergedData, cache_key: str, processing_time_seconds: float = 0.0) -> MergedDataDTO

Convert MergedData entity to DTO.

Parameters:

Name Type Description Default
entity MergedData

Domain entity with merged data

required
cache_key str

Cache key for this merge operation

required
processing_time_seconds float

Time taken for processing

0.0

Returns:

Type Description
MergedDataDTO

Immutable DTO for application layer

Notes
  • Shares DataFrame references (no copying)
  • Adds cache_key and processing_time metadata
  • Preserves all optional database results
Source code in src/application/mappers/merged_data_mapper.py
@staticmethod
def to_dto(
    entity: MergedData, cache_key: str, processing_time_seconds: float = 0.0
) -> MergedDataDTO:
    """
    Convert MergedData entity to DTO.

    Parameters
    ----------
    entity : MergedData
        Domain entity with merged data
    cache_key : str
        Cache key for this merge operation
    processing_time_seconds : float, default=0.0
        Time taken for processing

    Returns
    -------
    MergedDataDTO
        Immutable DTO for application layer

    Notes
    -----
    - Shares DataFrame references (no copying)
    - Adds cache_key and processing_time metadata
    - Preserves all optional database results
    """
    # Calculate match_count and total_records from data
    match_count = 0

    if entity.biorempp_data is not None and isinstance(
        entity.biorempp_data, pd.DataFrame
    ):
        match_count = len(entity.biorempp_data)

    # total_records should be at least match_count
    total_records = (
        len(entity.original_dataset.samples) if entity.original_dataset else 0
    )
    if total_records < match_count:
        total_records = match_count

    return MergedDataDTO(
        biorempp_data=entity.biorempp_data,
        hadeg_data=entity.hadeg_data,
        toxcsm_data=entity.toxcsm_data,
        match_count=match_count,
        total_records=total_records,
        cache_key=cache_key,
        processing_time_seconds=processing_time_seconds,
    )
from_dto staticmethod
from_dto(dto: MergedDataDTO) -> MergedData

Convert DTO to MergedData entity.

Parameters:

Name Type Description Default
dto MergedDataDTO

Application layer DTO

required

Returns:

Type Description
MergedData

Reconstructed domain entity

Notes
  • Shares DataFrame references (no copying)
  • Loses cache_key and processing_time (domain doesn't need them)
  • Preserves all database results
Source code in src/application/mappers/merged_data_mapper.py
@staticmethod
def from_dto(dto: MergedDataDTO) -> MergedData:
    """
    Convert DTO to MergedData entity.

    Parameters
    ----------
    dto : MergedDataDTO
        Application layer DTO

    Returns
    -------
    MergedData
        Reconstructed domain entity

    Notes
    -----
    - Shares DataFrame references (no copying)
    - Loses cache_key and processing_time (domain doesn't need them)
    - Preserves all database results
    """
    from src.domain.entities.dataset import Dataset

    return MergedData(
        original_dataset=Dataset([]),  # Empty dataset as we don't store it in DTO
        biorempp_data=dto.biorempp_data,
        hadeg_data=dto.hadeg_data,
        toxcsm_data=dto.toxcsm_data,
    )
create_empty_dto staticmethod
create_empty_dto(cache_key: str) -> MergedDataDTO

Create empty DTO for no matches scenario.

Parameters:

Name Type Description Default
cache_key str

Cache key for this operation

required

Returns:

Type Description
MergedDataDTO

DTO with empty DataFrames and zero counts

Notes
  • Useful when no matches found in database
  • All DataFrames are empty but not None
  • Maintains DTO structure consistency
Source code in src/application/mappers/merged_data_mapper.py
@staticmethod
def create_empty_dto(cache_key: str) -> MergedDataDTO:
    """
    Create empty DTO for no matches scenario.

    Parameters
    ----------
    cache_key : str
        Cache key for this operation

    Returns
    -------
    MergedDataDTO
        DTO with empty DataFrames and zero counts

    Notes
    -----
    - Useful when no matches found in database
    - All DataFrames are empty but not None
    - Maintains DTO structure consistency
    """
    empty_df = pd.DataFrame()
    return MergedDataDTO(
        biorempp_data=empty_df,
        hadeg_data=empty_df,
        toxcsm_data=empty_df,
        match_count=0,
        total_records=0,
        cache_key=cache_key,
        processing_time_seconds=0.0,
    )