Skip to content

Cache Service

cache_service

Cache Service - Application-Level Caching with Hash-Based Keys.

Provides caching functionality for expensive operations like database merges. Uses hash-based keys for cache invalidation and supports multiple data types (DataFrame, Dataset, MergedDataDTO).

Classes:

Name Description
CacheService

Manages application-level caching with TTL and size limits

Classes

CacheService

CacheService(max_size: int = 100, default_ttl_seconds: int = 3600)

Manage application-level caching with hash-based keys.

Provides caching for expensive operations with automatic expiration, size limits, and hash-based key generation.

Parameters:

Name Type Description Default
max_size int

Maximum number of cached items

100
default_ttl_seconds int

Default time-to-live in seconds (1 hour)

3600

Attributes:

Name Type Description
_cache Dict[str, Dict[str, Any]]

Internal cache storage

_max_size int

Maximum cache size

_default_ttl int

Default TTL

Methods:

Name Description
set

Store value in cache

get

Retrieve value from cache

delete

Remove value from cache

clear

Clear entire cache

generate_hash_key

Generate hash-based cache key

has

Check if key exists and is valid

size

Get current cache size

Notes

Cache entries structure: { 'key': { 'value': cached_value, 'timestamp': creation_time, 'ttl': time_to_live_seconds } }

Uses SHA256 for hash key generation.

Initialize cache service.

Parameters:

Name Type Description Default
max_size int

Maximum cached items

100
default_ttl_seconds int

Default TTL (1 hour)

3600
Source code in src/application/services/cache_service.py
def __init__(self, max_size: int = 100, default_ttl_seconds: int = 3600) -> None:
    """
    Initialize cache service.

    Parameters
    ----------
    max_size : int, default=100
        Maximum cached items
    default_ttl_seconds : int, default=3600
        Default TTL (1 hour)
    """
    self._cache: Dict[str, Dict[str, Any]] = {}
    self._max_size = max_size
    self._default_ttl = default_ttl_seconds
Functions
set
set(key: str, value: Any, ttl_seconds: Optional[int] = None) -> None

Store value in cache.

Parameters:

Name Type Description Default
key str

Cache key

required
value Any

Value to cache

required
ttl_seconds Optional[int]

Time-to-live (uses default if None)

None
Notes

If cache is full, removes oldest entry (FIFO eviction).

Source code in src/application/services/cache_service.py
def set(self, key: str, value: Any, ttl_seconds: Optional[int] = None) -> None:
    """
    Store value in cache.

    Parameters
    ----------
    key : str
        Cache key
    value : Any
        Value to cache
    ttl_seconds : Optional[int]
        Time-to-live (uses default if None)

    Notes
    -----
    If cache is full, removes oldest entry (FIFO eviction).
    """
    # Check size limit and evict if needed
    if len(self._cache) >= self._max_size:
        self._evict_oldest()

    ttl = ttl_seconds if ttl_seconds is not None else self._default_ttl

    self._cache[key] = {
        "value": value,
        "timestamp": time.time(),
        "ttl": ttl,
    }
get
get(key: str) -> Optional[Any]

Retrieve value from cache.

Parameters:

Name Type Description Default
key str

Cache key

required

Returns:

Type Description
Optional[Any]

Cached value or None if not found/expired

Notes

Automatically removes expired entries.

Source code in src/application/services/cache_service.py
def get(self, key: str) -> Optional[Any]:
    """
    Retrieve value from cache.

    Parameters
    ----------
    key : str
        Cache key

    Returns
    -------
    Optional[Any]
        Cached value or None if not found/expired

    Notes
    -----
    Automatically removes expired entries.
    """
    if key not in self._cache:
        return None

    entry = self._cache[key]

    # Check expiration
    if self._is_expired(entry):
        del self._cache[key]
        return None

    return entry["value"]
delete
delete(key: str) -> bool

Remove value from cache.

Parameters:

Name Type Description Default
key str

Cache key

required

Returns:

Type Description
bool

True if key was deleted, False if not found

Source code in src/application/services/cache_service.py
def delete(self, key: str) -> bool:
    """
    Remove value from cache.

    Parameters
    ----------
    key : str
        Cache key

    Returns
    -------
    bool
        True if key was deleted, False if not found
    """
    if key in self._cache:
        del self._cache[key]
        return True
    return False
clear
clear() -> None

Clear entire cache.

Source code in src/application/services/cache_service.py
def clear(self) -> None:
    """Clear entire cache."""
    self._cache.clear()
generate_hash_key
generate_hash_key(content: str) -> str

Generate SHA256 hash-based cache key.

Parameters:

Name Type Description Default
content str

Content to hash (e.g., upload content or dataset identifier)

required

Returns:

Type Description
str

SHA256 hash hexadecimal string

Notes
  • Same content always generates same key (deterministic)
  • Compatible with legacy cache key generation
Source code in src/application/services/cache_service.py
def generate_hash_key(self, content: str) -> str:
    """
    Generate SHA256 hash-based cache key.

    Parameters
    ----------
    content : str
        Content to hash (e.g., upload content or dataset identifier)

    Returns
    -------
    str
        SHA256 hash hexadecimal string

    Notes
    -----
    - Same content always generates same key (deterministic)
    - Compatible with legacy cache key generation
    """
    return hashlib.sha256(content.encode("utf-8")).hexdigest()
has
has(key: str) -> bool

Check if key exists and is valid (not expired).

Parameters:

Name Type Description Default
key str

Cache key

required

Returns:

Type Description
bool

True if key exists and not expired

Source code in src/application/services/cache_service.py
def has(self, key: str) -> bool:
    """
    Check if key exists and is valid (not expired).

    Parameters
    ----------
    key : str
        Cache key

    Returns
    -------
    bool
        True if key exists and not expired
    """
    if key not in self._cache:
        return False

    entry = self._cache[key]
    if self._is_expired(entry):
        del self._cache[key]
        return False

    return True
size
size() -> int

Get current cache size (number of entries).

Returns:

Type Description
int

Number of cached entries

Source code in src/application/services/cache_service.py
def size(self) -> int:
    """
    Get current cache size (number of entries).

    Returns
    -------
    int
        Number of cached entries
    """
    # Clean expired entries first
    self._clean_expired()
    return len(self._cache)