Dot Plot Strategy - Scatter and Bubble Chart Visualizations.
This module implements the DotPlotStrategy for creating scatter plots and bubble charts using Plotly. Supports both simple scatter plots with uniform markers and bubble charts with size/color encoding for quantitative variables.
Classes:
| Name | Description |
DotPlotStrategy | Strategy for creating scatter and bubble chart visualizations. |
Notes
- Supports simple scatter plots and bubble charts
- Flexible axis mappings (categorical or continuous)
- Data aggregation and filtering capabilities
For supported use cases, refer to the official documentation.
Classes
DotPlotStrategy
DotPlotStrategy(config: Dict[str, Any])
Bases: BasePlotStrategy
Strategy for creating scatter and bubble chart visualizations.
This strategy creates scatter-based visualizations with flexible configuration for both simple scatter plots and bubble charts with size and color encoding.
Parameters:
| Name | Type | Description | Default |
config | Dict[str, Any] | Complete configuration dictionary from YAML. Must contain 'visualization' section with plot parameters. | required |
Attributes:
| Name | Type | Description |
config | Dict[str, Any] | Stored configuration dictionary. |
data_config | Dict[str, Any] | Data processing configuration. |
plotly_config | Dict[str, Any] | Plotly-specific visualization config. |
Methods:
| Name | Description |
validate_data | Validate input data for dot plot requirements |
process_data | Process data with filtering, grouping, and sorting |
create_figure | Create dot plot figure from processed data |
generate | Generate complete dot plot visualization |
Notes
- Supports simple scatter and bubble chart modes
- Flexible axis mappings (categorical or continuous)
- Data aggregation and filtering capabilities
Initialize strategy with configuration.
Parameters:
| Name | Type | Description | Default |
config | Dict[str, Any] | Complete configuration from YAML file. | required |
Source code in src/domain/plot_strategies/charts/dot_plot_strategy.py
| def __init__(self, config: Dict[str, Any]):
"""
Initialize strategy with configuration.
Parameters
----------
config : Dict[str, Any]
Complete configuration from YAML file.
"""
super().__init__(config)
self.data_config = config.get("data", {})
self.plotly_config = self.viz_config.get("plotly", {})
logger.info("DotPlotStrategy initialized")
|
Functions
validate_data
validate_data(df: DataFrame) -> None
Validate input data for dot plot requirements.
Parameters:
| Name | Type | Description | Default |
df | DataFrame | | required |
Raises:
| Type | Description |
ValueError | If DataFrame is empty, required columns missing, or x/y columns not found in data. |
Source code in src/domain/plot_strategies/charts/dot_plot_strategy.py
| def validate_data(self, df: pd.DataFrame) -> None:
"""
Validate input data for dot plot requirements.
Parameters
----------
df : pd.DataFrame
Input data to validate.
Raises
------
ValueError
If DataFrame is empty, required columns missing, or x/y columns
not found in data.
"""
logger.debug("Starting data validation for DotPlotStrategy")
# Check if DataFrame is empty
if df.empty:
raise ValueError("DataFrame is empty")
# Get required columns from config
required_cols = self.data_config.get("required_columns", [])
if required_cols:
missing_cols = set(required_cols) - set(df.columns)
if missing_cols:
raise ValueError(
f"Missing required columns: {missing_cols}. "
f"Available: {df.columns.tolist()}"
)
# Validate x and y columns from scatter config
scatter_config = self.plotly_config.get("scatter", {})
x_col = scatter_config.get("x")
y_col = scatter_config.get("y")
if not x_col or not y_col:
raise ValueError(
"Configuration must specify 'x' and 'y' in "
"'visualization.plotly.scatter'"
)
if x_col not in df.columns:
raise ValueError(f"X column '{x_col}' not found in data")
if y_col not in df.columns:
raise ValueError(f"Y column '{y_col}' not found in data")
# Validate size/color columns if bubble mode
# NOTE: Skip validation for columns created during processing
# (e.g., 'unique_ko_count' created by group_and_count step)
mode = scatter_config.get("mode", "simple")
if mode == "bubble":
size_col = scatter_config.get("size")
color_col = scatter_config.get("color")
# Check if column will be created during processing
processing_steps = self.data_config.get("processing", {}).get("steps", [])
result_columns = []
for step in processing_steps:
if step.get("name") == "group_and_count":
result_col = step.get("params", {}).get("result_column")
if result_col:
result_columns.append(result_col)
# Only validate if column is not created by processing
if (
size_col
and size_col not in df.columns
and size_col not in result_columns
):
raise ValueError(
f"Size column '{size_col}' not found in data and "
f"not created by processing steps"
)
if (
color_col
and color_col not in df.columns
and color_col not in result_columns
):
raise ValueError(
f"Color column '{color_col}' not found in data and "
f"not created by processing steps"
)
logger.info(f"Data validation passed: {len(df)} rows")
|
process_data
process_data(df: DataFrame) -> pd.DataFrame
Process data for dot plot visualization.
Applies processing steps defined in configuration including filtering, grouping, aggregation, and sorting.
Parameters:
| Name | Type | Description | Default |
df | DataFrame | | required |
Returns:
| Type | Description |
DataFrame | Processed data ready for visualization. |
Source code in src/domain/plot_strategies/charts/dot_plot_strategy.py
| def process_data(self, df: pd.DataFrame) -> pd.DataFrame:
"""
Process data for dot plot visualization.
Applies processing steps defined in configuration including filtering,
grouping, aggregation, and sorting.
Parameters
----------
df : pd.DataFrame
Input data.
Returns
-------
pd.DataFrame
Processed data ready for visualization.
"""
logger.debug(f"Processing data: {len(df)} rows")
processed_df = df.copy()
# Get processing steps from config
processing_steps = self.data_config.get("processing", {}).get("steps", [])
for step in processing_steps:
step_name = step.get("name")
enabled = step.get("enabled", True)
if not enabled:
logger.debug(f"Skipping disabled step: {step_name}")
continue
params = step.get("params", {})
if step_name == "filter":
processed_df = self._apply_filter_step(processed_df, params)
elif step_name == "group_and_count":
processed_df = self._apply_grouping_step(processed_df, params)
elif step_name == "sort":
processed_df = self._apply_sort_step(processed_df, params)
else:
logger.warning(f"Unknown processing step: {step_name}")
logger.info(
f"Data processing completed: {len(processed_df)} rows "
f"(from {len(df)} original rows)"
)
return processed_df
|
create_figure
create_figure(processed_df: DataFrame) -> go.Figure
Create dot plot figure from processed data.
Parameters:
| Name | Type | Description | Default |
processed_df | DataFrame | Processed data ready for visualization. | required |
Returns:
| Type | Description |
Figure | Plotly figure (scatter or bubble chart). |
Source code in src/domain/plot_strategies/charts/dot_plot_strategy.py
| def create_figure(self, processed_df: pd.DataFrame) -> go.Figure:
"""
Create dot plot figure from processed data.
Parameters
----------
processed_df : pd.DataFrame
Processed data ready for visualization.
Returns
-------
go.Figure
Plotly figure (scatter or bubble chart).
"""
return self.generate(processed_df)
|
generate
generate(data: DataFrame) -> go.Figure
Generate dot plot (scatter or bubble chart).
Parameters:
| Name | Type | Description | Default |
data | DataFrame | Processed data ready for visualization. | required |
Returns:
| Type | Description |
Figure | Plotly figure with scatter plot or bubble chart. |
Raises:
| Type | Description |
ValueError | If data is empty or required configuration missing. |
Source code in src/domain/plot_strategies/charts/dot_plot_strategy.py
| def generate(self, data: pd.DataFrame) -> go.Figure:
"""
Generate dot plot (scatter or bubble chart).
Parameters
----------
data : pd.DataFrame
Processed data ready for visualization.
Returns
-------
go.Figure
Plotly figure with scatter plot or bubble chart.
Raises
------
ValueError
If data is empty or required configuration missing.
"""
logger.info("Generating dot plot", extra={"rows": len(data)})
# Validate data
if data.empty:
raise ValueError("Cannot create plot: DataFrame is empty")
# Extract configuration
scatter_config = self.plotly_config.get("scatter", {})
layout_config = self.plotly_config.get("layout", {})
# Get plot mode
mode = scatter_config.get("mode", "simple")
# Create figure based on mode
if mode == "bubble":
fig = self._create_bubble_chart(data, scatter_config)
else:
fig = self._create_simple_scatter(data, scatter_config)
# Apply layout
fig = self._apply_layout(fig, layout_config, scatter_config)
logger.info(
"Dot plot generated successfully", extra={"mode": mode, "points": len(data)}
)
return fig
|
apply_filters
apply_filters(df: DataFrame, filters: Optional[Dict[str, Any]] = None) -> pd.DataFrame
Apply filters to data.
This is a common implementation that can be overridden by subclasses if needed.
Parameters:
| Name | Type | Description | Default |
df | DataFrame | | required |
filters | Optional[Dict[str, Any]] | | None |
Returns:
| Type | Description |
DataFrame | |
Source code in src/domain/plot_strategies/base/base_plot_strategy.py
| def apply_filters(
self, df: pd.DataFrame, filters: Optional[Dict[str, Any]] = None
) -> pd.DataFrame:
"""
Apply filters to data.
This is a common implementation that can be overridden
by subclasses if needed.
Parameters
----------
df : pd.DataFrame
Data to filter.
filters : Optional[Dict[str, Any]], default=None
Filter specifications.
Returns
-------
pd.DataFrame
Filtered data.
"""
import logging
logger = logging.getLogger(__name__)
if not filters:
logger.debug("No filters provided, returning original data")
return df
logger.info(
f"Applying filters - Input shape: {df.shape}, "
f"Columns: {df.columns.tolist()}"
)
logger.info(f"Filters to apply: {filters}")
filtered_df = df.copy()
# Get filter configurations
filter_configs = self.config.get("filters", [])
for filter_config in filter_configs:
filter_id = filter_config.get("filter_id")
filter_type = filter_config.get("type")
if filter_id not in filters:
continue
filter_value = filters[filter_id]
data_binding = filter_config.get("data_binding", {})
column = data_binding.get("column")
if not column or column not in filtered_df.columns:
logger.warning(
f"Filter '{filter_id}': Column '{column}' not found. "
f"Available: {filtered_df.columns.tolist()}"
)
continue
# Apply range filter
if filter_type == "range" and isinstance(filter_value, list):
min_val, max_val = filter_value
logger.info(
f"Applying range filter on '{column}': " f"[{min_val}, {max_val}]"
)
filtered_df = filtered_df[
(filtered_df[column] >= min_val) & (filtered_df[column] <= max_val)
]
logger.info(f"After filter: {len(filtered_df)} rows remaining")
logger.info(f"Final filtered shape: {filtered_df.shape}")
return filtered_df
|
apply_customizations
apply_customizations(fig: Figure, customizations: Optional[Any] = None) -> go.Figure
Apply custom styling to figure.
This is a hook for future customization features (FLEXIVEL and FLEXIVEL2).
Parameters:
| Name | Type | Description | Default |
fig | Figure | | required |
customizations | Optional[Any] | Customization specifications. | None |
Returns:
Source code in src/domain/plot_strategies/base/base_plot_strategy.py
| def apply_customizations(
self, fig: go.Figure, customizations: Optional[Any] = None
) -> go.Figure:
"""
Apply custom styling to figure.
This is a hook for future customization features
(FLEXIVEL and FLEXIVEL2).
Parameters
----------
fig : go.Figure
Base figure.
customizations : Optional[Any], default=None
Customization specifications.
Returns
-------
go.Figure
Customized figure.
"""
# Hook for future implementation
return fig
|
generate_plot
generate_plot(data: DataFrame, filters: Optional[Dict[str, Any]] = None, customizations: Optional[Any] = None) -> go.Figure
Generate complete plot (Template Method).
This method orchestrates the entire plot generation process: 1. Validate input data 2. Process data 3. Apply filters 4. Create figure 5. Apply customizations
Parameters:
| Name | Type | Description | Default |
data | DataFrame | | required |
filters | Optional[Dict[str, Any]] | | None |
customizations | Optional[Any] | | None |
Returns:
Raises:
| Type | Description |
ValueError | |
Source code in src/domain/plot_strategies/base/base_plot_strategy.py
| def generate_plot(
self,
data: pd.DataFrame,
filters: Optional[Dict[str, Any]] = None,
customizations: Optional[Any] = None,
) -> go.Figure:
"""
Generate complete plot (Template Method).
This method orchestrates the entire plot generation process:
1. Validate input data
2. Process data
3. Apply filters
4. Create figure
5. Apply customizations
Parameters
----------
data : pd.DataFrame
Input data.
filters : Optional[Dict[str, Any]], default=None
Filters to apply.
customizations : Optional[Any], default=None
Customizations to apply.
Returns
-------
go.Figure
Complete Plotly figure.
Raises
------
ValueError
If validation fails.
"""
# 1. Validate
self.validate_data(data)
# 2. Process
processed_df = self.process_data(data)
# 3. Filter
filtered_df = self.apply_filters(processed_df, filters)
# 4. Create figure
figure = self.create_figure(filtered_df)
# 5. Apply customizations (hook for future)
figure = self.apply_customizations(figure, customizations)
return figure
|
Functions