Box Scatter Strategy¶
box_scatter_strategy ¶
Box Scatter Strategy.
Strategy for creating combined box plot with jittered scatter plot visualizations using Plotly. Ideal for showing statistical distributions while preserving visibility of individual data points.
Classes:
| Name | Description |
|---|---|
BoxScatterStrategy | Strategy combining box plot with jittered scatter overlay |
Notes
This strategy is particularly useful for: - Comparing distributions across categories - Identifying outliers while seeing all data points - Visualizing statistical summaries with raw data overlay
For supported use cases, refer to the official documentation.
Classes¶
BoxScatterStrategy ¶
Bases: BasePlotStrategy
Strategy for creating box plot with jittered scatter overlay.
This strategy combines two visualization types: 1. Box plot: Shows statistical distribution (median, IQR, outliers) 2. Scatter plot: Shows individual data points with horizontal jitter
The combination provides both statistical summary and granular detail, making it ideal for distribution analysis with moderate sample sizes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Dict[str, Any] | Complete configuration dictionary from YAML (must contain 'visualization' section with plot parameters) | required |
Attributes:
| Name | Type | Description |
|---|---|---|
config | Dict[str, Any] | Stored configuration dictionary |
Notes
Configuration Structure (YAML): visualization: strategy: "BoxScatterStrategy" plotly: box: y: "unique_ko_count" marker: color: "#198754" scatter: y: "unique_ko_count" mode: "markers" marker: color: "rgba(0,0,0,0.5)" size: 8 jitter: 0.01 hovertemplate: "%{customdata[0]}
..." customdata_columns: ["sample", "rank"] layout: yaxis: title: "Unique KO Count"
Refer to the official documentation for supported use cases and detailed configuration examples.
Initialize BoxScatterStrategy with configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config | Dict[str, Any] | Complete configuration from YAML file | required |
Source code in src/domain/plot_strategies/charts/box_scatter_strategy.py
Functions¶
validate_data ¶
Validate input data for box scatter plot requirements.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df | DataFrame | Input data to validate (already aggregated by callback) | required |
Raises:
| Type | Description |
|---|---|
ValueError | If validation fails |
Notes
Expected columns in aggregated data: - 'sample': Sample identifier - 'unique_ko_count': Aggregated count of unique KOs - 'rank': Ranking within database
Source code in src/domain/plot_strategies/charts/box_scatter_strategy.py
process_data ¶
Process data for box scatter visualization.
For BoxScatterStrategy, data is expected to already be aggregated by the callback (grouped by sample with unique KO counts and ranks). This method performs minimal processing - just ensures clean copy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df | DataFrame | Input data (already aggregated by callback) Expected columns: ['sample', 'unique_ko_count', 'rank'] | required |
Returns:
| Type | Description |
|---|---|
DataFrame | Processed data ready for visualization (unchanged copy) |
Notes
The aggregation pipeline is handled in the callback: 1. Extract raw data from store (Sample, KO columns) 2. Clean data (remove empty, duplicates) 3. Aggregate: groupby('sample')['ko'].nunique() 4. Calculate rank 5. Sort by unique_ko_count descending
Strategy receives final aggregated data.
Source code in src/domain/plot_strategies/charts/box_scatter_strategy.py
create_figure ¶
Create box scatter figure from processed data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df | DataFrame | Processed data ready for visualization | required |
Returns:
| Type | Description |
|---|---|
Figure | Plotly figure with box plot and scatter overlay |
Source code in src/domain/plot_strategies/charts/box_scatter_strategy.py
generate ¶
Generate box plot with jittered scatter overlay.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data | DataFrame | Processed data ready for visualization (must contain column specified in plotly.scatter.y) | required |
Returns:
| Type | Description |
|---|---|
Figure | Plotly figure with box plot and scatter overlay |
Raises:
| Type | Description |
|---|---|
ValueError | If data is empty or required columns missing |
Source code in src/domain/plot_strategies/charts/box_scatter_strategy.py
apply_filters ¶
Apply filters to data.
This is a common implementation that can be overridden by subclasses if needed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df | DataFrame | Data to filter. | required |
filters | Optional[Dict[str, Any]] | Filter specifications. | None |
Returns:
| Type | Description |
|---|---|
DataFrame | Filtered data. |
Source code in src/domain/plot_strategies/base/base_plot_strategy.py
apply_customizations ¶
Apply custom styling to figure.
This is a hook for future customization features (FLEXIVEL and FLEXIVEL2).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fig | Figure | Base figure. | required |
customizations | Optional[Any] | Customization specifications. | None |
Returns:
| Type | Description |
|---|---|
Figure | Customized figure. |
Source code in src/domain/plot_strategies/base/base_plot_strategy.py
generate_plot ¶
generate_plot(data: DataFrame, filters: Optional[Dict[str, Any]] = None, customizations: Optional[Any] = None) -> go.Figure
Generate complete plot (Template Method).
This method orchestrates the entire plot generation process: 1. Validate input data 2. Process data 3. Apply filters 4. Create figure 5. Apply customizations
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data | DataFrame | Input data. | required |
filters | Optional[Dict[str, Any]] | Filters to apply. | None |
customizations | Optional[Any] | Customizations to apply. | None |
Returns:
| Type | Description |
|---|---|
Figure | Complete Plotly figure. |
Raises:
| Type | Description |
|---|---|
ValueError | If validation fails. |