haam.haam_visualizations module

Visualization Module

Creates interpretable visualizations of the DML-LME results to reveal how different perceivers (human vs AI) utilize the perceptual cue space. These visualizations are designed to communicate complex statistical results to both technical and non-technical audiences.

Main Visualizations:

3D UMAP Projections: Shows how judgments vary across the high-dimensional embedding space, with PC arrows indicating directions of maximum variance
Framework Diagrams: Illustrates the mediation pathways from criterion through PCs to judgments
Coefficient Grids: Displays all 200 PC effects in a compact, interpretable format
Word Clouds: Reveals semantic content associated with each principal component

The visualizations implement design principles from the paper, using:

Red/blue color schemes for high/low PC values
Arrow overlays showing PC directions in UMAP space
Interactive Plotly figures for exploration
Validity coloring to indicate measurement quality

All visualizations support custom PC naming for domain-specific interpretation.

HAAM Visualization Module

Functions for creating interactive visualizations.

class haam.haam_visualizations.NumpyEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Bases: JSONEncoder

Custom JSON encoder that handles numpy types.

Methods

`default`(obj)	Implement this method in a subclass such that it returns a serializable object for `o`, or calls the base implementation (to raise a `TypeError`).
`encode`(o)	Return a JSON string representation of a Python data structure.
`iterencode`(o[, _one_shot])	Encode the given object and yield each string representation as available.

default(obj)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)

class haam.haam_visualizations.HAAMVisualizer(haam_results: Dict, topic_summaries: Dict | None = None)[source]

Bases: object

Create visualizations for HAAM analysis results.

Methods

`create_3d_pca_with_arrows`(pca_features, ...)	Create 3D PCA visualization with directional arrows showing PC gradients.
`create_3d_umap_with_pc_arrows`(...[, ...])	Create 3D UMAP visualization with PC directional arrows.
`create_all_pc_umap_visualizations`(...[, ...])	Create UMAP visualizations for multiple PCs.
`create_main_visualization`(pc_indices[, ...])	Create main HAAM framework visualization with dynamic metrics.
`create_metrics_summary`([output_file])	Create a comprehensive summary of all HAAM metrics.
`create_mini_visualization`([n_components, ...])	Create mini grid visualization of all PCs.
`create_pc_effects_plot`(pc_indices[, output_file])	Create bar chart showing PC effects.
`create_pc_umap_with_topics`(pc_idx, ...[, ...])	Create UMAP visualization colored by PC scores with topic labels.
`create_umap_visualization`(umap_embeddings[, ...])	Create interactive UMAP visualization.
`plot_pc_effects`(pc_idx, topic_associations)	Create 4-panel bar chart showing PC effects on outcomes.

__init__(haam_results: Dict, topic_summaries: Dict | None = None)[source]

Initialize visualizer.

Parameters:

haam_results (Dict) – Results from HAAMAnalysis
topic_summaries (Dict, optional) – Topic summaries from TopicAnalyzer

create_main_visualization(pc_indices: List[int], output_file: str | None = None, pc_names: Dict[int, str] | None = None, ranking_method: str = 'HU') → str[source]

Create main HAAM framework visualization with dynamic metrics.

The visualization now shows: - Generic “X” label instead of “SC” for criterion - Dynamically calculated R², PoMA, and unmodeled path percentages - Custom PC names when provided (shows “-” otherwise) - Enhanced topic display using c-TF-IDF

Parameters:

pc_indices (List[int]) – List of PC indices to display (0-based)
output_file (str, optional) – Path to save HTML file
pc_names (Dict[int, str], optional) – Manual names for PCs. Keys are PC indices (0-based), values are names. If not provided, uses “-” for all PCs. Example: {0: “Formality”, 3: “Complexity”, 6: “Sentiment”}
ranking_method (str, default='HU') – Method used to rank PCs: ‘HU’, ‘AI’, ‘X’, or ‘triple’

Returns:

HTML content

Return type:

str

create_mini_visualization(n_components: int = 200, n_highlight: int = 20, output_file: str | None = None) → str[source]

Create mini grid visualization of all PCs.

Parameters:

n_components (int) – Total number of components to show
n_highlight (int) – Number of top components to highlight
output_file (str, optional) – Path to save HTML file

Returns:

HTML content

Return type:

str

plot_pc_effects(pc_idx: int, topic_associations: Dict, figsize: Tuple[int, int] = (15, 6)) → Figure[source]

Create 4-panel bar chart showing PC effects on outcomes.

Parameters:

pc_idx (int) – PC index (0-based)
topic_associations (Dict) – Topic associations from TopicAnalyzer
figsize (Tuple[int, int]) – Figure size

Returns:

Matplotlib figure

Return type:

plt.Figure

create_umap_visualization(umap_embeddings: ndarray, color_by: str = 'X', topic_labels: Dict | None = None, show_topics: bool = True, output_file: str | None = None) → Figure[source]

Create interactive UMAP visualization.

Parameters:

umap_embeddings (np.ndarray) – 2D or 3D UMAP embeddings
color_by (str) – Variable to color by: ‘X’, ‘AI’, ‘HU’, or ‘PC1’, ‘PC2’, etc.
topic_labels (Dict, optional) – Topic labels for points
show_topics (bool) – Whether to show topic labels
output_file (str, optional) – Path to save HTML file

Returns:

Plotly figure

Return type:

go.Figure

create_pc_umap_with_topics(pc_idx: int, pc_scores: ndarray, umap_embeddings: ndarray, cluster_labels: ndarray, topic_keywords: Dict[int, str], pc_associations: Dict[int, List[Dict]], output_file: str | None = None, show_top_n: int = 5, show_bottom_n: int = 5, display: bool = True) → Figure[source]

Create UMAP visualization colored by PC scores with topic labels.

Parameters:

pc_idx (int) – PC index (0-based)
pc_scores (np.ndarray) – PC scores for all samples
umap_embeddings (np.ndarray) – 3D UMAP embeddings
cluster_labels (np.ndarray) – Cluster assignments for each point
topic_keywords (Dict[int, str]) – Topic ID to keyword mapping
pc_associations (Dict[int, List[Dict]]) – PC-topic associations from TopicAnalyzer
output_file (str, optional) – Path to save HTML file
show_top_n (int) – Number of high-scoring topics to label
show_bottom_n (int) – Number of low-scoring topics to label
display (bool) – Whether to display in notebook/colab

Returns:

Plotly figure object

Return type:

go.Figure

create_all_pc_umap_visualizations(pc_indices: List[int], pc_scores_all: ndarray, umap_embeddings: ndarray, cluster_labels: ndarray, topic_keywords: Dict[int, str], pc_associations: Dict[int, List[Dict]], output_dir: str, show_top_n: int = 5, show_bottom_n: int = 5, display: bool = False) → Dict[int, str][source]

Create UMAP visualizations for multiple PCs.

Parameters:

pc_indices (List[int]) – List of PC indices to visualize
pc_scores_all (np.ndarray) – All PC scores (n_samples x n_components)
umap_embeddings (np.ndarray) – 3D UMAP embeddings
cluster_labels (np.ndarray) – Cluster assignments
topic_keywords (Dict[int, str]) – Topic keywords
pc_associations (Dict[int, List[Dict]]) – PC-topic associations
output_dir (str) – Directory to save visualizations
show_top_n (int) – Number of high topics to show
show_bottom_n (int) – Number of low topics to show
display (bool) – Whether to display each plot

Returns:

Mapping of PC index to output file path

Return type:

Dict[int, str]

create_pc_effects_plot(pc_indices: List[int], output_file: str | None = None) → Figure[source]

Create bar chart showing PC effects.

Parameters:

pc_indices (List[int]) – List of PC indices to plot
output_file (str, optional) – Path to save HTML file

Returns:

Plotly figure

Return type:

go.Figure

create_metrics_summary(output_file: str | None = None) → Dict[str, Any][source]

Create a comprehensive summary of all HAAM metrics.

This method exports: - Model performance metrics (R² values for X, AI, HU) - Policy similarities between predictions - Mediation analysis results (PoMA percentages) - Feature selection statistics - Compatible with the new generic “X” labeling

Parameters:: output_file (str, optional) – Path to save JSON file with metrics
Returns:: Dictionary containing all metrics including: - model_performance: R² values for each model - policy_similarities: Correlations between predictions - mediation_analysis: PoMA and effect decomposition - feature_selection: Number and indices of selected PCs
Return type:: Dict[str, Any]

create_3d_umap_with_pc_arrows(umap_embeddings: ndarray, cluster_labels: ndarray, topic_keywords: Dict[int, str], pc_scores_all: ndarray, pc_indices: int | List[int] | None = None, top_k: int = 1, percentile_threshold: float = 90.0, arrow_mode: str = 'all', color_by_usage: bool = True, color_mode: str = 'legacy', criterion: ndarray | None = None, human_judgment: ndarray | None = None, ai_judgment: ndarray | None = None, show_topic_labels: bool | int = 10, output_file: str | None = None, display: bool = True) → Figure[source]

Create 3D UMAP visualization with PC directional arrows.

This method creates a 3D UMAP space where: - Topics are positioned based on their UMAP embeddings - Arrows show PC directions from low to high scoring topics - Arrow endpoints are averages of top-k and bottom-k topic positions - Topics are colored by HU/AI usage patterns

Parameters:

umap_embeddings (np.ndarray) – 3D UMAP embeddings (n_samples x 3)
cluster_labels (np.ndarray) – Cluster assignments for each point
topic_keywords (Dict[int, str]) – Topic ID to keyword mapping
pc_scores_all (np.ndarray) – PC scores for all samples (n_samples x n_components)
pc_indices (int or List[int], optional) – PC indices to show arrows for. If None and arrow_mode=’all’, shows first 3
top_k (int, default=1) – Number of top/bottom topics to average for arrow endpoints (default=1 for cleaner arrows)
percentile_threshold (float, default=90.0) – Percentile threshold for determining top/bottom topics
arrow_mode (str, default='all') – Arrow display mode: ‘single’, ‘list’, or ‘all’
color_by_usage (bool, default=True) – Whether to color topics by HU/AI usage patterns
color_mode (str, default='legacy') – Coloring mode when color_by_usage=True: - ‘legacy’: Use PC coefficient-based inference (original behavior) - ‘validity’: Use direct X/HU/AI measurement (consistent with word clouds)
criterion (np.ndarray, optional) – Ground truth values (X) for validity coloring mode
human_judgment (np.ndarray, optional) – Human judgment values (HU) for validity coloring mode
ai_judgment (np.ndarray, optional) – AI judgment values (AI) for validity coloring mode
show_topic_labels (bool or int, default=10) –
- If True: Show all topic labels
- If False: Hide all topic labels (hover still works)
- If int: Show only the N topics closest to camera (dynamic)
output_file (str, optional) – Path to save HTML file
display (bool, default=True) – Whether to display in notebook/colab

Returns:

Plotly 3D figure object

Return type:

go.Figure

create_3d_pca_with_arrows(pca_features: ndarray, cluster_labels: ndarray, topic_keywords: Dict[int, str], pc_indices: int | List[int] | None = None, arrow_mode: str = 'all', color_by_usage: bool = True, output_file: str | None = None, display: bool = True) → Figure[source]

Create 3D PCA visualization with directional arrows showing PC gradients.

This method creates a 3D scatter plot of the first 3 PCs with: - Topic clusters floating in 3D space - Directional arrows showing high->low gradients for specified PCs - Color coding based on HU/AI usage patterns - Interactive tooltips with topic information

Parameters:

pca_features (np.ndarray) – PCA-transformed features (n_samples x n_components)
cluster_labels (np.ndarray) – Cluster assignments for each point
topic_keywords (Dict[int, str]) – Topic ID to keyword mapping
pc_indices (int or List[int], optional) – PC indices to show arrows for. If None and arrow_mode=’all’, shows first 3
arrow_mode (str, default='all') – Arrow display mode: ‘single’, ‘list’, or ‘all’
color_by_usage (bool, default=True) – Whether to color topics by HU/AI usage patterns
output_file (str, optional) – Path to save HTML file
display (bool, default=True) – Whether to display in notebook/colab

Returns:

Plotly 3D figure object

Return type:

go.Figure