haam.haam_visualizations module

Visualization Module

Creates interpretable visualizations of the DML-LME results to reveal how different perceivers (human vs AI) utilize the perceptual cue space. These visualizations are designed to communicate complex statistical results to both technical and non-technical audiences.

Main Visualizations:

  • 3D UMAP Projections: Shows how judgments vary across the high-dimensional embedding space, with PC arrows indicating directions of maximum variance

  • Framework Diagrams: Illustrates the mediation pathways from criterion through PCs to judgments

  • Coefficient Grids: Displays all 200 PC effects in a compact, interpretable format

  • Word Clouds: Reveals semantic content associated with each principal component

The visualizations implement design principles from the paper, using:

  • Red/blue color schemes for high/low PC values

  • Arrow overlays showing PC directions in UMAP space

  • Interactive Plotly figures for exploration

  • Validity coloring to indicate measurement quality

All visualizations support custom PC naming for domain-specific interpretation.

HAAM Visualization Module

Functions for creating interactive visualizations.

class haam.haam_visualizations.NumpyEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Bases: JSONEncoder

Custom JSON encoder that handles numpy types.

Methods

default(obj)

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

encode(o)

Return a JSON string representation of a Python data structure.

iterencode(o[, _one_shot])

Encode the given object and yield each string representation as available.

default(obj)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
class haam.haam_visualizations.HAAMVisualizer(haam_results: Dict, topic_summaries: Dict | None = None)[source]

Bases: object

Create visualizations for HAAM analysis results.

Methods

create_3d_pca_with_arrows(pca_features, ...)

Create 3D PCA visualization with directional arrows showing PC gradients.

create_3d_umap_with_pc_arrows(...[, ...])

Create 3D UMAP visualization with PC directional arrows.

create_all_pc_umap_visualizations(...[, ...])

Create UMAP visualizations for multiple PCs.

create_main_visualization(pc_indices[, ...])

Create main HAAM framework visualization with dynamic metrics.

create_metrics_summary([output_file])

Create a comprehensive summary of all HAAM metrics.

create_mini_visualization([n_components, ...])

Create mini grid visualization of all PCs.

create_pc_effects_plot(pc_indices[, output_file])

Create bar chart showing PC effects.

create_pc_umap_with_topics(pc_idx, ...[, ...])

Create UMAP visualization colored by PC scores with topic labels.

create_umap_visualization(umap_embeddings[, ...])

Create interactive UMAP visualization.

plot_pc_effects(pc_idx, topic_associations)

Create 4-panel bar chart showing PC effects on outcomes.

__init__(haam_results: Dict, topic_summaries: Dict | None = None)[source]

Initialize visualizer.

Parameters:
  • haam_results (Dict) – Results from HAAMAnalysis

  • topic_summaries (Dict, optional) – Topic summaries from TopicAnalyzer

create_main_visualization(pc_indices: List[int], output_file: str | None = None, pc_names: Dict[int, str] | None = None, ranking_method: str = 'HU') str[source]

Create main HAAM framework visualization with dynamic metrics.

The visualization now shows: - Generic “X” label instead of “SC” for criterion - Dynamically calculated R², PoMA, and unmodeled path percentages - Custom PC names when provided (shows “-” otherwise) - Enhanced topic display using c-TF-IDF

Parameters:
  • pc_indices (List[int]) – List of PC indices to display (0-based)

  • output_file (str, optional) – Path to save HTML file

  • pc_names (Dict[int, str], optional) – Manual names for PCs. Keys are PC indices (0-based), values are names. If not provided, uses “-” for all PCs. Example: {0: “Formality”, 3: “Complexity”, 6: “Sentiment”}

  • ranking_method (str, default='HU') – Method used to rank PCs: ‘HU’, ‘AI’, ‘X’, or ‘triple’

Returns:

HTML content

Return type:

str

create_mini_visualization(n_components: int = 200, n_highlight: int = 20, output_file: str | None = None) str[source]

Create mini grid visualization of all PCs.

Parameters:
  • n_components (int) – Total number of components to show

  • n_highlight (int) – Number of top components to highlight

  • output_file (str, optional) – Path to save HTML file

Returns:

HTML content

Return type:

str

plot_pc_effects(pc_idx: int, topic_associations: Dict, figsize: Tuple[int, int] = (15, 6)) Figure[source]

Create 4-panel bar chart showing PC effects on outcomes.

Parameters:
  • pc_idx (int) – PC index (0-based)

  • topic_associations (Dict) – Topic associations from TopicAnalyzer

  • figsize (Tuple[int, int]) – Figure size

Returns:

Matplotlib figure

Return type:

plt.Figure

create_umap_visualization(umap_embeddings: ndarray, color_by: str = 'X', topic_labels: Dict | None = None, show_topics: bool = True, output_file: str | None = None) Figure[source]

Create interactive UMAP visualization.

Parameters:
  • umap_embeddings (np.ndarray) – 2D or 3D UMAP embeddings

  • color_by (str) – Variable to color by: ‘X’, ‘AI’, ‘HU’, or ‘PC1’, ‘PC2’, etc.

  • topic_labels (Dict, optional) – Topic labels for points

  • show_topics (bool) – Whether to show topic labels

  • output_file (str, optional) – Path to save HTML file

Returns:

Plotly figure

Return type:

go.Figure

create_pc_umap_with_topics(pc_idx: int, pc_scores: ndarray, umap_embeddings: ndarray, cluster_labels: ndarray, topic_keywords: Dict[int, str], pc_associations: Dict[int, List[Dict]], output_file: str | None = None, show_top_n: int = 5, show_bottom_n: int = 5, display: bool = True) Figure[source]

Create UMAP visualization colored by PC scores with topic labels.

Parameters:
  • pc_idx (int) – PC index (0-based)

  • pc_scores (np.ndarray) – PC scores for all samples

  • umap_embeddings (np.ndarray) – 3D UMAP embeddings

  • cluster_labels (np.ndarray) – Cluster assignments for each point

  • topic_keywords (Dict[int, str]) – Topic ID to keyword mapping

  • pc_associations (Dict[int, List[Dict]]) – PC-topic associations from TopicAnalyzer

  • output_file (str, optional) – Path to save HTML file

  • show_top_n (int) – Number of high-scoring topics to label

  • show_bottom_n (int) – Number of low-scoring topics to label

  • display (bool) – Whether to display in notebook/colab

Returns:

Plotly figure object

Return type:

go.Figure

create_all_pc_umap_visualizations(pc_indices: List[int], pc_scores_all: ndarray, umap_embeddings: ndarray, cluster_labels: ndarray, topic_keywords: Dict[int, str], pc_associations: Dict[int, List[Dict]], output_dir: str, show_top_n: int = 5, show_bottom_n: int = 5, display: bool = False) Dict[int, str][source]

Create UMAP visualizations for multiple PCs.

Parameters:
  • pc_indices (List[int]) – List of PC indices to visualize

  • pc_scores_all (np.ndarray) – All PC scores (n_samples x n_components)

  • umap_embeddings (np.ndarray) – 3D UMAP embeddings

  • cluster_labels (np.ndarray) – Cluster assignments

  • topic_keywords (Dict[int, str]) – Topic keywords

  • pc_associations (Dict[int, List[Dict]]) – PC-topic associations

  • output_dir (str) – Directory to save visualizations

  • show_top_n (int) – Number of high topics to show

  • show_bottom_n (int) – Number of low topics to show

  • display (bool) – Whether to display each plot

Returns:

Mapping of PC index to output file path

Return type:

Dict[int, str]

create_pc_effects_plot(pc_indices: List[int], output_file: str | None = None) Figure[source]

Create bar chart showing PC effects.

Parameters:
  • pc_indices (List[int]) – List of PC indices to plot

  • output_file (str, optional) – Path to save HTML file

Returns:

Plotly figure

Return type:

go.Figure

create_metrics_summary(output_file: str | None = None) Dict[str, Any][source]

Create a comprehensive summary of all HAAM metrics.

This method exports: - Model performance metrics (R² values for X, AI, HU) - Policy similarities between predictions - Mediation analysis results (PoMA percentages) - Feature selection statistics - Compatible with the new generic “X” labeling

Parameters:

output_file (str, optional) – Path to save JSON file with metrics

Returns:

Dictionary containing all metrics including: - model_performance: R² values for each model - policy_similarities: Correlations between predictions - mediation_analysis: PoMA and effect decomposition - feature_selection: Number and indices of selected PCs

Return type:

Dict[str, Any]

create_3d_umap_with_pc_arrows(umap_embeddings: ndarray, cluster_labels: ndarray, topic_keywords: Dict[int, str], pc_scores_all: ndarray, pc_indices: int | List[int] | None = None, top_k: int = 1, percentile_threshold: float = 90.0, arrow_mode: str = 'all', color_by_usage: bool = True, color_mode: str = 'legacy', criterion: ndarray | None = None, human_judgment: ndarray | None = None, ai_judgment: ndarray | None = None, show_topic_labels: bool | int = 10, output_file: str | None = None, display: bool = True) Figure[source]

Create 3D UMAP visualization with PC directional arrows.

This method creates a 3D UMAP space where: - Topics are positioned based on their UMAP embeddings - Arrows show PC directions from low to high scoring topics - Arrow endpoints are averages of top-k and bottom-k topic positions - Topics are colored by HU/AI usage patterns

Parameters:
  • umap_embeddings (np.ndarray) – 3D UMAP embeddings (n_samples x 3)

  • cluster_labels (np.ndarray) – Cluster assignments for each point

  • topic_keywords (Dict[int, str]) – Topic ID to keyword mapping

  • pc_scores_all (np.ndarray) – PC scores for all samples (n_samples x n_components)

  • pc_indices (int or List[int], optional) – PC indices to show arrows for. If None and arrow_mode=’all’, shows first 3

  • top_k (int, default=1) – Number of top/bottom topics to average for arrow endpoints (default=1 for cleaner arrows)

  • percentile_threshold (float, default=90.0) – Percentile threshold for determining top/bottom topics

  • arrow_mode (str, default='all') – Arrow display mode: ‘single’, ‘list’, or ‘all’

  • color_by_usage (bool, default=True) – Whether to color topics by HU/AI usage patterns

  • color_mode (str, default='legacy') – Coloring mode when color_by_usage=True: - ‘legacy’: Use PC coefficient-based inference (original behavior) - ‘validity’: Use direct X/HU/AI measurement (consistent with word clouds)

  • criterion (np.ndarray, optional) – Ground truth values (X) for validity coloring mode

  • human_judgment (np.ndarray, optional) – Human judgment values (HU) for validity coloring mode

  • ai_judgment (np.ndarray, optional) – AI judgment values (AI) for validity coloring mode

  • show_topic_labels (bool or int, default=10) –

    • If True: Show all topic labels

    • If False: Hide all topic labels (hover still works)

    • If int: Show only the N topics closest to camera (dynamic)

  • output_file (str, optional) – Path to save HTML file

  • display (bool, default=True) – Whether to display in notebook/colab

Returns:

Plotly 3D figure object

Return type:

go.Figure

create_3d_pca_with_arrows(pca_features: ndarray, cluster_labels: ndarray, topic_keywords: Dict[int, str], pc_indices: int | List[int] | None = None, arrow_mode: str = 'all', color_by_usage: bool = True, output_file: str | None = None, display: bool = True) Figure[source]

Create 3D PCA visualization with directional arrows showing PC gradients.

This method creates a 3D scatter plot of the first 3 PCs with: - Topic clusters floating in 3D space - Directional arrows showing high->low gradients for specified PCs - Color coding based on HU/AI usage patterns - Interactive tooltips with topic information

Parameters:
  • pca_features (np.ndarray) – PCA-transformed features (n_samples x n_components)

  • cluster_labels (np.ndarray) – Cluster assignments for each point

  • topic_keywords (Dict[int, str]) – Topic ID to keyword mapping

  • pc_indices (int or List[int], optional) – PC indices to show arrows for. If None and arrow_mode=’all’, shows first 3

  • arrow_mode (str, default='all') – Arrow display mode: ‘single’, ‘list’, or ‘all’

  • color_by_usage (bool, default=True) – Whether to color topics by HU/AI usage patterns

  • output_file (str, optional) – Path to save HTML file

  • display (bool, default=True) – Whether to display in notebook/colab

Returns:

Plotly 3D figure object

Return type:

go.Figure