haam.haam_visualizations module
Visualization Module
Creates interpretable visualizations of the DML-LME results to reveal how different perceivers (human vs AI) utilize the perceptual cue space. These visualizations are designed to communicate complex statistical results to both technical and non-technical audiences.
Main Visualizations:
3D UMAP Projections: Shows how judgments vary across the high-dimensional embedding space, with PC arrows indicating directions of maximum variance
Framework Diagrams: Illustrates the mediation pathways from criterion through PCs to judgments
Coefficient Grids: Displays all 200 PC effects in a compact, interpretable format
Word Clouds: Reveals semantic content associated with each principal component
The visualizations implement design principles from the paper, using:
Red/blue color schemes for high/low PC values
Arrow overlays showing PC directions in UMAP space
Interactive Plotly figures for exploration
Validity coloring to indicate measurement quality
All visualizations support custom PC naming for domain-specific interpretation.
HAAM Visualization Module
Functions for creating interactive visualizations.
- class haam.haam_visualizations.NumpyEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]
Bases:
JSONEncoder
Custom JSON encoder that handles numpy types.
Methods
default
(obj)Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).encode
(o)Return a JSON string representation of a Python data structure.
iterencode
(o[, _one_shot])Encode the given object and yield each string representation as available.
- default(obj)[source]
Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
- class haam.haam_visualizations.HAAMVisualizer(haam_results: Dict, topic_summaries: Dict | None = None)[source]
Bases:
object
Create visualizations for HAAM analysis results.
Methods
create_3d_pca_with_arrows
(pca_features, ...)Create 3D PCA visualization with directional arrows showing PC gradients.
create_3d_umap_with_pc_arrows
(...[, ...])Create 3D UMAP visualization with PC directional arrows.
create_all_pc_umap_visualizations
(...[, ...])Create UMAP visualizations for multiple PCs.
create_main_visualization
(pc_indices[, ...])Create main HAAM framework visualization with dynamic metrics.
create_metrics_summary
([output_file])Create a comprehensive summary of all HAAM metrics.
create_mini_visualization
([n_components, ...])Create mini grid visualization of all PCs.
create_pc_effects_plot
(pc_indices[, output_file])Create bar chart showing PC effects.
create_pc_umap_with_topics
(pc_idx, ...[, ...])Create UMAP visualization colored by PC scores with topic labels.
create_umap_visualization
(umap_embeddings[, ...])Create interactive UMAP visualization.
plot_pc_effects
(pc_idx, topic_associations)Create 4-panel bar chart showing PC effects on outcomes.
- __init__(haam_results: Dict, topic_summaries: Dict | None = None)[source]
Initialize visualizer.
- Parameters:
haam_results (Dict) – Results from HAAMAnalysis
topic_summaries (Dict, optional) – Topic summaries from TopicAnalyzer
- create_main_visualization(pc_indices: List[int], output_file: str | None = None, pc_names: Dict[int, str] | None = None, ranking_method: str = 'HU') str [source]
Create main HAAM framework visualization with dynamic metrics.
The visualization now shows: - Generic “X” label instead of “SC” for criterion - Dynamically calculated R², PoMA, and unmodeled path percentages - Custom PC names when provided (shows “-” otherwise) - Enhanced topic display using c-TF-IDF
- Parameters:
pc_indices (List[int]) – List of PC indices to display (0-based)
output_file (str, optional) – Path to save HTML file
pc_names (Dict[int, str], optional) – Manual names for PCs. Keys are PC indices (0-based), values are names. If not provided, uses “-” for all PCs. Example: {0: “Formality”, 3: “Complexity”, 6: “Sentiment”}
ranking_method (str, default='HU') – Method used to rank PCs: ‘HU’, ‘AI’, ‘X’, or ‘triple’
- Returns:
HTML content
- Return type:
- create_mini_visualization(n_components: int = 200, n_highlight: int = 20, output_file: str | None = None) str [source]
Create mini grid visualization of all PCs.
- plot_pc_effects(pc_idx: int, topic_associations: Dict, figsize: Tuple[int, int] = (15, 6)) Figure [source]
Create 4-panel bar chart showing PC effects on outcomes.
- create_umap_visualization(umap_embeddings: ndarray, color_by: str = 'X', topic_labels: Dict | None = None, show_topics: bool = True, output_file: str | None = None) Figure [source]
Create interactive UMAP visualization.
- Parameters:
- Returns:
Plotly figure
- Return type:
go.Figure
- create_pc_umap_with_topics(pc_idx: int, pc_scores: ndarray, umap_embeddings: ndarray, cluster_labels: ndarray, topic_keywords: Dict[int, str], pc_associations: Dict[int, List[Dict]], output_file: str | None = None, show_top_n: int = 5, show_bottom_n: int = 5, display: bool = True) Figure [source]
Create UMAP visualization colored by PC scores with topic labels.
- Parameters:
pc_idx (int) – PC index (0-based)
pc_scores (np.ndarray) – PC scores for all samples
umap_embeddings (np.ndarray) – 3D UMAP embeddings
cluster_labels (np.ndarray) – Cluster assignments for each point
topic_keywords (Dict[int, str]) – Topic ID to keyword mapping
pc_associations (Dict[int, List[Dict]]) – PC-topic associations from TopicAnalyzer
output_file (str, optional) – Path to save HTML file
show_top_n (int) – Number of high-scoring topics to label
show_bottom_n (int) – Number of low-scoring topics to label
display (bool) – Whether to display in notebook/colab
- Returns:
Plotly figure object
- Return type:
go.Figure
- create_all_pc_umap_visualizations(pc_indices: List[int], pc_scores_all: ndarray, umap_embeddings: ndarray, cluster_labels: ndarray, topic_keywords: Dict[int, str], pc_associations: Dict[int, List[Dict]], output_dir: str, show_top_n: int = 5, show_bottom_n: int = 5, display: bool = False) Dict[int, str] [source]
Create UMAP visualizations for multiple PCs.
- Parameters:
pc_indices (List[int]) – List of PC indices to visualize
pc_scores_all (np.ndarray) – All PC scores (n_samples x n_components)
umap_embeddings (np.ndarray) – 3D UMAP embeddings
cluster_labels (np.ndarray) – Cluster assignments
pc_associations (Dict[int, List[Dict]]) – PC-topic associations
output_dir (str) – Directory to save visualizations
show_top_n (int) – Number of high topics to show
show_bottom_n (int) – Number of low topics to show
display (bool) – Whether to display each plot
- Returns:
Mapping of PC index to output file path
- Return type:
- create_pc_effects_plot(pc_indices: List[int], output_file: str | None = None) Figure [source]
Create bar chart showing PC effects.
- create_metrics_summary(output_file: str | None = None) Dict[str, Any] [source]
Create a comprehensive summary of all HAAM metrics.
This method exports: - Model performance metrics (R² values for X, AI, HU) - Policy similarities between predictions - Mediation analysis results (PoMA percentages) - Feature selection statistics - Compatible with the new generic “X” labeling
- Parameters:
output_file (str, optional) – Path to save JSON file with metrics
- Returns:
Dictionary containing all metrics including: - model_performance: R² values for each model - policy_similarities: Correlations between predictions - mediation_analysis: PoMA and effect decomposition - feature_selection: Number and indices of selected PCs
- Return type:
Dict[str, Any]
- create_3d_umap_with_pc_arrows(umap_embeddings: ndarray, cluster_labels: ndarray, topic_keywords: Dict[int, str], pc_scores_all: ndarray, pc_indices: int | List[int] | None = None, top_k: int = 1, percentile_threshold: float = 90.0, arrow_mode: str = 'all', color_by_usage: bool = True, color_mode: str = 'legacy', criterion: ndarray | None = None, human_judgment: ndarray | None = None, ai_judgment: ndarray | None = None, show_topic_labels: bool | int = 10, output_file: str | None = None, display: bool = True) Figure [source]
Create 3D UMAP visualization with PC directional arrows.
This method creates a 3D UMAP space where: - Topics are positioned based on their UMAP embeddings - Arrows show PC directions from low to high scoring topics - Arrow endpoints are averages of top-k and bottom-k topic positions - Topics are colored by HU/AI usage patterns
- Parameters:
umap_embeddings (np.ndarray) – 3D UMAP embeddings (n_samples x 3)
cluster_labels (np.ndarray) – Cluster assignments for each point
topic_keywords (Dict[int, str]) – Topic ID to keyword mapping
pc_scores_all (np.ndarray) – PC scores for all samples (n_samples x n_components)
pc_indices (int or List[int], optional) – PC indices to show arrows for. If None and arrow_mode=’all’, shows first 3
top_k (int, default=1) – Number of top/bottom topics to average for arrow endpoints (default=1 for cleaner arrows)
percentile_threshold (float, default=90.0) – Percentile threshold for determining top/bottom topics
arrow_mode (str, default='all') – Arrow display mode: ‘single’, ‘list’, or ‘all’
color_by_usage (bool, default=True) – Whether to color topics by HU/AI usage patterns
color_mode (str, default='legacy') – Coloring mode when color_by_usage=True: - ‘legacy’: Use PC coefficient-based inference (original behavior) - ‘validity’: Use direct X/HU/AI measurement (consistent with word clouds)
criterion (np.ndarray, optional) – Ground truth values (X) for validity coloring mode
human_judgment (np.ndarray, optional) – Human judgment values (HU) for validity coloring mode
ai_judgment (np.ndarray, optional) – AI judgment values (AI) for validity coloring mode
show_topic_labels (bool or int, default=10) –
If True: Show all topic labels
If False: Hide all topic labels (hover still works)
If int: Show only the N topics closest to camera (dynamic)
output_file (str, optional) – Path to save HTML file
display (bool, default=True) – Whether to display in notebook/colab
- Returns:
Plotly 3D figure object
- Return type:
go.Figure
- create_3d_pca_with_arrows(pca_features: ndarray, cluster_labels: ndarray, topic_keywords: Dict[int, str], pc_indices: int | List[int] | None = None, arrow_mode: str = 'all', color_by_usage: bool = True, output_file: str | None = None, display: bool = True) Figure [source]
Create 3D PCA visualization with directional arrows showing PC gradients.
This method creates a 3D scatter plot of the first 3 PCs with: - Topic clusters floating in 3D space - Directional arrows showing high->low gradients for specified PCs - Color coding based on HU/AI usage patterns - Interactive tooltips with topic information
- Parameters:
pca_features (np.ndarray) – PCA-transformed features (n_samples x n_components)
cluster_labels (np.ndarray) – Cluster assignments for each point
topic_keywords (Dict[int, str]) – Topic ID to keyword mapping
pc_indices (int or List[int], optional) – PC indices to show arrows for. If None and arrow_mode=’all’, shows first 3
arrow_mode (str, default='all') – Arrow display mode: ‘single’, ‘list’, or ‘all’
color_by_usage (bool, default=True) – Whether to color topics by HU/AI usage patterns
output_file (str, optional) – Path to save HTML file
display (bool, default=True) – Whether to display in notebook/colab
- Returns:
Plotly 3D figure object
- Return type:
go.Figure