Visualization

Plotting helpers for embeddings, local ancestry, global admixture proportions, and association summaries. Most functions accept in-memory objects or result files with the column names documented in each signature. Scatter plots expect a fitted PCA, mdPCA, or maasMDS model with X_new_ and samples_.

snputils.visualization.scatter(dimredobj, labels_file, abbreviation_inside_dots=True, arrows_for_titles=False, dots=True, legend=True, color_palette=None, show=True, save_path=None, *, label_mode=None, style='default', figsize=None, label_colors=None, legend_outside=None, despine=None, axis_xlabel=None, axis_ylabel=None, point_size=None, centroid_size=None, point_alpha=None, savefig_kwargs=None, equal_aspect=None)[source]

Plot a scatter with group centroids and optional label styling.

Parameters:
  • dimredobj – Object produced by a dimensionality-reduction step, e.g. maasMDS, mdPCA, or PCA. Must expose X_new_ ((n, 2) embedding) and samples_ (identifiers aligned with embedding rows).

  • labels_file (str or pandas.DataFrame) – TSV path or in-memory table with columns indID and label.

  • abbreviation_inside_dots (bool) – If True, show a short acronym inside each centroid marker.

  • arrows_for_titles (bool) – If True, draw arrows from text labels to centroids.

  • dots (bool) – If True, draw scatter points; if False, print coordinates and use text markers instead.

  • legend (bool) – If True, include a legend for group labels.

  • color_palette (optional) – Colormap or indexable color list; default palette is chosen automatically if None.

  • show (bool, optional) – If True, call plt.show(); otherwise close the figure after saving. Default True.

  • save_path (str, optional) – If set, save the figure to this path (plt.savefig). Prefer .pdf or .svg for publication: dense scatter is rasterized at dpi (default 300) while axes and text stay vector. Bitmap formats (.png, …) also default to that dpi. Override via savefig_kwargs.

  • label_mode (str, optional) – Overrides abbreviation_inside_dots, arrows_for_titles, and legend. "legend" — legend plus abbreviations inside centroids. "acronym" — abbreviations inside centroids only. "arrow" — labels near centroids with adjustText arrows; best for many groups. None keeps the individual boolean flags.

  • style (str) – "default" — legacy appearance. "publication" — typography, despine, room for an outside legend, slightly larger markers, MDS-oriented axis labels.

  • figsize (tuple, optional) – Figure size in inches; chosen from style when None.

  • label_colors (Mapping, optional) – Map group labels (as in the TSV) to matplotlib color strings; unlisted labels use the palette.

  • legend_outside (bool, optional) – If True, place the legend outside the axes. Default True when style=="publication".

  • despine (bool, optional) – Hide top and right spines. Default True when style=="publication".

  • axis_xlabel (str, optional) – Axis labels; defaults depend on style.

  • axis_ylabel (str, optional) – Axis labels; defaults depend on style.

  • point_size (float, optional) – Override scatter sizes and point alpha.

  • centroid_size (float, optional) – Override scatter sizes and point alpha.

  • point_alpha (float, optional) – Override scatter sizes and point alpha.

  • savefig_kwargs (dict, optional) – Extra keyword arguments for plt.savefig when save_path is set.

  • equal_aspect (bool, optional) – If True, equal data aspect (typical for MDS/PCA). Default True when style="publication".

Returns:

None

snputils.visualization.lai.plot_lai(laiobj, colors, sort=True, figsize=None, legend=False, legend_kwargs=None, title=None, fontsize=None, scale=2)[source]

Plot LAI (Local Ancestry Inference) data with customizable options. Each row represents the ancestry of a sample at the window level, distinguishing between maternal and paternal strands. Whitespace is used to separate individual samples.

Parameters:
  • laiobj – A LocalAncestryObject containing LAI data.

  • colors – A dictionary with ancestry-color mapping.

  • sort – If True, sort samples based on the most frequent ancestry. Samples are displayed with the most predominant ancestry first, followed by the second most predominant, and so on. Defaults to True.

  • figsize – Figure size. If is None, the figure is displayed with a default size of (25, 25). Defaults to None.

  • legend – If True, display a legend. If sort==True, ancestries in the legend are sorted based on their total frequency in descending order. Defaults to False.

  • legend_kwargs – Optional keyword arguments passed through to Axes.legend. Defaults keep the legend centered below the x-axis label.

  • title – Title for the plot. If None, no title is displayed. Defaults to None.

  • fontsize – Font sizes for various plot elements. If None, default font sizes are used. Defaults to None.

  • scale – Number of times to duplicate rows for enhanced vertical visibility. Defaults to 2.

snputils.visualization.admixture.reorder_admixture(Q_mat)[source]

Reorder Q_mat rows so that rows are grouped by each sample’s dominant ancestry, and columns are sorted by descending average ancestry proportion.

snputils.visualization.admixture.plot_admixture(ax, Q_mat_sorted, boundary_list, col_order=None, colors=None, show_boundaries=True, show_axes_labels=True, show_ticks=True, set_limits=True, minimal=False)[source]

Plot a structure-style bar chart of Q_mat_sorted in the given Axes ax. If colors is not None, it should be a list or array of length K. If col_order is not None, colors are reordered according to col_order.

Optional controls: - show_boundaries (bool): draw vertical lines at group boundaries. Default True. - show_axes_labels (bool): set X/Y axis labels. Default True. - show_ticks (bool): show axis ticks. Default True. - set_limits (bool): set xlim and ylim to [0, n_samples-1] and [0,1]. Default True. - minimal (bool): if True, overrides to disable boundaries, labels, ticks, limits and hides spines.

snputils.visualization.manhattan_plot.manhattan_plot(data, colors=None, significance_threshold=0.05, point_size=7.0, line_width=1.0, line_color='r', figsize=None, title=None, fontsize=None, save=None, output_filename=None)[source]

Generate a Manhattan plot from association study results.

Accepts either a file path or an in-memory pandas.DataFrame. The input must contain columns #CHROM, POS, and P (p-values).

Parameters:
  • data – Path to a tab-separated results file or an in-memory DataFrame with columns #CHROM, POS, and P. PLINK2-style output files are supported directly.

  • colors – List of colors to apply per chromosome. The chromosome number modulo len(colors) is used to select the color. Defaults to ["black", "grey"].

  • significance_threshold – Nominal significance threshold used to derive the Bonferroni-corrected threshold (significance_threshold / n_variants). Default is 0.05.

  • point_size – Marker area for scatter points (matplotlib s). Default is 7.0.

  • line_width – Width of the Bonferroni reference line. Default is 1.0.

  • line_color – Color of the Bonferroni reference line. Default is "r".

  • figsize – Optional (width, height) tuple passed to matplotlib.pyplot.figure(). Defaults to (12, 6) (2:1 aspect ratio).

  • title – Plot title. Default is None (no title).

  • fontsize – Mapping with optional keys 'title', 'xlabel', and 'ylabel' controlling font sizes. Missing keys fall back to sensible defaults (20 for title, 15 for axis labels).

  • save – If True, saves the figure to output_filename.

  • output_filename – Destination path for the saved figure (.pdf, .svg, .png, …).

snputils.visualization.qq_plot.qq_plot(data, color='black', significance_threshold=0.05, point_size=7.0, line_width=1.0, expected_line_color='red', threshold_line_color='orange', figsize=None, title=None, fontsize=None, save=None, output_filename=None)[source]

Generate a quantile-quantile (QQ) plot of association study p-values.

Plots observed -log10(p) against the expected -log10(p) under the null hypothesis of no association (uniform distribution), together with the identity reference line and a Bonferroni significance threshold.

Accepts either a file path or an in-memory pandas.DataFrame. The input must contain a column P with p-values.

Parameters:
  • data – Path to a tab-separated results file or an in-memory DataFrame with a column P. PLINK2-style output files are supported directly.

  • color – Color for the scatter points. Defaults to "black".

  • significance_threshold – Nominal significance threshold used to derive the Bonferroni-corrected threshold (significance_threshold / n_variants). Default is 0.05.

  • point_size – Marker area for scatter points (matplotlib s). Default is 7.0.

  • line_width – Width of the expected-null and Bonferroni reference lines. Default is 1.0.

  • expected_line_color – Color of the identity (expected under null) reference line. Default is "red".

  • threshold_line_color – Color of the Bonferroni threshold line. Default is "orange".

  • figsize – Optional (width, height) tuple passed to matplotlib.pyplot.figure().

  • title – Plot title. Default is None (no title).

  • fontsize – Mapping with optional keys 'title', 'xlabel', and 'ylabel' controlling font sizes. Missing keys fall back to sensible defaults (20 for title, 15 for axis labels).

  • save – If True, saves the figure to output_filename.

  • output_filename – Destination path for the saved figure (.pdf, .svg, .png, …).

snputils.visualization.admixture_viz.pong_viz(folder_runs, output_dir, k=None, min_k=None, max_k=None, runs=None, run_prefix='train', ind2pop_path=None, pop_names_path=None, color_list_path=None, verbose=False)[source]

Executes Pong visualization with the specified parameters.

snputils.visualization.admixture_viz.create_filemap(folder, k=None, min_k=None, max_k=None, runs=None, run_prefix='train_demo')[source]

Creates a filemap for training files organized by k values and runs and saves it to a file.

Parameters:
  • folder (str) – Base folder path

  • k (Optional[int]) – Single k value to process. If specified, min_k and max_k are ignored

  • min_k (Optional[int]) – Minimum k value for range processing

  • max_k (Optional[int]) – Maximum k value for range processing

  • runs (List[int]) – List of run numbers

  • run_prefix (str) – Prefix for the run files (default: ‘train_demo’)

Returns:

str – Path to saved file

Raises:

FileMapError – If invalid parameters are provided or if configuration is incorrect

snputils.visualization.chromosome_painting(source, output_dir, sample_id=None, build='hg38', color_map=None, num_labels=8, fill_empty=True, fill_marker_gaps=False, output_format='png', force=True, verbose=False, show=False, keep_bed_files=False)[source]

Generate chromosome paintings from a local ancestry source.

Accepts a LocalAncestryObject, one or more MSP files, or one or more BED files and dispatches to the appropriate internal pipeline automatically.

Source types

  • LocalAncestryObject — in-memory LAI data; chromosomes and physical_pos must be populated.

  • str / pathlib.Path ending with .msp or .msp.tsv — a single MSP file; also accepts a list of such paths spanning multiple chromosomes.

  • str / pathlib.Path ending with .bed — one pre-formatted BED file; also accepts a list to paint multiple samples at once.

Selecting samples

  • sample_id=None (default) — paint every sample in the source.

  • sample_id="0001" — paint only the sample whose ID is "0001".

  • sample_id=["0001", "0002"] — paint a subset.

sample_id is not applicable to BED sources (a BED file already represents one sample); it is silently ignored when BED files are provided.

Parameters:
  • source – The data source; see description above.

  • output_dir – Directory where output files will be saved.

  • sample_id – Sample identifier(s) to paint. None paints all samples. Accepts a single string or a list of strings.

  • build – Genome build version ('hg37' or 'hg38').

  • color_map – A TSV filename or a {int: hex_color} dict mapping numeric ancestry codes to hex color strings. Uses the default snputils palette when None.

  • num_labels – Number of distinct colors to generate when color_map is None.

  • fill_empty – If True, fill unassigned chromosome regions with a neutral grey color.

  • fill_marker_gaps – If True, extend painted segments through inter-marker gaps until the next segment on the same chromosome copy. This avoids rendering sparse marker intervals as missing ancestry. Defaults to False.

  • output_format – Output format, 'png' or 'pdf'.

  • force – If True, overwrite existing output files.

  • verbose – If True, emit progress log messages.

  • show – If True, display each PNG in a matplotlib figure (PNG only).

  • keep_bed_files – If True, retain intermediate BED files generated from MSP sources.

Returns:

List[str] – Paths to the generated output files, one per sample.

Raises:

ValueError – If the source type cannot be determined from the file extension, or if a requested sample_id is not found.

Examples

Paint all samples from a LAI object:

su.viz.chromosome_painting(lai, "paintings/")

Paint a single sample:

su.viz.chromosome_painting(lai, "paintings/", sample_id="0001")

Paint a subset from MSP files:

su.viz.chromosome_painting(
    ["chr1.msp", "chr2.msp"],
    "paintings/",
    sample_id=["0001", "0002"],
)