IBD

Identity-by-descent segment objects and readers.

Objects

class snputils.IBDObject(sample_id_1, haplotype_id_1, sample_id_2, haplotype_id_2, chrom, start, end, length_cm=None, segment_type=None)[source]

Bases: object

A class for Identity-By-Descent (IBD) segment data.

Parameters:
  • sample_id_1 (array of shape (n_segments,)) – Sample identifiers for the first individual.

  • haplotype_id_1 (array of shape (n_segments,)) – Haplotype identifiers for the first individual (values in {1, 2}, or -1 if unknown).

  • sample_id_2 (array of shape (n_segments,)) – Sample identifiers for the second individual.

  • haplotype_id_2 (array of shape (n_segments,)) – Haplotype identifiers for the second individual (values in {1, 2}, or -1 if unknown).

  • chrom (array of shape (n_segments,)) – Chromosome identifier for each IBD segment.

  • start (array of shape (n_segments,)) – Start physical position (1-based, bp) for each IBD segment.

  • end (array of shape (n_segments,)) – End physical position (1-based, bp) for each IBD segment.

  • length_cm (array of shape (n_segments,), optional) – Genetic length (cM) for each segment, if available.

property sample_id_1

Retrieve sample_id_1.

Returns:

array of shape (n_segments,) – Sample identifiers for the first individual.

property haplotype_id_1

Retrieve haplotype_id_1.

Returns:

array of shape (n_segments,) – Haplotype identifiers for the first individual (values in {1, 2}).

property sample_id_2

Retrieve sample_id_2.

Returns:

array of shape (n_segments,) – Sample identifiers for the second individual.

property haplotype_id_2

Retrieve haplotype_id_2.

Returns:

array of shape (n_segments,) – Haplotype identifiers for the second individual (values in {1, 2}).

property chrom

Retrieve chrom.

Returns:

array of shape (n_segments,) – Chromosome identifier for each IBD segment.

property start

Retrieve start.

Returns:

array of shape (n_segments,) – Start physical position (1-based, bp) for each IBD segment.

property end

Retrieve end.

Returns:

array of shape (n_segments,) – End physical position (1-based, bp) for each IBD segment.

property length_cm

Retrieve length_cm.

Returns:

array of shape (n_segments,) – Genetic length (cM) for each segment if available; otherwise None.

property segment_type

Retrieve segment_type.

Returns:

array of shape (n_segments,) – Segment type labels (e.g., ‘IBD1’, ‘IBD2’), or None if unavailable.

property n_segments

Retrieve n_segments.

Returns:

int – The total number of IBD segments.

property n_samples

Retrieve the number of unique samples represented in the segments.

property n_chromosomes

Retrieve the number of unique chromosomes represented in the segments.

property shape

Retrieve the one-dimensional segment shape.

property pairs

Retrieve pairs.

Returns:

array of shape (n_segments, 2) – Per-segment sample identifier pairs.

property haplotype_pairs

Retrieve haplotype_pairs.

Returns:

array of shape (n_segments, 2) – Per-segment haplotype identifier pairs.

copy()[source]

Create and return a copy of self.

Returns:

IBDObject – A new instance of the current object.

keys()[source]

Retrieve a list of public attribute names for self.

Returns:

list of str – A list of attribute names, with internal name-mangling removed.

filter_segments(chrom=None, samples=None, min_length_cm=None, segment_types=None, inplace=False)[source]

Filter IBD segments by chromosome, sample names, and/or minimum genetic length.

Parameters:
  • chrom (sequence of str, optional) – Chromosome(s) to include.

  • samples (sequence of str, optional) – Sample names to include if present in either column.

  • min_length_cm (float, optional) – Minimum cM length threshold.

  • inplace (bool, default=False) – If True, modifies self in place. If False, returns a new IBDObject.

Returns:

Optional[IBDObject] – A filtered IBDObject if inplace=False. If inplace=True, returns None.

restrict_to_ancestry(*, laiobj, ancestry, require_both_haplotypes=False, min_bp=None, min_cm=None, inplace=False, method='clip')[source]

Filter and/or trim IBD segments to intervals where both individuals carry the specified ancestry according to a LocalAncestryObject.

This performs an interval intersection per segment against ancestry tracts. If haplotype IDs are known (for example Hap-IBD), ancestry is checked on those specific haplotypes. If haplotype IDs are unknown (for example ancIBD, where haplotype_id_* == -1), ancestry is considered present for an individual if at least one haplotype matches the target ancestry, unless require_both_haplotypes=True.

method='strict' drops entire segments when any overlapping LAI window has non-target ancestry for either individual.

method='clip' trims segments to contiguous regions where both individuals carry target ancestry, clipped to LAI window boundaries and original IBD segment boundaries.

Parameters:
  • laiobj – LocalAncestryObject containing 2D lai of shape (n_windows, n_haplotypes), physical_pos (n_windows, 2), and chromosomes (n_windows,).

  • ancestry – Target ancestry code or label. Compared as string, so both int and str work.

  • require_both_haplotypes – If True, require both haplotypes of each individual to have the target ancestry within a window. When haplotypes are known per segment, this only affects cases with unknown haplotypes (== -1) or IBD2 segments.

  • min_bp – Minimum base-pair length to retain a segment (strict) or subsegment (clip).

  • min_cm – Minimum centiMorgan length to retain a segment (strict) or subsegment (clip).

  • inplace – If True, replace self with the restricted object; else return a new object.

  • method – Method to use for filtering. ‘strict’ drops entire segments that overlap with non-target ancestry. ‘clip’ trims segments to target ancestry regions.

Returns:

Optional[IBDObject]

A restricted IBDObject if inplace=False. If inplace=True,

returns None.

Readers

class snputils.IBDReader(file)[source]

Bases: object

A factory class that attempts to detect the IBD file format and returns the corresponding reader.

Supported detections:

  • Hap-IBD: .ibd or .ibd.gz files (headerless, 8 columns)

  • ancIBD: directories with ch_all.tsv/ch*.tsv or .tsv / .tsv.gz files with ancIBD schema

class snputils.HapIBDReader(file)[source]

Bases: IBDBaseReader

Reads an IBD file in Hap-IBD format and processes it into an IBDObject.

Parameters:

file (str or pathlib.Path) – Path to the IBD file to read.

read(separator=None)[source]

Read a Hap-IBD file into an IBDObject.

The Hap-IBD format is a delimited text without a header with columns: sample_id_1, haplotype_id_1, sample_id_2, haplotype_id_2, chromosome, start, end, length_cm

Notes: - Haplotype identifiers are 1-based and take values in {1, 2}.

Parameters:

separator (str, optional) – Field delimiter. If None, whitespace (any number of spaces or tabs) is assumed.

Returns:

*IBDObject* – An IBDObject instance.

class snputils.AncIBDReader(file)[source]

Bases: IBDBaseReader

Reads IBD data from ancIBD outputs (TSV), accepting a file (ch_all.tsv or ch*.tsv) or a directory.

Parameters:

file (str or pathlib.Path) – Path to the IBD file to read.

read(path=None, include_segment_types=('IBD1', 'IBD2'))[source]

Read ancIBD outputs and convert to IBDObject.

Inputs accepted: - A single TSV (optionally gzipped), e.g. ch_all.tsv[.gz] or ch{CHR}.tsv[.gz]. - A directory containing per-chromosome TSVs or ch_all.tsv.

Column schema (tab-separated with header): iid1, iid2, ch, Start, End, length, StartM, EndM, lengthM, StartBP, EndBP, segment_type

Notes: - Haplotype indices are not provided by ancIBD; set to -1. - Positions in IBDObject use base-pair StartBP/EndBP. - Length uses centiMorgan as lengthM * 100.

Parameters:
  • path (str or Path, optional) – Override input path. Defaults to self.file.

  • include_segment_types (sequence of str, optional) – Filter by segment_type (e.g., IBD1, IBD2). None to disable.

Returns:

*IBDObject* – An IBDObject instance.

Read Functions

snputils.read_ibd(file, **kwargs)[source]

Automatically detect the IBD data file format from the file’s extension and read it into an IBDObject.

Supported formats: - Hap-IBD (no standard extension; defaults to tab-delimited columns without header). - ancIBD (template only).

Parameters:
  • file (str or pathlib.Path) – Path to the file to be read.

  • **kwargs – Additional arguments passed to the reader method.