IBD¶
Identity-by-descent segment objects and readers.
Objects¶
- class snputils.IBDObject(sample_id_1, haplotype_id_1, sample_id_2, haplotype_id_2, chrom, start, end, length_cm=None, segment_type=None)[source]¶
Bases:
objectA class for Identity-By-Descent (IBD) segment data.
- Parameters:
sample_id_1 (array of shape (n_segments,)) – Sample identifiers for the first individual.
haplotype_id_1 (array of shape (n_segments,)) – Haplotype identifiers for the first individual (values in {1, 2}, or -1 if unknown).
sample_id_2 (array of shape (n_segments,)) – Sample identifiers for the second individual.
haplotype_id_2 (array of shape (n_segments,)) – Haplotype identifiers for the second individual (values in {1, 2}, or -1 if unknown).
chrom (array of shape (n_segments,)) – Chromosome identifier for each IBD segment.
start (array of shape (n_segments,)) – Start physical position (1-based, bp) for each IBD segment.
end (array of shape (n_segments,)) – End physical position (1-based, bp) for each IBD segment.
length_cm (array of shape (n_segments,), optional) – Genetic length (cM) for each segment, if available.
- property sample_id_1¶
Retrieve sample_id_1.
- Returns:
array of shape (n_segments,) – Sample identifiers for the first individual.
- property haplotype_id_1¶
Retrieve haplotype_id_1.
- Returns:
array of shape (n_segments,) – Haplotype identifiers for the first individual (values in {1, 2}).
- property sample_id_2¶
Retrieve sample_id_2.
- Returns:
array of shape (n_segments,) – Sample identifiers for the second individual.
- property haplotype_id_2¶
Retrieve haplotype_id_2.
- Returns:
array of shape (n_segments,) – Haplotype identifiers for the second individual (values in {1, 2}).
- property chrom¶
Retrieve chrom.
- Returns:
array of shape (n_segments,) – Chromosome identifier for each IBD segment.
- property start¶
Retrieve start.
- Returns:
array of shape (n_segments,) – Start physical position (1-based, bp) for each IBD segment.
- property end¶
Retrieve end.
- Returns:
array of shape (n_segments,) – End physical position (1-based, bp) for each IBD segment.
- property length_cm¶
Retrieve length_cm.
- Returns:
array of shape (n_segments,) – Genetic length (cM) for each segment if available; otherwise None.
- property segment_type¶
Retrieve segment_type.
- Returns:
array of shape (n_segments,) – Segment type labels (e.g., ‘IBD1’, ‘IBD2’), or None if unavailable.
- property n_segments¶
Retrieve n_segments.
- Returns:
int – The total number of IBD segments.
- property n_samples¶
Retrieve the number of unique samples represented in the segments.
- property n_chromosomes¶
Retrieve the number of unique chromosomes represented in the segments.
- property shape¶
Retrieve the one-dimensional segment shape.
- property pairs¶
Retrieve pairs.
- Returns:
array of shape (n_segments, 2) – Per-segment sample identifier pairs.
- property haplotype_pairs¶
Retrieve haplotype_pairs.
- Returns:
array of shape (n_segments, 2) – Per-segment haplotype identifier pairs.
- copy()[source]¶
Create and return a copy of self.
- Returns:
IBDObject – A new instance of the current object.
- keys()[source]¶
Retrieve a list of public attribute names for self.
- Returns:
list of str – A list of attribute names, with internal name-mangling removed.
- filter_segments(chrom=None, samples=None, min_length_cm=None, segment_types=None, inplace=False)[source]¶
Filter IBD segments by chromosome, sample names, and/or minimum genetic length.
- Parameters:
chrom (sequence of str, optional) – Chromosome(s) to include.
samples (sequence of str, optional) – Sample names to include if present in either column.
min_length_cm (float, optional) – Minimum cM length threshold.
inplace (bool, default=False) – If True, modifies self in place. If False, returns a new IBDObject.
- Returns:
Optional[IBDObject] – A filtered IBDObject if inplace=False. If inplace=True, returns None.
- restrict_to_ancestry(*, laiobj, ancestry, require_both_haplotypes=False, min_bp=None, min_cm=None, inplace=False, method='clip')[source]¶
Filter and/or trim IBD segments to intervals where both individuals carry the specified ancestry according to a LocalAncestryObject.
This performs an interval intersection per segment against ancestry tracts. If haplotype IDs are known (for example Hap-IBD), ancestry is checked on those specific haplotypes. If haplotype IDs are unknown (for example ancIBD, where
haplotype_id_* == -1), ancestry is considered present for an individual if at least one haplotype matches the target ancestry, unlessrequire_both_haplotypes=True.method='strict'drops entire segments when any overlapping LAI window has non-target ancestry for either individual.method='clip'trims segments to contiguous regions where both individuals carry target ancestry, clipped to LAI window boundaries and original IBD segment boundaries.- Parameters:
laiobj – LocalAncestryObject containing 2D lai of shape (n_windows, n_haplotypes), physical_pos (n_windows, 2), and chromosomes (n_windows,).
ancestry – Target ancestry code or label. Compared as string, so both int and str work.
require_both_haplotypes – If True, require both haplotypes of each individual to have the target ancestry within a window. When haplotypes are known per segment, this only affects cases with unknown haplotypes (== -1) or IBD2 segments.
min_bp – Minimum base-pair length to retain a segment (strict) or subsegment (clip).
min_cm – Minimum centiMorgan length to retain a segment (strict) or subsegment (clip).
inplace – If True, replace self with the restricted object; else return a new object.
method – Method to use for filtering. ‘strict’ drops entire segments that overlap with non-target ancestry. ‘clip’ trims segments to target ancestry regions.
- Returns:
Optional[IBDObject] –
- A restricted IBDObject if inplace=False. If inplace=True,
returns None.
Readers¶
- class snputils.IBDReader(file)[source]¶
Bases:
objectA factory class that attempts to detect the IBD file format and returns the corresponding reader.
Supported detections:
Hap-IBD: .ibd or .ibd.gz files (headerless, 8 columns)
ancIBD: directories with ch_all.tsv/ch*.tsv or .tsv / .tsv.gz files with ancIBD schema
- class snputils.HapIBDReader(file)[source]¶
Bases:
IBDBaseReaderReads an IBD file in Hap-IBD format and processes it into an IBDObject.
- Parameters:
file (str or pathlib.Path) – Path to the IBD file to read.
- read(separator=None)[source]¶
Read a Hap-IBD file into an IBDObject.
The Hap-IBD format is a delimited text without a header with columns: sample_id_1, haplotype_id_1, sample_id_2, haplotype_id_2, chromosome, start, end, length_cm
Notes: - Haplotype identifiers are 1-based and take values in {1, 2}.
- Parameters:
separator (str, optional) – Field delimiter. If None, whitespace (any number of spaces or tabs) is assumed.
- Returns:
*IBDObject* – An IBDObject instance.
- class snputils.AncIBDReader(file)[source]¶
Bases:
IBDBaseReaderReads IBD data from ancIBD outputs (TSV), accepting a file (ch_all.tsv or ch*.tsv) or a directory.
- Parameters:
file (str or pathlib.Path) – Path to the IBD file to read.
- read(path=None, include_segment_types=('IBD1', 'IBD2'))[source]¶
Read ancIBD outputs and convert to IBDObject.
Inputs accepted: - A single TSV (optionally gzipped), e.g. ch_all.tsv[.gz] or ch{CHR}.tsv[.gz]. - A directory containing per-chromosome TSVs or ch_all.tsv.
Column schema (tab-separated with header): iid1, iid2, ch, Start, End, length, StartM, EndM, lengthM, StartBP, EndBP, segment_type
Notes: - Haplotype indices are not provided by ancIBD; set to -1. - Positions in IBDObject use base-pair StartBP/EndBP. - Length uses centiMorgan as lengthM * 100.
Read Functions¶
- snputils.read_ibd(file, **kwargs)[source]¶
Automatically detect the IBD data file format from the file’s extension and read it into an IBDObject.
Supported formats: - Hap-IBD (no standard extension; defaults to tab-delimited columns without header). - ancIBD (template only).
- Parameters:
file (str or pathlib.Path) – Path to the file to be read.
**kwargs – Additional arguments passed to the reader method.