Phenotypes

Phenotype data containers and readers for single-trait and multi-trait tables. Phenotype objects align sample IDs with trait values and feed run_gwas(), run_admixture_mapping(), and covariate-aware association workflows.

Objects

class snputils.PhenotypeObject(samples, values, phenotype_name='PHENO', quantitative=None)[source]

Bases: object

Generic phenotype container for single-trait analyses.

The object stores sample IDs, normalized phenotype values, inferred/declared trait type, and binary case/control convenience attributes.

class snputils.MultiPhenotypeObject(phen_df, sample_column=None)[source]

Bases: object

Sample-aligned table for multiple phenotypes or sample traits.

filter_samples(samples=None, indexes=None, include=True, reorder=False, inplace=False)[source]

Filter rows by sample ID or row index.

class snputils.CovariateObject(samples, values, covariate_names=None)[source]

Bases: object

Sample-aligned covariate matrix for association analyses.

Covariate construction

snputils.phenotype.read_covar_file(path, col_nums=None, *, variance_standardize=False)[source]

Read a PLINK-style covariate table with an IID column.

snputils.phenotype.build_association_covariates(*, embedding=None, n_components=None, global_ancestry=None, drop_ancestry=-1, columns=None, ancestry_names=None, file=None, col_nums=None)[source]

Compose optional embedding, global ancestry, and file covariate blocks.

CovariateObject factories (class methods on CovariateObject):

  • from_file(path, col_nums=None) — read a PLINK-style covariate table (IID plus numeric columns).

  • from_embedding(model, n_components=None) — PCs or MDS coordinates from a fitted PCA, mdPCA, or maasMDS model.

  • from_global_ancestry(admobj, columns=None, drop_ancestry=-1) — ADMIXTURE Q proportions; drops the last ancestry column by default.

  • merge(*objs) — inner-join sample IDs and concatenate covariate columns.

Readers

snputils.read_pheno(file, col=None, *, quantitative=None)[source]

Read a phenotype file into a PhenotypeObject.

Parameters:
  • file – Path to a headered phenotype table (.txt, .phe, .pheno, …).

  • col – Phenotype column to load (header name, with or without #). If the file has a single phenotype column, this may be omitted.

  • quantitative – If set, force quantitative (linear) or binary (logistic) mode. When None, inferred from the column values.

class snputils.PhenotypeReader(file)[source]

Bases: PhenotypeBaseReader

Reader for phenotype files (any extension; common: .txt, .phe, .pheno).

Expected format (headered, whitespace-delimited):
  • Must include IID (optionally preceded by FID)

  • Must include one or more phenotype columns after IID

  • If multiple phenotype columns are present, select one explicitly

property file

Retrieve file.

Returns:

pathlib.Path – Path to the file containing phenotype data.

read(phenotype_col=None, quantitative=None)[source]

Abstract method to read data from the provided file.

Subclasses must implement this method to read and parse the data. The implementation should construct an instance of snputils.phenotype.genobj.MultiPhenotypeObject or snputils.phenotype.genobj.PhenotypeObject based on the read data.

class snputils.MultiPhenReader(file)[source]

Bases: PhenotypeBaseReader

Reader for headered multi-phenotype tables with an IID column.

property file

Retrieve file.

Returns:

pathlib.Path – Path to the file containing phenotype data.

read(samples_idx=0, phen_names=None, sep=',', header=0, drop=False)[source]

Read a multi-phenotype table using the same IID convention as read_pheno().