# PPP 0.1.13 {{< admonition success "Installed" true >}} This software should be available with no extra configuration. {{< /admonition >}} ## ppp-0.1.13 - Popgen Pipeline Platform (PPP) The PPP is a software platform with the goal of reducing the computational expertise required for conducting population genomic analyses. The PPP was designed as a collection of scripts that facilitate common population genomic workflows in a consistent and standardized environment. Functions were developed to encompass entire workflows, including: input preparation, file format conversion, various population genomic analyses, output generation, and visualization. By facilitating entire workflows, the PPP offers several benefits to prospective end users - it reduces the need of redundant in-house software and scripts that would require development time and may be error-prone, or incorrect, depending on the expertise of the investigator. The platform has also been developed with reproducibility and extensibility of analyses in mind. The current documentation may be found [here](https://ppp.readthedocs.io/). A PDF of the documentation is also available for [download](https://readthedocs.org/projects/ppp/downloads/pdf/latest/). ------------------------------------------------------------------------------- ## Location and version ```console $ which vcf_filter.py /local/cluster/bin/vcf_filter.py $ vcf_filter.py -h initLogger - WARNING: PPP, version 0.1.13 ``` ## help message There are other scripts associated with this software as well. Please see the full documentation for more information. ```console $ vcf_filter.py -h initLogger - WARNING: PPP, version 0.1.13 usage: vcf_filter.py [-h] --vcf VCF [--model-file MODEL_FILE] [--model MODEL] [--out OUT] [--out-prefix OUT_PREFIX] [--out-format {vcf, vcf.gz, bcf, bed, sites}] [--overwrite] [--force-samples] [--filter-include-indv FILTER_INCLUDE_INDV [FILTER_INCLUDE_INDV ...] | --filter-exclude-indv FILTER_EXCLUDE_INDV [FILTER_EXCLUDE_INDV ...]] [--filter-include-indv-file FILTER_INCLUDE_INDV_FILE | --filter-exclude-indv-file FILTER_EXCLUDE_INDV_FILE] [--filter-only-biallelic] [--filter-min-alleles FILTER_MIN_ALLELES] [--filter-max-alleles FILTER_MAX_ALLELES] [--filter-max-missing FILTER_MAX_MISSING | --filter-max-missing-count FILTER_MAX_MISSING_COUNT] [--filter-include-indels | --filter-exclude-indels] [--filter-include-snps | --filter-exclude-snps] [--filter-include-pos FILTER_INCLUDE_POS [FILTER_INCLUDE_POS ...]] [--filter-exclude-pos FILTER_EXCLUDE_POS [FILTER_EXCLUDE_POS ...]] [--filter-include-pos-file FILTER_INCLUDE_POS_FILE] [--filter-exclude-pos-file FILTER_EXCLUDE_POS_FILE] [--filter-include-bed FILTER_INCLUDE_BED] [--filter-exclude-bed FILTER_EXCLUDE_BED] [--filter-include-passed] [--filter-exclude-passed] [--filter-include-flag FILTER_INCLUDE_FLAG [FILTER_INCLUDE_FLAG ...]] [--filter-exclude-flag FILTER_EXCLUDE_FLAG [FILTER_EXCLUDE_FLAG ...]] [--filter-include-snp FILTER_INCLUDE_SNP [FILTER_INCLUDE_SNP ...]] [--filter-exclude-snp FILTER_EXCLUDE_SNP [FILTER_EXCLUDE_SNP ...]] [--filter-include-snp-file FILTER_INCLUDE_SNP_FILE] [--filter-exclude-snp-file FILTER_EXCLUDE_SNP_FILE] [--filter-maf-min FILTER_MAF_MIN] [--filter-maf-max FILTER_MAF_MAX] [--filter-mac-min FILTER_MAC_MIN] [--filter-mac-max FILTER_MAC_MAX] optional arguments: -h, --help show this help message and exit --vcf VCF Defines the filename of the VCF (default: None) --model-file MODEL_FILE Defines the model file (default: None) --model MODEL Defines the model and the individual(s) to include (default: None) --out OUT Defines the complete output filename, overrides --out- prefix (default: None) --out-prefix OUT_PREFIX Defines the output prefix (i.e. filename without file extension) (default: out) --out-format {vcf, vcf.gz, bcf, bed, sites} Defines the desired output format (default: vcf.gz) --overwrite Overwrite previous output (default: False) --force-samples Ignore the error rasied when a sample that does not exist (default: False) --filter-include-indv FILTER_INCLUDE_INDV [FILTER_INCLUDE_INDV ...] Defines the individual(s) to include. May be used multiple times (default: None) --filter-exclude-indv FILTER_EXCLUDE_INDV [FILTER_EXCLUDE_INDV ...] Defines the individual(s) to exclude. May be used multiple times (default: None) --filter-include-indv-file FILTER_INCLUDE_INDV_FILE Defines a file of individuals to include (default: None) --filter-exclude-indv-file FILTER_EXCLUDE_INDV_FILE Defines a file of individuals to exclude (default: None) --filter-only-biallelic Only include variants that are biallelic (default: False) --filter-min-alleles FILTER_MIN_ALLELES Include variants with a number of allele >= to the given number (default: None) --filter-max-alleles FILTER_MAX_ALLELES Include variants with a number of allele <= to the given number (default: None) --filter-max-missing FILTER_MAX_MISSING Max proportion of missing data allowed (0.0: no missing data, 1.0: include all data) (default: None) --filter-max-missing-count FILTER_MAX_MISSING_COUNT Max number of sample with missing data allowed (default: None) --filter-include-indels Include variants if they contain an insertion or a deletion (default: False) --filter-exclude-indels Exclude variants if they contain an insertion or a deletion (default: False) --filter-include-snps Include variants if they contain a SNP (default: False) --filter-exclude-snps Exclude variants if they contain a SNP (default: False) --filter-include-pos FILTER_INCLUDE_POS [FILTER_INCLUDE_POS ...] Defines comma seperated positions (i.e. CHROM:START- END) to include. START and END are optional. May be used multiple times (default: None) --filter-exclude-pos FILTER_EXCLUDE_POS [FILTER_EXCLUDE_POS ...] Defines comma seperated positions (i.e. CHROM:START- END) to exclude. START and END are optional. May be used multiple times (default: None) --filter-include-pos-file FILTER_INCLUDE_POS_FILE Defines a file of positions to include within a file (default: None) --filter-exclude-pos-file FILTER_EXCLUDE_POS_FILE Defines a file of positions to exclude within a file (default: None) --filter-include-bed FILTER_INCLUDE_BED Defines a BED file of positions to include (default: None) --filter-exclude-bed FILTER_EXCLUDE_BED Defines a BED file of positions to exclude (default: None) --filter-include-passed Include variants with the 'PASS' filter flag (default: False) --filter-exclude-passed Exclude variants with the 'PASS' filter flag (default: False) --filter-include-flag FILTER_INCLUDE_FLAG [FILTER_INCLUDE_FLAG ...] Include variants with the specified filter flag (default: None) --filter-exclude-flag FILTER_EXCLUDE_FLAG [FILTER_EXCLUDE_FLAG ...] Exclude variants with the specified filter flag (default: None) --filter-include-snp FILTER_INCLUDE_SNP [FILTER_INCLUDE_SNP ...] Include SNP(s) with the matching ID. This argument may be used multiple times (default: None) --filter-exclude-snp FILTER_EXCLUDE_SNP [FILTER_EXCLUDE_SNP ...] Exclude SNP(s) with the matching ID. This argument may be used multiple times (default: None) --filter-include-snp-file FILTER_INCLUDE_SNP_FILE Defines a file of SNP IDs to include (default: None) --filter-exclude-snp-file FILTER_EXCLUDE_SNP_FILE Defines a file of SNP IDs to exclude (default: None) --filter-maf-min FILTER_MAF_MIN Include variants with equal or greater MAF values (default: None) --filter-maf-max FILTER_MAF_MAX Include variants with equal or lesser MAF values (default: None) --filter-mac-min FILTER_MAC_MIN Include variants with equal or greater MAC values (default: None) --filter-mac-max FILTER_MAC_MAX Include variants with equal or lesser MAC values (default: None) ``` software ref: research ref: