# PPanGGOLiN 1.1.136 ## PPanGGOLiN : Depicting microbial species diversity via a Partitioned PanGenome Graph Of Linked Neighbors PPanGGOLiN is a software suite used to create and manipulate prokaryotic pangenomes from a set of either genomic DNA sequences or provided genome annotations. It is designed to scale up to tens of thousands of genomes. It has the specificity to partition the pangenome using a statistical approach rather than using fixed thresholds which gives it the ability to work with low-quality data such as Metagenomic Assembled Genomes (MAGs) or Single-cell Amplified Genomes (SAGs) thus taking advantage of large scale environmental studies and letting users study the pangenome of uncultivable species. PPanGGOLiN builds pangenomes through a graphical model and a statistical method to partition gene families in persistent, shell and cloud genomes. It integrates both information on protein-coding genes and their genomic neighborhood to build a graph of gene families where each node is a gene family and each edge is a relation of genetic contiguity. The partitioning method promotes that two gene families that are consistent neighbors in the graph are more likely to belong to the same partition. It results in a Partitioned Pangenome Graph (PPG) made of persistent, shell and cloud nodes drawing genomes on rails like a subway map to help biologists navigate the great diversity of microbial life. Moreover, the panRGP method (Bazin et al. 2020) included in PPanGGOLiN predicts, for each genome, Regions of Genome Plasticity (RGPs) that are clusters of genes made of shell and cloud genomes in the pangenome graph. Most of them arise from Horizontal gene transfer (HGT) and correspond to Genomic Islands (GIs). RGPs from different genomes are next grouped in spots of insertion based on their conserved flanking persistent genes. To activate: ```console bash source /local/cluster/ppanggolin/activate.sh ``` Location and version: ```console $ which ppanggolin /local/cluster/ppanggolin/bin/ppanggolin $ ppanggolin --version ppanggolin 1.1.136 ``` help message: ```console $ ppanggolin --help usage: ppanggolin [-h] [-v] ... Depicting microbial species diversity via a Partitioned PanGenome Graph Of Linked Neighbors optional arguments: -h, --help show this help message and exit -v, --version show program's version number and exit subcommands: All of the following subcommands have their own set of options. To see them for a given subcommand, use it with -h or --help, as such: ppanggolin -h Basic: workflow Easy workflow to run a pangenome analysis in one go panrgp Easy workflow to run a pangenome analysis with genomic islands and spots of insertion detection Expert: annotate Annotate genomes cluster Cluster proteins in protein families graph Create the pangenome graph partition Partition the pangenome graph rarefaction Compute the rarefaction curve of the pangenome msa Compute Multiple Sequence Alignments for pangenome gene families Output: draw Draw figures representing the pangenome through different aspects write Writes 'flat' files representing the pangenome that can be used with other softwares fasta Writes fasta files for different elements of the pangenome info Prints information about a given pangenome graph file Regions of genomic Plasticity: align Aligns a genome or a set of proteins to the pangenome gene families representatives and predict informations from it rgp Predicts Regions of Genomic Plasticity in the genomes of your pangenome spot Predicts spots in your pangenome ``` software ref: research ref: