# pb-assembly 0.0.8 ## pb-assembly pb-assembly is the bioconda recipe encompassing all code and dependencies necessary to run: * FALCON assembly pipeline * FALCON-Unzip to phase the genome and perform phased-polishing with Arrow * FALCON-Phase to extend phasing between unzipped haplotig blocks (requires HiC data) Installed package recipes include: - pb-falcon - pb-dazzler - genomicconsensus - etc (all other dependencies) ## FALCON and FALCON-Unzip FALCON and FALCON-Unzip are de novo genome assemblers for PacBio long reads, also known as Single-Molecule Real-Time (SMRT) sequences. FALCON is a diploid-aware assembler which follows the hierarchical genome assembly process (HGAP) and is optimized for large genome assembly though microbial genomes can also be assembled. FALCON produces a set of primary contigs (p-contigs) as the primary assembly and a set of associate contigs (a-contigs) which represent divergent allelic variants. Each a-contig is associated with a homologous genomic region on an p-contig. FALCON-Unzip is a true diploid assembler. It takes the contigs from FALCON and phases the reads based on heterozygous SNPs identified in the initial assembly. It then produces a set of partially-phased primary contigs and fully-phased haplotigs which represent divergent haplotypes. **NOTE:** Please ensure your settings in the `fc_run.cfg` file includes setting NPROC * njobs = NTOTAL that corresponds to the NTOTAL processors you check out using `SGE_Batch` and/or `SGE_Array`, e.g. if NTOTAL = 64 then use `-P 64` in your SGE command. To activate: ```console bash source /local/cluster/pb-assembly/activate.sh ``` Location: ```console $ which fc_run /local/cluster/pb-assembly/bin/fc_run (/local/cluster/pb-assembly) ``` help message: ```console $ fc_run --help falcon-kit 1.8.1 (pip thinks "falcon-kit 1.8.1") pypeflow 2.3.0 usage: fc_run [-h] config [logger] positional arguments: config .cfg/.ini/.json logger (Optional)JSON config for standard Python logging module optional arguments: -h, --help show this help message and exit (/local/cluster/pb-assembly) ``` Configuration files can be found here: ```console /nfs1/CGRB/databases/software/pb-assembly/cfgs ``` software ref: research ref: research ref: