.. _quickstart: Quick Start ============================== Prerequisites ~~~~~~~~~~~~~~~~~~~~~~ - Python 3.11 or higher recommended. - `odgi `_ required to prepare custom data. .. code-block:: bash git clone https://github.com/ScottMastro/pangyplot.git cd pangyplot pip install -r requirements.txt ``gunicorn`` is additionally recommended for production deployment but is not required for local development (Flask's built-in server is used in that case). See the commented line in ``requirements.txt``. Quick Start - Running PangyPlot ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash python pangyplot.py run --db hprc.clip --ref GRCh38 --annotations gencode48.chrY This should launch a local web server at http://127.0.0.1:5700 with chrY data that is included with the codebase. .. dropdown:: What is it doing? ``pangyplot run`` loads the specified database (``--db``) and launches the Flask web server. The database is loaded from ``datastore/graphs/{db}``. The directory at this location is assumed to be filled with chromosome-specific subdirectories (i.e. ``datastore/graphs/hprc.clip/chrY``). Each chromosome directory holds the database files created from a GFA file. The reference path (``--ref``) is used to specify the primary reference path. The optional gene annotation file (``--annotations``) is similarly loaded from ``datastore/annotations/{ref}/{annotations}`` (i.e. ``datastore/annotations/GRCh38/gencode48.chrY``). Quick Start - Loading Prepared Data ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash wget https://zenodo.org/records/17173731/files/chrY.zip unzip chrY.zip mkdir -p datastore/graphs/hprc.prepared mv chrY datastore/graphs/hprc.prepared/chrY python pangyplot.py run --db hprc.prepared --ref GRCh38 .. dropdown:: What is it doing? HPRC chromosome data has been preprocessed and available at: https://doi.org/10.5281/zenodo.17173731 Here we manually set up the directory structure to store the prepared data. Zipping up the directory structure is a convenient way to share prepared PangyPlot data. Quick Start - Preparing Data ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. tip:: The steps below can be generated for you interactively with :ref:`pangyplot-preprocess`, which writes a tailored shell (or SLURM) script from a few prompts. The manual walkthrough below is kept for reference and for cases where you want finer control over the individual ``odgi`` invocations. .. code-block:: bash cd pangyplot wget https://s3-us-west-2.amazonaws.com/human-pangenomics/pangenomes/freeze/freeze1/minigraph-cactus/hprc-v1.1-mc-grch38/hprc-v1.1-mc-grch38.chroms/chrY.vg # convert to odgi format - odgi cannot read GFA files with W-lines vg convert --no-wline chrY.vg -f > chrY_unsorted.gfa odgi build -O -g chrY_unsorted.gfa -o chrY_unsorted.og # one-dimensional sort odgi paths -L -i chrY_unsorted.og | grep GRCh38 > path_sort_order.txt odgi paths -L -i chrY_unsorted.og | grep CHM13 >> path_sort_order.txt odgi sort -t 4 --optimize -Y -H path_sort_order.txt -i chrY_unsorted.og -o chrY.og -P # create layout file odgi layout -t 4 -i chrY.og --tsv chrY.lay.tsv -P # create GFA file odgi view -i chrY.og -g > chrY.gfa python pangyplot.py add --ref GRCh38 --chr chrY --db hprc.test --gfa chrY.gfa --layout chrY.lay.tsv python pangyplot.py status --db hprc.test python pangyplot.py run --db hprc.test --ref GRCh38 .. dropdown:: What is it doing? This is how the data was prepared for the previous example. PangyPlot requires a GFA file and an layout file to create the database. Here we optimize the graph for the primary reference path GRCh38 during the 1D sort.