# NCBI datasets 20221007 {{< admonition success "Installed" true >}} This software should be available with no extra configuration. {{< /admonition >}} ## ncbi-datasets-20221007 **Note: syntax will be changing significantly in the version 14 release. Stay tuned...** ## Getting started Welcome to NCBI Datasets NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases. You have the choice of getting the data through three interfaces: * NCBI Datasets website * Command-line tools * API: accessible through our Python package, or in combination with other UNIX tools (such as wget and curl). ![schematic showing the NCBI Datasets available interfaces: web, API and command line tool](../images/datasets/datasets_getting_started.png) ### Data delivery #### How is the data delivered? NCBI Datasets delivers data and metadata as a cohesive data package contained in a zip archive. When unzipped, files can be found in the folder ncbi_dataset/data. #### What do we mean by “cohesive”? For all data packages, users can include multiple files associated with the requested accession. For example, if users want to download the human reference genome assembly, they can also simultaneously select from transcript, protein, GFF, GTF, GBFF, and metadata files. For more information about data packages and their contents, please see our [Data packages](https://www.ncbi.nlm.nih.gov/datasets/docs/v1/reference-docs/data-packages/) page. #### Where can I learn more about NCBI Datasets? You can read more about how to use NCBI Datasets by checking out our [How-to guides](https://www.ncbi.nlm.nih.gov/datasets/docs/v1/how-tos/) where you can find instructions on how to download data and metadata for genomes, genes, ortholog sets, and viruses. Additionally, we also have an extensive documentation page for our [API](https://www.ncbi.nlm.nih.gov/datasets/docs/v1/reference-docs/rest-api/) and detailed information about our [command-line tools](https://www.ncbi.nlm.nih.gov/datasets/docs/v1/reference-docs/command-line/). ------------------------------------------------------------------------------- ## Location and version ```console $ which datasets /local/cluster/bin/datasets $ datasets version 13.43.2 ``` ## help message ```console $ datasets help datasets is a command-line tool that is used to query and download biological sequence data across all domains of life from NCBI databases. Refer to NCBI's [download and install](https://www.ncbi.nlm.nih.gov/datasets/docs/download-and-install/) documentation for information about getting started with the command-line tools. Usage datasets [command] Data Retrieval Commands summary print a summary of a gene or genome dataset download download a gene, genome or coronavirus dataset as a zip file rehydrate rehydrate a downloaded, dehydrated dataset Miscellaneous Commands completion generate autocompletion scripts version print the version of this client and exit help Help about any command Flags --api-key string NCBI Datasets API Key -h, --help help for datasets --no-progressbar hide progress bar Use datasets help for detailed help about a command. ``` software ref: