Cloud Cost Overview
Data for AMP PD is made available on the Google Cloud Platform (GCP). This data is stored in Google Cloud projects paid for by the AMP PD partnership.
To use AMP PD data, you must have a Google Cloud billing account, and any charges related to using the data or Google Cloud services will be billed to you by Google. This document provides information and examples for working with AMP PD data so that you can make informed decisions around costs. For up-to-date billing information, see the documentation for GCP Pricing.
Google Compute Engine
Google Compute Engine (GCE) provides virtual machines (VMs) and block storage (disks) which can be used for running analyses such as converting a CRAM file to a BAM file or running a Jupyter Notebook to transform and visualize data.
Google BigQuery (BQ) is a database where "tables" are stored in "datasets". You can issue SQL queries to filter and retrieve data in BigQuery.
Cloud Use Cases
With the key Cloud costs listed above, we can revisit the original uses cases and provide the framework for their costs. Specific costs will vary based on software versions, data sizes, storage and access locations.
Running a Notebook
The Terra environment provides the ability to run analyses using Jupyter notebooks. In this section, we look at costs around using the Jupyter notebook service, along with costs for running a couple of example notebooks.
The following is general advice for controlling costs when using Google Cloud for typical life sciences work. The following "quick tips" are explained in more detail below: