Content-Length: 368932 | pFad | http://github.com/GoekeLab/sg-nex-data/pull/33/files/d3f8122d0f52830d4e4e2e4a0c8ed353a920d8c4

5C update master branch with blow5 tutorials by cying111 · Pull Request #33 · GoekeLab/sg-nex-data · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update master branch with blow5 tutorials #33

Merged
merged 21 commits into from
Mar 7, 2023
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ This release includes 86 samples from 11 different cell lines.
You can access the following data through the [AWS Open Data Registry](https://registry.opendata.aws/sgnex/):

- raw files (fast5)
- raw files (blow5)
- basecalled files (fastq)
- aligned reads (genome and transcriptome) (bam)
- tracks for visualisation (bigwig and bigbed)
Expand Down Expand Up @@ -89,6 +90,8 @@ The following short tutorials are available that demonstrate how to analyse the

- [Identification of m6A with the SG-NEx samples (using m6Anet)](./docs/SG-NEx_m6Anet_tutorial.md)

- [Basecalling and analysing SG-NEx samples in S/BLOW5 format](./docs/SG-NEx_blow5_tutorial.md)

Additional, more detailed workflows can be found here:

- [Transcript discovery, quantification, and differential transcript expression from long read RNA-Seq data (using Bambu)](https://github.com/GoekeLab/bambu)
Expand All @@ -108,7 +111,6 @@ Viktoriia Iakovleva, Puay Leng Lee, Lixia Xin, Hui En Vanessa Ng, Jia Min Loo, X

**Statistical Modeling and Data Analytics**
Ying Chen, Nadia M. Davidson, Harshil Patel, Yuk Kei Wan, Min Hao Ling, Yu Song Chuah, Naruemon Pratanwanich, Christopher Hendra, Laura Watten, Chelsea Sawyer, Dominik Stanojevic, Philip Andrew Ewels, Andreas Wilm, Mile Sikic, Alexandre Thiery, Michael I. Love, Alicia Oshlak, Jonathan Göke

## Citing the SG-NEx project

The SG-NEx resource is described in:
Expand Down
30 changes: 20 additions & 10 deletions docs/AWS_data_access_tutorial.md
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,18 @@ SG-NEx data source contains long read (Oxford Nanopore) RNA sequencing data for

The SG-NEx S3 bucket contains the following types of data:

- [Raw sequencing signal (fast5)](#raw-sequencing-signal)
- [Basecalled sequences (fastq)](#basecalled-sequences)
- [Aligned sequences (bam)](#aligned-sequences)
- [Data visualisation tracks (bigwig/bigbed)](#data-visualisation-tracks)
- [Annotations](#annotations)
- [Processed data for RNA modification detection](#processed-data)
- [Sample and experiment information](#sample-and-experimental-data)
- [Raw sequencing signal (fast5)](#raw-sequencing-signal)
- [Basecalled sequences (fastq)](#basecalled-sequences)
- [Aligned sequences (bam)](#aligned-sequences)
- [Data visualisation tracks (bigwig/bigbed)](#data-visualisation-tracks)
- [Annotations](#annotations)
- [Processed data for RNA modification detection](#processed-data)
- [Sample and experiment information](#sample-and-experimental-data)

Below is the folder index for the open data bucket:
The SG-NEx S3 BLOW5 bucket contains the following types of data:
- [Raw sequencing signal (blow5)](#raw-sequencing-signal-in-blow5-format)

Below is the folder index for the open data buckets:

![folder indexing\!](/images/folder_index.png)

Expand All @@ -24,6 +27,14 @@ aws s3 ls --no-sign-request s3://sg-nex-data/data/sequencing_data_ont/fast5/ # l
aws s3 sync --no-sign-request s3://sg-nex-data/data/sequencing_data_ont/fast5/sample_name . # download fast5 files to your local directory
```

# Raw sequencing signal in BLOW5 format
To access raw sequencing (blow5) files:

```bash
aws s3 ls --no-sign-request s3://sg-nex-data-blow5/ # list samples
aws s3 sync --no-sign-request s3://sg-nex-data-blow5/sample_name . # download blow5 file and the index to your local directory
```

# Basecalled sequences
To access basecalled sequencing (fastq) files:

Expand Down Expand Up @@ -90,7 +101,6 @@ aws s3 sync --no-sign-request s3://sg-nex-data/data/annotations/gtf_file . # do

## RNA modification detection
Long read direct RNA sequencing has allows the detection of RNA modification with RNA modification tools, such as [xPore](https://github.com/GoekeLab/xpore) and [m6Anet](https://github.com/GoekeLab/m6anet). To simplify the analysis of RNA modifications using the SG-Nex datasets, you can download the processed files to use with xPore and m6Anet.

To download the processed data for differential RNA modification analysis with xPore:
```bash
aws s3 ls --no-sign-request s3://sg-nex-data/data/processed_data/xpore/ # list all samples that have processed data for RNA modification detection using xPore
Expand All @@ -106,7 +116,7 @@ These files are provided for a subset of samples, please see [here](/docs/sample

# Sample and experimental data

Detailed information for each sequencing sample is provided [here](/docs/samples.tsv). The data also includes multiplexed samples which share the same fast5 files. The information about the multiplexed samples can be found [here](/docs/multiplexed_samples.tsv). The files can also be accessed directly on S3:
Detailed information for each sequencing sample is provided [here](/docs/samples.tsv). The data also includes multiplexed samples which share the same fast5/blow5 files. The information about the multiplexed samples can be found [here](/docs/multiplexed_samples.tsv). The files can also be accessed directly on S3:


```bash
Expand Down
Loading








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/GoekeLab/sg-nex-data/pull/33/files/d3f8122d0f52830d4e4e2e4a0c8ed353a920d8c4

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy