. . . . "WorkflowHub" . "https://about.workflowhub.eu/" . . "Workflow RO-Crate Profile" . "0.2.0" . . "MaƂgorzata Wolniewicz" . . "https://doi.org/10.1093/bioinformatics/bts480" . "Snakemake" . "https://snakemake.readthedocs.io/" . . "https://orcid.org/0000-0001-8197-3303" . "Stevie Pederson" . . "- Coverage bigWig files for each individual sample are produced using CPM values (i.e. Signal Per Million Reads, SPMR)\n- For all combinations of target and treatment coverage bigWig files are also produced, along with fold-enrichment bigWig files" . . "18.06167400881057" . "8.2" . "geology" . . "75.6271411814711" . "1.3413619995117188" . "### Outputs\n\n- Trimmed fastq.gz files will be written to `data/fastq/trimmed`\n- `FastQC` and `MultiQC` will also be run, with output in `docs/qc/trimmed`\n- AdapterRemoval 'settings' files will be written to `output/adapterremoval`" . . "19.162995594713653" . "8.7" . "fastq.gz file" . . "8.302808302808302" . "6.8" . "### Outputs\r\n\r\n- Trimmed fastq.gz files will be written to `data/fastq/trimmed`\r\n- `FastQC` and `MultiQC` will also be run, with output in `docs/qc/trimmed`\r\n- AdapterRemoval 'settings' files will be written to `output/adapterremoval`" . . "19.162995594713653" . "8.7" . "log" . . "9.011264080100124" . "7.2" . "earth sciences" . . "24.372858818528897" . "0.43228960037231445" . "Civil law" . . "Crime, law and justice/Law/Civil law" . "Civil law" . . "Crime, law and justice/Law/Civil law" . "output" . . "15.465116279069768" . "26.6" . "data" . . "15.143929912390487" . "12.1" . "service-account-enrichment" . . . . "75099"^^ . "https://api.rohub.org/api/ros/3fdc0374-95f4-4c7d-928c-24dd80fbd26f/crate/download/" . "Work-in-progress" . . "2023-09-08 11:58:23.205668+00:00" . "2024-03-05 12:23:16.330026+00:00" . "2023-09-08 11:58:23.205668+00:00" . "# prepareChIPs\r\n\r\nThis is a simple `snakemake` workflow template for preparing **single-end** ChIP-Seq data.\r\nThe steps implemented are:\r\n\r\n1. Download raw fastq files from SRA\r\n2. Trim and Filter raw fastq files using `AdapterRemoval`\r\n3. Align to the supplied genome using `bowtie2`\r\n4. Deduplicate Alignments using `Picard MarkDuplicates`\r\n5. Call Macs2 Peaks using `macs2`\r\n\r\nA pdf of the rulegraph is available [here](workflow/rules/rulegraph.pdf)\r\n\r\nFull details for each step are given below.\r\nAny additional parameters for tools can be specified using `config/config.yml`, along with many of the requisite paths\r\n\r\nTo run the workflow with default settings, simply run as follows (after editing `config/samples.tsv`)\r\n\r\n```bash\r\nsnakemake --use-conda --cores 16\r\n```\r\n\r\nIf running on an HPC cluster, a snakemake profile will required for submission to the queueing system and appropriate resource allocation.\r\nPlease discuss this will your HPC support team.\r\nNodes may also have restricted internet access and rules which download files may not work on many HPCs.\r\nPlease see below or discuss this with your support team\r\n\r\nWhilst no snakemake wrappers are explicitly used in this workflow, the underlying scripts are utilised where possible to minimise any issues with HPC clusters with restrictions on internet access.\r\nThese scripts are based on `v1.31.1` of the snakemake wrappers\r\n\r\n### Important Note Regarding OSX Systems\r\n\r\nIt should be noted that this workflow is **currently incompatible with OSX-based systems**. \r\nThere are two unsolved issues\r\n\r\n1. `fasterq-dump` has a bug which is specific to conda environments. This has been updated in v3.0.3 but this patch has not yet been made available to conda environments for OSX. Please check [here](https://anaconda.org/bioconda/sra-tools) to see if this has been updated.\r\n2. The following error appears in some OSX-based R sessions, in a system-dependent manner:\r\n```\r\nError in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : \r\n polygon edge not found\r\n```\r\n\r\nThe fix for this bug is currently unknown\r\n\r\n## Download Raw Data\r\n\r\n### Outline\r\n\r\nThe file `samples.tsv` is used to specify all steps for this workflow.\r\nThis file must contain the columns: `accession`, `target`, `treatment` and `input`\r\n\r\n1. `accession` must be an SRA accession. Only single-end data is currently supported by this workflow\r\n2. `target` defines the ChIP target. All files common to a target and treatment will be used to generate summarised coverage in bigWig Files\r\n3. `treatment` defines the treatment group each file belongs to. If only one treatment exists, simply use the value 'control' or similar for every file\r\n4. `input` should contain the accession for the relevant input sample. These will only be downloaded once. Valid input samples are *required* for this workflow\r\n\r\nAs some HPCs restrict internet access for submitted jobs, *it may be prudent to run the initial rules in an interactive session* if at all possible.\r\nThis can be performed using the following (with 2 cores provided as an example)\r\n\r\n```bash\r\nsnakemake --use-conda --until get_fastq --cores 2\r\n```\r\n\r\n### Outputs\r\n\r\n- Downloaded files will be gzipped and written to `data/fastq/raw`.\r\n- `FastQC` and `MultiQC` will also be run, with output in `docs/qc/raw`\r\n\r\nBoth of these directories are able to be specified as relative paths in `config.yml`\r\n\r\n## Read Filtering\r\n\r\n### Outline\r\n\r\nRead trimming is performed using [AdapterRemoval](https://adapterremoval.readthedocs.io/en/stable/).\r\nDefault settings are customisable using config.yml, with the defaults set to discard reads shorter than 50nt, and to trim using quality scores with a threshold of Q30.\r\n\r\n### Outputs\r\n\r\n- Trimmed fastq.gz files will be written to `data/fastq/trimmed`\r\n- `FastQC` and `MultiQC` will also be run, with output in `docs/qc/trimmed`\r\n- AdapterRemoval 'settings' files will be written to `output/adapterremoval`\r\n\r\n## Alignments\r\n\r\n### Outline\r\n\r\nAlignment is performed using [`bowtie2`](https://bowtie-bio.sourceforge.net/bowtie2/manual.shtml) and it is assumed that this index is available before running this workflow.\r\nThe path and prefix must be provided using config.yml\r\n\r\nThis index will also be used to produce the file `chrom.sizes` which is essential for conversion of bedGraph files to the more efficient bigWig files.\r\n\r\n### Outputs\r\n\r\n- Alignments will be written to `data/aligned`\r\n- `bowtie2` log files will be written to `output/bowtie2` (not the conenvtional log directory)\r\n- The file `chrom.sizes` will be written to `output/annotations`\r\n\r\nBoth sorted and the original unsorted alignments will be returned.\r\nHowever, the unsorted alignments are marked with `temp()` and can be deleted using \r\n\r\n```bash\r\nsnakemake --delete-temp-output --cores 1\r\n```\r\n\r\n## Deduplication\r\n\r\n### Outline\r\n\r\nDeduplication is performed using [MarkDuplicates](https://gatk.broadinstitute.org/hc/en-us/articles/360037052812-MarkDuplicates-Picard-) from the Picard set of tools.\r\nBy default, deduplication will remove the duplicates from the set of alignments.\r\nAll resultant bam files will be sorted and indexed.\r\n\r\n### Outputs\r\n\r\n- Deduplicated alignments are written to `data/deduplicated` and are indexed\r\n- DuplicationMetrics files are written to `output/markDuplicates`\r\n\r\n## Peak Calling\r\n\r\n### Outline\r\n\r\nThis is performed using [`macs2 callpeak`](https://pypi.org/project/MACS2/).\r\n\r\n- Peak calling will be performed on:\r\n a. each sample individually, and \r\n b. merged samples for those sharing a common ChIP target and treatment group.\r\n- Coverage bigWig files for each individual sample are produced using CPM values (i.e. Signal Per Million Reads, SPMR)\r\n- For all combinations of target and treatment coverage bigWig files are also produced, along with fold-enrichment bigWig files\r\n\r\n### Outputs\r\n\r\n- Individual outputs are written to `output/macs2/{accession}`\r\n\t+ Peaks are written in `narrowPeak` format along with `summits.bed`\r\n\t+ bedGraph files are automatically converted to bigWig files, and the originals are marked with `temp()` for subsequent deletion\r\n\t+ callpeak log files are also added to this directory\r\n- Merged outputs are written to `output/macs2/{target}/`\r\n\t+ bedGraph Files are also converted to bigWig and marked with `temp()`\r\n\t+ Fold-Enrichment bigWig files are also created with the original bedGraph files marked with `temp()`\r\n" . . "application/ld+json" . . . . . . . . . "https://w3id.org/ro-id/3fdc0374-95f4-4c7d-928c-24dd80fbd26f" . "https://github.com/smped/prepareChIPs.git" . . "workflow/Snakefile" . "Research Object Crate for prepareChIPs:" . "https://workflowhub.eu/workflows/528/ro_crate?version=1" . . . . "https://w3id.org/ro-id/a1bb094f-e223-4eb7-90c8-2d55ec57e261" . "https://w3id.org/ro-id/af02653a-41cb-4cbc-86f3-4da526e29d72" . "https://w3id.org/ro-id/c55159f8-5154-4d92-b6c8-ebd2fcfc4bf0" . "https://w3id.org/ro-id/3c10da67-cce5-459d-88dd-2bce722516d0" . "https://w3id.org/ro-id/6277e661-77ff-4bd7-b1b3-f2f51c0ab0ec" . "https://w3id.org/ro-id/8e9e8062-ddbd-46c3-a771-5be65444212f" . "https://w3id.org/ro-id/b1a61304-f050-4541-9d32-259ff31f5fdc" . "https://w3id.org/ro-id/d35c78c4-748a-4706-a8b1-cee0aa47f9ab" . "https://w3id.org/ro-id/d80380ce-e484-491a-b68b-e978fd2930c2" . "https://w3id.org/ro-id/dfdcee9f-16ae-4a8f-bae0-4b0b69a73d53" . "https://w3id.org/ro-id/f7b300ce-ad48-4976-9503-40223a6a9144" . "https://w3id.org/ro-id/ff38c043-b00d-41e1-b3e0-4f9aa80de812" . "https://w3id.org/ro-id/0f19c8c2-dbdb-4207-b24c-c97de87452ac" . "https://w3id.org/ro-id/28fa92df-fcc2-479d-b1db-71bdf50be326" . "https://w3id.org/ro-id/46d241be-9337-4749-9543-e01808c828ce" . "https://w3id.org/ro-id/64f93c98-c150-4431-885a-8ac5fb5add82" . "https://w3id.org/ro-id/2abd0462-b126-4e65-beed-dd55db1ab4b4" . "https://w3id.org/ro-id/339f5ed4-2f45-49a7-926d-a2b248657c9b" . "https://w3id.org/ro-id/61a7eb88-5e9f-4cfc-b578-c406b7a43144" . "https://w3id.org/ro-id/708386a0-6582-4493-a7a2-7e22bc2c0dbb" . "https://w3id.org/ro-id/a6991c9d-c113-435e-9455-dfbbd1f0ed19" . "https://w3id.org/ro-id/b827efbe-6c40-47b1-bba2-b9623dc2ed2f" . "https://w3id.org/ro-id/c33311df-7f79-464a-98d9-4ca351c53eb6" . "https://w3id.org/ro-id/e08487bf-fd3c-4882-90b4-5f9ce844a146" . "https://w3id.org/ro-id/1efe80d0-1f70-4a71-8b23-3e1de8d01a44" . "https://w3id.org/ro-id/3d3af63c-e9de-41b9-8795-881508422b9f" . "https://w3id.org/ro-id/5bf0c07d-8b93-4a78-b020-101fa59fd22a" . "https://w3id.org/ro-id/6b05e40d-078b-4023-b984-3dd996317ee9" . "https://w3id.org/ro-id/9156e762-e630-40b0-b9aa-c059ba0a40c1" . "https://w3id.org/ro-id/d98db78b-77a7-454e-af9d-4ef2514949dc" . "https://w3id.org/ro-id/dba77b48-6599-4bbd-82d2-f9d4cfd9abd8" . "https://w3id.org/ro-id/4366f40c-5956-4a88-87bf-794b7c12d33b" . "https://w3id.org/ro-id/49f24138-b352-4b9d-bbcd-b10c266846ca" . "https://w3id.org/ro-id/c65c143c-5478-46a4-a257-4ec365270d90" . "https://w3id.org/ro-id/c9a9f204-3830-48be-a7a8-9f4e19d0fd00" . "https://w3id.org/ro-id/113a166f-2c22-4c99-a811-eb9943175d7b" . "https://w3id.org/ro-id/52e0bc84-491c-4913-9a92-5fe4af038b25" . "https://w3id.org/ro-id/69951488-193c-43c5-ae9a-c5135803cb3e" . "https://w3id.org/ro-id/a49165ba-fbd5-4483-9c39-9cb8258e6c5a" . "https://w3id.org/ro-id/b673a29e-23c3-487e-ad95-8fc69f7d7547" . "https://w3id.org/ro-id/ccf5e4c4-0c4e-48dd-8bdd-84006c60ec4b" . "https://w3id.org/ro-id/e193bcdc-d8c4-4b4e-8331-c1d46c14c206" . "https://w3id.org/ro-id/faf7e5b3-b0ef-4e09-b55c-a745db352edf" . "https://w3id.org/ro-id/0d2c2af7-0ab6-4218-885f-a838c95f4871" . "https://w3id.org/ro-id/0fc5108a-3620-4c1b-94d3-a22e717bd424" . "https://w3id.org/ro-id/13a2600e-aeba-4334-9fd1-abb8f2542e75" . "https://w3id.org/ro-id/5cf3d332-4cd5-432a-8423-200792b8d509" . "https://w3id.org/ro-id/78fbbea2-7dd0-4f4b-984d-625c6013c161" . "Stevie Pederson. \"Research Object Crate for prepareChIPs:.\" ROHub. Sep 08 ,2023. https://w3id.org/ro-id/3fdc0374-95f4-4c7d-928c-24dd80fbd26f." . . . "config" . . . . . . . . . . . . . . "envs" . . . . . . . "workflow" . . . . . . . . . . "analysis" . . . . . . . . . . . . . . . . . "rules" . . . . "docs" . . . . . . . . . . . . . "scripts" . . . . "89"^^ . "https://api.rohub.org/api/resources/03bab0c3-27a1-4bad-9192-54fd5718efb6/download/" . . "2023-09-08 11:58:24.200662+00:00" . "2023-09-08 11:58:25.174162+00:00" . . ".gitignore" . "2023-09-08 11:58:24.200662+00:00" . . . . "6262"^^ . "https://api.rohub.org/api/resources/165ceeeb-0982-40e6-a0ad-a77d0eec7316/download/" . . "2023-09-08 11:58:24.201386+00:00" . "2023-09-08 11:58:25.579736+00:00" . "text/markdown" . . "README.md" . "2023-09-08 11:58:24.201386+00:00" . . . "https://ror.org/https://doi.org/10.48546/workflowhub.workflow.528.1" . . "24823"^^ . "https://api.rohub.org/api/resources/17e5cc71-3d5c-46fd-8a27-709516f324bd/download/" . . "2023-09-08 11:58:24.234824+00:00" . "2023-09-08 11:58:45.182676+00:00" . "text/html" . . "ro-crate-preview.html" . "2023-09-08 11:58:24.234824+00:00" . . . . . "444"^^ . "https://api.rohub.org/api/resources/19242bbc-9eae-461c-9535-fff02a359034/download/" . . "2023-09-08 11:58:24.203231+00:00" . "2023-09-08 11:58:25.984117+00:00" . "text/html" . . "footer.html" . "2023-09-08 11:58:24.203231+00:00" . . . . "861"^^ . "https://api.rohub.org/api/resources/1b909ce5-a128-47dc-93dd-4e0a1795d8c7/download/" . . "2023-09-08 11:58:24.223469+00:00" . "2023-09-08 11:58:34.130492+00:00" . . "adapterremoval.smk" . "2023-09-08 11:58:24.223469+00:00" . . . . "114"^^ . "https://api.rohub.org/api/resources/22f177c1-06e1-49bf-ade6-d3ba16e6d622/download/" . . "2023-09-08 11:58:24.232471+00:00" . "2023-09-08 11:58:38.631179+00:00" . . "adapterremoval.yml" . "2023-09-08 11:58:24.232471+00:00" . . . . "3680"^^ . "https://api.rohub.org/api/resources/23c6fd11-d3c3-4615-8946-3c11f0876dac/download/" . . "2023-09-08 11:58:24.203909+00:00" . "2023-09-08 11:58:26.201236+00:00" . . "index.Rmd" . "2023-09-08 11:58:24.203909+00:00" . . . . "1427"^^ . "https://api.rohub.org/api/resources/2699beb6-1816-4b09-a671-c7b4e8e27632/download/" . . "2023-09-08 11:58:24.226059+00:00" . "2023-09-08 11:58:35.413099+00:00" . . "bedgraph_to_bigwig.smk" . "2023-09-08 11:58:24.226059+00:00" . . . . "95"^^ . "https://api.rohub.org/api/resources/26f96889-0418-4090-92f3-3040ccf7daad/download/" . . "2023-09-08 11:58:24.226985+00:00" . "2023-09-08 11:58:35.650367+00:00" . . "macs2.yml" . "2023-09-08 11:58:24.226985+00:00" . . . . "1162"^^ . "https://api.rohub.org/api/resources/2d4cc403-0af3-4ea5-9cae-06b8e07ae93f/download/" . . "2023-09-08 11:58:24.210823+00:00" . "2023-09-08 11:58:29.206586+00:00" . "text/x-python" . . "fasterq-dump.py" . "2023-09-08 11:58:24.210823+00:00" . . . . "1107"^^ . "https://api.rohub.org/api/resources/2e217125-2078-43e1-a55c-b9db9862ce5a/download/" . . "2023-09-08 11:58:24.212088+00:00" . "2023-09-08 11:58:29.561861+00:00" . . "cross_correlations.R" . "2023-09-08 11:58:24.212088+00:00" . . . . "2066"^^ . "https://api.rohub.org/api/resources/365c007e-1313-401c-8ac1-1f681739363a/download/" . . "2023-09-08 11:58:24.216182+00:00" . "2023-09-08 11:58:31.336377+00:00" . "text/x-python" . . "picard_markduplicates.py" . "2023-09-08 11:58:24.216182+00:00" . . . . "1216"^^ . "https://api.rohub.org/api/resources/3cded656-98eb-4101-a93e-5842ba43a49e/download/" . . "2023-09-08 11:58:24.222810+00:00" . "2023-09-08 11:58:33.768883+00:00" . . "bowtie2.smk" . "2023-09-08 11:58:24.222810+00:00" . . . . "100"^^ . "https://api.rohub.org/api/resources/3daa2270-78fe-4674-aa82-9272af94aee1/download/" . . "2023-09-08 11:58:24.228827+00:00" . "2023-09-08 11:58:36.395143+00:00" . . "fastqc.yml" . "2023-09-08 11:58:24.228827+00:00" . . . . "0"^^ . "https://api.rohub.org/api/resources/44eaf315-663c-450e-b246-b417fb1251ba/download/" . . "2023-09-08 11:58:24.202011+00:00" . "2023-09-08 11:58:25.770572+00:00" . . ".here" . "2023-09-08 11:58:24.202011+00:00" . . . . "584"^^ . "https://api.rohub.org/api/resources/4cdb7678-4556-4b45-9bd8-05790d8e59b7/download/" . . "2023-09-08 11:58:24.208900+00:00" . "2023-09-08 11:58:28.428222+00:00" . . "config.yml" . "2023-09-08 11:58:24.208900+00:00" . . . . "10145"^^ . "https://api.rohub.org/api/resources/4ff57492-0375-4aa6-a794-2009fbbbf255/download/" . . "2023-09-08 11:58:24.205266+00:00" . "2023-09-08 11:58:26.779572+00:00" . . "trimmed_qc.Rmd" . "2023-09-08 11:58:24.205266+00:00" . . . . "512"^^ . "https://api.rohub.org/api/resources/52bc888b-bb39-405f-8491-53b49d676d40/download/" . . "2023-09-08 11:58:24.213388+00:00" . "2023-09-08 11:58:30.562624+00:00" . . "check_callpeak_logs.R" . "2023-09-08 11:58:24.213388+00:00" . . . . "8553"^^ . "https://api.rohub.org/api/resources/5aacd754-ef17-459c-a829-f6e0af1013f6/download/" . . "2023-09-08 11:58:24.204589+00:00" . "2023-09-08 11:58:26.388423+00:00" . . "raw_qc.Rmd" . "2023-09-08 11:58:24.204589+00:00" . . . . "0"^^ . "https://api.rohub.org/api/resources/5c0d3b16-c3ec-4020-8d86-927d5ed39ad4/download/" . . "2023-09-08 11:58:24.224071+00:00" . "2023-09-08 11:58:34.342232+00:00" . "application/pdf" . . "rulegraph.pdf" . "2023-09-08 11:58:24.224071+00:00" . . . . "514"^^ . "https://api.rohub.org/api/resources/5d6b0cd0-9e8e-456c-bc4d-f371f89984c3/download/" . . "2023-09-08 11:58:24.217793+00:00" . "2023-09-08 11:58:31.909187+00:00" . . "picard_markduplicates.smk" . "2023-09-08 11:58:24.217793+00:00" . . . . "122"^^ . "https://api.rohub.org/api/resources/60e2708f-da85-46e8-b4a9-0907b507fd15/download/" . . "2023-09-08 11:58:24.227608+00:00" . "2023-09-08 11:58:35.870545+00:00" . . "bedgraph_to_bigwig.yml" . "2023-09-08 11:58:24.227608+00:00" . . . . "5110"^^ . "https://api.rohub.org/api/resources/672f2b4f-fac0-4518-8150-408e4469fbca/download/" . . "2023-09-08 11:58:24.224720+00:00" . "2023-09-08 11:58:35.034203+00:00" . . "rmarkdown.smk" . "2023-09-08 11:58:24.224720+00:00" . . . . "1391"^^ . "https://api.rohub.org/api/resources/68fd58a3-bed8-4164-9bca-bc4aabe88cec/download/" . . "2023-09-08 11:58:24.221583+00:00" . "2023-09-08 11:58:33.403422+00:00" . . "peak_stats.smk" . "2023-09-08 11:58:24.221583+00:00" . . . . "745"^^ . "https://api.rohub.org/api/resources/6d7b6733-4b86-4ace-98af-8db33f4eba06/download/" . . "2023-09-08 11:58:24.214056+00:00" . "2023-09-08 11:58:30.750540+00:00" . . "create_macs2_summary.R" . "2023-09-08 11:58:24.214056+00:00" . . . . "1286"^^ . "https://api.rohub.org/api/resources/7746fab3-7efa-4abf-a940-7b31e2c7fd98/download/" . . "2023-09-08 11:58:24.216821+00:00" . "2023-09-08 11:58:31.532681+00:00" . . "get_frip.R" . "2023-09-08 11:58:24.216821+00:00" . . . . "5270"^^ . "https://api.rohub.org/api/resources/890b303b-e8c0-4c2b-b1a0-cabcdbe1e1f2/download/" . . "2023-09-08 11:58:24.220935+00:00" . "2023-09-08 11:58:33.229069+00:00" . . "macs2.smk" . "2023-09-08 11:58:24.220935+00:00" . . . . "660"^^ . "https://api.rohub.org/api/resources/8a3c0ab0-e7e0-43b9-99b0-d199e24487c3/download/" . . "2023-09-08 11:58:24.219048+00:00" . "2023-09-08 11:58:32.418575+00:00" . . "multiqc.smk" . "2023-09-08 11:58:24.219048+00:00" . . . . "379"^^ . "https://api.rohub.org/api/resources/90ea7f1f-c153-4af1-8544-9f4dc0889864/download/" . . "2023-09-08 11:58:24.219670+00:00" . "2023-09-08 11:58:32.626231+00:00" . . "fasterq-dump.smk" . "2023-09-08 11:58:24.219670+00:00" . . . . "138"^^ . "https://api.rohub.org/api/resources/9223fbb7-c378-4a6d-87fe-0da8b12a0404/download/" . . "2023-09-08 11:58:24.230046+00:00" . "2023-09-08 11:58:37.165732+00:00" . . "samtools.yml" . "2023-09-08 11:58:24.230046+00:00" . . . . "7330"^^ . "https://api.rohub.org/api/resources/93bebfa3-e544-4175-9efb-7d82ea7d27bb/download/" . . "2023-09-08 11:58:24.206676+00:00" . "2023-09-08 11:58:27.291046+00:00" . "text/x-bibtex" . . "references.bib" . "2023-09-08 11:58:24.206676+00:00" . . . "https://bioschemas.org/profiles/ComputationalWorkflow/1.0-RELEASE/" . . "3868" . "https://api.rohub.org/api/resources/985f7fa0-bee5-4e8d-88cc-b1aba653c3fd/download/" . . "2023-07-09 08:54:36+00:00" . "2023-09-08 11:58:44.807004+00:00" . "# prepareChIPs\r\n\r\nThis is a simple `snakemake` workflow template for preparing **single-end** ChIP-Seq data.\r\nThe steps implemented are:\r\n\r\n1. Download raw fastq files from SRA\r\n2. Trim and Filter raw fastq files using `AdapterRemoval`\r\n3. Align to the supplied genome using `bowtie2`\r\n4. Deduplicate Alignments using `Picard MarkDuplicates`\r\n5. Call Macs2 Peaks using `macs2`\r\n\r\nA pdf of the rulegraph is available [here](workflow/rules/rulegraph.pdf)\r\n\r\nFull details for each step are given below.\r\nAny additional parameters for tools can be specified using `config/config.yml`, along with many of the requisite paths\r\n\r\nTo run the workflow with default settings, simply run as follows (after editing `config/samples.tsv`)\r\n\r\n```bash\r\nsnakemake --use-conda --cores 16\r\n```\r\n\r\nIf running on an HPC cluster, a snakemake profile will required for submission to the queueing system and appropriate resource allocation.\r\nPlease discuss this will your HPC support team.\r\nNodes may also have restricted internet access and rules which download files may not work on many HPCs.\r\nPlease see below or discuss this with your support team\r\n\r\nWhilst no snakemake wrappers are explicitly used in this workflow, the underlying scripts are utilised where possible to minimise any issues with HPC clusters with restrictions on internet access.\r\nThese scripts are based on `v1.31.1` of the snakemake wrappers\r\n\r\n### Important Note Regarding OSX Systems\r\n\r\nIt should be noted that this workflow is **currently incompatible with OSX-based systems**. \r\nThere are two unsolved issues\r\n\r\n1. `fasterq-dump` has a bug which is specific to conda environments. This has been updated in v3.0.3 but this patch has not yet been made available to conda environments for OSX. Please check [here](https://anaconda.org/bioconda/sra-tools) to see if this has been updated.\r\n2. The following error appears in some OSX-based R sessions, in a system-dependent manner:\r\n```\r\nError in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : \r\n polygon edge not found\r\n```\r\n\r\nThe fix for this bug is currently unknown\r\n\r\n## Download Raw Data\r\n\r\n### Outline\r\n\r\nThe file `samples.tsv` is used to specify all steps for this workflow.\r\nThis file must contain the columns: `accession`, `target`, `treatment` and `input`\r\n\r\n1. `accession` must be an SRA accession. Only single-end data is currently supported by this workflow\r\n2. `target` defines the ChIP target. All files common to a target and treatment will be used to generate summarised coverage in bigWig Files\r\n3. `treatment` defines the treatment group each file belongs to. If only one treatment exists, simply use the value 'control' or similar for every file\r\n4. `input` should contain the accession for the relevant input sample. These will only be downloaded once. Valid input samples are *required* for this workflow\r\n\r\nAs some HPCs restrict internet access for submitted jobs, *it may be prudent to run the initial rules in an interactive session* if at all possible.\r\nThis can be performed using the following (with 2 cores provided as an example)\r\n\r\n```bash\r\nsnakemake --use-conda --until get_fastq --cores 2\r\n```\r\n\r\n### Outputs\r\n\r\n- Downloaded files will be gzipped and written to `data/fastq/raw`.\r\n- `FastQC` and `MultiQC` will also be run, with output in `docs/qc/raw`\r\n\r\nBoth of these directories are able to be specified as relative paths in `config.yml`\r\n\r\n## Read Filtering\r\n\r\n### Outline\r\n\r\nRead trimming is performed using [AdapterRemoval](https://adapterremoval.readthedocs.io/en/stable/).\r\nDefault settings are customisable using config.yml, with the defaults set to discard reads shorter than 50nt, and to trim using quality scores with a threshold of Q30.\r\n\r\n### Outputs\r\n\r\n- Trimmed fastq.gz files will be written to `data/fastq/trimmed`\r\n- `FastQC` and `MultiQC` will also be run, with output in `docs/qc/trimmed`\r\n- AdapterRemoval 'settings' files will be written to `output/adapterremoval`\r\n\r\n## Alignments\r\n\r\n### Outline\r\n\r\nAlignment is performed using [`bowtie2`](https://bowtie-bio.sourceforge.net/bowtie2/manual.shtml) and it is assumed that this index is available before running this workflow.\r\nThe path and prefix must be provided using config.yml\r\n\r\nThis index will also be used to produce the file `chrom.sizes` which is essential for conversion of bedGraph files to the more efficient bigWig files.\r\n\r\n### Outputs\r\n\r\n- Alignments will be written to `data/aligned`\r\n- `bowtie2` log files will be written to `output/bowtie2` (not the conenvtional log directory)\r\n- The file `chrom.sizes` will be written to `output/annotations`\r\n\r\nBoth sorted and the original unsorted alignments will be returned.\r\nHowever, the unsorted alignments are marked with `temp()` and can be deleted using \r\n\r\n```bash\r\nsnakemake --delete-temp-output --cores 1\r\n```\r\n\r\n## Deduplication\r\n\r\n### Outline\r\n\r\nDeduplication is performed using [MarkDuplicates](https://gatk.broadinstitute.org/hc/en-us/articles/360037052812-MarkDuplicates-Picard-) from the Picard set of tools.\r\nBy default, deduplication will remove the duplicates from the set of alignments.\r\nAll resultant bam files will be sorted and indexed.\r\n\r\n### Outputs\r\n\r\n- Deduplicated alignments are written to `data/deduplicated` and are indexed\r\n- DuplicationMetrics files are written to `output/markDuplicates`\r\n\r\n## Peak Calling\r\n\r\n### Outline\r\n\r\nThis is performed using [`macs2 callpeak`](https://pypi.org/project/MACS2/).\r\n\r\n- Peak calling will be performed on:\r\n a. each sample individually, and \r\n b. merged samples for those sharing a common ChIP target and treatment group.\r\n- Coverage bigWig files for each individual sample are produced using CPM values (i.e. Signal Per Million Reads, SPMR)\r\n- For all combinations of target and treatment coverage bigWig files are also produced, along with fold-enrichment bigWig files\r\n\r\n### Outputs\r\n\r\n- Individual outputs are written to `output/macs2/{accession}`\r\n\t+ Peaks are written in `narrowPeak` format along with `summits.bed`\r\n\t+ bedGraph files are automatically converted to bigWig files, and the originals are marked with `temp()` for subsequent deletion\r\n\t+ callpeak log files are also added to this directory\r\n- Merged outputs are written to `output/macs2/{target}/`\r\n\t+ bedGraph Files are also converted to bigWig and marked with `temp()`\r\n\t+ Fold-Enrichment bigWig files are also created with the original bedGraph files marked with `temp()`" . "Bioinformatics, Genomics, Transcriptomics" . . "prepareChIPs:" . "https://workflowhub.eu/projects/148" . "#snakemake" . "2023-07-09 08:54:36+00:00" . "https://about.workflowhub.eu/" . "https://workflowhub.eu/workflows/528?version=1" . "1" . . . . . . "1307"^^ . "https://api.rohub.org/api/resources/98f60e6a-e6fa-4fbb-bc56-ccda2a3340e3/download/" . . "2023-09-08 11:58:24.214903+00:00" . "2023-09-08 11:58:30.946579+00:00" . . "make_greylist.R" . "2023-09-08 11:58:24.214903+00:00" . . . . "155"^^ . "https://api.rohub.org/api/resources/99e9f2a6-a57d-4443-86af-15afee791284/download/" . . "2023-09-08 11:58:24.233076+00:00" . "2023-09-08 11:58:38.816175+00:00" . . "bowtie2.yml" . "2023-09-08 11:58:24.233076+00:00" . . . . "2297"^^ . "https://api.rohub.org/api/resources/9b9b25b1-77f6-4dac-af7e-92fa55863cea/download/" . . "2023-09-08 11:58:24.211472+00:00" . "2023-09-08 11:58:29.377116+00:00" . . "create_site_yaml.R" . "2023-09-08 11:58:24.211472+00:00" . . . . "9719"^^ . "https://api.rohub.org/api/resources/9d59e482-707c-4522-87c0-c5ca05313e38/download/" . . "2023-09-08 11:58:24.207339+00:00" . "2023-09-08 11:58:27.495049+00:00" . . "align_qc.Rmd" . "2023-09-08 11:58:24.207339+00:00" . . . . "0"^^ . "https://api.rohub.org/api/resources/a034a7c7-cf41-47cb-920b-f4904701712a/download/" . . "2023-09-08 11:58:24.234013+00:00" . "2023-09-08 11:58:38.997256+00:00" . . ".nojekyll" . "2023-09-08 11:58:24.234013+00:00" . . . . "669"^^ . "https://api.rohub.org/api/resources/a0f3633c-223b-439d-99f3-2a4926996f00/download/" . . "2023-09-08 11:58:24.218411+00:00" . "2023-09-08 11:58:32.106362+00:00" . . "fastqc.smk" . "2023-09-08 11:58:24.218411+00:00" . . . . "33"^^ . "https://api.rohub.org/api/resources/a2b0640e-a918-4ddf-8f36-2021c706c9fe/download/" . . "2023-09-08 11:58:24.208262+00:00" . "2023-09-08 11:58:27.690429+00:00" . "text/tab-separated-values" . . "samples.tsv" . "2023-09-08 11:58:24.208262+00:00" . . . . "2640"^^ . "https://api.rohub.org/api/resources/a4a0093b-349f-443b-9fac-bb4a60a213bd/download/" . . "2023-09-08 11:58:24.220277+00:00" . "2023-09-08 11:58:32.819647+00:00" . "application/msword" . . "rulegraph.dot" . "2023-09-08 11:58:24.220277+00:00" . . . . "18397"^^ . "https://api.rohub.org/api/resources/acfc63b3-0045-4720-80a4-8a36d26a72b4/download/" . . "2023-09-08 11:58:24.206000+00:00" . "2023-09-08 11:58:26.969578+00:00" . . "_macs2_summary.Rmd" . "2023-09-08 11:58:24.206000+00:00" . . . . "167"^^ . "https://api.rohub.org/api/resources/b520ea10-64d3-4124-b441-154cd4749048/download/" . . "2023-09-08 11:58:24.229450+00:00" . "2023-09-08 11:58:36.574390+00:00" . . "picard_markduplicates.yml" . "2023-09-08 11:58:24.229450+00:00" . . . . "774"^^ . "https://api.rohub.org/api/resources/b7726c93-0c81-4159-89fb-e8d59cdc7182/download/" . . "2023-09-08 11:58:24.212723+00:00" . "2023-09-08 11:58:30.135042+00:00" . "text/x-python" . . "samtools_sort.py" . "2023-09-08 11:58:24.212723+00:00" . . . . "99"^^ . "https://api.rohub.org/api/resources/bceff5e5-5748-47fa-8225-46eda16071b0/download/" . . "2023-09-08 11:58:24.230660+00:00" . "2023-09-08 11:58:37.859407+00:00" . . "multiqc.yml" . "2023-09-08 11:58:24.230660+00:00" . . . . "504"^^ . "https://api.rohub.org/api/resources/bf76b440-5e45-49fc-bc9d-fe7b0398a545/download/" . . "2023-09-08 11:58:24.231251+00:00" . "2023-09-08 11:58:38.065476+00:00" . . "rmarkdown.yml" . "2023-09-08 11:58:24.231251+00:00" . . . . "713"^^ . "https://api.rohub.org/api/resources/c4e0cdb1-351c-4fc2-8f72-55d35d80297e/download/" . . "2023-09-08 11:58:24.222185+00:00" . "2023-09-08 11:58:33.600249+00:00" . . "make_greylist.smk" . "2023-09-08 11:58:24.222185+00:00" . . . . "145"^^ . "https://api.rohub.org/api/resources/d29dda34-5600-44a7-80be-d89fe409e4d0/download/" . . "2023-09-08 11:58:24.228219+00:00" . "2023-09-08 11:58:36.226230+00:00" . . "greylist.yml" . "2023-09-08 11:58:24.228219+00:00" . . . . "2643"^^ . "https://api.rohub.org/api/resources/d38663ba-d7a4-4c02-981c-f14cef1ade0a/download/" . . "2023-09-08 11:58:24.215552+00:00" . "2023-09-08 11:58:31.150098+00:00" . "text/x-python" . . "bowtie2.py" . "2023-09-08 11:58:24.215552+00:00" . . . . "139"^^ . "https://api.rohub.org/api/resources/dd1cc899-8d02-43b3-8bf7-38de97153ed0/download/" . . "2023-09-08 11:58:24.231866+00:00" . "2023-09-08 11:58:38.456481+00:00" . . "fasterq-dump.yml" . "2023-09-08 11:58:24.231866+00:00" . . . . "1274"^^ . "https://api.rohub.org/api/resources/e0e3ae31-7616-4e18-9b9c-fa616bffa161/download/" . . "2023-09-08 11:58:24.225409+00:00" . "2023-09-08 11:58:35.254805+00:00" . . "samtools.smk" . "2023-09-08 11:58:24.225409+00:00" . . . . . . "space sciences (general)" . . "5.774895344811626" . "0.05067460238933563" . "oceanography" . . "24.372858818528897" . "0.43228960037231445" . "space sciences" . . "5.774895344811626" . "0.05067460238933563" . "treatment coverage bigwig file" . . "15.384615384615383" . "12.6" . "default" . . "8.760951188986231" . "7.0" . "- Coverage bigWig files for each individual sample are produced using CPM values (i.e. Signal Per Million Reads, SPMR)\r\n- For all combinations of target and treatment coverage bigWig files are also produced, along with fold-enrichment bigWig files" . . "18.06167400881057" . "8.2" . "IT-computer sciences" . . "Science and technology/Technology and engineering/IT-computer sciences" . "data" . . "11.627906976744185" . "20.0" . "earth sciences" . . "75.6271411814711" . "1.3413619995117188" . "bedGraph file" . . "14.529914529914528" . "11.9" . "output" . . "19.899874843554443" . "15.9" . "Hardware" . . "Economy, business and finance/Economic sector/Computing and information technology/Hardware" . "This index will also be used to produce the file `chrom.sizes` which is essential for conversion of bedGraph files to the more efficient bigWig files." . . "25.550660792951543" . "11.6" . "file" . . "33.95348837209303" . "58.400000000000006" . "setting" . . "9.136420525657073" . "7.300000000000001" . "computer hardware" . . "5.787781350482315" . "3.6" . "Fold-Enrichment bigwig file" . . "19.536019536019534" . "16.0" . "Hardware" . . "Economy, business and finance/Economic sector/Computing and information technology/Hardware" . "software" . . "11.254019292604502" . "7.0" . "sample" . . "5.5813953488372094" . "9.6" . "callpeak log file" . . "14.163614163614163" . "11.6" . "Computer networking" . . "Economy, business and finance/Economic sector/Computing and information technology/Computer networking" . "Computer networking" . . "Economy, business and finance/Economic sector/Computing and information technology/Computer networking" . "computer science" . . "82.95819935691318" . "51.6" . "mathematical and computer sciences" . . "94.22510465518837" . "0.8268235921859741" . "computer operations and hardware" . . "94.22510465518837" . "0.8268235921859741" . "default setting" . . "8.79120879120879" . "7.2" . "target" . . "6.046511627906977" . "10.4" . "default option" . . "6.511627906976744" . "11.2" . "bigwig" . . "9.011264080100124" . "7.2" . "file" . . "29.036295369211512" . "23.2" . "electronic journal" . . "7.093023255813954" . "12.2" . "IT-computer sciences" . . "Science and technology/Technology and engineering/IT-computer sciences" . "single-end data" . . "10.378510378510377" . "8.5" . "kingpin" . . "6.8604651162790695" . "11.8" . "fold-enrichment bigwig file" . . "8.913308913308914" . "7.300000000000001" . "setting" . . "6.8604651162790695" . "11.8" . "Stevie Pederson" . . "Black Ochre Data Labs" . . . . "2025-11-11T16:09:17.560+01:00"^^ . . . "Research Object Crate for prepareChIPs:" . "RSA" . "MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA4pPaESKwmC6l37P86K6TNLq6yeQtc7m9CvcqauLs/1FC0viHvQnFBgxj0a+loPDv/Egwe6OqFpa0iW9Ypnyz9YPoh+pxbRXonbuMOb+8Ry9hXZ+TEKfWjhjVDGEaClwfRwglh2HI/xfV4CD9AgvDOEoZQiyta8a90PYwJ3G6e70oCHTn61+OWTkI9KRYHOYgg3btdy2Z7q/30PTFawb2ZT5aIfIJYobUYv2a7yhtcqWCHZeKv0bxGnRjTFNx1rscBMlLJSzvRtpQc1cCRVEPFZHo1adaXCI9tGvn4cxeNQ96y8dxkN1XhpaJairde+23MDzf42Oe97KG2HYzKiyVnQIDAQAB" . "mHFnagI75sf3KahssYHdh575grpPujTQBmaVNxnezsLeu4xH6ffMcCAF0nH6VKY+f9XVONOMSqL/KHy5ARfK+riaWu1Oghia1roOAOHzSzak/G9fzTxmlyTMxWEZ7qg5Ehy9mM+YF2hriGqcQrpYD0gdkCd3OSZE2h/plbL/y1Qib6ftuwa9deMyv1auU4EwHiM8jnbGEYFHd3mWXyjYSWKqpHTG5bG3pOMCrEns1Ty74oR8gioMq9eGfyFH5E0I59TiiRF8yt+oVGQjE0jEEZE6I8VpNEFtz3gMQRmkj3tgO1hw+h4/WN//fzd5SeBPfzf2ad7poAEdh8d0k09X4Q==" . . .