Relative Content

Tag Archive for snakemake

Capturing output, stderr and stdout in Snakemake

I am learning snakemake and I do not manage to capture stdout and stderr in the log file.
This is my code:

Access Profile Information in Snakemake Rule

I am using Snakemake to execute a workflow across multiple cluster environments. I therefore use Snakemake profiles per cluster to define CPUs, memory, etc. However, there are a few inputs that need to be run in single threaded mode.

Snakemake not re-running workflow or rule after script update

In snakemake, how do I indicate that a script is a dependency of an output file such that if the script changes, the rule will re-run?

Prevent Snakemake from recreating specific files

I’ve developed a simple testing framework for Snakemake workflows. Each test case has some number of input/output files. Tests are run like so:

How to obtain syntax highlight/coloring for Snakemake file?

The snakemake documentation show syntax highlighting of the Snakefile file, like in here

Snakemake checkpoint is not evaluated

I have a checkpoint rule that is creating files (names unknown in advance), and each of those files has to be processed by rule BaseCellCounter. I don’t find how to make snakemake understand that bam=f"SplitBam/{{scDNA}}/{{scDNA}}.{{clone}}.bam", is created by checkpoint SplitBam, and linking rule AggregateSplitBamOutput doesn’t help. As a result the DAG is not built.

Snakemake – How Do I Combine Parameters Conditionally?

Suppose I have the following input FASTQ files:

Snakemake: how to produce multiple outputs from one input

I am trying to make a snakemake workflow for Earth Observation applications and I have to download data from S3. First, I have a rule that query the data I need based on parameters in a file. The output of this rule is a list containing the data I need to download.

Making a pipeline with a variable number of LANES and a variable number inside filename

# Directory paths DATA_DIR = “/home/debian/data1/breast/data/wes” REFERENCE_DIR = “/home/debian/data2/genomes/human_GRCh38.p14” REFERENCE_GENOME = os.path.join(REFERENCE_DIR, “GRCh38_latest_genomic.fna.gz”) samples = [‘100B’,’101B’] rule bwa_index: input: REFERENCE_GENOME output: multiext(os.path.join(REFERENCE_DIR, “GRCh38_latest_genomic.fna.gz”), “.amb”, “.ann”, “.bwt”, “.pac”, “.sa”) log: “logs/bwa_index.log” shell: “”” bwa index {input} &> {log} “”” rule run_fastp: input: r1=os.path.join(DATA_DIR,{SAMPLE},”raw”,”{SAMPLE}-exome-tumor_{SEQ_ID}_L{LANE}_R1_001.fastq.gz”,sample=SAMPLES), r2=os.path.join(DATA_DIR,{SAMPLE},”raw”,”{SAMPLE}-exome-tumor_{SEQ_ID}_L{LANE}_R2_001.fastq.gz”,sample=SAMPLES) output: r1_trimmed=”fastp/{SAMPLE}/{SAMPLE}-exome-tumor_{SEQ_ID}_L{LANE}_R1_001.fastq.gz”, r2_trimmed=”fastp/{SAMPLE}/{SAMPLE}-exome-tumor_{SEQ_ID}_L{LANE}_R1_001.fastq.gz” threads: 8 shell: “”” fastp -i {input.r1} -I {input.r2} […]

expand based on a python dictionary

Considering I have the following python dictionary:

Thiết kế website giá rẻ

Danh mục