生信分析流工具常见流派snakemake、wdl、cwl、nextflow,甚至还有make流,在snakemake和nextflow之间反复横跳后,发现最好两个都要会。
官方网站: https://www.nextflow.io/docs/latest/index.html
安装
依赖java环境,当前nextflow版本为25.10.2
curl -s https://get.nextflow.io | bash
chmod +x nextflow
测试文件构建
在data目录下创建20个双端测序空文件备用
seq 1 20|xargs -I {} touch sample_{}_1.fq
seq 1 20|xargs -I {} touch sample_{}_2.fq
编写一个测试例子
#!/usr/bin/env nextflownextflow.enable.dsl = 2params.read_path = "${workflow.projectDir}/data"
params.outdir = "${workflow.projectDir}/result"
params.pattern = "*_{1,2}.fq"process chd {publishDir params.outdir, mode: 'link'input:tuple val(sample_id), path(reads)output:path "${sample_id}_info.txt", emit: sample_infoscript:"""echo "sample_id: $sample_id, seq_file: ${reads[0]}:\t:${reads[1]}" > ${sample_id}_info.txt"""// 三重单引号内变量获取方式 !{variable}// ''' echo "sample_id: !{sample_id}, seq_file: !{reads}" '''
}process cats {publishDir params.outdir, mode: 'link'cache 'lenient' // 避免重复运行input:path(sample_files)output:path "merged.txt"script:"""for file in ${sample_files}; docat \$file >> "merged.txt"done"""
}workflow {println "workdir: ${workflow.projectDir}"ch_fq = channel.fromFilePairs("${params.read_path}/${params.pattern}", flat: false, checkIfExists: true)// 显式声明参数名(比如sample_data),替换隐式的itch_fq.view { sample_data -> "raw ctx: sm=${sample_data[0]}, fq1 = ${sample_data[1][0]}, fq2= ${sample_data[1][1]}" }chd_out = chd(ch_fq)res1 = chd_out.sample_info.collect()cats(res1)
}