Directory:
tophat_alignment_1511_6.fastq
Files:
accepted_hits.bam
accepted_hits.txt
deletions.bed
insertions.bed
junctions.bed
prep_reads.info
unmapped.bam
with no indication of the sample they came from. This is not a problem for me as my subsequent pipelines use the directory name when importing samples, however as this analysis will be used by many different members of the lab long after I have left there is a potential for confusion. To remedy this I wrote the script below which takes the directory name, for example tophat_alignment_1511_6.fastq, cuts out the sample name 1511_6 and prefixes it to each file name in that directory. I kept the script simple so that it was easy to read and test before implementation - as it recursively moves through directories renaming files a bug has the potential to cause massive problems.
for i in /SEQDATA/RNASEQ/my_mutants/Alignments/* do cd $i for x in $i/* do FNAME=$(basename $x) DNAME=$(dirname $x) SAMPLENAME=`echo $DNAME | sed "s/^.*tophat_alignment_\(.*\)\.fastq$/\1/"` NEW=`echo $DNAME"/"$SAMPLENAME"."$FNAME` OLD=$x COMMAND=`echo "mv" $OLD $NEW` eval $COMMAND done done cd /SEQDATA/RNASEQ/my_mutants/Alignments/
Returning:
1511_6.accepted_hits.bam
1511_6.accepted_hits.txt
1511_6.deletions.bed
1511_6.insertions.bed
1511_6.junctions.bed
1511_6.prep_reads.info
1511_6.unmapped.bam
No comments:
Post a Comment