1 Comment
User's avatar
⭠ Return to thread
Paul Devine's avatar

Yes with remote commands it take around an hour to pull the data , I think the results are stored in memory and then dumped in the text file at the end . if you don't specify the max alignments you will get 100-200 results for each sequence and this can take ages, if the search goes on for longer than an hour blast terminates it , I think there are so many accessions for sars-cov2 that it just wont run , as you say you can even get the same errors on web blast around 25% of the time for me . But for 30-50 contigs I just leave it running for an hour and its got a good success rate if sars-cov2 is omitted . But you are reliant on a connection to the internet so databases can be quicker and more reliable , even though the searches might be smaller .

The output of Trinity is around 70-90gb , which is insane if you only want to look at a 70mb fasta file . I also needed to split the data from SRA differently

fastq-dump --defline-seq '@$sn[_$rn]/$ri' --split-files SRR10971381.sra

I cant get it to finish any other way , if you want to change GCC to 11 without brew or any docker type images , you can edit the makefile in Chrysalis ( around line 300-310) and just manually edit the GCC version there , this should build fine after that .

Just to add , removing some of the output option on blast remote also made it run quicker and more successful , so I just keep these at a minimum .

Expand full comment