The latest version of newbler, version 2.6, has some welcome additions for input and output. As I have so far only treated de novo assembly, I will skip the updates on the gsMapper (except for mentioning that it is now able to provide a bam file using the -bam option).
Posts Tagged ‘newbler parameters’
Posted by lexnederbragt on July 12, 2011
Posted by lexnederbragt on March 22, 2011
Recently, newbler version 2.5.3 became available. With this post, I’ll describe the changes between this version, and the previous (2.3). As I have not yet described the gsMapper function of newbler, I here only dicuss changes relevant to assembly (gsAssembler, runAssembly).
Posted by lexnederbragt on August 31, 2010
Since version 2.3, newbler has a -cdna option for de novo transcriptome assembly. In this post, I’ll explain the principles and setting up the transcriptome assembly. The next post will discuss the output of a transcriptome assembly.
1) Principles of transcriptome assembly
As with other assembly projects, the first steps for transcriptome assembly are identical, and newbler builds a contig graph, see this post. Ideally, the reads coming from the transcript of a certain gene should result in a single contig. However, because of splice-variants (and other sequence particularities), there may be several contigs for each transcript, which themselves form a small contig graph. Splice-variants will result in reads that , relative to other reads have an insert (representing an additional exon in the transcript), thereby breaking the contig graph, see the figure.
So, for transcriptome projects, there will be numerous subgraphs each potentially representing one gene. Each of these subgraphs are called an isogroup. Next, newbler will traverse the contigs in the subgraphs of each isogroup to generate transcript variants, which are called isotigs, again, see the figure. There are certain rules for this traversing step, for example, for starting the path and for ending it. Another rule, for complex graphs, is a cutoff such that no more than a maximum number of isotigs are generated per isogroup (by default set to 100 isotigs). If fully traversing the graph will result in more isotigs than this maximum, the contigs of this isogroup are reported in the output instead of the isotigs. Read the rest of this entry »
Posted by lexnederbragt on June 10, 2010
This post is about how to start up newbler for de novo assembly projects. I will describe setting up newbler using the command line. Most of the options I will mention are also available through the GUI version, but I will not describe how to use them here.
For a description of the progress that newbler reports during assembly, please check this post. For a description of the different output files, these are described in a series of previous posts.
1) default newbler on one or more files
This is the most simple way of running newbler: just provide it with one sff file. It will generate a folder called along the lines of P_yyyy_mm_dd_hh_min_sec_runAssembly and put all output in there. If you want to have control over the name of this folder, use
runAssembly -o project1 /data/sff/EYV886410.sff
-o describes the name of the folder newbler will provide all output in, in this case ‘project1’