Discussion:
[galaxy-dev] Salmon references and data manager
Previti
2018-09-07 07:48:23 UTC
Permalink
Dear Björn,

I just installed Salmon on our Galaxy instance and I have a couple of
basic questions.

Currently the reference transcriptomes are put in the same data table as
the genomes, would it be of interest to separate this and give the

transcriptomes their own table? I could probably try to do this...

There is a data manager available that unfortunately has a bug. We fixed
that and it now populates the reference genome data table.

I would probably modify this as well use the new table. Could this be
useful? I'm not sure how to proceed...would I give you the modified
Salmon wrapper for inclusion in the package?

Best regards,

Christopher
--
*Dr. Christopher Previti*
Genomics and Proteomics Core Facility
High Throughput Sequencing (W190)
Bioinformatician

German Cancer Research Center (DKFZ)
Foundation under Public Law
Im Neuenheimer Feld 580
69120 Heidelberg
Germany
Room: B2.102 (INF580/TP3)
Phone: +49 6221 42-4661

***@dkfz.de <http://www.dkfz.de/>
www.dkfz.de <http://www.dkfz.de/>

Management Board: Prof. Dr. Michael Baumann, Prof. Dr. Josef Puchta
VAT-ID No.: DE143293537

Vertraulichkeitshinweis: Diese Nachricht ist ausschließlich fÃŒr die
Personen bestimmt, an die sie adressiert ist.
Sie kann vertrauliche und/oder nur fÌr den/die EmpfÀnger bestimmte
Informationen enthalten. Sollten Sie nicht
der bestimmungsgemÀße EmpfÀnger sein, kontaktieren Sie bitte den
Absender und löschen Sie die Mitteilung.
Jegliche unbefugte Verwendung der Informationen in dieser Nachricht ist
untersagt.
Björn Grüning
2018-09-07 07:56:41 UTC
Permalink
Hi Christopher!
Dear Björn,
I just installed Salmon on our Galaxy instance and I have a couple of
basic questions.
Sure, thanks for getting in touch!
Currently the reference transcriptomes are put in the same data table as
the genomes, would it be of interest to separate this and give the
transcriptomes their own table? I could probably try to do this...
That I don't understand?
Salmon is using this one here, isn't it?

https://github.com/bgruening/galaxytools/blob/master/tools/salmon/salmon.xml#L233
There is a data manager available that unfortunately has a bug. We fixed
that and it now populates the reference genome data table.
Do you mean this one?

https://github.com/ieguinoa/data_manager_salmon_index_builder
I would probably modify this as well use the new table. Could this be
useful? I'm not sure how to proceed...would I give you the modified
Salmon wrapper for inclusion in the package?
If you can, please feel free to create PRs to the repositories, so we
can all reviewed it. And then, when we merge, it gets automatically
updated to the Tool Shed :)

Thanks!
Bjoern
Best regards,
Christopher
--
*Dr. Christopher Previti*
Genomics and Proteomics Core Facility
High Throughput Sequencing (W190)
Bioinformatician
German Cancer Research Center (DKFZ)
Foundation under Public Law
Im Neuenheimer Feld 580
69120 Heidelberg
Germany
Room: B2.102 (INF580/TP3)
Phone: +49 6221 42-4661
www.dkfz.de <http://www.dkfz.de/>
Management Board: Prof. Dr. Michael Baumann, Prof. Dr. Josef Puchta
VAT-ID No.: DE143293537
Vertraulichkeitshinweis: Diese Nachricht ist ausschließlich für die
Personen bestimmt, an die sie adressiert ist.
Sie kann vertrauliche und/oder nur für den/die Empfänger bestimmte
Informationen enthalten. Sollten Sie nicht
der bestimmungsgemäße Empfänger sein, kontaktieren Sie bitte den
Absender und löschen Sie die Mitteilung.
Jegliche unbefugte Verwendung der Informationen in dieser Nachricht ist
untersagt.
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
http
Ignacio EGUINOA
2018-09-07 08:46:07 UTC
Permalink
Hi Christopher and Björn,

I have some comments about this because I also came up with these questions some time ago...
Sent: Friday, September 7, 2018 9:56:41 AM
Subject: Re: [galaxy-dev] Salmon references and data manager
Hi Christopher!
Post by Previti
Dear Björn,
I just installed Salmon on our Galaxy instance and I have a couple of
basic questions.
Sure, thanks for getting in touch!
Post by Previti
Currently the reference transcriptomes are put in the same data table as
the genomes, would it be of interest to separate this and give the
transcriptomes their own table? I could probably try to do this...
That I don't understand?
Salmon is using this one here, isn't it?
https://github.com/bgruening/galaxytools/blob/master/tools/salmon/salmon.xml#L233
What he means, I think, is the table to build the index from. Data managers that take a transcriptome as input get it from the all_fasta table, I think that is what he means by the genomes table.
As I said at some point I also thought it may be useful to have a separate table (e.g all_transcriptomes) so that the genome and transcriptome entries of the same build don't get mixed. I think it would be good to have a way of listing only the transcriptomes from the all_gff but that would requiere some kind of standard on the naming to filter. We had this in our instance at some point but didn't help at all so I just modified the data manger to use the all_fasta and that is what I published.
Post by Previti
There is a data manager available that unfortunately has a bug. We fixed
that and it now populates the reference genome data table.
Do you mean this one?
https://github.com/ieguinoa/data_manager_salmon_index_builder
Post by Previti
I would probably modify this as well use the new table. Could this be
useful? I'm not sure how to proceed...would I give you the modified
Salmon wrapper for inclusion in the package?
If you can, please feel free to create PRs to the repositories, so we
can all reviewed it. And then, when we merge, it gets automatically
updated to the Tool Shed :)
As Björn said, if that's the one you are talking about please create a PR or an isssue or contact me.

Cheers,
Ignacio
Thanks!
Bjoern
Post by Previti
Best regards,
Christopher
--
*Dr. Christopher Previti*
Genomics and Proteomics Core Facility
High Throughput Sequencing (W190)
Bioinformatician
German Cancer Research Center (DKFZ)
Foundation under Public Law
Im Neuenheimer Feld 580
69120 Heidelberg
Germany
Room: B2.102 (INF580/TP3)
Phone: +49 6221 42-4661
www.dkfz.de <http://www.dkfz.de/>
Management Board: Prof. Dr. Michael Baumann, Prof. Dr. Josef Puchta
VAT-ID No.: DE143293537
Vertraulichkeitshinweis: Diese Nachricht ist ausschließlich fÃŒr die
Personen bestimmt, an die sie adressiert ist.
Sie kann vertrauliche und/oder nur fÌr den/die EmpfÀnger bestimmte
Informationen enthalten. Sollten Sie nicht
der bestimmungsgemÀße EmpfÀnger sein, kontaktieren Sie bitte den
Absender und löschen Sie die Mitteilung.
Jegliche unbefugte Verwendung der Informationen in dieser Nachricht ist
untersagt.
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
https://lists.galaxyproject.org/
http://galaxyproject.org/search/
Previti
2018-09-07 10:33:49 UTC
Permalink
Yeah, I got confused about the data tables. Sorry about this. I too
would keep the transcriptome indices separate from the reference
genomes, it just makes sense.

 @Ignacio, I found that you need insert the following (in red)

if not os.path.exists( target_directory ):

    os.mkdir( target_directory )

args = ['salmon','index']

in order for anything to happen.

I think that's it...but I'll test some more.

Best regards,

Christopher
Post by Ignacio EGUINOA
Hi Christopher and Björn,
I have some comments about this because I also came up with these
questions some time ago...
------------------------------------------------------------------------
*Sent: *Friday, September 7, 2018 9:56:41 AM
*Subject: *Re: [galaxy-dev] Salmon references and data manager
Hi Christopher!
Post by Previti
Dear Björn,
I just installed Salmon on our Galaxy instance and I have a
couple of
Post by Previti
basic questions.
Sure, thanks for getting in touch!
Post by Previti
Currently the reference transcriptomes are put in the same data
table as
Post by Previti
the genomes, would it be of interest to separate this and give the
transcriptomes their own table? I could probably try to do this...
That I don't understand?
Salmon is using this one here, isn't it?
https://github.com/bgruening/galaxytools/blob/master/tools/salmon/salmon.xml#L233
What he means, I think, is the table to build the index from. Data
managers that take a transcriptome as input get it from the all_fasta
table, I think that is what he means by the genomes table.
As I said at some point I also thought it may be useful to have a
separate table (e.g all_transcriptomes) so that the genome and
transcriptome entries of the same build don't get mixed. I think it
would be good to have a way of listing only the transcriptomes from
the all_gff but that would requiere some kind of standard on the
naming to filter. We had this in our instance at some point but didn't
help at all so I just modified the data manger to use the all_fasta
and that is what I published.
although it would be easier for the GUI. For now just giving the
entries a descriptive name to indicate the entries correspond to a
transcriptome is enough and works ok for us. In any case this is not
for users and at least for us its all handled through the API so,
again, it's just a matter of taking care of the entries names and you
are fine with using the all_fasta table.
Post by Previti
There is a data manager available that unfortunately has a bug.
We fixed
Post by Previti
that and it now populates the reference genome data table.
Do you mean this one?
https://github.com/ieguinoa/data_manager_salmon_index_builder
Post by Previti
I would probably modify this as well use the new table. Could
this be
Post by Previti
useful? I'm not sure how to proceed...would I give you the modified
Salmon wrapper for inclusion in the package?
If you can, please feel free to create PRs to the repositories, so we
can all reviewed it. And then, when we merge, it gets automatically
updated to the Tool Shed :)
As Björn said, if that's the one you are talking about please create a
PR or an isssue or contact me.
Cheers,
Ignacio
Thanks!
Bjoern
Post by Previti
Best regards,
Christopher
--
*Dr. Christopher Previti*
Genomics and Proteomics Core Facility
High Throughput Sequencing (W190)
Bioinformatician
German Cancer Research Center (DKFZ)
Foundation under Public Law
Im Neuenheimer Feld 580
69120 Heidelberg
Germany
Room: B2.102 (INF580/TP3)
Phone: +49 6221 42-4661
www.dkfz.de <http://www.dkfz.de/>
Management Board: Prof. Dr. Michael Baumann, Prof. Dr. Josef Puchta
VAT-ID No.: DE143293537
Vertraulichkeitshinweis: Diese Nachricht ist ausschließlich fÃŒr die
Personen bestimmt, an die sie adressiert ist.
Sie kann vertrauliche und/oder nur fÌr den/die EmpfÀnger bestimmte
Informationen enthalten. Sollten Sie nicht
der bestimmungsgemÀße EmpfÀnger sein, kontaktieren Sie bitte den
Absender und löschen Sie die Mitteilung.
Jegliche unbefugte Verwendung der Informationen in dieser
Nachricht ist
Post by Previti
untersagt.
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
  https://lists.galaxyproject.org/
  http://galaxyproject.org/search/
--
*Dr. Christopher Previti*
Genomics and Proteomics Core Facility
High Throughput Sequencing (W190)
Bioinformatician

German Cancer Research Center (DKFZ)
Foundation under Public Law
Im Neuenheimer Feld 580
69120 Heidelberg
Germany
Room: B2.102 (INF580/TP3)
Phone: +49 6221 42-4661

***@dkfz.de <http://www.dkfz.de/>
www.dkfz.de <http://www.dkfz.de/>

Management Board: Prof. Dr. Michael Baumann, Prof. Dr. Josef Puchta
VAT-ID No.: DE143293537

Vertraulichkeitshinweis: Diese Nachricht ist ausschließlich fÃŒr die
Personen bestimmt, an die sie adressiert ist.
Sie kann vertrauliche und/oder nur fÌr den/die EmpfÀnger bestimmte
Informationen enthalten. Sollten Sie nicht
der bestimmungsgemÀße EmpfÀnger sein, kontaktieren Sie bitte den
Absender und löschen Sie die Mitteilung.
Jegliche unbefugte Verwendung der Informationen in dieser Nachricht ist
untersagt.
Loading...