Pablo's research and teaching areas El Teide and tajinastes rojos, Tenerife, Canary Islands


The aims of the following three tutorials are: i ) to introduce the reader to the key aspects that have to be taken into account in order to make a rigorous phylogenetic analysis of gene-coding (CDS), protein and small-subunit ribosomal RNA gene sequences (rrs) sequences. For the latter case, there are tutorials ii ) to illustrate the use of three websites that serve powerful analysis tools to perform these tasks correctly and efficiently.

Tutorials (English and Spanish) on phyloinformatics: working with sequences efficiently on the command line

The following are a selection of tutorial I've prepared for workshops taught at the International Workshops on Bioinformatics - TIB2019

You can access the complete materials I've prepared for the Workshop on Microbial Genomics and Phylogenomics at this GitHub repostitory

Introducción al biocómputo en sistemas Linux
Presentación conceptual al biocómputo en sistemas Linux, en español - PDF
TIB19: Prácticas de biocómputo en sistemas Linux, en español - html
Curso avanzado y extenso de programación AWK y Bash para bioinformática y genómica
repositorio GitHub del curso avanzado de programación AWK y Bash para biocómputo en sistemas GNU/Linux
Running BLAST+ on the command line: formatting databases, performing searches and parsing results with Linux tools
Download BLAST+
sequence data for BLAST+ practice - tgz
Running BLAST+ on the command line: example code in English - txt
Performing multiple sequence alignments with clustalo on the command line
Download clustal omega
sequece data for alignment exercises - tgz
llamadas de clustalo desde la línea de comandos: código en español - txt
Model selection (parametric DNA substitution models and empirical protein matrices) using PhyML on the command line
Download PhyML from GitHub
Tutoral en español sobre uso de PhyML v3 para selección de modelos y búsquedas intensivas de árboles - html
Microbial pan-genomics, using GET_HOMOLOGUES
Download GET_HOMOLOGUES from GitHub
pIncA/C plasmid sequences for GET_HOMOLOGUES practice - tgz
Running GET_HOMOLOGUES - code (English)
Microbial phylogenomics, using GET_PHYLOMARKERS
Download GET_PHYLOMARKERS from GitHub
Tutorial (English) on unsing GET_PHYLOMARKERS for estimating core- and pan-genome phylogenies - html

Tutorials on inferring phylogenies from 16 rRNA sequences

By far and large, the rrs gene has been the most widely used molecular marker in bacterial molecular systematics and ecology. However, this marker is not easy to analyze properly. Watch out for the following issues:

  1. Obtain correct multiple sequence alignment based on secondary structure motifs.
  2. Check for the presence of intragenic mosaicism or gene chimeras.
  3. Select a realistic (or at least not so unrealistic) nucleotide substitution model.
  4. Perform tree searches using optimality criteria instead of algorithmic distance-matrix based tree reconstruction methods (such as neighbor-joining).

The following tutorials will focus on points 1, 3 and 4. We will learn how to perform correct multiple sequence alignments using the GreenGenes and RDPII sites. Once obtained, the best-fitting or approaching nucleotide substitution model(s) will be searched for using Modeltest, as implemented in the FindModel web server. Finally, the reader will learn how to run a maximum-likelihood tree search using the PhyML online too

This tutorial shows how to use the GreenGenes and RDPII web servers to perform different tasks related to the analysis of ribosomal gene sequences

These are very useful and great sites when you need to align, retrieve or check your 16S rDNA sequences for chimeric structures !!!

This tutorial shortly explains what model fitting is and why it is important to model-based phylogeny inference methods (distance-matrix based methods, ME, ML, Bayesian) and shows how this fundamental task can be performed automaticaly using Modeltest, as implemented in the FindModel web server.

Here the reader will get an intuitive notion and learn the most important things to know about phylogeny inference in a ML framework from a practical or user standpoint. After laying the ground, which includes the previous tutorial on model fitting and concepts such as the likelihood function, the likelihood ratio test etc., this tutorial shows
how to run a maximum-likelihood tree search using the PhyML online tool.

The primers4clades web server - design PCR oligonucleotide primers based on multiple sequence alignments and phylogenetic trees

Contreras-Moreira, B. and Vinuesa, P. (2009).
primers4clades: a web server to design PCR primers for cross-species amplification of molecular markers, that uses phylogenies to target oligonucletotide formulations for sequence clusters and to evaluate the phylogenetic information content of the amplicons. (to be published).
primers for clades is an easy-to-use web server developed for researchers interested in designing PCR primers for cross-species amplification of novel sequences from metagenomic DNA or from uncharacterized organisms belonging to user-specified phylogenetic clades or taxa. It implements complementary primer design strategies based on both DNA and protein multiple sequence alignments of coding sequences. It evaluates a comprehensive set of thermodynamic properties of the oligonucleotide pairs, as well as the phylogenetic information content of the theoretical amplicons, which is computed from the branch support values of maximum likelihood phylogenies estimated for each molecular marker. Phylogenetic trees displayed on screen make it easy to target the primer design for particular species groups or sequence clusters selected by the user. It is developed by Bruno Contreras-Moreira (Laboratorio de Biologí­a Computacional, Estación Experimental Aula Dei, CSIC, Spain) and Pablo Vinuesa (Center for Genomic Sciences, UNAM, Mexico) and is mirrored at two sites: primers4clades - Mexico and primers4clades - Spain. If you use it, we would greatly appreciate your feedback in order to improve the documentation and extend the FAQs list!