Log in
Home
Log in

Previous Workshops

View details about workshops taught over the last few years.
Registration Closed
With increasing adoption of Next Generation Sequencing technologies to infectious disease surveillance and outbreak investigations, genomic epidemiology (combining pathogen genomics data with epidemiological investigations to track the spread of infectious diseases) is poised to change the practices of public health and infection controls and provides unprecedented amount of data for pathogen evolution studies. The CBW has developed a 3-day course providing an introduction to genomic epidemiology analysis followed by hands-on practical tutorials demonstrating the use selected analysis tools. The tutorials are designed as self-contained units that include example data and detailed instructions for installation of all required bioinformatics tools or access to publicly available web applications. Participants will gain practical experience and skills to be able to: Understand next generation sequencing (NGS) platforms as applied to pathogen genomics and metagenomics sequencing Analyze NGS data for pathogen surveillance and outbreak investigations Analyze antimicrobial resistance genes Detect emerging pathogens in metagenomics data Perform phylogeographic analysis Use different visualization tools for genomic epidemiology analysis
Registration Closed
With the introduction of high-throughput sequencing platforms, it is becoming feasible to consider sequencing approaches to address many research projects. However, knowing how to manage and interpret the large volume of sequence data resulting from such technologies is less clear. The CBW has developed a popular 2-day course covering the bioinformatics tools available for managing and interpreting high-throughput sequencing data, where the focus is on Illumina reads although the information is applicable to all sequencer reads. Beginning with an understanding of the workflow involved to move from platform images to sequence generation, participants will gain practical experience and skills to be able to: Assess sequence quality Map sequence data onto a reference genome Perform de novo assembly tasks Quantify sequence data Integrate biological context with sequence information
Registration Closed
As the cost of collecting transcriptomics data continues to drop, researchers in the environmental life sciences are increasingly seeking to use these data as part of their investigations. In many cases, this means using non-model organisms that have few or no genomics and bioinformatics resources for comprehensive data analysis and interpretation. The objective of this workshop is to equip researchers in the environmental life sciences with easy-to-use tools to process and analyze transcriptomics data from non-model organisms, and strategies for leveraging databases and statistical methods originally designed for model organisms.
Registration Closed
(2021) Epigenomics Analysis September 13-15, 2021
High-throughput sequencing of Chromatin-Immunoprecipitated libraries (ChIP-seq) and of bisulfite converted DNA (WGBS) have become increasingly common and have largely supplanted microarrays for chromatin and DNA methylation profiling. When processed appropriately, ChIP-seq data provides base-pair resolution representations of transcription factor DNA-binding events and nucleosome (histone) modifications genome-wide. Similarly, WGBS can provide a quantitative genome wide profile of cytosine methylation. The CBW has developed a 2-day course providing an introduction to histone ChIP-seq and WGBS data analysis followed by integrated tutorials demonstrating the use of open source ChIP-Seq and WGBS analysis packages. The tutorials are designed as self-contained units that include example data and detailed instructions for installation of all required bioinformatics tools (FASTQC, BWA, MACS2, FindER, samtools, Picard, BisSNP). The course also includes an overview of integrative epigenomic tools that have been developed to explore ChIP-Seq and WGBS data together with other epigenomic datasets such as RNA-seq, DHS-seq and ATAC-seq. Participants will gain practical experience and skills to be able to: Align ChIP-seq and WGBS sequence data to a reference genome (required) Identify narrow and broad peaks from ChIP-seq data Identify methylated levels from WGBS data Visualize and summarize the output of ChIP-Seq and WGBS analyses Explore integrative tools for epigenomic data sets
Registration Closed
(2021) RNA-Seq Analysis September 08-10, 2021
High-throughput sequencing of RNA libraries (RNA-seq) has become increasingly common and largely supplanted gene microarrays for transcriptome profiling. When processed appropriately, RNA-seq data has the potential to provide a considerably more detailed view of the transcriptome. The CBW has developed a 3-day course providing an introduction to RNA-seq data analysis followed by integrated tutorials demonstrating the use of popular RNA-seq analysis packages. The tutorials are designed as self-contained units that include example data (Illumina paired-end RNA-seq data) and detailed instructions for installation of all required bioinformatics tools (HISAT, StringTie, etc.). Participants will gain practical experience and skills to be able to: Perform command-line Linux based analysis on the cloud (Amazon AWS) Perform basic bioinformatics tasks such as tool installation Assess quality of RNA-seq data and perform trimming Align RNA-seq data to a reference genome Visualize RNA-seq alignments and variants Estimate known gene and transcript expression using multiple approaches Perform differential expression analysis Visualize and summarize the output of RNA-seq analyses in R Perform batch correction Perform pathway analysis Alignment free expression estimation
Registration Closed
(2021) Microbiome Analysis September 01-03, 2021
Metagenomics, the sequencing of DNA directly from a sample without first culturing and isolating the organisms, has become the principal tool of “meta-omic” analysis. It can be used to explore the diversity, function, and ecology of microbial communities. The CBW has developed a 3-day course providing an introduction to metagenomic data analysis followed by hands-on practical tutorials demonstrating the use of metagenome analysis tools. The tutorials are designed as self-contained units that include example data and detailed instructions for installation of all required bioinformatics tools. Participants will gain practical experience and skills to be able to: Design appropriate microbiome-focused experiments Understand the advantages and limitations of metagenomic data analysis Devise an appropriate bioinformatics workflow for processing and analyzing metagenomic sequence data (marker-gene, shotgun metagenomic, and metatranscriptomic data) Apply appropriate statistics to undertake rigorous data analysis Visualize datasets to gain intuitive insights into the composition and/or activity of their data set
Registration Closed
(2021) Analysis Using R June 28-29, 2021
Before we can begin to apply rigorous statistical tools to research data, we often need to approach our data intuitively, and look for meaningful associations, surprising patterns, or irregularities, to formulate hypotheses. This is Exploratory Data Analysis (EDA). This workshop introduces the essential tools and strategies that are available for EDA through the free statistical workbench R. Steps covered in this workshop are broadly relevant for many areas of modern, quantitative biology such as flow cytometry, expression profile analysis, function prediction and more. Participants will gain practical experience and skills to be able to use R to visualize and investigate patterns in their data.
Registration Closed
(2021) Introduction to R June 21-22, 2021
R is one of the most important scripting languages for both experimental and computational biologists. It is well-designed, efficient, widely adopted and has a very large base of contributors who add new functionality for all modern aspects of data analysis and visualization. Moreover, it is free and open source. However, R’s great power and expressively can at first be difficult to approach without guidance, especially for those who are new to programming. This workshop introduces the essential ideas and tools of R. Although this workshop will cover running statistical tests in R, it does not cover statistical concepts. Participants will gain practical experience and skills to be able to: Meet the challenges of data handling Break down problems into structured parts Use R syntax, functions and packages
Registration Closed
Using high-throughput technologies, life science researchers can identify and characterize all the small molecules or metabolites in a given cell, tissue, or organism. The CBW course covers many topics ranging from understanding metabolomics technologies, data collection and analysis, using pathway databases, performing pathway analysis, conducting univariate and multivariate statistics, working with metabolomic databases, and exploring chemical databases. Hands-on practical tutorials using various data sets and tools will assist participants in learning metabolomics analysis techniques. Participants will gain practical experience and skills to be able to: Design appropriate metabolome-focused experiments Understand the advantages and limitations of metabolomic data analysis Devise an appropriate bioinformatics workflow for processing and analyzing metabolomic data Apply appropriate statistics to undertake rigorous data analysis Visualize datasets to gain intuitive insights into the composition and/or activity of their metabolome
Registration Closed
(2021) Cancer Analysis June 07-11, 2021
Cancer research has rapidly embraced high throughput technologies and Cloud computing into its research. Large amounts of data are being created from various microarray, tissue array, and next generation sequencing platforms. Dedicated compute clouds such as the Cancer Genome Collaboratory [http://cancercollaboratory.org/] facilitate complex analyses on big cancer data sets from projects hosting their data in the Cloud, such as the ICGC and PCAWG. Now more than ever, having the informatic skills and knowledge of available bioinformatic resources specific to cancer and how to access and use available data sets in the Cloud is critical. This 5-day workshop will cover the key bioinformatics concepts and tools required to analyze cancer genomic data sets and access and work with data sets in the Cloud. Participants will gain practical experience and skills to: Visualize genomic data; Analyze cancer –omic data for gene expression, genome rearrangement, somatic mutations, and copy number variation; Analyze and conduct pathway analysis on the resultant cancer gene list; Integrate clinical data; Launch, configure, customize, and scale virtual machines (VM); Navigate and work with data sets from Cloud repositories; and Follow best practices in data and workflow management.
Registration Closed
(2021) Machine Learning May 25-26, 2021
This workshop is intended to provide an introduction to machine learning and its application to bioinformatics. This workshop is not intended for machine learning experts. Instead, it targets biologists or other life scientists who want to understand what machine learning is, what it can do and how it can be used for a variety of bioinformatic or medical informatics applications. Students will gain experience in: Applications and Limitations of Machine Learning and Deep Learning Decision Trees and Random Forests – how they work, how they are coded in Python and R, and how they can be used in bioinformatic applications (biomarker discovery and modeling) Artificial Neural Networks (ANNs) – how they work, how data is encoded, how they are coded in Python and R, and how they can be used in bioinformatic applications (classification and secondary structure prediction) Hidden Markov Models (HMMs) – how they work, how they are coded in Python and R and how they can be used in bioinformatics applications (gene finding) Using Machine Learning tools (Decision Trees, ANNs and HMMs) on the Web (SciKit Learn and Keras/Colab)
Registration Closed
The CBW has developed a 2.5-day course covering the bioinformatics concepts and tools available for interpreting a gene list using pathway and network information. The workshop focuses on the principles and concepts required for analyzing and conducting pathway and network analysis on a gene list from any organism, although focus will be on human and model eukaryotic organisms. Participants will gain practical experience and skills to be able to: Get more information about a gene list; Discover what pathways are enriched in a gene list (and use it for hypothesis generation); Find out how a set of genes is connected by e.g. protein interactions and identify pathways, systems and modules within this network; Predict gene function and extend a gene list; Identify master regulators, such as transcription factors, active in the experiment. We will develop a unified analysis flow chart throughout the course that students will be able to follow after the workshop to conduct their own analysis.
Registration Closed
High-throughput sequencing of Chromatin-Immunoprecipitated libraries (ChIP-seq) and of bisulfite converted DNA (WGBS) have become increasingly common and have largely supplanted microarrays for chromatin and DNA methylation profiling. When processed appropriately, ChIP-seq data provides base-pair resolution representations of transcription factor DNA-binding events and nucleosome (histone) modifications genome-wide. Similarly, WGBS can provide a quantitative genome wide profile of cytosine methylation. The CBW has developed a 2-day course providing an introduction to histone ChIP-seq and WGBS data analysis followed by integrated tutorials demonstrating the use of open source ChIP-Seq and WGBS analysis packages. The tutorials are designed as self-contained units that include example data and detailed instructions for installation of all required bioinformatics tools (FASTQC, BWA, MACS2, FindER, samtools, Picard, BisSNP). The course also includes an overview of integrative epigenomic tools that have been developed to explore ChIP-Seq and WGBS data together with other epigenomic datasets such as RNA-seq, DHS-seq and ATAC-seq. Participants will gain practical experience and skills to be able to: Align ChIP-seq and WGBS sequence data to a reference genome (required) Identify narrow and broad peaks from ChIP-seq data Identify methylated levels from WGBS data Visualize and summarize the output of ChIP-Seq and WGBS analyses Explore integrative tools for epigenomic data sets
Registration Closed
(2020) Machine Learning September 21-22, 2020
This workshop is intended to provide an introduction to machine learning and its application to bioinformatics. This workshop is not intended for machine learning experts. Instead it targets biologists or other life scientists who are wanting to understand what machine learning, what it can do and how it can be used for a variety of bioinformatic or medical informatics applications. Students will gain experience in: Applications and Limitations of Machine Learning and Deep Learning Data encoding for Machine Learning Artificial Neural Networks (ANNs) – how they work and how they can be used in bioinformatic applications (secondary structure prediction) ANNs – how to program a useful ANN for bioinformatics in Python Hidden Markov Models (HMMs) – how they work and how they can be used in bioinformatics applications (gene finding) HMMs – how to program a useful HMM for bioinformatics in Python Support Vector Machines, Decision Trees an Random Forests – how they work and how they can be used in bioinformatic applications (biomarker discovery and modeling) Using Machine Learning tools on the Web (WEKA) Using Machine Learning Apps (TENSORFLOW)
Registration Closed
The CBW has developed a 2.5 day course covering the bioinformatics concepts and tools available for interpreting a gene list using pathway and network information. The workshop focuses on the principles and concepts required for analyzing and conducting pathway and network analysis on a gene list from any organism, although focus will be on human and model eukaryotic organisms. Participants will gain practical experience and skills to be able to: Get more information about a gene list; Discover what pathways are enriched in a gene list (and use it for hypothesis generation); Find out how a set of genes is connected by e.g. protein interactions and identify pathways, systems and modules within this network; Predict gene function and extend a gene list; Identify master regulators, such as transcription factors, active in the experiment. We will develop a unified analysis flow chart throughout the course that students will be able to follow after the workshop to conduct their own analysis.
Registration Closed
With the introduction of high-throughput sequencing platforms, it is becoming feasible to consider sequencing approaches to address many research projects. However, knowing how to manage and interpret the large volume of sequence data resulting from such technologies is less clear. The CBW has developed a popular 2-day course covering the bioinformatics tools available for managing and interpreting high-throughput sequencing data, where the focus is on Illumina reads although the information is applicable to all sequencer reads. Beginning with an understanding of the workflow involved to move from platform images to sequence generation, participants will gain practical experience and skills to be able to: Assess sequence quality Map sequence data onto a reference genome Perform de novo assembly tasks Quantify sequence data Integrate biological context with sequence information
Registration Closed
High-throughput sequencing of RNA libraries (RNA-seq) has become increasingly common and largely supplanted gene microarrays for transcriptome profiling. When processed appropriately, RNA-seq data has the potential to provide a considerably more detailed view of the transcriptome. The CBW has developed a 3-day course providing an introduction to RNA-seq data analysis followed by integrated tutorials demonstrating the use of popular RNA-seq analysis packages. The tutorials are designed as self-contained units that include example data (Illumina paired-end RNA-seq data) and detailed instructions for installation of all required bioinformatics tools (HISAT, StringTie, etc.). Participants will gain practical experience and skills to be able to: Perform command-line Linux based analysis on the cloud Assess quality of RNA-seq data Align RNA-seq data to a reference genome Estimate known gene and transcript expression Perform differential expression analysis Discover novel isoforms Visualize and summarize the output of RNA-seq analyses in R Assemble transcripts from RNA-Seq data.
Registration Closed
Before we can begin to apply rigorous statistical tools to research data, we often need to approach our data intuitively, and look for meaningful associations, surprising patterns, or irregularities, to formulate hypotheses. This is Exploratory Data Analysis (EDA). This workshop introduces the essential tools and strategies that are available for EDA through the free statistical workbench R. Steps covered in this workshop are broadly relevant for many areas of modern, quantitative biology such as flow cytometry, expression profile analysis, function prediction and more. Participants will gain practical experience and skills to be able to: Use R and its analysis tools, read and modify code, and explore protocols that can be adapted for their own research tasks. Write R functions and analysis scripts. Plot and visualize data using the elementary built-in routines via their (sometimes bewildering) array of parameters to sophisticated, publication-ready presentations.
Registration Closed
(2020) Introduction to R June 09-10, 2020
R is rapidly becoming the most important scripting language for both experimental and computational biologists. It is well designed, efficient, widely adopted and has a very large base of contributors who add new functionality for all modern aspects of data analysis and visualization. Moreover it is free and open source. However, R’s great power and expressivity can at first be difficult to approach without guidance, especially for those who are new to programming. This workshop introduces the essential ideas and tools of R. Although this workshop will cover running statistical tests in R, it does not cover statistical concepts. Participants will gain practical experience and skills to be able to: Meet the challenges of data handling Break down problems into structured parts Use R syntax, functions and packages Understand best practices for scientific computational work
Registration Closed
High-throughput sequencing of RNA libraries (RNA-seq) has become increasingly common and largely supplanted gene microarrays for transcriptome profiling. When processed appropriately, RNA-seq data has the potential to provide a considerably more detailed view of the transcriptome. The CBW has developed a 3-day course providing an introduction to RNA-seq data analysis followed by integrated tutorials demonstrating the use of popular RNA-seq analysis packages. The tutorials are designed as self-contained units that include example data (Illumina paired-end RNA-seq data) and detailed instructions for installation of all required bioinformatics tools (HISAT, StringTie, etc.). Participants will gain practical experience and skills to be able to: Perform command-line Linux based analysis on the cloud Assess quality of RNA-seq data Align RNA-seq data to a reference genome Estimate known gene and transcript expression Perform differential expression analysis Discover novel isoforms Visualize and summarize the output of RNA-seq analyses in R Assemble transcripts from RNA-Seq data.
Registration Closed
The CBW has developed a 2.5-day course covering the bioinformatics concepts and tools available for interpreting a gene list using pathway and network information. The workshop focuses on the principles and concepts required for analyzing and conducting pathway and network analysis on a gene list from any organism, although focus will be on human and model eukaryotic organisms. Participants will gain practical experience and skills to be able to: Get more information about a gene list; Discover what pathways are enriched in a gene list (and use it for hypothesis generation); Find out how a set of genes is connected by e.g. protein interactions and identify pathways, systems and modules within this network; Predict gene function and extend a gene list; Identify master regulators, such as transcription factors, active in the experiment. We will develop a unified analysis flow chart throughout the course that students will be able to follow after the workshop to conduct their own analysis.
Registration Closed
With the introduction of high-throughput sequencing platforms, it is becoming feasible to consider sequencing approaches to address many research projects. However, knowing how to manage and interpret the large volume of sequence data resulting from such technologies is less clear. The CBW has developed a popular 2-day course covering the bioinformatics tools available for managing and interpreting high-throughput sequencing data, where the focus is on Illumina reads although the information is applicable to all sequencer reads. Beginning with an understanding of the workflow involved to move from platform images to sequence generation, participants will gain practical experience and skills to be able to: Assess sequence quality Map sequence data onto a reference genome Perform de novo assembly tasks Quantify sequence data Integrate biological context with sequence information
Registration Closed
High-throughput sequencing of Chromatin-Immunoprecipitated libraries (ChIP-seq) and of bisulfite converted DNA (WGBS) have become increasingly common and have largely supplanted microarrays for chromatin and DNA methylation profiling. When processed appropriately, ChIP-seq data provides base-pair resolution representations of transcription factor DNA-binding events and nucleosome (histone) modifications genome-wide. Similarly, WGBS can provide a quantitative genome wide profile of cytosine methylation. The CBW has developed a 2-day course providing an introduction to histone ChIP-seq and WGBS data analysis followed by integrated tutorials demonstrating the use of open source ChIP-Seq and WGBS analysis packages. The tutorials are designed as self-contained units that include example data and detailed instructions for installation of all required bioinformatics tools (FASTQC, BWA, MACS2, FindER, samtools, Picard, BisSNP). The course also includes an overview of integrative epigenomic tools that have been developed to explore ChIP-Seq and WGBS data together with other epigenomic datasets such as RNA-seq, DHS-seq and ATAC-seq. Participants will gain practical experience and skills to be able to: Align ChIP-seq and WGBS sequence data to a reference genome (required) Identify narrow and broad peaks from ChIP-seq data Identify methylated levels from WGBS data Visualize and summarize the output of ChIP-Seq and WGBS analyses Explore integrative tools for epigenomic data sets
Registration Closed
Using high-throughput technologies, life science researchers can identify and characterize all the small molecules or metabolites in a given cell, tissue, or organism. The CBW course covers many topics ranging from understanding metabolomics technologies, data collection and analysis, using pathway databases, performing pathway analysis, conducting univariate and multivariate statistics, working with metabolomic databases, and exploring chemical databases. Hands-on practical tutorials using various data sets and tools will assist participants in learning metabolomics analysis techniques. Participants will gain practical experience and skills to be able to: Design appropriate metabolome-focused experiments Understand the advantages and limitations of metabolomic data analysis Devise an appropriate bioinformatics workflow for processing and analyzing metabolomic data Apply appropriate statistics to undertake rigorous data analysis Visualize datasets to gain intuitive insights into the composition and/or activity of their metabolome
Registration Closed
Before we can begin to apply rigorous statistical tools to research data, we often need to approach our data intuitively, and look for meaningful associations, surprising patterns, or irregularities, to formulate hypotheses. This is Exploratory Data Analysis (EDA). This workshop introduces the essential tools and strategies that are available for EDA through the free statistical workbench R. Steps covered in this workshop are broadly relevant for many areas of modern, quantitative biology such as flow cytometry, expression profile analysis, function prediction and more. Participants will gain practical experience and skills to be able to: Use R and its analysis tools, read and modify code, and explore protocols that can be adapted for their own research tasks. Write R functions and analysis scripts. Plot and visualize data using the elementary built-in routines via their (sometimes bewildering) array of parameters to sophisticated, publication-ready presentations.
Registration Closed
This course focuses on the analysis of DNA and RNA sequencing data, and is offered at Cold Spring Harbor in odd-numbered years. See this page for a companion course in cancer genomics data analysis, offered in even-numbered years. With the introduction of next-generation sequencing platforms, it is now feasible to use high-throughput sequencing approaches to address many research questions. Now more than ever, it is crucial to know what bioinformatic tools and resources are available, and it is necessary to develop informatic skills to analyze high-throughput data using those tools. The Canadian Bioinformatics Workshops (CBW), in collaboration with Cold Spring Harbor Laboratory, has developed a comprehensive seven-day course covering key bioinformatics concepts and tools required to analyze DNA- and RNA-sequence reads using a reference genome. This course combines the material and concepts from three established CBW workshops; see a full outline here. The course will begin with the workflow involved in moving from platform images to sequence generation, after which participants will gain practical skills for evaluating sequence read quality, mapping reads to a reference genome, and analyzing sequence reads for variation and expression level. The course will conclude with pathway and network analysis on the resultant ‘gene’ list. Participants will gain experience in cloud computing and data visualization tools. All class exercises will be self-contained units that include example data (e.g., Illumina paired-end data) as well as detailed instructions for installing all required bioinformatics tools.