Skip to content

Mit Signals And Systems Assignments Discovery

Abstract

Online learning initiatives over the past decade have become increasingly comprehensive in their selection of courses and sophisticated in their presentation, culminating in the recent announcement of a number of consortium and startup activities that promise to make a university education on the internet, free of charge, a real possibility. At this pivotal moment it is appropriate to explore the potential for obtaining comprehensive bioinformatics training with currently existing free video resources. This article presents such a bioinformatics curriculum in the form of a virtual course catalog, together with editorial commentary, and an assessment of strengths, weaknesses, and likely future directions for open online learning in this field.

Online Learning Comes of Age

Online academic “courseware” at the university level has now been available to the public for a decade, the earliest concerted effort having originated in 2002 with the Massachusetts Institute of Technology (MIT) and their OpenCourseWare initiative (http://ocw.mit.edu). This project offered up the syllabi, lecture notes, quizzes, exams, and/or other study materials for a very large number of courses, at the discretion of professors but with strong support and encouragement from the MIT administration. Only in a minority of cases were videos of lectures posted.

Even before this, The University of California, Berkeley, had started webcasting lectures, and eventually began posting both audio and video for public consumption at their Berkeley Webcast site (http://webcast.berkeley.edu), though without the ancillary materials of MIT's OpenCourseWare. A number of other universities followed suit, though seldom so extensively; among these was Stanford with its ClassX streaming service (http://classx.stanford.edu/ClassX) and an earlier effort called Stanford Engineering Everywhere (http://see.stanford.edu/see/courses.aspx). In many cases, individual faculty members took the initiative to post course materials, including video, in widely varying formats. Some adopted the use of “Khan-style videos” or tablet-based screencasts of the sort popularized by the Kahn Academy with its vast library of instructional videos, which started as a viral YouTube sensation and has now become its own well-funded institution (http://www.khanacademy.org).

YouTube indeed became the destination of many academic videos, which are now aggregated by institution under YouTube EDU (http://www.youtube.com/education). Apple has also put its distinctive stamp on online learning with iTunes U (http://www.apple.com/education/itunes-u), also organized by institution but with integrated search capability and, of course, deployment to iPad and iPhone apps. Countless aggregators also assemble collections of video courses, but generally with little value added.

Yale University began in 2007 to release Open Yale Courses (http://oyc.yale.edu) in a more curated and consistent format than most other efforts, including high-quality video and extensive syllabi; courses appeared incrementally, with just under 50 available to date. Then, in 2011, MIT revamped several of its online courses into a much more structured instructional format, with learning modules in outline form containing videos interspersed with self-assessment and other activities. In a somewhat different vein, the non-profit Saylor Foundation compiled a comprehensive online university curriculum comprising courses that are essentially mashups of video and text resources from many existing sources, including a number of those described above (http://www.saylor.org).

In the fall of 2011, a highly publicized online course, “Introduction to Artificial Intelligence” (AI), was conducted by Stanford University Prof. Sebastian Thrun and Google's Director of Research, Peter Norvig, based on the Stanford AI course. It ran “live” in the sense that new videos were released and homework assignments collected on a weekly basis, and quizzes and exams were given at set times, while discussion logs allowed for some degree of interaction. The course attracted 160,000 students from 190 countries, 22,000 of whom finished successfully and were granted “certificates of completion” [1]. Shortly afterwards, MIT set up a similar approach on a new platform called MITx, offering a course in electronic circuits that attracted comparable numbers of students (https://6002x.mitx.mit.edu).

The trend to structured presentation and high production quality then accelerated remarkably, and took an entrepreneurial turn. The AI course was effectively spun off by Prof. Thrun into a Web startup called Udacity (http://www.udacity.com), which is currently live with six courses. In April of 2012, two other Stanford scientists, Profs. Andrew Ng and Daphne Koller, announced a similar newco called Coursera (https://www.coursera.org), with backing from major Silicon Valley venture capital firms. Coursera, also now live, is being stocked with courses from academic partners Stanford, Princeton University, the University of Pennsylvania, and the University of Michigan; this list was recently augmented with a tranche of a dozen more top-tier universities. And in May of 2012, barely six months after MIT had rolled out its new MITx platform, they and Harvard announced that the institutions were investing $30 million each in a joint online learning initiative called edX (http://www.edxonline.org).

All of these initiatives promise to offer undiluted, highly interactive university-level courses to the public, free of charge. Moreover, there is every indication that the instruction can be effective; the U.S. Department of Education, in an exhaustive meta-analysis of 51 published head-to-head trials, found that “on average, students in online learning conditions performed better than those receiving face-to-face instruction” [2].

An Online Bioinformatics Education

Clearly a revolution in open online learning is at hand. This is a welcome addition to a movement that also encompasses open online scientific publication, of which this journal is an example. As such, this is an appropriate forum to assess the current potential for a freely accessible online bioinformatics education.

Both the completeness and the quality of such an unconventional education should be evaluated. Such judgments cannot be entirely objective, and even curricula in conventional university settings vary widely. Thus, this must ultimately be considered an “opinion piece.” Even its purely factual content has to be viewed as evanescent, given the rate of change in online education, and the fact that newly announced initiatives may increase the selection and quality of courses available to a considerable extent even within the year.

Even so, the first opinion offered here is that it is probably already possible for a motivated student to become a competent, employable bioinformatics professional in the comfort of his or her own home—with certain important caveats to be elaborated in the discussion at the end. By way of evidence, a suggested curriculum will be laid out that is supported by existing online resources.

This central thesis, that online bioinformatics education has in some sense “arrived,” can certainly be challenged on a number of counts. The fundamental question of the optimal content for bioinformatics training would probably elude universal consensus in any case, and perhaps the most that can be hoped for is that what follows will contribute meaningfully to the dialogue. Even so, the reader has a right to question both the author's qualifications and methodology in offering these opinions.

The author has advanced degrees in both biology and computer science, has published original research in both fields, and has passing familiarity with but is by no means expert in all of the advanced course topics described below. He has helped design academic curricula as part of a major training grant and taught at both an undergraduate and graduate level, though not extensively, having spent most of his career in the computer and then the pharmaceutical industries. However, in the latter positions he was directly or indirectly responsible for hiring well over a hundred scientists and engineers for bioinformatics-related roles. Thus if any bias exists, it is probably in favor of the practical over the theoretical, though the author's own research is somewhat more in the latter category.

In terms of methodology, the author has personally sampled all of the main courses listed below that are currently available, as well as most of those offered as alternatives or suggested for advanced study. Of these, he has actually completed six of the main courses and seven in the latter categories (most recently, two of the inaugural offerings by Coursera), and has made significant progress in several more. In each case the main course offering for a given topic was adjudged superior to the alternatives based on a variety of criteria including coverage, production quality, availability of ancillary course material, and incorporation of the latest modular courseware technologies described above. Less tangible factors such as teaching style, clarity, and pace were also considered. Courses listed as alternatives to the main courses still met basic standards of quality, and in addition to offering redundancy often had other features that might appeal to specific students, for instance in terms of areas of emphasis. In several cases, courses were selected as main offerings despite being scheduled but not yet online; such judgments were made based on instructors' proven teaching backgrounds and in some instances after direct consultation with them on the syllabi.

Only courses offered without charge were considered. Online courses and entire degree programs for money are widely available, though troubling to some given issues of accreditation and mounting student debt. Course discussion logs on free resources like Coursera indicate a tremendous demand for online education in the developing world, and students anywhere may need to be thrifty, particularly if they are retraining or exploring career change. There are certainly extension programs of universities and other for-profit resources that offer good value-for-money in this arena, and those who can afford it should not be discouraged from taking advantage of such benefits as personalized instruction. Nevertheless, part of the challenge in the present instance is to see just how far the free resources have come. Moreover there is the practical issue that extending the analysis to paid courses would open up a much larger set of alternatives, most of which are inaccessible to evaluation without expenditure.

Only video courses are included, either showing the instructor with slides and/or blackboard, or in screencast format. Learning from course notes only, or even disembodied audio, simply doesn't have the immediacy of the visual experience of a lecture hall or even a tablet-based screencast. At the other extreme, one could maintain that reading textbooks at one's own speed is a more efficient and focused way to learn. That is certainly true for some, and perhaps more so for experienced and mature scholars, but it is probably also true that a lecture format offers much-needed structure to the learning process for others. Moreover, cognitive psychology offers both a theoretical basis and empirical evidence for the benefits of multimedia learning [3]. In any case, most of the courses below require reading at least selections from one or more textbooks in close coordination with the lectures (though in a surprising number of cases the textbooks are freely available online).

What follows, then, is a virtual catalog for a course of study in bioinformatics. It includes both core courses and electives, as will be evident in the commentaries included with each course. Even at that, different paths are possible depending on preparation (whether the student starts with a biology and/or computer science background already) and inclination (whether the student plans to focus on bioinformatics analysis and needs less programming experience, or hopes to develop algorithms and systems that require considerably more computational sophistication). Since this virtual program awards no degrees and makes no guarantees, it will not attempt to set absolute standards for numbers of credits and distribution of core and elective subjects, but will suggest possible study threads in the penultimate section of this article.

Biology Department

Fundamentals of Biology

Source

MIT, 7.012, Profs. Eric Lander, Robert Weinberg, Tyler Jacks, Hazel Sive, Graham Walker, Sallie Chisholm, and Dr. Michelle Mischke (Fall 2011)

Provider description

“Fundamentals of Biology focuses on the basic principles of biochemistry, molecular biology, genetics, and recombinant DNA. These principles are necessary to understanding the basic mechanisms of life and anchor the biological knowledge that is required to understand many of the challenges in everyday life, from human health and disease to loss of biodiversity and environmental quality.”

Commentary

Anyone motivated to enter the field of bioinformatics is unlikely to need a freshman-level introduction to biology, but this one is included for the sake of completeness. The faculty are stellar, and the course has recently been converted to modular form with interactive quizzes, problem sets, exams, and additional helpful features.

Going further

All of the remaining courses in this virtual Department extend the material in this course in various ways.

Principles of Evolution, Ecology, and Behavior

Source

Yale, EEB122, Prof. Stephen Stearns (Spring 2009)

Provider description

“This course presents the principles of evolution, ecology, and behavior for students beginning their study of biology and of the environment … Recent advances have energized these fields with results that have implications well beyond their boundaries: ideas, mechanisms, and processes that should form part of the toolkit of all biologists and educated citizens.”

Commentary

This is a modern treatment of evolution and ecology but not one especially geared to quantitative analysis, so may be considered optional for students of bioinformatics. Still it is a valuable reminder that molecular biology is not all there is. Especially interesting is the coverage of evolutionary medicine, in which Prof. Stearns is a leading light.

Alternatives

The continuation of the first-year Berkeley program, Biology 1B, spends a third of the course covering plant biology in more detail than is necessary for bioinformatics, but also provides a solid introduction to genetics and phylogeny that may be preferred as being more molecular (http://webcast.berkeley.edu/playlist#c,d,Biology,434C6A29FA3A4580). Another interesting alternative is the introductory course by Stanford Prof. Robert Sapolsky on “Human Behavioral Biology,” which actually covers a wide swath of evolution, molecular genetics, and neuroscience (http://www.youtube.com/playlist?list=PL848F2368C90DDC3D).

Biochemistry

Source

Indian Institute of Technology (IIT), Kharagpur, BT20001, Prof. Swagata Dasgupta

Provider description

“Chemistry and metabolism of biopolymers (carbohydrates, lipids, proteins, nucleic acids, and nucleoproteins), vitamins, and hormones. Amino acid, primary, secondary, tertiary, and quaternary structure of proteins … Enzymes and co-enzymes. Glycolytic pathway and TCA cycle. Electron transport and oxidative phosphorylation …”

Commentary

Exposure to biochemistry in greater detail than is found in the introductory biology courses is particularly recommended for those interested in biochemical pathway analysis, metabolomics, and structural bioinformatics. With this video course we introduce a resource developed by the Indian National Programme on Technology Enhanced Learning (NPTEL), whose ambition is “to build at least one version of each course offered in all of Science and Engineering in India, from BTech/BSc to PhD programs” (http://nptel.iitm.ac.in). It currently offers some 110 full video courses, skewed toward engineering, but with plans for up to 400 total. The courses tend to follow very traditional syllabi and sometimes move slowly, but are generally well produced and exhaustive in their coverage. The lectures are delivered in English that is more or less accented but nearly always impeccable, and altogether make for a rather refreshing multicultural experience.

Prerequisites

Introduction to Biology. Organic Chemistry.

Genetics

Source

Berkeley, PMB 160, Profs. Robert Fischer and Jennifer Fletcher (Spring 2012)

Provider description

“A consideration of plant genetics and molecular biology. Principles of nuclear and organellar genome structure and function: regulation of gene expression in response to environmental and developmental stimuli; clonal analysis; investigation of the molecular and genetic bases for the exceptional cellular and developmental strategies adopted by plants.”

Source

Berkeley, MCB C148, Profs. Daniel Barsky and Louise Glass (Spring 2011)

Provider description

“Course emphasizes bacterial and archaeal genetics and comparative genomics. Genetics and genomic methods used to dissect metabolic and development processes in bacteria, archaea, and selected microbial eukaryotes. Genetic mechanisms integrated with genomic information to address integration and diversity of microbial processes. Introduction to the use of computational tools for a comparative analysis of microbial genomes and determining relationships among bacteria, archaea, and microbial eukaryotes.”

Commentary

This pair of courses together provide in-depth coverage of classical genetics through modern genomics of the non-human variety. The first course, entitled “Plant Molecular Genetics,” actually begins with a comprehensive introduction to general Mendelian genetics, before delving into plant genetics in detail. The student may wish to skip some of the latter lectures, but they do cover many aspects of molecular genetics that are completely general. The second entry, “Microbial Genetics and Genomics,” starts halfway through the actual course with the lectures of Prof. Glass, focusing on comparative genomics, and includes an extended exercise in annotation of a new microbial genome from the Joint Genome Institute. Finally, for some exposure to current human genetics, the student should take the “Genetics for Epidemiologists” short course conducted by the National Human Genome Research Institute in 2008 (http://www.youtube.com/playlist?list=PL6D747D95EBB33F2D). While this pastiche of sources may not be ideal, it touches on the major themes in this diverse subject and will give a good sense of the tools underlying many laboratory methods used in molecular biology.

Prerequisites

Introduction to Biology.

Going further

The book “Human Molecular Genetics” by Drs. Tom Strachan and Andrew Reed, now in its 4th edition, goes deeper into modern techniques [5]. Though now a bit dated, a freely accessible online version of the 2nd edition is available from the National Center for Biotechnology Information (NCBI) of the U.S. National Institutes of Health (NIH) (http://www.ncbi.nlm.nih.gov/books/NBK7580).

Molecular Biology

Source

Berkeley, MCB110, Profs. Thomas Alber, Qiang Zhou and Qing Zhong (Fall 2009)

Provider description

“Molecular biology of prokaryotic and eukaryotic cells and their viruses. Mechanisms of DNA replication, transcription, translation. Structure of genes and chromosomes. Regulation of gene expression. Biochemical processes and principles in membrane structure and function, intracellular trafficking and subcellular compartmentation, cytoskeletal architecture, nucleocytoplasmic transport, signal transduction mechanisms, and cell cycle control.”

Commentary

This upper-level Berkeley course in their Biochemistry and Molecular Biology track, which is subtitled “Macromolecular Synthesis and Cellular Function,” is a thorough introduction to basic cellular information processing and as such is important background for bioinformatics. The first third (taught by Prof. Alber) covers DNA replication and repair, the second third (Prof. Zhou) does RNA and protein synthesis, and the final third (Prof. Zhong) includes cell membranes, membrane proteins, trafficking, signaling, the cell cycle, and apoptosis. Note that there are some missing lectures in the first third of the Fall 2009 version, but the student can use the Fall 2008 version for Prof. Alber's lectures (http://itunes.apple.com/itunes-u/molecular-cell-biology-110/id354820355), which, however, is missing the final third of the course. Note that in all cases iTunes has the order of courses reversed in its listing. (An iTunes link is provided rather than a Berkeley Webcast link because a significant number of courses were dropped from the latter website during a redesign in 2011.)

Prerequisites

Introduction to Biology, Biochemistry, or equivalent.

Cell and Systems Biology

Source

Berkeley, MCB130, Profs. Randy Schekman, Kunxin Luo and David Drubin (Spring 2009)

Provider description

“This course is aimed at conveying an understanding of how cellular structure and function arise as a result of the properties of cellular macromolecules. An emphasis will be placed on the dynamic nature of cellular organization and will include a description of physical properties of cells (dimensions, concepts of free energy, diffusion, biophysical properties). Students will be introduced to quantitative aspects of cell biology and a view of cellular function that is based on integrating multiple pathways and modes of regulation (systems biology).”

Commentary

Another upper-level Berkeley course, this one in their Cell and Developmental Biology track, offers a different take on the cell that is geared to current systems biology. Berkeley does not allow this course and the previous one to be taken together for elective credit, but the overlap is mainly with the last third of the Molecular Biology course, so students may want to take only the first two thirds of that course and then this course in its entirety.

Prerequisites

Introduction to Biology, Biochemistry, or equivalent.

Eukaryotic Gene Expression

Source

Indian Institute of Science (IISc), Bangalore, Prof. P.N. Rangarajan

Provider description

“[Topics include] cis-acting elements and trans-acting factors … domain structure of eukaryotic transcription factors … role of chromatin … synthesis of mRNA, rRNA, and tRNA … cell surface receptors … intracellular receptors … regulation of gene expression during development … recombinant protein expression systems … gene therapy and transgenic technology …”

Commentary

This NPTEL course offers a significantly more detailed view of gene regulation than the courses above, though it overlaps with them. It is not absolutely current but will still be of interest to those interested in bioinformatics of signaling pathways and genetic networks. For the larger perspective students should also view a seminar by Dr. Robert Tjian on “The Molecular Biology of Gene Regulation” (http://www.ibioseminars.org/lectures/bio-mechanisms/robert-tjian.html) and, for more recent aspects of microRNA-based regulation, talks by Dr. Adrian Ferré-D'Amaré on “Catalytic and Gene Regulatory RNAs” (http://videocast.nih.gov/launch.asp?17170), by Dr. Victor Ambros on “MicroRNA Pathways in Animal Development” (http://videocast.nih.gov/launch.asp?14844), and by Dr. Witold Filipowicz on “Regulating the Regulators: Mechanisms Controlling Function and Metabolism of microRNAs” (http://videocast.nih.gov/launch.asp?17234).

Prerequisites

Introduction to Biology and Biochemistry or equivalent.

Computational Molecular Biology

Source

Stanford, Biochem 218, Prof. Doug Brutlag (Spring 2012)

Provider description

“… a practical, hands-on approach to the field of computational molecular biology. The course is recommended for both molecular biologists and computer scientists desiring to understand the major issues concerning analysis of genomes, sequences and structures.”

Commentary

A wide-ranging bioinformatics practicum covering aspects of sequence analysis, genomics, phylogenetic reconstruction, gene regulation, and metabolic networks. There is an excellent set of slides in PDF format, which should be viewed in parallel with the video lectures, and a set of practical how-to videos as well. This course provides a biologist's approach to computational biology, and is thus listed separately from a corresponding course in the Computer Science Department. The emphasis here is more on how to use the algorithms than on the details of their construction.

Prerequisites

Molecular Biology.

Alternatives

MIT offers “Genomics and Computational Biology” by Prof. George Church (http://ocw.mit.edu/courses/health-sciences-and-technology/hst-508-genomics-and-computational-biology-fall-2002), but the online version is now 10 years old, and is audio-only so that the user must coordinate the lecture with a separate, rather massive set of slides. One hopes that the recently announced edX initiative will provide a Harvard-MIT course in this area soon. A short practical course on “DNA/Protein Sequence Analysis” is offered by Prof. Amy Denton of California State University, Channel Islands (http://itunes.apple.com/WebObjects/MZStore.woa/wa/viewPodcast?id=472584215). The author is aware of at least one graduate-level course in bioinformatics that is in preparation for one of the major online venues, but is as yet unannounced. While lacking any videos, Stanford Prof. Russ Altman's course “Representations and Algorithms for Computational Molecular Biology” has a wealth of notes, slides, readings, and other useful links (http://helix-web.stanford.edu/bmi214-2006).

Going further

The University of Illinois at Urbana-Champaign conducted a Summer School on “Computational Approaches for Simulation of Biological Systems” in 2003 that posted a number of videos relating to biophysical modeling and bioinformatics analyses of macromolecular structures, a topic otherwise underrepresented here (http://www.ks.uiuc.edu/Training/SumSchool/lectures.html). The laboratory of Prof. Burkhard Rost of the Technische Universität München maintains several short video courses with separate slides, having titles such as “Protein Prediction” and “Computational Systems Biology” (http://rostlab.org/cms/teaching/materials). The Canadian Bioinformatics Workshops provide a number of short courses annually on topics including pathway and network analysis, high-throughput sequencing data, metabolomics, microarrays, and cancer genomics, all of which are archived (http://bioinformatics.ca/workshops/open_access); some lecture videos are missing, but the slide sets are complete.

Introduction to Genome Science

Source

University of Pennsylvania on Coursera, Profs. John Hogenesch and John Isaac Murray (Fall 2012)

Provider description

“This course serves as an introduction to the main laboratory and theoretical aspects of genomics and is divided into themes: genomes, genetics, functional genomics, systems biology, single cell approaches, proteomics, and applications. We start with the basics, DNA sequencing and the genome project, then move to high throughput sequencing methods and applications. Next we introduce principles of genetics and then apply them in clinical genetics and other large-scale sequencing projects. In the functional genomics unit, we start with RNA expression dynamics, analysis of alternative splicing, epigenomics and ChIP-seq, and metagenomics. Model organisms and forward and reverse genetics screens are then discussed, along with quantitative trait locus (QTL) and eQTL analysis. After that, we introduce integrative and single cell genomics approaches and systems biology. Finally, we conclude by introducing … proteomic approaches.”

Commentary

This anticipated Coursera entry promises to touch on all the “hot topics” in genomics, chip technologies, and next-generation sequencing, making it central to this curriculum. It will be closely based on the long-established core course in the Penn Graduate Group in Genomics and Computational Biology, and in fact the instructors plan to use the material with their own students. Prof. Hogenesch in particular has a strong computational orientation and indicates that the material taught in this course will be “bioinformatics-ready” (personal communication).

Prerequisites

Molecular Biology.

Current Topics in Genome Analysis

Source

National Human Genome Research Institute (Winter 2012)

Provider description

“A lecture series covering contemporary areas in genomics and bioinformatics.”

Commentary

This series of 13 extended guest lecturers in course format is offered every other year by the National Human Genome Research Institute (NHGRI) of the U.S. National Institutes of Health (NIH). Coverage includes biological sequence analysis, genome browsers, regulatory and epigenomic landscapes of mammalian genomes, next-generation sequencing technologies, population genetics, genome-wide association studies, pharmacogenomics, large-scale expression analysis, genomic medicine, and genomics of microbes and microbiomes. Handouts are provided. As part of this course, students should also do the NHGRI tutorial “Next-Gen 101” from 2011, which has 9 shorter lectures on whole-exome sequencing and analysis (http://videocast.nih.gov/launch.asp?16885), as well as the “1000 Genomes Tutorial” of 6 even shorter lectures on this important resource for bioinformatics (http://www.youtube.com/playlist?list=PLF61543E11FF78240).

Prerequisites

Molecular Biology.

Going further

The “EMBO Practical Course on Analysis of High-Throughput Sequence Data” (http://www.ebi.ac.uk/training/online/course/embo-practical-course-analysis-high-throughput-seq) is highly recommended as a hands-on introduction to modern genomic analysis. It closely coordinates video lectures with detailed analysis exercises, with tutorial handouts and code supplied, using R and Bioconductor. Topics include short read analysis, ChIP-Seq data and analysis, statistical concepts, differential expression by RNA-Seq, and allele-specific expression and eQTL.

Biological Seminars

Source

Howard Hughes Medical Institute, iBioSeminars

Provider description

“iBioSeminars is a freely available library of video seminars from outstanding scientists, including many HHMI investigators. These lectures, which describe on-going research in leading laboratories, feature an extensive introduction to the subject matter, making them accessible to advanced undergraduates or beginning graduate students and researchers outside of the specific field. The main subject areas are biological mechanisms, cell biology and medicine, developmental biology and evolution, chemical biology and biophysics, and global health and energy.”

Commentary

Much of a biologist's advanced training is down to departmental seminars, invited speakers, conferences, etc. This star-studded collection amassed by the Howard Hughes Medical Institute now has some 80 extended seminars covering a wide range of topics, including some that are underrepresented in the available online courseware, such as neurosciences and developmental biology. An important side benefit of learning the scientific content itself is the educational experience of becoming familiar with the names, faces, and presentation techniques of many of the top scientists in the American biological community.

Alternatives

A particularly rich lode of talks by distinguished scientists is the NIH Director's Wednesday Afternoon Lecture series (http://videocast.nih.gov/PastEvents.asp?c=3). While there are almost 15 years' worth of these videos available for mining, the online student might be well advised to make a habit of tuning in to the live streaming of these events, for more of a flavor of the campus experience.

Mathematics Department

Differential Equations

Source

MIT, 18.03SC, Prof. Arthur Mattuck (Fall 2011)

Provider description

“The laws of nature are expressed as differential equations. Scientists and engineers must know how to model the world in terms of differential equations, and how to solve those equations and interpret the solutions. This course focuses on the equations and techniques most useful in science and engineering.”

Commentary

Bioinformatics students who have somehow only studied math through integral calculus may find that some knowledge of differential equations is an important addition to their skill set. Not only are differential equations a mainstay of mathematical biology in areas such as enzyme kinetics and population dynamics, but they are the basis of many approaches to modeling of biological systems. Prof. Mattuck's development of the subject is fairly traditional, but is supplemented by updated “wrappers” in the MIT courseware that provide helpful visualizations and simulations of the sort to which many modern treatments of the subject are trending.

Numerical Methods

Source

University of South Florida, EML3041, Prof. Autar Kaw (Summer 2012)

Provider description

“Numerical methods are techniques to approximate mathematical procedures … Approximations are needed because we either cannot solve the procedure analytically … or because the analytical method is intractable. In this course, you will learn the numerical methods for the following mathematical procedures and topics - Differentiation, Nonlinear Equations, Simultaneous Linear Equations, Interpolation, Regression, Integration, and Ordinary Differential Equations. Calculation of errors and their relationship to the accuracy of the numerical solutions is emphasized throughout the course.”

Commentary

Numerical methods are an important skill set for those who will actually need to solve differential equations and other formulations that have no easy closed form expression, which applies to a lot of real-world mathematical biology. While math packages can handle much of the dirty work, the real pros need to understand what's under the hood. While Prof. Kaw's course at the University of South Florida is listed here, this link is actually for an independent e-learning course funded by major grants to Prof. Kaw from the U.S. National Science Foundation and used by a variety of universities. It is modular, including not only hundreds of short videos but also quizzes, slides, examples, and demonstrations using a free Mathematica Player. An associated textbook is also freely available online, a chapter at a time [8]. Sample code is provided in each of Maple, MathCad, Mathematica, and MatLab, none of which are free, but the Octave free software package (http://www.gnu.org/software/octave) closely approaches the core functionality of MatLab, which is heavily used in this and several other listed courses for numerical computation and matrix math.

Linear Algebra

Source

MIT, 18.06SC, Prof. Gilbert Strang (Fall 2011)

Provider description

“This course covers matrix theory and linear algebra, emphasizing topics useful in other disciplines such as physics, economics and social sciences, natural sciences, and engineering.”

Commentary

Prof. Strang is a legend as an educator, charmingly diffident in his delivery yet never lacking in clarity. He has long held that the subject of linear algebra should be given as much or more teaching emphasis than calculus and differential equations, and the rise of Big Data is now proving him correct beyond any doubt. No bioinformatics professional dealing with high-dimensional data can afford to neglect an understanding of matrix math, with many bioinformatics methods currently making use of various matrix factorizations, transformations, decompositions, and eigenwhatevers.

Going further

The Harvard Extension School has an advanced course in “Abstract Algebra” taught by Prof. Benedict Gross, starting from a linear algebra foundation to study group theory, vector spaces, fields, etc. (http://www.extension.harvard.edu/open-learning-initiative/abstract-algebra). Prof. Edwin Connell of the University of Miami has a free online textbook “Elements of Abstract and Linear Algebra” with a similar approach (http://www.math.miami.edu/~ec/book). While these may be overkill for bioinformatics, it might just inspire some to seek deeper insights into structures in large datasets. Prof. Strang himself teaches two follow-on video courses in applied mathematics, developing his linear algebra-oriented approach to networks, structures, estimation, Fourier analysis, convolution filtering, etc. (http://ocw.mit.edu/courses/mathematics/18-085-computational-science-and-engineering-i-fall-2008 and http://ocw.mit.edu/courses/mathematics/18-086-mathematical-methods-for-engineers-ii-spring-2006). His magisterial self-published textbook for these courses includes a treatment of microarray analysis to discover “eigengenes” [9].

Statistics

Source

Princeton on Coursera, Prof. Andrew Conway (Fall 2012)

Provider description

“Statistics One is designed to be a friendly introduction to very simple, very basic, fundamental concepts in statistics … Random sampling and assignment. Distributions … Descriptive statistics. Measurement … Correlation. Causality … Multiple regression. Ordinary least squares … Confidence intervals. Statistical power … t-tests, chi-square tests. Analysis of Variance.”

Commentary

Only those with no exposure at all to statistics, or those who would benefit from a refresher, should feel the need to take this rather elementary introduction, but the skills are certainly essential to bioinformatics analysis. If necessary it can also provide a gentle lead-in to the Introduction to Probability course, which in turn will be required for more advanced work in statistics. The course makes use of the free statistical software package R (http://www.r-project.org), which bioinformatics practitioners should have in their toolbox not only for classical statistical tests taught here but for more advanced applications such as linear and nonlinear modeling, time-series analysis, classification, clustering, etc.

Alternatives

Udacity is offering a similar introductory course by Stanford Prof. Sebastian Thrun (http://www.udacity.com/overview/Course/st101). Profs. Susan Dean and Barbara Illowski of De Anza College offer an “Elementary Statistics” video course that also has a free online textbook and a full complement of quizzes, exams, and assignments (http://sofia.fhda.edu/gallery/statistics/index.html). For a stimulating change, one can consider learning or reviewing the basics of statistics from the perspectives of other disciplines. For instance, another way to pick up R while learning a little epidemiology is through Berkeley Prof. Tomas Aragon's course in “Applied Epidemiology using R” (http://www.youtube.com/view_play_list?p=1CBCB8C53D0CBE1F). A somewhat more detailed (but also considerably more protracted) treatment of basic research statistics is to be found in Berkeley Prof. Frederic Theunissen's “Research and Data Analysis in Psychology” (http://www.youtube.com/view_play_list?p=A07B0BAB1D82C53C). For those with more math and less time, an “Introduction to Statistical Methods for High-Energy Physics” by Prof. Glen Cowan (http://videolectures.net/cernstudentsummerschool09_cowan_is) is a four-lecture overview of material taught in the University of London course.

Going further

Prof. Wim Krijnen of Hanze University in the Netherlands has a free online textbook “Applied Statistics for Bioinformatics using R” [10] that does a lovely job of combining a course in statistics with instruction in R and more advanced applications to bioinformatics such as microarray analysis. Further study of statistics should be undertaken only after completing the Introduction to Probability below.

Introduction to Probability

Source

Harvard, Statistics 110, Prof. Joseph Blitzstein (Fall 2011)

Provider description

“A comprehensive introduction to probability. Basics: sample spaces and events, conditional probability, and Bayes' Theorem. Univariate distributions: density functions, expectation and variance, Normal, t, Binomial, Negative Binomial, Poisson, Beta, and Gamma distributions. Multivariate distributions: joint and conditional distributions, independence, transformations, and Multivariate Normal. Limit laws: law of large numbers, central limit theorem. Markov chains: transition probabilities, stationary distributions, convergence.”

Commentary

Bioinformatics methods depend on statistics to a much greater degree and in much greater depth than biologists typically encounter in their training for analysis of variance and experimental design. Consequently a solid foundation in probability is de rigeur, particularly in preparation for data mining and machine learning applications. Prof. Blitzstein has an unintimidating, even laid-back style, always striving to convey valuable intuitions, but does not lack in rigor or depth of coverage.

Prerequisites

As noted above, those who lack even a basic working knowledge of statistics should take Statistics One, which can also serve as a less demanding lead-in to this course.

Going further

IIT Kharagpur offers two courses through NPTEL that start with a more mathematically intensive treatment of probability founded in measure theory (usually kept “behind the curtain” for non-mathematicians), but then extend it in two different directions: “Probability and Statistics” by Prof. Somesh Kumar (http://nptel.iitm.ac.in/courses/111105041) and “Probability and Random Processes” by Prof. Mrityunjoy Chakraborty (http://www.youtube.com/playlist?list=PLD85E88483F782338). One flavor of stochastic processes that is especially important in bioinformatics is taught in “Introduction to Markov Processes” by Prof. Christof Schutte, head of the Biocomputing Group at the Freie Universität Berlin (http://www.networkmaths.ie/videos/list_videos.php?course=mar). In terms of books, a quick tour of statistical inference suited to a computer science world view can be found in the ambitiously titled “All of Statistics” by Carnegie-Mellon University Prof. Larry Wasserman [12]. For a treatment of probability, statistics, and stochastic processes that makes reference to bioinformatics throughout, see the book “Statistical Methods in Bioinformatics” by University of Pennsylvania Prof. Warren Ewens and Gregory Grant [13]. The first edition of Stanford Prof. Robert Gray's “Probability, Random Processes, and Ergodic Properties,” since reissued in a revised second edition, is freely available online [14].

Automata

Source

Stanford, CS154 on Coursera, Prof. Jeffrey Ullman (Spring 2012)

Provider description

“The course covers four broad areas: (1) Finite automata and regular expressions, (2) Context-free grammars, (3) Turing machines and decidability, and (4) the theory of intractability, or NP-complete problems.”

Prerequisites

Data Structures or equivalent. Prof. Ullman recommends portions of his free online textbook “Foundations of Computer Science” as preparation [15]. The optional programming assignments require Java or Python.

Commentary

Despite the name, this course also extends to formal language theory and introduces tractability. The primary attraction of this Coursera offering is its illustrious instructor, who literally wrote the book on automata (and on databases, on algorithms, etc.). It's hard to imagine a better way for biologists to be introduced to the theory of computation. Topics such as automata and grammars are important in areas like pattern matching and RNA fold prediction, while an awareness of tractability and decidability is essential in contemplating algorithmic approaches to new problems. Perhaps most importantly, as Prof. Ullman points out, surveys of Stanford grads show that this course was one of the most useful in their subsequent careers, for the mindset it engendered in solving many real-world computational challenges.

Alternatives

For a somewhat more extensive treatment, the Harvard Extension School has an outstanding “Introduction to Formal Systems and Computation” by Prof. Harry Lewis (http://itunes.apple.com/WebObjects/MZStore.woa/wa/viewPodcast?id=429428100). A “Theory of Automata, Formal Languages and Computation” is offered by Prof. Kamala Krithivasan of IIT Madras through NPTEL, which includes lectures on natural language processing and DNA computing (http://nptel.iitm.ac.in/courses/106106049). The book by MIT Prof. Michael Sipser is standard [16], but for a free online alternative try the text by the late Prof. Eitan Gurari of Ohio State University [17].

Discrete Math

Source

Stony Brook University, Prof. Steven Skiena, CSE 547 (1999)

Provider description

“The mathematical analysis of algorithms uses a variety of topics from discrete mathematics—combinatorial analysis, number theory, and graph theory. The purpose of this course is to provide fluency with summations, congruences, generating functions, graph theory, and other tools of the trade. The emphasis will be on learning how to attack and solve problems.”

Electrical Engineering and Computer Science (Course 6)

IAP 2018

AccessClass
MIT6.057 - Introduction to MATLAB
Public6.058 - Review of Signals & Systems
MIT6.148 - Web Programming Competition
Class6.178 - Intro to Software Engineering in Java
Class6.914/16.669 - Project Engineering
MIT6.S085 - Creating Software Analysis Tools
MIT6.S090 - Special Subject in EE & CS
Public6.S092 - Artificial Intelligence and Global Risks
Public6.S095 - Programming for the Puzzled

Fall 2017

Spring 2017

IAP 2017

AccessClass
Class6.057 - Introduction to MATLAB
Public6.058 - Review of Signals & Systems
Public6.148 - Web Programming Competition
MIT6.178 - Intro to Software Engineering in Java
Class6.179 - Introduction to C and C++
Class6.914/16.669 - Project Engineering
MIT6.S089 - Special Subject in EE & CS: Introduction to Quantum Computing
Public6.S194 - 6.S194 Special Laboratory Subject in EECS, Error-Efficient Computing

Fall 2016

Spring 2016

AccessClass
MIT6.0001 - Intro to CS and Programming in Python
MIT6.0002 - Intro to Computational Thinkng & Data Science
Class6.002 - Circuits and Electronics
Public6.003 - Signals and Systems
Public6.005 - Software Construction
Public6.006 - Introduction to Algorithms
Class6.007 - Applied Electromagnetism & Quantum Mechanics
Class6.011 - Signals, Systems, & Inference
Class6.012 - Microelect Devices & Circuits
Class6.013 - Electromagnetics & Applications
Class6.022/HST.542/2.792/6.522/20.371 - Quant Systems Physiology
Class6.023/2.793/20.330 - Fields, Forces and Flows
Class6.024/2.797/3.053/20.310 - Molec, Cell, & Tissue Biomechs
Class6.027/2.180/2.18 - Biomolecular Feedback Systems
Class6.03 - Intro to EECS II Med Tech
Class6.035 - Computer Language Engineering
Class6.036 - Introduction to Machine Learning
Class6.041/6.431 - Probabilistic Systems Analysis
Public6.042/18.062 - Math For Computer Science
Public6.045/18.400 - Automata, Comput, & Complexity
Class6.046/18.410 - Design and Analysis of Algorithms
Class6.049/7.33 - Evolutionary Biology
MIT6.050/2.110 - Info, Entropy, & Computation
MIT6.070/EC.120 - Electronics Project Laboratory
Class6.071/22.071 - Elec, Signals, Measurement
Class6.073/CMS.611 - Creating Video Games
MIT6.123/20.345 - Bioinstrumentation Project Lab
MIT6.129/20.129 - Biological Circuit Engr Lab
Class6.141/16.405 - Robotics: Sci and Sys I
MIT6.163 - Strobe Project Lab
Class6.169 - Applic of Circuits & Elctroncs
Class6.252/15.084 - Nonlinear Optimization
Public6.256 - Algebraic Techniques and Semidefinite Optimization
Class6.265/15.070 - Advanced Stochastic Processes
Class6.265/15.070 - Adv Stochastic Processes
Class6.281/ESD.216/1.203/15.073/16.76 - Logistical and Transportation Planning Methods
Class6.302 - Feedback System Design
Class6.334 - Power Electronics
Public6.337/18.335 - Introduction to Numerical Methods
Class6.344 - Digital Image Processing
Class6.431/6.041 - Probabilistic Systems Analysis
Class6.437 - Inference and Information
MIT6.441 - Information Theory
Class6.442 - Optical Comm and Networks
Class6.443/8.371/18.436 - Quantum Information Science
Class6.522/HST.542/2.792/6.022/20.371 - Quant Systems Physiology
Class6.541/HST.710/24.968 - Speech Communication
Class6.555/HST.582/16.456 - Biomedical Signal & Image Processing
Public6.695/ESD.162/15.032 - Eng Econ & Reg: Electric Power
Class6.730 - Physics:Solid-State Applicatns
Class6.731 - Semi Opto Theory and Design
Class6.775 - CMOS Analog and Circuit Design
MIT6.776 - High Speed Comm Circuits
Class6.780/ESD.63/2.830 - Control of Manufacturing Processes
Class6.802/6.874/7.36/7.91/20.390 - Computational & Systems Biolgy
MIT6.807 - Computational Fabrication
Public6.813/6.831 - User Interface Design
Class6.816/6.836 - Multicore Programming
Public6.831/6.813 - User Interface Design
Class6.834/16.412 - Cognitive Robotics
Class6.836/6.816 - Multicore Programming
Public6.841/18.405 - Advanced Complexity Theory
Class6.859/15.083 - Integer Programming & Combinatorial Optimization
Class6.874/6.802/7.36/7.91/20.390 - Computational & Systems Biolgy
Class6.875/18.425 - Cryptography & Cryptanalysis
Public6.881 - Computational Personal Genomics: making sense of complete genomes
Class6.883 - Adv Topics in Artificial Intel
Public6.902/ESD.051/2.723 - Engr Innovation and Design
MIT6.903 - Law of Intellectual Property
MIT6.904/1.082/2.900/10.01/22.014 - Ethics for Engineers
Class6.929/ESD.174/5.00/10.579/22.813 - Energy Tech and Policy
Class6.933 - Entrepreneurship in Engineering
Class6.935/15.481 - Financial Market Dynamics & Human Behavior
Public6.S04 - Fundamentals of Programming
Public6.S194 - Open Source Software Project Lab
Class6.S898 - Cybersecurity Policy
Class6.S977 - Technical Communication Skills for Graduate Students
Class6.UAR - Prep for Undergrad Research

IAP 2016

AccessClass
Class6.057 - Introduction to MATLAB
MIT6.058 - Preview of Signals & Systems
MIT6.148 - Web Programming Competition
Class6.149 - Introduction to Python
Class6.169 - Applications of Circuits & Elctroncs
MIT6.177 - Build Program Exp in Python
Public6.178 - Intro to Software Engr in Java
MIT6.179 - Introduction to C and C++
Class6.906/6.936 - StartMIT: Workshop for Entrepreneurs and Innovators
Public6.S190 - Special Lab Subject in EE & CS
Class6.S194 - Special Laboratory Subject in EECS, Error-Efficient Computing
MIT6.S915 - Special Subject in EECS

Fall 2015

Spring 2015

AccessClass
Public6.0001 - Introduction to Computer Science Programming in Python
MIT6.0002 - Introduction to Computational Thinking & Data Science
Class6.002 - Circuits and Electronics
Class6.003 - Signals and Systems
Public6.005 - Software Construction
MIT6.006 - Introduction to Algorithms
Class6.007 - Electromagnetic Energy: From Motors to Solar Cells
Class6.011 - Intro:Comm,Control,Signal Proc
Class6.012 - Microelect Devices & Circuits
Class6.013 - Electromagnetics & Applications
MIT6.022/2.792/6.522/20.371/HST.542 - Quantitative Systems Physiology
MIT6.022/HST.542/2.792/6.522/20.371 - Quant Systems Physiology
Class6.023/2.793/20.330 - Fields, Forces and Flows
Class6.024/2.797/3.053/20.310 - Molec, Cell, & Tissue Biomechs
Class6.036 - Intro to Machine Learning
Class6.041/6.431 - Probabilistic Systems Analysis
Public6.042/18.062 - Math For Computer Science
Public6.045/18.400 - Automata, Comput, & Complexity
Class6.046/18.410 - Design and Analysis of Algorithms
Class6.049/7.33 - Evolutionary Biology
Class6.061/6.690 - 6.061
Public6.061-DEVELOPMENT - 6.061-DEVELOPMENT
MIT6.070/EC.120 - Electronics Project Laboratory
Class6.071/22.071 - Elec, Signals, Measurement
MIT6.072/EC.110 - Intro to Digital Electronics
MIT6.123/20.345 - Bioinstrumentation Project Lab
MIT6.129/20.129 - Biological Circuit Engr Lab
MIT6.141/16.405 - Robotics: Sci and Sys I
MIT6.163 - Strobe Project Lab
Class6.169 - Applic of Circuits & Elctroncs
Public6.182 - Psychoacoustics Project Lab
Class6.241/16.338 - Dynamic Systems and Control
Public6.252/15.084 - Nonlinear Optimization
Class6.262 - Discrete Stochastic Processes
Class6.268 - Network Science and Models
Class6.302 - Feedback Systems
Public6.337/18.335 - Intro: Numerical Methods
Class6.344 - Digital Image Processing
Class6.431/6.041 - Probabilistic Systems Analysis
Class6.437 - Inference and Information
MIT6.441 - Information Theory
MIT6.522/2.792/6.022/20.371/HST.542 - Quantitative Systems Physiology
MIT6.522/HST.542/2.792/6.022/20.371 - Quant Systems Physiology
Class6.541/HST.710/24.968 - Speech Communication
Class6.632 - Electromagnetic Wave Theory
Class6.690/6.061 - 6.061
Class6.695/ESD.162/15.032 - Eng Econ & Reg: Electric Power
Class6.717/2.374/6.777 - MEMS
Class6.730 - Physics:Solid-State Applicatns
Class6.775 - CMOS Analog and Circuit Design
MIT6.776 - High Speed Communication Circuits
Class6.777/2.374/6.717 - MEMS
MIT6.780/ESD.63/2.830 - Control of Manufacturing Processes
Class6.781/2.391 - Nanostructure Fabrication
Class6.802/6.874/7.36/7.91/20.390 - Computational & Systems Biolgy
Public6.813/6.831 - User Interface Design
Public6.815/6.865 - Digital & Computational Photography
Public6.831/6.813 - User Interface Design
MIT6.834/16.412 - Cognitive Robotics
Class6.835 - Intelligent Multimodal User Interfaces
Public6.841/18.405 - Advanced Complexity Theory
Class6.857 - Network and Computer Security
Public6.865/6.815 - Digital & Computational Photography
Class6.874/6.802/7.36/7.91/20.390 - Computational & Systems Biolgy
Public6.885 - Introduction to Principles and Practice of Software Synthesis
Public6.886 - Adv Performance Engineering for Multicore Applications
Class6.891 - Games, Decision, and Computation
Class6.902/ESD.051/2.723 - Engineering Innovation and Design
MIT6.903 - Law of Intellectual Property
Class6.933 - Entrepreneurship in Engineering: The Founder’s Journey
Class6.S03 - Intro to EECS II Med Tech
Class6.S076 - Engineering Innovation and Design for Freshmen
MIT6.S079 - Computational Fabrication
Class6.S198 - Special Lab Subject in EE & CS
MIT6.UAR - Prep for Undergrad Research
Class6.UAT - Oral Communication

IAP 2015

AccessClass
Class6.057 - Introduction to MATLAB
MIT6.148 - Web Programming Competition
MIT6.176 - Pokerbots Competition
MIT6.177 - Build Program Exp in Python
Public6.179 - Introduction to C and C++
Public6.S092 - Introduction to Java Programming

Fall 2014

Summer 2014

AccessClass
Class6.443/8.371/18.436 - Quantum Information Science
Public6.THM - Master of Engr Program Thesis
Public6.UAP - Undergraduate Advanced Project

Spring 2014

AccessClass
Public6.00 - Intro: Comp Sci & Programming
Class6.002 - Circuits and Electronics
Class6.003 - Signals and Systems
Public6.005 - Software Construction
Public6.006 - Intro to Algorithms
Class6.007 - Electro Energy Motors to Laser
Class6.011 - Intro:Comm,Control,Signal Proc
Class6.012 - Microelect Devices & Circuits
Class6.013 - Electromagnetics & Applications
Class6.022/HST.542/2.792/6.522/20.371