Motivation

The explosion in the quantity and complexity of experimental data generated by biomedical research is widely recognized.

The amount of data being produced in genomics daily is doubling every seven months, so within the next decade, genomics is looking at generating somewhere between 2 and 40 exabytes a year.

This has created a bottleneck in converting new discoveries into clinical applications —the so-called “translational medicine” pipeline—and it is widely understood that machine learning and other AI approaches must be applied to increase the speed of processing data and close this gap. A software infrastructure is needed to process and store the data, analyze and summarize it in an understandable form, integrate it into comprehensive predictive models of normal and pathological processes, and apply these models to diagnose and treat patients.

Last updated