While the disciplines of biology, chemistry, and medicine have anchored drug discovery research since its inception, data science is a recent development in comparison. Yet, it is widely recognized that public and proprietary data, together with the ability to extract knowledge from them, are key assets that can drive competitive advantage.
In the context of the pharmaceutical industry, data science can be defined as the discipline at the interface of statistics, computer science, and drug discovery. Data scientists use traditional drug discovery research and add the ability to extract knowledge from the data that can drive competitive advantage. Members of the data discovery team are likely to include clinical statisticians, computational chemists, biostatisticians, and computational biologists who have been contributing to drug discovery and development through analyses of large datasets long before the term data science was popularized. More recently, machine learning engineers and specialized data scientists with specific skillsets (e.g., deep learning, image processing, or body sensors analysis) have joined the ranks of growing data science teams in pharmaceutical companies.
While the impact these scientists are having in both early and late drug discovery projects is recognized and often highly visible within an organization, they may not be well-recognized among higher leadership as essential to the organization.
In order to establish data science as a core drug discovery discipline, team composition needs to evolve at all levels: from leadership to project teams. Developing a greater understanding within leadership teams of the potential, applications, limitations, and pitfalls of data science in the pharmaceutical industry is now critical. Inclusion of data science leaders in decision-making bodies connects data scientists to critical business questions, raises organizational awareness of computational approaches and data management, and further connects disease-focused departments with discovery and clinical platforms. While the relatively recent emergence of data science means its practitioners may have less extensive career experience in pharmaceutical research than their peers in other functions, they are likely to provide novel perspectives and take orthogonal approaches to the difficult task of discovering and developing new drugs.
Traditional drug discovery project teams are composed of key scientific experts: biologists, pharmacologists, chemists, and clinicians, who collaborate to move the programs from target discovery to clinical trials. For projects to be fueled by computational insights and predictions, data scientists need to be integral members of the project teams and engage as collaborators (as opposed to being perceived as just a support function). This enables the development of a project-specific data strategy, deployment of resources required for the more data-intensive phases of the program, and application of the most effective computational methods to address the key project questions.