The process of discovering, developing, and bringing a new medicine to market has long been synonymous with sky-high costs, time-intensive phases, and staggering failure rates. On average, it takes 10–15 years and more than $2.5 billion to develop a single drug1. Out of 10,000 compounds, only about 250 enter preclinical testing, 5 make it to clinical trials, and just 1 is eventually approved2. Despite scientific progress, these figures have remained largely unchanged over the past decade, reflecting the inefficiencies and complexity of the traditional pharmaceutical R&D model.
Today, we stand at the edge of a profound transformation. The widespread adoption of artificial intelligence and digital tools is beginning to disrupt every stage of drug discovery, development and even advanced manufacturing. Companies like Insilico Medicine and Exscientia, now Recursion Pharmaceutical, have used AI to identify novel drug candidates in 30 and 15-12 months respectively, a process that traditionally took 4-5 years3,4. Pfizer employed AI and automation to accelerate COVID-19 vaccine development, reducing timelines from years to months5. AstraZeneca has invested in smart factories that utilize AI to create digital twins, optimizing operational flow and reducing lead times6. GSK has also adopted AI strategies to significantly compress drug discovery timelines7. Novo Nordisk is leveraging data science and AI to optimize clinical development and improve therapeutic decision-making. Published just recently was the reduction of time taken to compile regulatory documents from 12 weeks to under 10 minutes8.
In average we, scientists, produce today 1 million more data per unit of time that we did 40 years ago7. As labs embrace high-throughput experimentation and automated workflows, the amount of data generated daily has reached unprecedented levels. A single high-throughput chemistry lab can screen tens of thousands of chemical reactions per day9,10. For example, droplet-based microfluidic platforms have the potential of running thousands of experiments per second11, while digital tools can simulate billions of conditions virtually12. We are not just creating data, we are drowning in it.
However, this data deluge is only valuable if harnessed properly. A robust data strategy is no longer optional—it’s essential. First, data capture and ingestion must be streamlined across instruments and platforms. Second, data standardization and normalization ensure compatibility, sharing and reuse. Third, to unlock value, data must be AI- and automation-ready, which means proper formatting, metadata tagging and enrichment13, and accessible infrastructure. Only then can we fully leverage AI to uncover patterns, predict outcomes, and accelerate innovation.
The pharmaceutical industry is at a pivotal moment. With the fusion of AI, automation, and data-driven strategies, we have the opportunity to redefine what is possible from drug discovery and development all the way to commercial manufacturing. But success depends not just on generating data—it depends on managing it intelligently. The future belongs to those who can turn data into decisions, and decisions into breakthroughs.
1. Mullard, A. 2019 FDA Drug Approvals. Nat. Rev. Drug Discov. 2020, 19, 79–84. https://doi.org/10.1038/d41573-020-00001-7
2. Wouters, O. J.; McKee, M.; Luyten, J. Estimated Research and Development Investment Needed to Bring a New Medicine to Market, 2009–2018. JAMA 2020, 323 (9), 844–853. https://doi.org/10.1001/jama.2020.1166
3. Insilico Medicine. Insilico Medicine Announces First AI-Designed Drug to Advance to Phase I Trials. https://insilico.com/phase1 (accessed May 2025).
4. UK Research and Innovation. Exscientia: A Clinical Pipeline for AI-Designed Drug Candidates. https://www.ukri.org/who-we-are/how-we-are-doing/research-outcomes-and-impact/bbsrc/exscientia-a-clinical-pipeline-for-ai-designed-drug-candidates/ (accessed May 2025).
5. Pfizer. Accelerating Digital Technology for COVID-19 Vaccine Rollout. https://www.pfizer.com/news/behind-the-science/accelerating-digital-technology-covid-19-vaccine-rollout (accessed May 2025).
6. Smart Factories: Delivering more for Patients Worldwide. https://www.astrazeneca.com/content/astraz/what-science-can-do/topics/technologies/smart-factories-delivering-more-for-patients-worldwide.html# (accessed May 2025).
7. How GSK is Speeding up Drug Discovery Timelines with AI. The Tech Talks Daily Podcast, March 16, 2025. https://techblogwriter.co.uk/gsk/ (accessed May 11, 2025).
8. Novo Nordisk. Data Science and AI. Novo Nordisk, March 7, 2025. https://www.novonordisk.com/content/dam/nncorp/global/en/investors/irmaterial/cmd/2024/P10-Data-Science-and-AI.pdf (accessed May 11, 2025).
From 12 Weeks to 10 Minutes: How Novo Nordisk Accelerates Time to Value with GenAI and MongoDB, https://www.mongodb.com/solutions/customer-case-studies/novo-nordisk (accessed May25)
9. Perera, D.; Tucker, J. W.; Brahmbhatt, S.; Helal, C. J.; Chong, A.; Farrell, W.; Richardson, P.; Sach, N. W. A Platform for Automated Nanomole-Scale Reaction Screening and Micromole-Scale Synthesis in Flow. Science 2018, 359 (6374), 429-434. https://doi.org/10.1126/science.aap9112
10. Buitrago Santanilla, A.; Regalado, E. L.; Pereira, T.; Shevlin, M.; Bateman, K.; Campeau, L.-C.; Schneeweis, J.; Berritt, S.; Shi, Z.-C.; Dreher, S. D. Nanomole-Scale High-Throughput Chemistry for the Synthesis of Complex Molecules. Science 2014, 347 (6217), 49–53. https://doi.org/10.1126/science.1259203
11. Samimi, A.; Hengoju, S.; Rosenbaum, M. A. Combinatorial Sample Preparation Platform for Droplet-Based Applications in Microbiology. Sens. Actuators, B 2024, 417, 136162. https://doi.org/10.1016/j.snb.2024.136162
12. Li, S.-C.; Wang, P.-H.; Su, J.-W.; Chiang, W.-Y.; Huang, S.-H.; Lin, Y.-C.; Ou, C.-H.; Chen, C.-Y. Application of the Digital Annealer Unit in Optimizing Chemical Reaction Conditions for Enhanced Production Yields. arXiv 2024, arXiv:2407.17485 [physics.chem-ph]. https://doi.org/10.48550/arXiv.2407.17485
13. Della Corte, D.; Young, N.; Della Corte, K. A. Delivering Digital Transformation with Self-Reporting Data Assets. Lect. Notes Netw. Syst. 2021, 358, 975–985. https://doi.org/10.1007/978-3-030-89906-6_62