Benefits and Drawbacks of Generative AI in Biotech

Resources / eLabBlog / Benefits and Drawbacks of Generative AI in Biotech

By Zareh Zurabyan 9 min read 02 May 2024

Biotech R&D has birthed some of the most impressive innovations, from recombinant DNA to genome editing. While the road to commercialisation has always been challenging, many fundamental barriers to innovation have grown bigger. Currently, there’s an overload of unstructured data and ideas. Translating these into world-changing ideas presents a huge organisational and logistical challenge.

With the rise of generative AI in the past year, a solution to some of these issues is on the horizon. Generative AI can promote divergent thinking, challenge the bias of experts, evaluate and refine ideas, and facilitate collaboration across niche research areas. It’s also streamlining the data lifecycle and changing the creative aspects of biotech lab operations, such as automating and improving the quality of content, from lab notebooks to published scientific literature.

In the following blog, we look at what generative AI is, how it works, and its applications in biotech and the broader life sciences.

What is Generative AI?

Generative AI, or Gen AI, is a class of artificial intelligence techniques and algorithms that generate new data samples or content, including audio, code, images, text, simulations, and videos.

Unlike discriminative models that focus on classification or prediction tasks based on existing data, generative models learn the underlying patterns and structures of the data to generate new instances that are statistically similar to the training data.

Gen AI Encompasses a Wide Range of Algorithms and Approaches

Gen AI has applications across various domains, including image generation, text synthesis, music composition, drug discovery, and content creation. It enables the generation of realistic and diverse data samples, facilitates data augmentation for training machine learning models, and fosters creativity and innovation in AI-driven applications.

There are several algorithms and approaches, including:

Generative Adversarial Networks (GANs): GANs consist of two neural networks, a generator and a discriminator, which are trained simultaneously in a competitive manner. The generator learns to generate realistic data samples, such as images, text, or audio, while the discriminator learns to distinguish between real and generated samples. Through adversarial training, GANs produce high-quality synthetic data that closely resembles the distribution of real data.
Variational Autoencoders (VAEs): VAEs are probabilistic generative models that learn to encode and decode data samples into a lower-dimensional latent space. By sampling from the latent space and decoding the samples back into the original data space, VAEs can generate new data samples that capture the variability and structure of the training data. VAEs are commonly used for generating images, text, and other complex data types.
Autoregressive Models: AutoRegressive models, such as autoregressive neural networks (ARNs) and autoregressive moving average (ARMA) models, generate data sequences by modeling the conditional probability distribution of each data point given the previous observations. By iteratively sampling from the conditional distribution, autoregressive models generate sequences of data samples, such as time series data, speech, or text.
Transformers: Transformers are a class of deep learning architectures that have achieved state-of-the-art performance in natural language processing (NLP) tasks. Transformers, particularly variants like GPT (Generative Pre-trained Transformer) and BERT (Bidirectional Encoder Representations from Transformers), can generate coherent and contextually relevant text by modeling the relationships between words or tokens in a sequence.

How Does Gen AI Work?

Gen AI uses various techniques – including neural networks and deep learning algorithms – to identify patterns and generate new outcomes based on them. The training process for a generative model involves feeding it a large dataset of examples, such as images, text, audio, and videos. While traditional AI aims to perform specific tasks based on predefined rules and patterns, gen AI goes beyond this limitation and strives to create entirely new data that resembles human-created content. A language model is an example of gen AI, using a probabilistic model of a natural language that can generate probabilities of a series of words based on text corpora in one or multiple languages it was trained on. Large language models, as their most advanced form, are a combination of feedforward neural networks and transformers.

In the area of natural language processing, there is also a neural network, a method in artificial intelligence that teaches computers to process data in a way inspired by the human brain. Deep learning is a type of machine learning process that uses interconnected nodes or neurons in a layered structure that resembles the human brain. These algorithms can take different data inputs and be used for speech and voice recognition.

How is Gen AI Currently Used in Biotech?

Gen AI is increasingly utilised in biotech and life sciences across several applications, leveraging its ability to generate realistic and diverse data samples. Some key areas where Gen AI is currently being used in biotech and life sciences include:

Drug Discovery and Development: Gen AI generates novel molecular structures with desired properties for drug candidates. Generative models like GANs and VAEs can generate new chemical compounds with specific pharmacological properties, helping identify potential drug candidates and accelerating the drug discovery process.
Protein Design and Engineering: Gen AI techniques are employed to design and engineer proteins with enhanced functionalities or specific biological activities. Researchers can design novel enzymes, antibodies, or therapeutics for various applications, including enzyme engineering, drug delivery, and immunotherapy by generating protein sequences or structures with desired properties.
Biological Image Synthesis: Generative models synthesise realistic biological images, such as microscopy images of cells, tissues, or organisms. These synthesised images can be used to augment training datasets for image analysis algorithms, improve the generalisation of machine learning models, and generate data for virtual screening and testing of algorithms in silico.
Omics Data Generation: Gen AI techniques are applied to generate synthetic omics data, including genomics, transcriptomics, proteomics, and metabolomics data. Synthetic omics data can be used to supplement real experimental data, simulate biological processes, and validate computational models, enabling researchers to explore complex biological systems and discover biomarkers or therapeutic targets.
Text and Literature Generation: Generative models generate text-based content, such as scientific articles, literature reviews, or drug interaction reports. These generated texts can assist researchers in literature mining, knowledge discovery, and data summarisation, facilitating literature-based research and biotech and life sciences decision-making.
Biomolecule Design and Synthesis: Gen AI techniques are used to design and synthesise novel biomolecules, such as peptides, aptamers, or nucleic acids, with specific functions or properties. By generating sequences or structures with desired characteristics, researchers can develop biomolecules for diagnostics, therapeutics, and biosensing applications.

Overall, Gen AI is revolutionising biotech and life sciences by enabling the generation of novel data samples, molecules, and biological entities, fostering innovation, and accelerating research and development efforts in various domains. As the field continues to advance, Gen AI is expected to play an increasingly pivotal role in shaping the future of biotechnology and life sciences, including clinical research. Gen AI can help identify which patient cohorts will respond best to specific drugs, developing more personalised medicine. On the operational and marketing sides, generative AI can optimise the supply chain, manufacturing processes, and marketing and advertising strategy.

What Are the Concerns About Using Generative AI in Biotech?

Data Quality and Bias: Gen AI models rely heavily on the quality and representativeness of training data. In biotech and life sciences, datasets may suffer from biases, inaccuracies, or limited diversity, impacting performance and generalisation ability. Biases in training data can lead to the generation of biased or unrealistic samples, hindering the reliability and validity of generated results.
Interpretability and Trustworthiness: Many Gen AI techniques, such as deep neural networks, are complex and opaque, making it challenging to interpret and trust the generated outputs. In critical applications like drug discovery and biomolecule design, it is crucial to understand how and why generative models generate specific outputs. Lack of interpretability can impede the adoption of Gen AI in decision-making processes and regulatory approval, limiting its utility in real-world applications.
Mapping Data: Mapping data in biotech and life sciences presents challenges due to the complexity and heterogeneity of biological systems and the diversity of data types and sources. Integrating and mapping heterogeneous data types requires harmonising data structures, ontologies, and metadata to ensure data consistency across datasets. Challenges in mapping data can affect the accuracy and reliability of Gen AI models trained on such data, impacting their performance and applicability in real-world scenarios. Addressing issues related to mapping data is essential to ensure the quality and reliability of data inputs for Gen AI models in biotech and life sciences.

Day-to-Day Usage of AI for Lab Operations

Above, we’ve listed the broader research applications in which Gen AI can be used. However, in day-to-day lab operations, the average life sciences and biotech scientist can utilise Gen AI in several ways. Gen AI can replace manual and repetitive tasks, freeing up resources for more complex and creative tasks.

Data Analysis and Interpretation: Gen AI can assist scientists in analysing and interpreting experimental data more efficiently. By employing machine learning algorithms, scientists can train models to recognise patterns, identify correlations, and extract meaningful insights from complex datasets, such as omics data, biological images, or high-throughput screening results. Gen AI can automate data analysis tasks, streamline data interpretation processes, and provide actionable insights to guide experimental design and decision-making in the lab.
Experimental Design and Planning: Gen AI can aid scientists in designing and planning experiments by generating hypotheses, optimising experimental conditions, and predicting outcomes. By leveraging predictive modeling techniques, scientists can simulate experimental scenarios, predict experimental outcomes, and identify optimal experimental parameters to achieve desired objectives. Gen AI can assist in experimental design optimisation, resource allocation, and risk assessment, helping scientists make informed decisions and maximise experimental efficiency in the lab.
Literature Mining and Knowledge Discovery: Gen AI can assist scientists in literature mining and knowledge discovery by analysing scientific literature, extracting relevant information, and synthesising knowledge from diverse sources. Natural language processing (NLP) techniques can extract key concepts, identify relationships between scientific entities, and summarise findings from research articles, patents, and databases. Gen AI can automate literature review processes, facilitate literature-based research, and accelerate knowledge discovery in specific research areas, enabling scientists to stay up-to-date with the latest advancements and make informed decisions in the lab.

Summary

Integrating generative AI into the biotech industry brings significant benefits and notable concerns. By leveraging generative AI, researchers can tackle the challenge of navigating through vast amounts of unstructured data and ideas, fostering divergent thinking, and facilitating collaboration across niche research areas. Moreover, generative AI streamlines various aspects of biotech lab operations, from automating content generation to improving the quality of scientific literature. However, concerns regarding data quality and bias, interpretability, trustworthiness, and data mapping remain pertinent. Despite these challenges, the day-to-day usage of generative AI in lab operations holds promise, enabling scientists to enhance data analysis and interpretation, optimise experimental design and planning, and accelerate literature mining and knowledge discovery. As generative AI continues to evolve, its role in revolutionising biotech research and development is poised to expand, shaping the future of precision medicine, drug discovery, and other critical areas within the life sciences.

Share on: Linkedin Facebook Twitter

Back to overview