AI in Research and Development

GAURI BHARAT
7 min readDec 14, 2020

“We have seen AI providing conversation and comfort to the lonely; we have also seen AI engaging in racial discrimination. Yet the biggest harm that AI is likely to do to individuals in the short term is job displacement, as the amount of work we can automate with AI is vastly larger than before. As leaders, it is incumbent on all of us to make sure we are building a world in which every individual has an opportunity to thrive.”

Andrew Ng, Co-founder and lead of Google Brain

Artificial Intelligence is always a topic of interest in various domains like finance, Customer Service, Sports, and a lot more. Research and Development relating to the different sectors is also not an exception to this. Not only does AI promise to improve existing goods, services, and organizational operations and efficiency, but it will play an important role when it comes to shaping the future of innovation and R&D.

In a report by the National Bureau of Economic Research, the authors (economists at MIT, Harvard, and Boston University) argue that AI is in a lane of its own as a research tool that can be applied to any domain and doesn’t discriminate based on the organization’s industry, size, or makeup.

Let’s have a look at some aspects of AI which are proven to be of great use in Pharmaceutical Research and Development and NASA

Artificial Intelligence for Pharmaceutical Research and Development

The use of different AI concepts can be viewed as 3 different processes involved in drug-testing in Pharmaceutical Industries

1. NLP for drug-discovery

Most software vendors that offer drug research and discovery solutions purport that they handle big data analytics or microscope imaging of molecular drug compounds.
In contrast, natural language processing software could enrich drug research by extracting information from unstructured data sources to be incorporated into the testing of current and future drug molecules.
NLP applications could sift through previous research documents for findings that are more relevant to a pharmaceutical company’s research than they were when originally discovered.
If a data scientist used NLP software to search for previously discovered chemical reactions of the drug they were testing, they may no longer find the need to conduct a given experiment. Pharmaceutical companies typically save their lab notes and clinical trial data into a database in order to keep track of their experiences with certain drugs, molecules, and chemicals. Clinical trial data, along with physician’s notes, can also reveal patients; experiences with adverse effects from drugs.
Because of this, an NLP software could be trained on clinical trial reports and data from electronic medical records in order to identify these experiences and mark any relevant drugs or compounds as possible contributors to the adverse effects discovered in those documents.

2. Predictive Analytics for Drug Discovery

The capability of predictive analytics technology to use past drugs, molecules and research assets that may have had some previous clinical testing is a large factor in setting AI software apart from drug discovery methods of the past.
Some AI applications are advertised to handle large amounts of pharmaceutical data to discern the physical and chemical traits of a drug molecule that could be useful in predicting success in the long term.
Pharmaceutical companies could use predictive analytics software to search through these data points about chemicals and molecules to find similarities to the one being tested.
A company might use predictive analytics to determine how likely a newly identified drug is to do well in clinical trials.
This would help prevent harm from adverse effects of the drug as well as drive sales by selecting the most likely drug to succeed. This would be based on past documentation of clinical trials that involve drugs with a similar chemical makeup to the one being tested, as well as drugs used to treat the same illnesses.

3. AI for Salt and Polymorph Testing

An important step in pharmaceutical research and development is determining a drug compound’s level of solubility in water and other liquids, its crystalline form once it becomes solid, and the stability of that structure once bound together.

Salt and polymorph testing or screening; is the process of choosing the best physical form in which to manufacture and distribute the drug.
This also helps to determine how long the drug can sit before it expires.
Pharmaceutical companies could use machine learning and AI applications to facilitate this process on multiple levels. Predictive analytics applications could find any existing data on crystalline structures of drug molecules to give the user an idea of what the drug may look like under a microscope once in pill form.
A data scientist at a pharmaceutical research facility may be able to predict the results of a salt or polymorph screening using machine learning and thus be able to augment the drug before it begins.
This is because if a pharmaceutical researcher is not familiar enough with the machine learning technology aggregating his company’s data, they won’t be able to use it in any helpful way.

Artificial Intelligence at NASA — Current Projects and Applications

The focus of the National Aeronautics and Space Administration (NASA) is to provide information to civilian institutions to help them solve scientific problems at home and in space. This requires a continuous stream of raw data under a constantly shifting environment.

1. TESS and Exoplanets

The Transiting Exoplanet Survey Satellite (TESS) is the next step in the search for planets outside of our solar system, including those that could support life. The mission will find exoplanets that periodically block part of the light from their host stars, events called transits.

Using the Kepler Space telescope, more than 2500 exoplanets are observed by scientists by now. Detection of dips in the visibility of stars from the earth (known as Transit Photometry) Helps in this observation. So abundant data about exoplanets and other 2000 candidate planets are yielded after TESS was launched. This is where the AI comes into the picture.

The challenge is to narrow down the number of candidates to the best ones to follow up. The Kepler data has provided excellent material for researchers at MIT and Google to develop and train machine learning algorithms. These machine vision algorithms search rapidly through these data sets and identify the dips that warrant a closer look.

Speed is of the essence in this undertaking, as TESS sweeps through each sector between 27 and 30 days. Quick identification of these candidates will give scientists the opportunity for a closer look before TESS moves on to the next sector.

As TESS continues its mission, more data will become available, scientists hope to establish an HPC pipeline of extracting light curves from the TESS instrument, classifying light curves using neural networks on ADAPT, distributing light curves to interested NASA astronomers, re-training neural networks on ADAPT using astronomer feedback, and repeating this process for new data. Old data can continue to be reprocessed as new insights are gained.

2. AI and Aliens

Transit photometry is not just for finding exoplanets, however. Together with spectroscopy, it may also be useful for finding aliens.

The Fermi Paradox and the Drake Equation continue to spark debate among scientists, but the basis for both is incontrovertible: the universe is vast. Despite there being no direct proof of intelligent life on other planets, the sheer number of stars with potentially habitable planets indicate it is highly probable that there is, and the Drake Equation suggests as many as 10,000 life forms in the Milky Way galaxy alone, although perhaps not life as humans know it.

Researchers at NASA’s Frontier Development Lab (FDL) used generative adversarial networks, or GANs, to create 3.5 million possible permutations of alien life based on signals from Kepler and the European Space Agency’s Gaia telescope. With these permutations, scientists have a better idea of the type of conditions that are most likely to sustain some form of life, such as gases present, temperature, density, and biological structure, whether carbon-based, such as that on Earth, or something more exotic.

GANs is a set of two deep learning machines that work off each other to create new data based on a given data set. The machines determine if the new data is plausible (as opposed to implausible) or real (as opposed to fake), depending on the form of the ideal data fed into the machines at the outset.

In simple terms, the two machines in a GAN act like two sides of an argument. One is trying to convince the other, and by degrees, come to a compromise. GANs can “fill in the gaps” in data sets to present a complete picture provided it has enough time to produce enough iterations so that the algorithm creates “realistic” results.

For example, if the first machine produces a stick figure as a representation of a human, the second machine will point out the flaws in the stick figure. The first machine then produces another image for further critique. In the end, the image should be a good approximation of a human.

One of these GANs is Atmos, simulation software created by Google Cloud developers in response to NASA’s FDL astrobiology challenge to simulate alien atmospheres. The software is open source.

Another neural network at play is a convolutional neural network or CNN. Embodied in a machine-learning tool called Intelligent Exoplanet Atmosphere Retrieval or INARA, it trains on spectral signatures from millions of exoplanets. It is able to assign values for each spectral signature and differentiate one from the other. When INARA encounters spectral signatures in an image, it can generate the possible chemical composition and the probability of life in seconds.

This explains how AI has its Scope in every single field. It’s definitely time to gain skills in AI and leverage it along with the future!

--

--