Testing AI for information retrieval

Dear Readers,

With the continuous interest in AI amongst our partners, we’ve dedicated this month’s insights to share about our experiment concerning the usage of AI (Artificial Intelligence) for information retrieval. Please kindly note that this experiment is still ongoing, led by our internal digital & medical writing teams.

Without further ado, let’s dive in!

Context

Throughout 2024, the topic of AI’s role in medical writing emerged on numerous occasions during our communication with our partners. They wanted to know whether AI can assist in any way in a broad context.

Our main purpose of the experiment was to test the accuracy of the output & references, along with the response time.

September 2023 – The creation of model “A1”


We adopted a Large Language Model (LLM) with limited documentation input, open source models & self-hosting. Here were the main observations from our first test with 12 inputs, using the model “A1”:

  • Speed: >1 minute
  • Output accuracy: Mixed; sometimes the output was incorrectly phrased.
  • References: Mixed; sometimes the irrelevant references were listed.

Time for major changes!

Early November 2023 – Revamped model “B1”


After 1.5 months, we made 2 key changes for the newly revamped model “B1”: Using OpenAI directly and hosting on a dedicated server. We tested using the same 12 inputs:

  • Speed: ~5 seconds
  • Output accuracy: Better than model “A1”, but sometimes it couldn’t provide specific data
  • References: Better than model “A1”, but it can be better if it can indicate specific references for each claim

Overall, model “B1” was better than model “A1” but it still fell short in being specific. Slight modifications are still needed.

Late November 2023 – Remodeled model “B2”


This time, our digital team made some slight tweaks for our medical writers to test model “B2” using the same 12 inputs. We’re using OpenAI in Azure (instead of directly through OpenAI) and answers were supposed to be separated per paper. These were our main observations:

  • Speed: <5 seconds
  • Output accuracy: Similar to model “B1”, it’ll still be better if it can provide specific data
  • References: Slightly better than model “B1”, but it still can be improved.

While we’re still making further tweaks and running further tests, it is inconclusive (as of today) whether AI can assist in information retrieval. It did show some promises, but the output & referencing needs to be extremely accurate. We’ll continue to provide further updates in due time!

For any clarifications, please send an email to: michael.phee@drcomgroup.com | marketing@drcomgroup.com

for further contact.