A key issue faced in pharmaceutical laboratories is the integration and standardisation of data from the array of instruments. Here, Samantha Kanza from the University of Southampton outlines some challenges with using laboratory information management systems or other digital tools. How might these be overcome?
What are the key data integration challenges faced in labs?
At the risk of citing a circular issue, I would say two key challenges are both lack of data standards and a saturation of data standards. Other challenges include proprietary data formats, lack of up-to‑date equipment and inconsistent datasets.
An important aspect of data integration is having different datasets in the same format (or at least a format that can be converted to match the other). Unfortunately, there are many programs and instruments that use different proprietary formats. They are often hard to convert into other formats and don’t necessarily align with the other datasets that people want to use. This is sometimes due to outdated equipment, but also the absence of a set of agreed standards for the different types of data that can be generated.
This issue of standards is a major topic that comes up at every event I attend related to laboratory research. Recently, a postgraduate I spoke to said one of the programs they use produces data that cannot be used with any of their other programs; this was a major pain point. When working on my ethnography during my PhD I observed a student copy down numbers into a notebook from a program on one computer and then walk over to another computer and input those numbers into another program. When I asked why they simply said the software doesn’t integrate.
Are you looking to explore how lipid formulations in softgels can enhance drug absorption and bioavailability. Register for our upcoming webinar to find out!
3 September 2025 | 3:00 PM BST | FREE Webinar
This webinar will delve into the different types of lipid formulations, such as solutions, suspensions, emulsions, and self-(micro)emulsifying systems. Applications span diverse therapeutic areas including HIV therapy, oncology, immunosuppressants, and emerging treatments like medicinal cannabis (eg, CBD).
What You’ll Learn:
Lipid formulation development and screening tools for optimisation
Key steps in scale-up and industrialisation to ensure consistency and efficiency
Impact of lipid-based softgels on drug delivery and patient outcomes.
To achieve data integration, we must first address the data standards issues…research labs must carefully consider the software they use to create their data and what formats they can produce it in.”
To achieve data integration, we must first address the data standards issues. While it is far harder than it sounds, research labs must carefully consider the software they use to create their data and what formats they can produce it in. Additionally, vendors should assist by not using difficult-to-integrate, proprietary formats.
Another big issue for data integration is being consistent with the data you collect/produce. Theoretically data produced directly from instruments should be consistent with other datasets produced by the same instruments, but if there is any human intervention in the data (eg, adding headings, units, terms) this risks inconsistencies. A round table discussion at a recent conference heard multiple delegates note that their teams tend to record data in their own ways, leading to a lot of inconsistencies.
Inconsistency issues should, theoretically, be easier to overcome because they are human driven, although this isn’t always straightforward. Ideally a lab would set some internal guidelines for how to structure data (eg, what units, terms, etc, to use) and would ensure that datasets are produced with descriptive metadata such that other researchers could use them.
What are the challenges faced when using electronic systems?
There are several challenges regarding electronic lab notebook (ELN) usage. After many lengthy discussions and considering my research over the last eight years, I believe the first challenge may be the term ELN itself. It carries a stigma. Researchers have both an impression and expectation of what an ELN should be, and I feel like this has evolved. ELNs were originally introduced to replace paper lab notebooks, creating misconceptions ranging from ‘this means I can’t use my paper notebook’ to ‘this will now do everything in one place’. Neither are true. We must therefore reassess the best use of ELNs by considering what we need from them with fresh eyes.
Having said that, there are obviously some challenges associated with the tools themselves. One is proprietary data formats. Many researchers have expressed great concern that if they put their data into an ELN they won’t be able to retrieve it in any useful form (or it will potentially cost a lot of money to achieve this). Interestingly, when our team researched import and export formats (based on ELNs that were active and had available documentation on this matter), the available import formats were substantially greater than the export formats.
What are the other issues to consider?
Another challenge is data security and trust. Whether rightly or wrongly, researchers don’t necessarily trust ELN systems, particularly if they are in the cloud…While trust is important, some of this comes down to a lack of education and understanding on data security coupled with some learned behaviour that needs to be addressed.
Another issue with respect to using ELNs is that labs are hostile environments for technology. There is not always room for a laptop, there are concerns about chemical spills and researchers may find having to type in information while conducting their experiments very intrusive. Additionally, computers that do exist in the lab are usually there to be used with certain equipment and may have outdated operating systems or not be connected to the internet.
Another huge issue – and perhaps most challenging to address – is the human element. Adoption of any technology is tricky and requires buy-in from everyone in the lab. Persuading researchers to change their methods and use new software is a difficult task. Researchers also consider note taking to be very personal, thus making them adhere to a structured approach is often unappealing.
Social theories of adoption suggest that success requires the following:
Users must believe that said technology will be easy to use
Users must see the value of using the technology
Users must believe that it will make measurable improvements to their work.
Naturally we know that capturing our work digitally, having it all in one place and not leaving it lying around in a dusty lab book is a better option, but the steps to progress to full ELN adoption are complex and require a change of attitude, willingness to learn, training and a complete overhaul of lab culture.
Additionally, whilst many ELNs and LIMS have made leaps and bounds with respect to helping scientists record their processes, it can still be arduous for researchers to facilitate and capture user workflows, and we need methods that enable researchers to achieve this with less effort/time.
Finally, the ELN market is saturated. Some ELNs are trying to be a one‑stop‑shop for all domains, which is often an issue because specific sub domains will find that they don’t have the required functionality. Alternatively, some are very niche, which is great for those specific groups that want to use them, but makes wider adoption trickier. Ultimately different groups should carefully consider their requirements before choosing an ELN and be aware that what works for one group won’t necessarily work for another.
What technologies are needed for the ‘lab of the future’?
I fully believe that the lab of the future will not involve a keyboard. One of the major barriers to capturing information digitally in the laboratory is having to use a keyboard. We are at a stage where a lot of the instrument-generated data is captured electronically, and yet the data and information that ends up in paper lab books is the recording of the process, observations and some results depending on the type of experiment.
There are several key reasons that researchers don’t capture this information digitally in the first place. First, it is intrusive and time consuming to type it out. Second, researchers are often wearing gloves as they are handling dangerous chemicals and so find it too intrusive to consider removing those gloves mid experiment to type something out. Third, often computers in laboratories are dedicated to specific pieces of equipment and/or are outdated and using legacy systems (necessitated by the instruments) and so don’t necessarily have access to the internet or required software. Finally, the lab can be a hostile environment for technology, given concerns about chemical spills, for example.
If we could provide other mechanisms for recording this information that did not necessitate use of a keyboard, processes could be streamlined and we could truly move towards the lab of the future.”
If we could provide other mechanisms for recording this information that did not necessitate use of a keyboard, processes could be streamlined and we could truly move towards the lab of the future.
I think that the two key technologies will be smart labs and hybrid devices. Smart homes are commonplace now, so why not smart labs? Several companies are now working on voice-based tools for the lab including smart laboratory assistants. I envisage this technology as something that is core to the lab of the future.
There is also great promise for hybrid devices (eg, smart notebooks). While these devices may have some issues for those wearing gloves, if researchers could scribble down notes in a device that would act like paper and yet still capture the notes electronically, this would represent a great stride for the lab of the future.
About the author
Samantha Kanza, PhD is a Senior Enterprise Fellow at the University of Southampton in the UK. She leads a pathfinder on Process Recording as part of the Physical Sciences Data Infrastructure (PSDI Initiative, and co‑ordinates the AI 4 Scientific Discovery Network run out of Southampton, and the Future Blood Testing Network run out of Reading. Samantha works in the interdisciplinary research area of applying computer science techniques to the scientific domain, specifically through use of Semantic Web technologies and artificial intelligence. Her research includes looking at electronic lab notebooks and smart laboratories, to improve the digitalisation and knowledge management of the scientific record using Semantic Web technologies; and using IoT devices in the laboratory.
This website uses cookies to enable, optimise and analyse site operations, as well as to provide personalised content and allow you to connect to social media. By clicking "I agree" you consent to the use of cookies for non-essential functions and the related processing of personal data. You can adjust your cookie and associated data processing preferences at any time via our "Cookie Settings". Please view our Cookie Policy to learn more about the use of cookies on our website.
This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorised as ”Necessary” are stored on your browser as they are as essential for the working of basic functionalities of the website. For our other types of cookies “Advertising & Targeting”, “Analytics” and “Performance”, these help us analyse and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these different types of cookies. But opting out of some of these cookies may have an effect on your browsing experience. You can adjust the available sliders to ‘Enabled’ or ‘Disabled’, then click ‘Save and Accept’. View our Cookie Policy page.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Cookie
Description
cookielawinfo-checkbox-advertising-targeting
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertising & Targeting".
cookielawinfo-checkbox-analytics
This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Analytics".
cookielawinfo-checkbox-necessary
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance
This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Performance".
PHPSESSID
This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
zmember_logged
This session cookie is served by our membership/subscription system and controls whether you are able to see content which is only available to logged in users.
Performance cookies are includes cookies that deliver enhanced functionalities of the website, such as caching. These cookies do not store any personal information.
Cookie
Description
cf_ob_info
This cookie is set by Cloudflare content delivery network and, in conjunction with the cookie 'cf_use_ob', is used to determine whether it should continue serving “Always Online” until the cookie expires.
cf_use_ob
This cookie is set by Cloudflare content delivery network and is used to determine whether it should continue serving “Always Online” until the cookie expires.
free_subscription_only
This session cookie is served by our membership/subscription system and controls which types of content you are able to access.
ls_smartpush
This cookie is set by Litespeed Server and allows the server to store settings to help improve performance of the site.
one_signal_sdk_db
This cookie is set by OneSignal push notifications and is used for storing user preferences in connection with their notification permission status.
YSC
This cookie is set by Youtube and is used to track the views of embedded videos.
Analytics cookies collect information about your use of the content, and in combination with previously collected information, are used to measure, understand, and report on your usage of this website.
Cookie
Description
bcookie
This cookie is set by LinkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
GPS
This cookie is set by YouTube and registers a unique ID for tracking users based on their geographical location
lang
This cookie is set by LinkedIn and is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc
This cookie is set by LinkedIn and used for routing.
lissc
This cookie is set by LinkedIn share Buttons and ad tags.
vuid
We embed videos from our official Vimeo channel. When you press play, Vimeo will drop third party cookies to enable the video to play and to see how long a viewer has watched the video. This cookie does not track individuals.
wow.anonymousId
This cookie is set by Spotler and tracks an anonymous visitor ID.
wow.schedule
This cookie is set by Spotler and enables it to track the Load Balance Session Queue.
wow.session
This cookie is set by Spotler to track the Internet Information Services (IIS) session state.
wow.utmvalues
This cookie is set by Spotler and stores the UTM values for the session. UTM values are specific text strings that are appended to URLs that allow Communigator to track the URLs and the UTM values when they get clicked on.
_ga
This cookie is set by Google Analytics and is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. It stores information anonymously and assign a randomly generated number to identify unique visitors.
_gat
This cookies is set by Google Universal Analytics to throttle the request rate to limit the collection of data on high traffic sites.
_gid
This cookie is set by Google Analytics and is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visited in an anonymous form.
Advertising and targeting cookies help us provide our visitors with relevant ads and marketing campaigns.
Cookie
Description
advanced_ads_browser_width
This cookie is set by Advanced Ads and measures the browser width.
advanced_ads_page_impressions
This cookie is set by Advanced Ads and measures the number of previous page impressions.
advanced_ads_pro_server_info
This cookie is set by Advanced Ads and sets geo-location, user role and user capabilities. It is used by cache busting in Advanced Ads Pro when the appropriate visitor conditions are used.
advanced_ads_pro_visitor_referrer
This cookie is set by Advanced Ads and sets the referrer URL.
bscookie
This cookie is a browser ID cookie set by LinkedIn share Buttons and ad tags.
IDE
This cookie is set by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
li_sugr
This cookie is set by LinkedIn and is used for tracking.
UserMatchHistory
This cookie is set by Linkedin and is used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.
VISITOR_INFO1_LIVE
This cookie is set by YouTube. Used to track the information of the embedded YouTube videos on a website.