High-Throughput Platforms Big Data in Healthcare: Potential Impact, Challenges and Opportunities in Data Integration and Transformation

In today’s world, advancement in technology has become synonymous with “never quenching one’s thirst.” Big data is a fast-growing complex process in which a large amount of data can work wonders by providing insights that lead to better strategic moves and new opportunities for business growth. The main task of big data is to collect data, store it, and transform it into a unified data structure.
Healthcare industries have been slower to capitalize on big data than other industries. But today, many data-rich organizations are focused on using big data and analytics to make better decisions in treatment, personalized medicine, and patient health. The volume of data collected in hospitals during treatments and across clinical trials is a gold mine for physicians. Our paper briefly describes the potential impact of big data in healthcare.
This white paper also describes the notable concerns in big data, which areincreasing day by day as the volume of medical data increases. Insufficientunderstanding of big data, uncertainty in converting data into the big data platform to provide useful insights, difficulties in data sharing, security problems, a lack of technology specialists, and a high cost are the challenges associated with big data. This paper also provides solutions for these challenges, such as data lakes and optimized energy consumption to reduce cost, cloud computing, and a two-way authentication process for high security.

BIG DATA IN HEALTHCARE

Information has been the key for new and better developments. Therefore, data collection is a crucial part of every organization for the prediction of current trends and future forecast. The 21st century, a ‘Modern Era’, is also known as the era of Big Data in the field of information technology (IT), science,and engineering. The term ‘Big Data’ was introduced in the 1990s which meansa large amount of data that can be collected, analyzed and modified to generate significant revenue by providing insights that lead to better strategic moves and new opportunities in the business.

In the healthcare industry, big data is driving a revolutionary shift by collecting, protecting, and analyzing information that is too complex to understand by traditional methods of data processing. In today’s world, every sector of the healthcare system, mainly hospitals and organizations, is generating and analyzing big data for various purposes. The use of big data has been increasing tremendously due to the large amount of information that can be analyzed to give real-time medical care, increase revenue, and improve health outcomes. The development and management of monitoring software and devices aremainly dependent on IT, which can generate alerts and share patient information with the respective health care provider. A survey was conducted, and the results show that big data will be a primary driver of innovation in the healthcare industry in 2019. According to the survey, 38% of healthcare industry participants predicted that big data analytics would be the most significant trending technology in the pharmaceutical industry, followed by 33% who predicted artificial intelligence (AI).

Different Sources of Big Data in healthcare

In the healthcare industry, different sourcesfor big data are medical andhospital records of patients, personal health records (PHR), observationsof medical examinations, and many other healthcare data components that allow electronic storage, retrieval, and
modification of health and medical records.

Electronic health records (EHR) are computerized medical records containing any information about a patient related to their past and present health condition. The EHR allows organizations to retrieve data more quickly and improves public health supervision
by providing real-time reporting of disease outbreaks. The advantage of using an EHR is having easy access to the entire medical history of a patient, which can be used for future research.
The electronic medical record (EMR) stores medical information gathered from patients electronically by clinicians. All practitioners are now required to electronically record all medical data.

Dimensions of Big Data

In the healthcare industry, big data has become important for all operational and clinical tasks, including health management, strategic decision-making, quality standards, predictive analytics, and revenue management. Therefore, the complexity of big data has been broken down into five dimensions: volume, velocity, variety, veracity, and value.

Three vital dimensions of big data are volume, velocity, and variety (the 3V’s), which have become the standard definition of big data.

  • Volume indicates the large amount of clinical information, which includes medical records, genomics, clinical trial data, 3D imaging, FDA submission data, etc. High throughput technologies are used for effective storage and manipulation of data, due to which now a large volume of data can be generated every minute. The volume of data collected in hospitals during treatments and across clinical trials is a gold mine for physicians
  •  Velocity indicates the speed or rate at which data is collected for further analysis in nearreal-time. In healthcare,some ofthe most important data,such as the vital sign records of patients must be updated and displayed immediately in real time.
  •  Variety indicates the ability to classify different types of organized and unorganized data that any system can collect. In healthcare, data can take many forms, including text, numbers, graphics, images, and coded data.

Additional V’s used by others to describe big data include veracity and value.

  •  Veracity indicates the data assurance with the variable quality so that big data outcomes are error-free. Big data cannot be 100% accurate, but data assurance in healthcare is critical for those critical decisions that rely on having the correct information, especially if it is disorganized.
  • Value indicates improved outcomes, strategic decision-making, and better business efficiency.

Overall, more than 2.5 million terabytes of data are created each day, of which 3.5 billion searches are performed in Google, 0.5 million videos are uploaded on YouTube, 0.9 billion photos are uploaded on Facebook, etc.

Advantages of Big Data in the Healthcare Industry

The use of big data provides benefits in the healthcare sector, such as more accurate diagnoses, data protection, fewer medication errors, and various other efficient insightsin a timely manner.

The prominent benefits of big data are:

  • Provide thousands to millions of data more quickly within a few hours to days than the traditional standard of scientific findings.
  • Collect and analyze all the structured and unstructured data to predict patients at risk for disease and match treatments with outcomes.
  • Create a 360-degree view of patients, physicians, and consumers.
  • Increase in hospital growth by improving patient care, health outcomes, and by speeding new treatments to market.
  • Execute gene sequencing more effectively to make genomic analysis a part of medical care.
  • Provide an unbiased collection of data, resulting in accurate analysis of important patterns.
  • Diagnose diseases at their initial stages to make treatment more precise and effective.
  • Based on a large amount of historical data, big data analytics can address numerous questions regarding certain developments and outcomes.
  • Provide statistical tools to improve the design of the clinical study and participant recruitment, thus minimizing the risk of trial failure.
  • Identify new adverse effects before products reach the market, thus enabling cost savings.
  • Provide real-time tracking of clinical trials.
  • By using advanced algorithms, big data can go through different reports to detect any fraud or mistake quickly.

2. POTENTIAL IMPACT OF BIG DATA IN HEALTHCARE

2.1 Big Data in Cancer Careon

Cancer remains a leading cause of death worldwide. Pharmaceutical companies and health authorities are engaged in the development of drugsfor cancer control, and various strategies have also been incorporated for the prevention of cancer.

Big data provides opportunities to significantly address the issues by refining theoretical models designed to understand cancer. The goal of big data is to collect pre-diagnosis and pre treatment data that can be combined with clinical data to make feasible predictions (predictive analysis) to improve cancer care. There is a need to analyze historical patient data for this, but 96% of the available cancer data is not analyzed.

To solve this issue, Flatiron Health developed cloud-based oncology software known as Oncology Cloud (OncoCloud™) to collect data during diagnosis and treatment and then make it available to physicians to advance their research on cancer. OncoCloud™ includes:
OncoAnalytics® for deep company insights
SeeYourChart® for sharing lab data with patients
OncoEMR® for EMR
OncoBilling® for generating claims

2.2 Big Data Helps Fight Ebola Virus

Ebola virus infection is a rare a n d deadly virus that causes fever, body pain,bleeding, and organ failure. The Ebola virus outbreaks in Africa have demonstrated that any country with weak treatment options is in danger.

Big data plays a crucial role in detecting disease outbreaks using location attributes. IBM Big Data Analytics, through the location tracker, can predict the Ebola virus infection and curb the spread of epidemics. This provides information about the most affected areas to help plan for treatment centers.

2.3 Big Data in Precision Public Health

Precision medicine is an emerging approach to customize treatments that aims to deliver the right medicine at the right time to the right patient by using new data and technologies. The use of big data has shown an improvement in the precision medicine approach by delivering a volume and variety of organized and unorganized data to physicians to achieve the goal of precision medicine.Big data helps in predicting risk, targeting therapies, and performing disease surveillance.
A precision medicine knowledgebase (PreMedKB) is a big database that compiles all the information related to precision medicine, such as diseases, drugs, genes, and variants. PreMedKB is a user-friendly database that assistsphysicians and researchers in gathering
genetic information about patients.

Currently, the PreMedKB database consists of approximately 311,678 variants, 66,437 genes, 18,185 diseases, and 8604 drugs. This database combines information from various sources and presents around 496,689 relationships among variants, diseases, genes, and drugs.

2.4 Big Data in Clinical trials

Big data plays an important role in modernizing various methods by which clinical trials are carried out. With big data analytics, researchers can improveclinical trial design, site selection, risk and cost reduction, and overall decision-making. One of the main reasons for
clinical trial failure is the insufficient enrollment of patients. Therefore, big data also improves patient recruitment as it helps to identify patients who are most likely to respond to the medicinebased on their genetic understanding. A survey estimated that themajority of
clinical trials are now using preliminary online analysis to help identify the potential efficacy and safety of the drug.

3. CHALLENGES IN DATA INTEGRATION AND TRANSFORMATION AND OPPORTUNITIES TO SOLVE THEM

From data collection to data analysis, big data integration is a complex process. Data integration is inadequate due to structured, semi-structured, and unstructured data and also due to information barriers among organizations, hospitals, and institutions. During this
integration, both IT specialists and business sponsors face several challenges. The big challenges with big data in healthcare are:

3.1 Insufficient Understanding of Big Data

Mostly companies fail to understand the basics of big data, such as how it actually works, what infrastructure is required, the benefits of big data, etc. As a result, IT professionals must organize regular trainings to ensure big data comprehension.

3.2 Converting Data into the Big Data Platform to Provide UsefulInsights

A major challenge of big data is sorting, analyzing, and manipulating a large amount of unorganized information as compared to organized information, which results in t h e mismanagement of information. Therefore, due to unsynchronized data, it becomes difficult
to determine which data point will provide useful insights. When information is collected from various sources at different times and speeds, there is a possibility of getting out of sync withthe systems. Due to this, inconsistent, duplicate, or invalid data might lead to wrong insights, which eventually cause great damage to the big dataenvironment. Hence, it becomes difficult to manage the quality of the data.
Big data can never be 100% accurate, but to minimize this problem, first there is a need to create a proper model for big data that compares data with every prospect, then match the records and merge. The characteristics needed to manage the quality of data are:

Organized structure The structure of data should be in a particular format that complies with all the requirements
Consistency – There should be logical relations rather than duplications or gaps
Completeness – Data should probably consist of all the needed elements
Accuracy – Data curation should result in the real state of things ,i.e. true results
Completeness – For data quality, regular manual and automatic audits ahould be conducted

3.3 Security Challenges

Confidentiality in healthcare big data is another concern, as the about the patient is more sensitive than other types of big data. Despite the fact that big data technologies provide data security, there are still many challenges, and no complete solution has yet emerged.
When everyone in an organization or hospital begins working with patients’ personal data, a privacy breach occurs, which is a major concern.

To minimize this problem, there is a need to design big data by putting securityfirst. The organizations should select good big data vendors with a well-supported distribution system and security. To protect data privacy, organizations should limit access to a few specialists rather than an entire team. The security models related to big data in healthcare are:

Cloud Computing in Healthcare Big Data: Cloud computing provides security for healthcare data. It also makes data sharing very easy for users.
Two-Way Authentication Process: It is a process in which only users who have access can modify the data. In the two-way authentication process, first users add their login details and then add a one-time password sent to their mobile phone or email, and finally they get access to cloud data storage.

3.4 Lack of Technology Specialists

The complexity of the data is increasing rapidly, and it is expected that the size of healthcare data in 2020 will be around 40 ZB. This massive data will createa lot of difficulties in the data analysis process and pose significant challenges to traditional computing technology. Even some commonly used big data technologies are facing major challenges, such as Hadoop, which solves the storage issue of big data and improves the speed of operation but has technical challenges with security and storage. Similarly, cloud computing also has security issues.
These challenges demand the development of new tools with alternative data layouts to increase the speed, maximize the security, and identify actionable insights for which talented experts or experienced scientists are required. But it has become another very difficult
challenge to find the right person with the right skill set. Yet, only a few companies worldwide have mastered the coretechnology of big data, and the rest of the world still needs technology specialists.

3.5 High Cost of Big Data

Big data provides big business benefits but hides high costs and complexity barriers that organizations struggle with afterward. Big data projects involve lots of expenses, mostly for software development, configuration, and maintenance. In healthcare, the government doesn’t allocate sufficient funds to accelerate the development of big data.

There are cost-effective hybrid solutions in which half of the data is stored and processed in the cloud and the other half on-premises.

Data lakes provide cheap data storage opportunities by capturing and storing rawdata at low cost to perform data management transformations, processing, and analytics based on specific use cases. This approach has shown positive results through increased speed and
quality of web search and improved behavior analysis.

Optimized energy consumption minimizes energy costs by reducing power consumption by 5 to 100 times.

SUMMARY OF KEY TAKEAWAYS

Big data has a potential impact on the healthcare industry as it offers a variety of benefits; however, big data integration is a complex process, so it must overcome some complications that arise due to structured, semi-structured,and unstructured data as well as the information barriers among organizations, hospitals, and institutions.

References

  1. Baro E, Degoul S, Beuscart R, Chazard Toward a literature-driven definition of big data in healthcare. Biomed Res Int. 2015;2015:639021. doi:10.1155/2015/639021
  2. Zhang X, Pérez-Stable EJ, Bourne PE, et Big data science: opportunities and challenges to address minority health and health disparities in the 21st century. Ethn Dis. 2017;27(2):95-106. Published 2017 Apr 20. doi:10.18865/ed.27.2.95
  3. Lokhorst C, de Mol RM, Kamphuis Invited review: big data in precision dairy farming.Animal. 2019;13(7):1519-1528. doi:10.1017/S1751731118003439
  4. Pharmaceutical AI, big data to have \’greatest technology impact\’ in pharma industry in  2019.     pharmaceutical-tech.com. Accessed on 26 December 2022. https://www.pharmaceutical-tech.com/articles/ai-big-data-to-have-greatest- technology-impact-in-pharma-industry
  5. NEJM Healthcare big data and the promise of value-based care. catalyst.nejm.org. Published on 01 January 2018. Accessed on 26 December 2022. https://catalyst.nejm.org/big-data-healthcare/
  6. McCue ME, McCoy The scope of big data in one medicine: unprecedented opportunities and challenges. Front Vet Sci. 2017;4:194. Published 2017 Nov 16. doi:10.3389/fvets.2017.00194
  7. Raghupathi W, Raghupathi Big data analytics in healthcare: promise and potential. Health Inf Sci Syst. 2014;2:3. Published 2014 Feb 7. doi:10.1186/2047-2501-2-3
  8. Sabyasachi Dash, Sushil Kumar Shakyawar, Mohit Sharma, SandeepKaushik. Big data in healthcare: management, analysis and future Journal of Big Data. 2019 Dec;6(1):1-25.
  9. Atienza AA, Serrano KJ, Riley WT, Moser RP, Klein Advancing cancer prevention and behavior theory in the era of big data. J Cancer Prev. 2016;21(3):201-206. doi:10.15430/JCP.2016.21.3.201
  10. Genetic Engineering & Biotechnology Roche  expands    in personalized medicine, oncology with $1.9b purchase of Flatiron Health. genengnews.com. Published on      16                     February   2018.          Accessed                   on     26     December     2012. https://www.genengnews.com/topics/translational-medicine/roche-expands-in- personalized-medicine-oncology-with-1-9b-purchase-of-flatiron-health/
  11. Bernie Monegain. IBM says big data has provided new insight into how Ebola healthcareitnews.com. Published on 05 May 2017. Accessed on 27 December 2022. https://www.healthcareitnews.com/news/ibm-using-big-data-take- aim-ebola
  12. Dolley S. Big Data’s Role in Precision Public Health. Front Public Health. 2018;6:68. Published 2018 Mar doi:10.3389/fpubh.2018.00068
  13. Yu Y, Wang Y, Xia Z, et PreMedKB: an integrated precision medicine knowledgebase for interpreting relationships between diseases, genes, variants and drugs. Nucleic Acids Res. 2019;47(D1):D1090-D1101. doi:10.1093/nar/gky1042
  14. Mayo CS, Matuszak MM, Schipper MJ, Jolly S, Hayman JA, Ten Haken Big Data in Designing Clinical Trials: Opportunities and Challenges. Front Oncol. 2017;7:187. Published 2017 Aug 31. doi:10.3389/fonc.2017.00187
  15. Hong L, Luo M, Wang R, Lu P, Lu W, Lu Big data in health care: Applications and challenges. Data and information management. 2018;2(3):175-97.
  16. Siddique M, Mirza MA, Ahmad M, Chaudhry J, Islam R. A survey of big data security solutions in In International conference on security and privacy in communication systems. 2018;391-406. Springer, Cham.
  17. Prashant Tyagi and Haluk The biggest big data challenges. pubsonline.informs.org. Published on 07 November 2016. Accessed on 27 December 2022. https://pubsonline.informs.org/do/10.1287/LYTX.2016.06.05/full/

Author: Shikha Sharma

Reviewer: Samyukta

Leave a Reply

Your email address will not be published. Required fields are marked *