scispace - formally typeset
Open AccessJournal ArticleDOI

Medical Big Data: Neurological Diseases Diagnosis Through Medical Data Analysis

TLDR
The challenges of medical big data handing are explored and the concept of the computer-aided diagnosis (CAD) system how it works is introduced and a survey of developed CAD methods in the area of neurological diseases diagnosis is provided.
Abstract
Diagnosis of neurological diseases is a growing concern and one of the most difficult challenges for modern medicine. According to the World Health Organisation’s recent report, neurological disorders, such as epilepsy, Alzheimer’s disease and stroke to headache, affect up to one billion people worldwide. An estimated 6.8 million people die every year as a result of neurological disorders. Current diagnosis technologies (e.g. magnetic resonance imaging, electroencephalogram) produce huge quantity data (in size and dimension) for detection, monitoring and treatment of neurological diseases. In general, analysis of those medical big data is performed manually by experts to identify and understand the abnormalities. It is really difficult task for a person to accumulate, manage, analyse and assimilate such large volumes of data by visual inspection. As a result, the experts have been demanding computerised diagnosis systems, called “computer-aided diagnosis (CAD)” that can automatically detect the neurological abnormalities using the medical big data. This system improves consistency of diagnosis and increases the success of treatment, save lives and reduce cost and time. Recently, there are some research works performed in the development of the CAD systems for management of medical big data for diagnosis assessment. This paper explores the challenges of medical big data handing and also introduces the concept of the CAD system how it works. This paper also provides a survey of developed CAD methods in the area of neurological diseases diagnosis. This study will help the experts to have some idea and understanding how the CAD system can assist them in this point.

read more

Content maybe subject to copyright    Report

Data Sci. Eng. (2016) 1(2):54–64
DOI 10.1007/s41019-016-0011-3
REVIEW
Medical Big Data: Neurological Diseases Diagnosis Through
Medical Data Analysis
Siuly Siuly
1
· Yanchun Zhang
1,2
Received: 26 May 2016 / Accepted: 23 June 2016 / Published online: 27 July 2016
© The Author(s) 2016. This article is published with open access at Springerlink.com
Abstract Diagnosis of neurological diseases is a grow-
ing concern and one of the most difficult challenges for
modern medicine. According to the World Health Organisa-
tion’s recent report, neurological disorders, such as epilepsy,
Alzheimer’s disease and stroke to headache, affect up to one
billion people worldwide. An estimated 6.8 million people
die every year as a result of neurological disorders. Current
diagnosis technologies (e.g. magnetic resonance imaging,
electroencephalogram) produce huge quantity data (in size
and dimension) for detection, monitoring and treatment of
neurological diseases. In general, analysis of those med-
ical big data is performed manually by experts to identify
and understand the abnormalities. It is really difficult task
for a person to accumulate, manage, analyse and assimilate
such large volumes of data by visual inspection. As a result,
the experts have been demanding computerised diagnosis
systems, called “computer-aided diagnosis (CAD)” that can
automatically detect the neurological abnormalities using the
medical big data. This system improves consistency of diag-
nosis and increases the success of treatment, save lives and
reduce cost and time. Recently, there are some research works
performed in the development of the CAD systems for man-
agement of medical big data for diagnosis assessment. This
paper explores the challenges of medical big data handing
B
Siuly Siuly
siuly.siuly@vu.edu.au
Yanchun Zhang
yanchun.zhang@vu.edu.au
1
Centre for Applied Informatics, College of Engineering and
Science, Victoria University, Melbourne, Australia
2
School of Computer Science, Fudan University, Shanghai,
China
and also introduces the concept of the CAD system how it
works. This paper also provides a survey of developed CAD
methods in the area of neurological diseases diagnosis. This
study will help the experts to have some idea and understand-
ing how the CAD system can assist them in this point.
Keywords Medical big data analysis · Computer-aided
diagnosis system · Neurological diseases diagnosis
1 Background
In the world of medicine, neurological disorders are the
most challenging to diagnose, manage and monitor due to
the complex nervous system. Diagnosis of neurological dis-
eases and their treatments demand high precision, dedication
and experience. Nowadays, modern technology and sys-
tems allow neurologists to provide proper neurological care.
Neurological disorders are diseases of the body’s nervous
system. Structural, biochemical or electrical abnormalities
in the brain, spinal cord or other nerves can result in a
range of symptoms. There are more than 600 diseases of
the nervous system, s uch as epilepsy, dementias, Alzheimer’s
disease and cerebrovascular diseases including stroke, multi-
ple sclerosis, Parkinson’s disease, migraine, neuroinfections,
brain tumours and traumatic disorders of the nervous system
such as brain trauma and autism. According to the World
Health Organisation (WHO) report, more than 50 million
people suffer from epilepsy [1]. It is estimated that 35.6
million people have dementia, with 7.7 million new cases
every year; Alzheimer’s disease i s the most common cause
of dementia and may contribute to 60–70 % of cases [2].
These disorders affect people in all countries, irrespective
of age, sex, education or income. Neurological disorders are
typically devastating to affected patients and their families,
123

Medical Big Data: Neurological Diseases Diagnosis Through Medical Data Analysis 55
often depriving the patient’s quality of life. A rapid and timely
diagnosis of these diseases can save and significantly improve
patients’ life by applying appropriate procedures. Recently,
varieties of advanced diagnosis technologies have been used
to detect, manage and treat neurological disease, such as brain
wave tests (Electroencephalography or EEG), computerised
tomography (CT scan), magnetic resonance imaging (MRI
scan), electromyography (EMG) and arteriogram (also called
an angiogram), positron emission tomography (PET scan or
PET imagery). These technologies are vital tools that help
physicians confirm or rule out the presence of a neurological
disorder or other medical conditions.
The produced huge amounts of medical data from these
aforementioned technologies are an important source for
diagnosis, therapy assessment and planning. In general, med-
ical image data range anywhere from a few megabytes for a
single study to hundreds of megabytes per study (e.g. thin-
slice CT studies comprise of up to 2500+ scans per study)
[3,4]. Such data require large storage capacities if stored for
long term. Due to high volume, velocity and complexity of
the medical data, it is really difficult for the experts to accu-
mulate, manage, analyse and assimilate the large volumes of
data for diagnosis, therapy assessment and planning. Integra-
tion of high quantity physiological data is the grand challenge
for the experts to deliver clinical recommendations. Support-
ing medical experts or neurologists in the process of finding
a correct diagnosis to a hypothesis in a timely manner is
very desirable to improve a patient’s outcome. In general, the
analysis of those vast amounts of information is performed
manually through visual inspection by neurologists/experts
to identify and understand abnormalities from medical imag-
ing and signal data [5]. The visual inspection of such huge
data is not a satisfactory procedure for precise and reliable
diagnosis as it is time-consuming, error prone and subject
to fatigue. Thus, the medical analytics demand to develop
automatic decision systems by utilising computational intel-
ligence for fast, accurate and efficient diagnosis, prognosis,
and treatment processes.
Recently, an advanced idea on automated CAD system
is introduced for the experts/neurologists for detecting the
neurological abnormalities from the medical big data. The
algorithms of major CAD systems are developed by using
techniques and theories of the pattern recognition field, and
thus the CAD is involved as one of the pattern recognition
fields [6]. The techniques of the CAD systems consist of
data pre-processing, feature extraction and classification as
discussed in Sect. 3. The CAD systems assist the experts in
accurately interpreting medical big data, so that the accu-
racy and consistency of diagnosis can be improved and also
reduce the analysis time. Many methods and frameworks on
the CAD concept have been developed for analysis of med-
ical image and signal processing as discussed in Sect. 4.The
CAD system is cost-effective and efficient and can be used
as a decision support system by the experts in the diagnosis
and treatment of neurological disorders [6].
Section 2 of this paper provides brief information about
current medical technologies in the neurological disease
diagnosis and also discusses challenges in medical big data
analysis. In Sect. 3, the CAD system is introduced and briefly
described on how this method works for automatic diagnosis
of neurological diseases. Section 4 provides a short review
of the CAD system on the diagnosis of various neurological
diseases, and Sect. 5 concludes the paper with the potentials
of CAD systems in the future.
2 Current Medical Technologies for Medical Data
Collections and Challenges in Medical Big Data
Analysis
Currently, neurological diseases are diagnosed by using var-
ious medical techniques such as electroencephalography
(EEG), computerised tomography (CT scan or CAT scan),
magnetic resonance imaging (MRI scan), electromyography
(EMG), positron emission tomography (PET scan or PET
imagery), arteriogram (also called an angiogram) and single-
photon emission-computed tomography (SPECT). These
diagnostic tests help physicians confirm or rule out the
presence of a neurological disorder or other medical con-
ditions. In order to diagnose brain-related diseases such as
epilepsy, certain seizure disorders, degenerative disorders,
sleep disorders, autism, brain tumours and migraines, and
EEG is used to record brain cell activity through the skull
for studying the functional states of the brain to help physi-
cians for detecting and monitoring brain abnormalities [7].
Variations or abnormalities in brain waves recommend differ-
ent types of neurological disorders. To diagnose neurological
conditions such as tumours, blood clots, degenerative disease
and the location of strokes. To identify brain abnormalities,
a CT or CAT scan is used to see the cross-sectional images
of the body using X-rays and a computer [8]. Such tests are
mainly used for swelling and lesions in certain areas, broken
bones, heart disease and internal bleeding.
In finding brain and spinal cord abnormalities, MRI tests
are valuable to investigate detailed images of body structures
including tissues, organs, bones and nerves [911]. MRI tests
help physicians to diagnose torn ligaments, tumours, circula-
tion (blood flow) problems, eye disease, inflammation (e.g.
arthritis) and infection. MRI scans are also used to detect
and monitor degenerative disorders such as multiple sclerosis
and can document brain injury from trauma. If the physicians
need to investigate the brain in action (e.g. speaking or mov-
ing) and to pinpoint areas of the brain that become active and
note how long they stay active, fMRI is a suitable diagnostic
test. The fMRI test measures small changes in blood flow as
a person completes tasks while in the MRI scanner [12]. The
123

56 S. Siuly, Y. Zhang
fMRI imaging process is used to assess brain damage from
head injury or degenerative disorders such as Alzheimer’s
disease and to identify and monitor other neurological disor-
ders, including multiple sclerosis, stroke and brain tumours.
To follow-up to a CT or MRI scan, a PET test can be used
to provide the physician with a greater understanding of spe-
cific areas of the brain including two- and three-dimensional
pictures of brain activity. SPECT tests are also ordered as a
follow-up to an MRI to diagnose tumours, infections, degen-
erative spinal disease and stress fractures. In order to detect
abnormal electrical activity of muscle that can occur in many
diseases and conditions such as amyotrophic lateral sclero-
sis (ALS, Lou Gehrig’s disease), carpal tunnel syndrome,
muscular dystrophy, neuropathy, sciatic nerve dysfunction,
inflammation of muscles, an EMG scan is used to record
the electrical activity of muscles [13]. For detecting different
types of heart problems such as heart attack, coronary heart
diseases and stroke, ECG is used to record the heart’s electri-
cal activity to understand how the heart works [14]. To detect
blockage or narrowing of the vessels, arteriogram is used to
have an X-ray of the arteries and veins. To investigate spinal
nerve injury, herniated discs, fractures, back or leg pain and
spinal tumours, myelograms are used. Ultrasounds are used
to assess blood flow through various vessels, and transcranial
doppler ultrasounds are used to view arteries and blood ves-
sels in the neck and determine blood flow and risk of stroke.
These medical technologies produce huge quantities of
complex and high dimension data that are an important
source for diagnosing neurological diseases and treatment
and therapy planning. The medical big data analysis has
potential to be a valuable tool, but implementation can pose
a challenge. It requires careful data analysis which can pro-
vide authentic, accurate and reliable information for good
decision-making in disease diagnosis. In practice, most of
the cases interpretations of that data are accomplished by
experts/neurologists in visual manner [15]. It is very natural
that clinicians are not always able to make optimal use of the
acquired data due to the limitations of the human eye–brain
system, limitations in training and experience and factors
such as fatigue and distraction. The medical data interpre-
tation by humans is limited owing to the non-systematic
search patterns of humans, the presence of structure noise
and the vast amounts of data. For handling the high volume
of data with complexity, it is essential to use digital tech-
nologies to support medical data analysis. Hence, there is
ever-increasing requirements to develop such CAD systems
for the experts/neurologists that can automatically make an
accurate assessment for the detection of different neurologi-
cal problems.
3 Computer-Aided Diagnosis System for
Automatic Diagnosis of Neurological Diseases
Recently, CAD is becoming very popular in medical and
diagnostic imaging for automatic detecting abnormalities
from medical big data sources. The basic concept of CAD
was proposed by The University of Chicago, in the mid-
1980s, whose idea was to provide a computer output as a
“second opinion” to assist experts in interpreting medical
data, so that the accuracy and consistency of diagnosis could
be improved, and also the analysis time could be reduced
[1618]. The CAD system consists of three main steps [ 6]
such as pre-processing, feature extraction and classification
as shown in Fig. 1. In the pre-processing part, acquired med-
ical data (e.g. medical image data or medical signal data) are
processed for removing noises, which reduces the complex-
ity and computation time of the CAD algorithms. The feature
Patient
Clinician
Computer-
aided
diagonosis
system (CAD)
Pre-processing
Feature
extrcation and
then
Pattern
recognition and
classification
Diagonosis
procedure in
CAD:
Autometic
Disease
diagnosis
Treatments,
Therapies or
Rehabilitations
Fig. 1 Diagram of CAD system for automatic detecting abnormities from the medical big data
123

Medical Big Data: Neurological Diseases Diagnosis Through Medical Data Analysis 57
extraction part of the CAD system is one of the most impor-
tant parts where the biomarkers of disease identification are
extracted from the original source data. In the classification
process for CAD systems, the extracted feature vector is used
in the classifier model as input for assigning the candidate to
one of the possible categories (e.g. healthy or normal) accord-
ing to the output of a classifier. Generally, a CAD system can
be two types. When CAD system involves in classifying all
candidates into two categories such as abnormal and nor-
mal candidates, it is called two-class categorisation system.
On the other hand, if a CAD system can classify unknown
cases into several types of abnormalities, which are more than
two, it is called a multi-class categorisation system. Many
researchers are working to develop CAD schemes for detec-
tion and classification of various kinds of abnormalities from
medical data.
Like pattern recognition, the performance of CAD sys-
tems is assessed by k-fold cross-validation test, bootstrap
method, leave-one-out [19], etc. The free-response receiver
operating characteristic (FROC) and ROC curves are used
for evaluation of the overall performance of the CAD sys-
tems for various operating points. The FROC curve shows
the relationship between the sensitivity and the number of
false positives, which can be obtained by thresholding a
certain parameter of the CAD system or the output of the
classifier [6]. Recently, there has been a lot of research per-
formed on the development of the CAD systems for detecting
neurological problems such as epileptic seizures, dementia,
Alzheimer’s disease, autism, strokes, brain tumours, alco-
holism related neurological disorders and sleeping disorders.
4 Research on Neurological Diseases Diagnosis
Through the CAD System
Recently, a few automated computerised classification meth-
ods have been proposed to diagnose neurological diseases.
They are sufficiently robust to handle data from different
scanners for many applications. The numbers of developed
CAD approaches are too large t o review in a single article.
Thus, in this section, we provide a brief review consider-
ing some of the essentials and recent researches of those for
assisting neurologists in detection of neurological diseases.
4.1 Epilepsy and Epileptic Seizure Diagnosis
Epilepsy is one of the most common and devastating neu-
rological diseases worldwide. Epilepsy is characterised by
recurrent seizures [20,21]. Seizures are defined as sudden
changes in the electrical functioning of the brain, resulting
in altered behaviours, such as losing consciousness, jerky
movements, temporary loss of breath and memory loss. The
EEG is an important clinical tool which contains valuable
Fig. 2 An illustration of an EEG signal containing seizure [22]
information for understanding epilepsy [7,22]. Its chief man-
ifestation is the epileptic seizure, which can encompass a
discrete part of the brain partial or the complete cerebral mass
generalised. Over the past few years, numerous epileptic
seizure detection and prediction algorithms have developed
from several countries throughout the world. Figure 2 shows
an illustration of epileptic EEG signals during seizure activ-
ity. As can be seen in Fig. 2, the abnormal pattern of the
signals significantly appears in the seizure period.
More recently, Shen et al. [23] i ntroduced a method based
on a cascade of wavelet-approximate entropy for feature
extraction in the epileptic EEG signal classification. They
tested three existing methods for classification: support vec-
tor machine (SVM), k-nearest neighbour (kNN) and radial
basis function neural network (RBFNN), to determine which
has the best performance in such as cascaded EEG analy-
sis system. Acharjee and Shahnaj [24] used twelve Cohen
class kernel functions to transform EEG data in order to
facilitate the time frequency analysis. The transformed data
formulated a feature vector consisting of modular energy
and modular entropy, and the feature vector was fed to an
artificial neural network (ANN) classifier. Siuly et al. [25]
introduced a computerised approach based on simple ran-
dom sampling (SRS) techniques and least square support
vector machine (LS-SVM) to classify epileptic EEG signals.
In another work, Siuly and Li [26] developed a new algo-
rithm for feature extraction considering the variability of the
observations within a time window called optimum alloca-
tion approach. Then, the extracted features were assessed
by various multiclass least square support vector machine
(MLS-SVM), classifying epileptic EEG signals; Aslan et
al. [27] executed a study to check epileptic patients devel-
oping classification method. The classification process was
performed into partial and primary generalised epilepsy by
employing RBFNN and multilayer perceptron neural net-
work (MLPNNs).
123

58 S. Siuly, Y. Zhang
The experimental results demonstrated that the RBFNN
model can be used as a decision support tool in clinical stud-
ies to validate the epilepsy. Li [28 ] proposed an approach
based on multi-resolution analysis to automatically indicate
the epileptic seizures or other abnormal events in EEG. Song
and Liò [29] developed an EEG epilepsy detection scheme
based on the entropy-based feature extraction and extreme
learning machine. Subasi [30] applied a novel method of
analysis of EEG signals using discrete wavelet transform
and classification using ANN. Gular et al. [31] proposed an
idea of a study for the assessment of accuracy of recurrent
neural networks (RNN) employing Lyapunov exponents in
detection seizure in the EEG signals. For the detection of
epilepsy and seizure, Adeli et al. [32] developed a wavelet
chaos methodology for analysis of EEGs and delta, theta,
alpha, beta and gamma sub-bands of EEGs. Siuly et al. [5]
introduced a clustering technique-based LS-SVM for EEG
signal classification. Akin et al. [33] tried to find a new solu-
tion for diagnosing the epilepsy.
4.2 Dementia, Alzheimer’s and Parkinson Diseases
Diagnosis
Dementia refers to a group of neurodegenerative disorder
diversity caused by the gradual neuronal dysfunction and
death of brain cells. This disorder can be defined clinically
as a syndrome that causes a decline in cognitive domain (i.e.
attention, memory, executive function, visual–spatial abil-
ity and language) [34], which are common i n the elderly.
According to the American Academy of Neurology summary
report, 10 % of people over the age of 65 and up to 50 % of
people over 85 have dementia [35]. In 2011, it is estimated
that nearly 300,000 people in Australia had dementia out of
a total population of 23 million. This number is anticipated
to increase to 900,000 by 2050 [36].
Dementia is classified into Alzheimer’s disease (AD),
Parkinson’s disease (PD), dementia with lewy bodies,
Creutzfeldt–Jakob disease, normal pressure hydrocephalus,
vascular dementia, front temporal dementia [38,39]. AD is
the most well-known and common type of dementia. Out of
all the mentioned types of dementia, 2/3 of the demented
patients suffer from AD. In this section, we provide a brief
review in the developed CAD methods for detecting demen-
tia AD and PD from medical image data and signal data.
Figure 3 displays an image of PET scans showing a large
area of normal activity in the brain of a normal person com-
pared to reduced activity in the brain of a person who has
dementia.
There is a large number of automatic computer assis-
tance methods developed for identification of dementia.
Koikkalainen et al. [40] completed an extensive study on
various developed methods of CAD system for detecting
dementias using only structural MRI data. An extensive set
Fig. 3 An illustration image of brain activity in dementia [37]
of features was extracted quantifying volumetric and mor-
phometric characteristics from MRI images, and vascular
characteristics. The classification process was performed
based on Disease State Index methodology. Hirata et al. [41]
developed software based on the voxel-based specific region
analysis for AD, which can automatically analyse 3D MRI
data as a series of segmentation, anatomical standardisation
and smoothing using a software and Z-score analysis. Li et al.
[42] employed a SVM for characterisation of the hippocam-
pal volume changes in AD and differentiation of AD patients
from healthy control subjects. Kloppel et al. [43] developed a
CAD method for diagnosis of AD from MRI scans obtained
from two different centres and two different equipment [28]
using linear support vector. I n their method, MRI images
were segmented into grey matter (GM), white matter and CSF
using SPM. Colliot et al. [44] developed an automated seg-
mentation method to aid distinguish between patients with
AD, mild cognitive impairment (MCI) and elderly controls.
Hamou et al. [45] proposed a computerised method based on
cluster analysis and decision tree for analysing processing
MRI image in the AD diagnosis.
Neural changes associated with dementia can also be
detected through clinical biomarkers, such as EEG [34].
Numerous studies have been conducted on the CAD system
to deal with EEG changes associated with dementia. The
researchers developed CAD methods to identify the degree
of severity of dementia, and some studies support the possi-
bility for EEG to detect dementia in early stages [4650]. For
example, Henderson et al. [51] detected early dementia pres-
ence in EEGs with high sensitivity and specificity [46,51].
EEG may play an important role in detecting and classifying
dementia because of its significant influence on dementia
abnormalities in terms of rhythm activity. Joshi et al. [52]
used neural network methods (NN) to classify AD by consid-
ering common risk factors. In that study, they also tested some
other machine learning techniques such as DT, bagging, BF
123

Citations
More filters
Journal ArticleDOI

Prevalence and Diagnosis of Neurological Disorders Using Different Deep Learning Techniques: A Meta-Analysis.

TL;DR: An exhaustive review on deep learning techniques used in the prognosis of eight different neuropsychiatric and neurological disorders such as stroke, alzheimer, parkinson’s, epilepsy, autism, migraine, cerebral palsy, and multiple sclerosis is dispensed.
Journal ArticleDOI

A Greedy Deep Learning Method for Medical Disease Analysis

TL;DR: The greedy deep weighted dictionary learning has a good effect on the classification of mobile multimedia for medical diseases, and the accuracy, sensitivity, and specificity of the classification have good performance, which may provide guidance for the diagnosis of disease in wisdom medical.
Journal ArticleDOI

A feature extraction technique based on tunable Q-factor wavelet transform for brain signal classification.

TL;DR: The results prove that the proposed TQWT based feature extraction method has great potential to extract discriminative information from brain signals and can assist doctors and other health experts to identify diversified EEG categories.
References
More filters
Journal ArticleDOI

A comparative risk assessment of burden of disease and injury attributable to 67 risk factors and risk factor clusters in 21 regions, 1990-2010: a systematic analysis for the Global Burden of Disease Study 2010

Stephen S Lim, +210 more
- 15 Dec 2012 - 
TL;DR: In this paper, the authors estimated deaths and disability-adjusted life years (DALYs; sum of years lived with disability [YLD] and years of life lost [YLL]) attributable to the independent effects of 67 risk factors and clusters of risk factors for 21 regions in 1990 and 2010.
Journal ArticleDOI

Statistical pattern recognition: a review

TL;DR: The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.
Journal ArticleDOI

Statistical Pattern Recognition

TL;DR: In this paper, the primary goal of pattern recognition is supervised or unsupervised classification, and the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been used.
Journal ArticleDOI

Prevalence of disorders of the autism spectrum in a population cohort of children in South Thames: the Special Needs and Autism Project (SNAP).

TL;DR: Prevalence of autism and related ASDs is substantially greater than previously recognised and services in health, education, and social care will need to recognise the needs of children with some form of ASD, who constitute 1% of the child population.
Journal ArticleDOI

Mechanisms and applications of plant growth promoting rhizobacteria: Current perspective

TL;DR: The latest paradigms of applicability of these beneficial rhizobacteria in different agro-ecosystems have been presented comprehensively under both normal and stress conditions to highlight the recent trends with the aim to develop future insights.
Related Papers (5)

Global, regional, and national burden of neurological disorders, 1990-2016: A systematic analysis for the Global Burden of Disease Study 2016

Valery L. Feigin, +270 more
- 01 Nov 2017 -