Big Data Analystics for Unstructured Clinical Data



Health Big Data (HBD) is more than just a very large amount of data or a large number of data sources. The data collected or produced during the clinical care process can be exploited at different levels and across different domains, especially concerning questions related to clinical and translational research. To leverage these big, heterogeneous, sensitive and multi-domain clinical data, new infrastructures are arising in most of the academic hospitals, which are intended to integrate, reuse and share data for research.

Yet, a well-known challenge for secondary use of HBD is that much of detailed patient information is embedded in narrative text, mostly stored as unstructured data. The lack of efficient Natural Language Processing (NLP) resources dedicated to clinical narratives, especially for French, leads to the development of ad-hoc NLP tools with limited targeted purposes. Moreover, the scalability and real-time issues are rarely taken into account for these possibly costly NLP tools, which make them inappropriate in real-wopld scext,io1> 063d/cextst"N mussionidjeo;shallenge fwitheneuse g (Nalth Bita foe raionlnarotesourlved:ita foquaty andsssiont ofr research.  


He63d/cexe lag Dinicrodujeiohth, Bopries. o thaddss-mhe scs-m nti nare to clverage these aber,igratis (Ltheneuse g (Nstructured dainical data, t dilarge nualabe:1) WeBopries.o clvelopmeew ininical darondrdrelaesentation&#elatyg (N&#efe-heatee-hsceemanc poanrotion&#ean jk o thw inP tools widicated to clench, linical narratives, 063d/cex2) Siernthe scaims thaofficient N tamaphe thaed ineemanc poformation isaoffxis,g (Nructured data. eaof exfurmussialystz in naaig Data. efrastructures the scodujeiohs, thaddss-m dedis,ribud tosticemsnsues a:calability amosalygent of aduerniinerata, t porvateca,Nrucm meocessing epimic lologand repre maco-vigilaernthinical dapraion andsssiont ofd rehlth Bire prquaty anroarch. &nmg alt="Bisrc="/itp://;jdocunt os/58104/624202/gclin&#_age/c.png/3b722ce9-eca2-46f4-8fb2-8d i5d5e34b2?1442654541580" reyle="limge -cob mtot: 20.0;">dth=": 524.0;">ight: 15317.0;">/>

Cortners (stylnicg>h3>

He6href=" a> CNRS-IRISAa> &nExrnatelollecaberedor:bsp;CNRS-STLa>

Tspan> rmat
>script type="text/javascript">Liferay.PoUonl.aedInputFocus();feray.Portlet.lirt fe irtlet.lIds=["3"><]/script>