Sophia Genetics’ new space-saving and privacy-preserving technology for genomic big data

Posted: 6 December 2016 | | No comments yet

The new technology makes sure the privacy of genomic information is not compromised as patients’ data go through various processing steps…

Sophia Genetics unveiled a new privacy-protecting technology for the storage and access to patients’ genomic information worldwide. With a large clinical genomics platform – 215 hospitals from 35 countries connected – the designer of SOPHiA (artificial intelligence for Data-Driven Medicine) Sophia Genetics decided to utilise its industry position in genomics to ensure the secure democratisation of Data-Driven Medicine worldwide.


Developed with genomic data privacy and security experts from the Swiss Federal Institute of Technology Lausanne, and biomedical researchers from Stanford University, the new technology makes sure the privacy of genomic information is not compromised as patients’ data go through various processing steps, from compression to storage and finally access by healthcare institutions.

Privacy of personal data

Adam Molyneaux, Chief Information Officer at Sophia Genetics commented, “Privacy has become an increasingly serious concern given both the highly personal nature of genomic information, and the increasing number of requests for access to patients’ genomic information generated by healthcare institutions all along patients’ life. Preventing unsolicited use of personal data requires to not only encrypt data but also to define data access privileges, enabling selective retrieval of genomic data.”

Cost of data storage

The new genomic data privacy technology tackles another challenge of the Data-Driven Medicine revolution – genomic data costs of storage.

Adam Molyneaux explains, “The data of one human genome can go from 30 GB to 200 GB. And with new high speed DNA sequencing technologies driving costs of generating genomic data down, the growth rate of genomic data has more than doubled each year since 2007, outpacing Moore’s law.”

Over 100 petabytes of storage are already used by the world’s largest 20 biological research institutions, which corresponds to more than $1 million USD in storage maintenance costs every month.

‘…space-saving and privacy-preserving…’

Jean-Pierre Hubaux, Professor at EPFL and senior author of the publication, commented, “This new technology offers an effective space-saving and privacy-preserving solution for the storage of clinical genomic data. The currently available solutions were developed before the widespread usage of high-throughput technologies and do not consider effective protection when compressing genomic sequences; the current standard saves 34% storage on average in lossless compression. Our no-compromise solution uses 18% less storage while allowing for unprecedented levels of security in genomic data storage as well as selective retrieval.”

Presenting the new solution today, Jurgi Camblong, Sophia Genetics’ CEO and co-founder, declared “I am proud that, in partnership with EPFL, we are raising the bar in genomic data privacy and at the same time lowering the cost of storage of genomic data. This is necessary for the large-scale application of personal genomics in research and clinical settings.”

Nick-named SECRAM , the technology has been made accessible to the whole genomics community in open source, here.