QnGene.png QnGene

GENETIC FORMAT FOR HPC  


Extension of the Adam schema to include genotyping and clinical data and perform genetic analysis on a large scale.


 
     
             CHALLENGE 




 

Health data analysis is typically done using a pipeline of Python scripts that run SQL queries on relational database technology. This approach is fast becoming obsolete because of its limitations to process BigData sizes present in genetic data. The Amplab of UC Berkeley (see their technical paper) has published the Adam format that can take advantage of recent open source Big Data technologies. Having access to genetic data on BigData is great but we also need clinical and demographical data obtained during clinical trials to conduct precision medicine research. This research aims to extend the Adam schema without impacting its current APIs. A first advantage is to have all this data colocated in efficient binay format for access by Hadoop/Spark processing clusters (see recent publication here) (recent video instructions here).

 
   
             WHAT ARE WE DOING

 
This is our lab project done without any external funding. Thanks to teh students and researchers for their time and effort. We are currently extending the Adam format to accept the genetic and clinical data of large clinial trials: for example Advance and CDK Gene provided by Dr. Pavel Hamet of the CRCHUM. Thanks also to Prof. Larry Hall, of the University of South Florida for his advice on machine learning to validate the segmentation factors to obtain a preventive diabetes type 2 score. An additional collaboration with Dr. Michael Phillips of Sequencebio will be initiated during summer of 2018. This ADAM HPC genetic sequencing and analysis pipeline is inspired from the GATK best practices.
 



             STUDENTS INVOLVED     Fodil Belgait, Michel Hénault-Éthier, Béatriz Kanzki


             TECHNOLOGIES     
 
Python, Adam, Hbase, Parquet, Avro, H2o, genomeBrowser, IGV, VarSeq

 
             
Thanks to  Screen-Shot-2016-12-07-at-10-42-18-AM.png for providing the power behind our algorithms




 
  Screen-Shot-2017-05-03-at-10-07-50-AM.png GENOME VIEWER


Creating an open-source Web-based interactive genomic somatic mutation visualizer. Upload your data, identify a variant, all matches found are diplayed with all possible cancer types or tissue where mutation were found. Types of mutations to visualize can be selected and include substitutions and indels. The objective is to improve on the functionality of LocusZoom, IGV, UCSC Genome Browser and the generic browser of COSMIC. Look at this M.Ing. track paper published at the ACM Digital Health 2017.

 
 

             CHALLENGE 

 
 
Health researchers want to interact, in real-time, with their genetic data but growing volumes and complexity are 
limiting them. Currently they have to as their bio-informtics staff to run queries in batch mode for hours. This 
This approach and technology consumes lots of resources and slows down the discovery process. There is a need 
for an interactive visualisation tool that can handle large amount of data where the researchers can work on their own interactively and efficiently with the data.
             WHAT ARE WE DOING 
 
       
 

First we have studied the genetic query processes and tools (i.e. LocusZoom, IGV, UCSC Genome Browser, COSMIC) used at  the CRCHUM. Next we looked at the requirements of Dr. Sinnett at the research center of hôpital Sainte-Justine. A first proof of concept (throwable) prototype, coined GOAT v1, was initially developed (git@github.com:jokerbea/GOAT-Genetic-Output-Analysis-Tool.git) by Beatriz Kanzki. From this throwaway proof of concept, a number of reengineering projects where initiated (see Cédric v2 and Victor v3 reports in French, then during the winter 2017 term - v4 refactoring was done also in French. Major changes were done to the front-end by replacing the Bokeh-Server by AmCharts for the visualisation and replacing the SQL database by the UC Berkeley Adam format on Spark. Previously, in v3, loading a large .vcf file took 6 hours. Now it only takes 30 minutes for the researcher to load his genetic data. The best improvement is that now, the query to the reference data loaded in Adam format takes only 5 seconds (prviously it was taking 3 minutes). Finally, operation on AWS makes this version of the prototype available from anywhere now. We are planning now for v5 that should include more than the 1000 genome reference. 

Screen-Shot-2017-04-26-at-8-43-11-AM.png

Screen-Shot-2017-04-26-at-8-44-11-AM.png

 
 
             STUDENTS INVOLVED   Beatriz Kanzki, Max St-Onge, Raphaël Papillon, David Guay, Émile Filteau-Tessier, Cédric Urvoy and Victor Dupuy


             TECHNOLOGIES     
 
Smacss, Sass, React (Facebook), Reflux, React-JSXGriddle,DjangoGulpBabel, BrowserifyBootstrap, MySql, Python, Biopython, AmChartsNumpy, PandasBlaze, Flask and Matplotlib, Adam, AWS-EMR (Spark)
     
   
   This project runs on Screen-Shot-2016-12-07-at-10-42-18-AM-(1).png



     


 
PACIQ


Monitoring the continuous improvement and

quality program for health institutions


Creating a Web-based system that includes a dashboard to track

the progress of quality initiatives and measures against national

excellence standards used by the health industry: 
Accreditation

Canada
,
Planetree, du BOMA BESt and the 
Quebec Network

of HealthInstitutions
 to monitor conformance.

 

                Screen-Shot-2017-02-03-at-1-34-25-PM.png

                       Screen-Shot-2017-02-03-at-1-34-42-PM.png

                                          Screen-Shot-2017-02-03-at-1-35-58-PM.png
  
             CHALLENGE 
  Today, in a large health institution, its hard to obtain precise and up to date information about the progress of all the improvement projects toward conformance. Especially since their are so many dimensions of  quality. On top of these challenges the many standards duplicate the requirements causing many  duplicated questions, analysis and reports. Geneviève Parisien, directrice adjointe at the Quality section of Sainte-Justine hospital, has presented a summary of his requirements to address this problem (French powerpoint presentation).
             WHAT ARE WE DOING 

 

 

 

  We have studied the current process and looked at the many Excel sheets involved. Then the students have designed a centralized database (using a Microsoft .Net technology  recommended by the hospital IT department). Then, using an Agile methodology, we have developed a software prototype that has two components: 1) a WEB front-end and 2) a BI dashboard to help the decision makers Students have put into practice the software engineering concepts teached at ÉTS:

1) produce and ask for the approval of a vision document that describes the functional and non-functional requirements of the customer;

2) developed, presented and ask for the approval of a project plan;

3) developed a detailed design document SRS (in French), inclusing a business intelligence module that was prototyped in a development environment. Then a technological architecture was developed for the software. The first prototype is currently at user acceptace stage at the CHU Sainte-Justine in Montréal. Its planned that it will be moved to production at the end of 2016 or early 2017. Finally a quality assessment report was produce to identify what reengineering will be required to produce a solid and final version of this software.


 

 

 


             STUDENTS INVOLVED     M.Y.Tariq, U.Ghomsi, N.Brousseau, R.Chebli,  G.Gbelai and A. Elmoul


             OPEN SOURCE TECHNOLOGIES        .Net 4, IIS7, Sql Server 2008 R2, SQL, MDX, XML, SSIS, SSAS, SSRS


 


   

CoreLabNow


An real time dashboard to follow closely the processing of a very high volume of blood test in a large hospital. Doctors complete a test requisition. Samples are collected all over the hospital. The test order is captures and each sample prepared. Samples are placed in a basket and when the basket is full it is fed to the CoreLab equipment (see below).


 
  Screen-Shot-2017-02-03-at-3-23-27-PM.png
 

  
            
CHALLENGE  
   
Human and equipment errors generate a lot of rework for the MPA technician. It is hard to follow at what step of the process each

sample is and ensure: a set level of service, predict a surge in test demands, bottlenecks, historical trends by day/time of day...


            WHAT ARE WE DOING 
 
Design of a real time dashboard that shows the conditions of the workload from different perspectives. If an SLA for a test

is in trouble, a read square appears. The technician will also have access to a drill down screen where individual samples

numbers that are not meeting the service level are identified. He can then take action and locate/take rapid action on it.


Screen-Shot-2017-02-03-at-3-09-58-PM-(1).png

3_troponin-event.jpg
   
   

             STUDENTS INVOLVED     D. Lauzon, C. Vallières, P. Herrera, A. Boussif, A. Zakharov, D. Olano, M-A Tardif, P-E Viau, M. Ouellet,
                                                 P-A St-Jean



             ALL OPEN SOURCE TECHNOLOGIES   
Highcharts JS, WebSockets, Socket.io, Node.js, VirtualBox, Ubuntu Server LTS, LXDE 






 

FixMyShoulder

I had a number of shoulder problems from 2001 to 2013. George, a well known physio has helped me and now has published a book on this topic. This project has developed an Android and IOS app + ots CMS platform to update the  information for George Demirakos.



 
  
             CHALLENGE 
   
Create a multiplatform (iPhone (IOS), iPad and Android) that is easy to update from one central CMS system.

 

  
            
WHAT WE HAVE DONE 
   
Developed a CMS allowing to update both technologies applications mobile. Developed two mobile applications:

(Itération 1 (IOS):
Julie Vincent, itération 1 (Android and platform Web): Mathieu Crochet, Itération 2

(Android and CMS)


  STUDENTS INVOLVED    J. Vincent, M. Crochet, M.Awada, S Kadi, M. Khalil, M. Mammar, T.Warnant



  TECHNOLOGIES  Objective-C, XML, HTML, CSS, PhO, Cocoa, TestFlight, Java, Javascript, ADT pour Éclipse, WampServer,

plateforme Xcode, yED, Photoshop, Jira and Subversion