Previous Works
Manage projects | Handle collaborations | Mentor PhD students
Experimental Design | Data Analysis | Presentation & Publication

Assistant Computational Biologist

GIGA-Research, Unit of Human Genetics (Prof V.Bours)

April 2017 – now

Random Forest behavior on Biological data

Short Biomarker Discovery

Machine learning approaches are heavily used to produce models that will one day support clinical decisions. To be reliably used as a medical decision, such diagnosis and prognosis tools have to harbor a high-level of precision. Random Forests have been already used in cancer diagnosis, prognosis, and screening. Numerous Random Forests methods have been derived from the original random forest algorithm from Breiman et al. in 2001. Nevertheless, the precision of their generated models remains unknown when facing biological data. The precision of such models can be therefore too variable to produce models with the same accuracy of classification, making them useless in daily clinics. Here, we perform an empirical comparison of Random Forest based strategies, looking for their precision in model accuracy and overall computational time.

Computational & Molecular Biologist

  • Data analysis of NGS, qPCR arrays & Mass Spectrometry
  • Data analysis of microRNA, mRNA, Metabolites & Clinics
  • Design and development of comparative tool to assess for the fittest Random Forest strategy for a dataset
  • Design of short biomarker signatures for diagnotics and response to treatment
  • Predict classes using Machine Learning (Random Forests) and Support Vector Machine
  • External Data integration (TCGA, ICGC) & pathways (IPA)
  • Integrative analysis algorithms development.
  • Development of a robust feature selection to reduce high-throughput data dimensions.
  • Development of a method to predict which biomarker signature will perform best.
  • Development of qPCR Normalization without refrence genes
  • Integration of mRNA, microRNA, & Metabolites data along with Clinical data
  • Algorithm development and Algorithm integration to user interfaces (GenePattern)
  • User interface tools for the team including:
  • • Clinical Management tool (Interactive and Dynamic website)
  • • Interactive graphical interface to display analyzed results (R, Plotly, D3.js).
  • Teached data analysis to Scientists
  • Mentored one PhD student in Bioinformatics along his PhD Thesis

Computational Biologist, Bioinformatics Analyst

Aduro Biotech, Berkeley, USA, Immune Monitoring and Biomarker Discovery Group (Dr. Chan Whiting)

September 2015 – April 2017

Immune cells signature discovery

Molecular and Cellular Data Integration

Analysis of samples Longitudinal monitoring clinical trials. Molecular and Cellular data integration.

Computational & Molecular Biologist

  • Genomic analyses (Class Comparisons, Genomic signatures, Pattern Recognition and Class Discovery).
  • Internal Data integration between NGS, microarray, NanoString, FACS, LUMINEX, ELISPOT and Clinical Data.
  • External Data integration (TCGA, ICGC)
  • Integrative analysis algorithms development.
  • Integrated Network analysis.
  • Signatures / Pattern Discovery / Classification (Unsupervised Learning).
  • Pattern recognition / Machine Learning (Random Forests), Linear Regressions and ElasticNet.
  • Algorithm development and Algorithm integration to user interfaces (GenePattern).
  • User interface tools for the team including:
  • • Clinical Management tool (Interactive and Dynamic website)
  • • Interactive graphical interface to display analyzed results (R, Plotly, D3.js).

PhD Thesis

GIGA-Research, Unit of Human Genetics (Prof V.Bours)

October 2006 – November 2014

Cancer characterization

The intra-tumor heterogeneity bias

Relapses are inevitable for many types of tumors. Such events are thought to be the consequence of cell sub-populations that are resistant to therapy. My project aimed at describing the bias coming from the internal heterogeneity of tumors for the early identification of treatment-resistant sub-populations, and thus improves the design of future therapies.

Computational & Molecular Biologist

  • Production of High-Throughput (HT) data from DNA & RNA
  • Management and Storage of HT data
  • Genome and Network Analyses of HT data
  • Statistical analyses
  • Coding using R and bioconductor, Python & Perl
  • Production of meaningful graphs
  • DNA / RNA Preparation (Extraction & QC)
  • FISH, Western Blot

PhD Collaboration

MRC, Imperial College London, C3-Neuroscience Lab (Prof F. Turkheimer)

September 2009 – December 2009

Wavelet smoothing

Statistical analyses of microarray signals

The aim of this project was to adapt the wavelet denoising methods to enhance the analysis of CGH microarrays. The resulting signal was then used to analyze the clonal heterogeneity of tumor samples.

Computational Statistics

  • Simulations of heterogeneous signal mixtures.
  • Assessment of tumor sub-population proportions.
  • Analysis using Gaussian mixture models using Expectation Maximization.
  • Smoothing using CHROMOWAVE Wavelet smoothing algorithm
  • Coding using MATLAB, R and bioconductor

Master Thesis

ULG, Mass Spectrometry Laboratory (Prof E. DePauw).

January 2006 – September 2006

Single Cell Fingerprinting


The aim of this thesis was to identify the pattern of proteins expressed from distinct intact single tumor cells. Using MALDI-TOF, tumor cells are analyzed after cell-culture. This is quick and powerful method to identify the class and the grade of each tumor sample.


  • In vitro expansion of immortalized cell-lines (MCF7, MDA, LNCAP)
  • Spheroid culture of LNCAP cells
  • Protein extraction
  • Engines: MALDI-TOF & Nano HPLC coupled with Easy-Q-TOF
  • Immuno-Staining, Western Blot & 2D Gels assays

Master Internship

CNRS, Laboratoire de NeuroImmunologie des Annélides (Prof M.Salzet)

October 2004 – August 2005

Role of gralunin in Hirudo Médicinalis.

Cloning and Expression of a gralunin of Hirudo Médicinalis.

My aim in this project was to clone the leech GRN gene into a E. coli plasmid and to validate its insertion using sanger sequencing. Due to its action on the cell-cycle, this granulin was then over-expressed into neurons to study neural regeneration. This study led to a better understanding of the underlying mechanisms triggered by the granulin and thus a better understanding of the main biological pathways.

Molecular Biologist

  • Sanger Sequencing
  • DNA extraction
  • PCR
  • Cell culture
  • Clonal Selection

Antibody Adviser NPO

December 2012 – Present

Antibody Advisor NPO

Find, rate and manage antibodies is the faster social way to find and share reviews about antibodies. It offers an independent online resource of commercially available antibodies that have been tested by scientists and gives the opportunity to share personal experiences with the scientific community.

Co-funder & Web developer

My Skills

Select the field.

Genomics 80%
Transcriptomics 90%
Proteomics 70%
Statistics 70%
Machine Learning 70%

Nothing in the biology makes sense except in the light of evolution.

sub specie evolutionis

- Theodosius Dobzhansky (1973) -

Get in touch

Feel free to contact me if you have something to say!