Yan Yan
My research is mainly on bioinformatics. My lab focuses on multi-omics analysis which provide novel information on the mechanisms of the biological process and cell states in disease development. We develop computational models based on artificial intelligence, machine learning, and statistics. Some of the recent research include statistical association and deep learning on genome-wide population data, deep learning on single-cell RNA-seq, and de novo peptide sequencing. We are particularly interested in plant genomes and AI-assisted digital agriculture. We are also interested in the board range of data analytics using AI and machine learning.
Selected Publications (underline = research trainee)
- B S Puliparambil, J Tomal, Y Yan (2022): A Novel Algorithm for Feature Selection Using Penalized Regression with Applications to Single-Cell RNA Sequencing Data. Biology 11.10: 1495.
- Y Yan, C Burbridge, J Shi, J Liu, AJ Kusalik (2019): Effects of Input Data Quantity on Genome-Wide Association Studies (GWAS), International Journal of Data Mining and Bioinformatics, 22(1), 19-43.
- J Shi, Y Yan, M Links, L Li, J Dillon, M Horsch, AJ Kusalik (2019): Antimicrobial resistance genetic factor identification from whole-genome sequence data using deep feature selection, BMC Bioinformatics, 20(Suppl15): 535.
- Y Yan, AJ Kusalik, FX Wu (2017): NovoExD: De novo peptide sequencing for ETD/ECD spectra, IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 14(2), 337-344.
- Y Yan and K Zhang (2016). Spectra library assisted de novo peptide sequencing for HCD and ETD spectra pairs, BMC Bioinformatics, 17(17): 205-211.
- Y Yan, AJ Kusalik, FX Wu (2016): De novo Peptide Sequencing using CID and HCD Spectra Pairs, Proteomics, 16(20): 2615-2624.
- Y Yan, AJ Kusalik, FX Wu (2015): Recent developments in computational methods for de novo peptide sequencing from tandem mass spectrometry (MS/MS). Protein and peptide letters, 22(11): 983-991.
- Y Yan, AJ Kusalik, FX Wu (2015): A framework of de novo peptide sequencing for multiple tandem mass spectra. IEEE Transactions on NanoBioscience, 14(4):478-484.
Bioinformatics, Machine Learning, Statistical Association, Graph Theory, Big Data Analysis