- Old Caoling Tunnel Biking -

Embrace the joy of riding the wind !

- Laser Gun Game -

Enjoy the excitement of the game !

- Shi Fen Trip -

Just forget busy and enjoy in nature!

- Chinese Year End Parties -

Yearly celebrate for everyone’s effort. Hope y’all keep having fun in bioinformatics!

- Lunch Gathering to Welcome Aboard -

Together, let us innovate and take our laboratory to new heights!

Join Us

Interested in Bioinformatics?
We are now hiring research assistants.
Click the button to know more

Photo credit: Saksham Gangwar on Unsplash

Apply

Bio-IT-Station

HK Tsai Lab of Bioinformatics

Institute of Information Science, Academia Sinica

News

Spring Outing: Strawberry Picking

After a heated vote, we determined to pick strawberries at Dong Lin strawberry farm, located at Bishan in Neihu. Enjoyed nature and relaxing time! Making jam with strawberries. Cozy hiking

Ting-Yu Yeh

Last updated on Apr 1, 2026 1 min read

Lab Lunch & Trip: Riddle City

We had a wonderful lab gathering with intern presentations, a farewell & welcome lunch! A puzzle game “Riddle City - 捷運踩地

Bing-Shiun Tsai

Last updated on Aug 26, 2025 1 min read

Summer: Weclome New Friends

Welcoming new interns and friends to our lab—let’s explore, learn, and grow together!

Bing-Shiun Tsai

Last updated on Jul 5, 2025 1 min read

See all posts

Projects

Host defense against viruses

Investigate how viral infections shape the host immune response and cell-cell communication

Mechanisms of structural variation

Classify and decode structural variant formation mechanisms from genomic sequence using machine learning.

Dark Genes

Reveal dark genes at network perspectives from multiomic data

Drug-Target Affinity

Enhancing drug-target affinity with protein LLMs and diffusion models

Lab Members

Principal Investigator

Huai-Kuang Tsai

Research Fellow/Professor

Evolutionary Algorithm, Bioinformatics, Regulatory Mechanism, Metagenomics, Computational Biology

Researchers

Shu-Qi Yu

Research Assistant

Bioinformatics, Network, Graph Theory, Algorithm

Ting-Yu Yeh

Research Assistant

Machine Learning, Network Biology, Genomics

Grad Students

Ru-Yin Jian

Doctoral Student

Machine Learning, Bioinformatics, Cancer

Administration

Visiting Scholars

Jia-Hsin Huang

Assistant Professor

Insect Physiology, Bioinformatics, Genomics

Wong Jin Yung

Assistant Professor

Evolution, Genomics, Machine Learning, Biomechanics

Alumni

Recent Publications

Quickly discover relevant content by filtering publications.

Complete end-to-end learning from protein feature representation to protein interactome inference

Background Co-fractionation coupled with mass spectrometry (CF-MS) is a powerful strategy for mapping protein–protein interactions (PPIs) under near-physiological conditions. Despite recent progress, existing analysis pipelines remain constrained by reliance on handcrafted features, sensitivity to experimental noise, and an inherent focus on pairwise interactions, which limit their scalability and generalizability. To address these difficulties, we introduce FREEPII (Feature Representation Enhancement End-to-End Protein Interaction Inference), a unified deep learning framework that integrates CF-MS data with sequence-derived features to learn biologically meaningful protein-level representations for accurate and efficient inference of PPIs and protein complexes.

Results FREEPII employs a convolutional neural network architecture to learn protein-level representations directly from raw data, enabling feature sharing across interaction pairs and reducing computational complexity. To enhance robustness against CF-MS noise, protein sequences are introduced as auxiliary input to enrich the feature space with complementary biological cues. The supervised protein embeddings further encode network-level context derived from complex annotations, allowing the model to capture higher-order interactions and enhance the expressive power of protein representations. Extensive benchmarking demonstrates that FREEPII consistently outperforms state-of-the-art CF-MS analysis tools, capturing more biologically coherent and discriminative protein features. Cross-dataset evaluations further reveal that integrating multimodal data from diverse experimental contexts substantially improves the generalization and sensitivity of data-driven models, offering a scalable, cross-species strategy for reliable protein interaction inference.

Conclusions FREEPII provides a unified computational framework that integrates CF-MS data and sequence-derived features to learn discriminative and biologically consistent protein representations. By leveraging multimodal inputs through a coherent deep learning architecture, the model achieves accurate and scalable inference of PPIs and protein complexes across species. Its modality-aware design and supervised protein embeddings capture higher-order interaction contexts, ensuring robust generalization and reliable discovery of novel interactions. Overall, FREEPII offers a flexible and extensible foundation for data-driven exploration of protein interaction networks.

Yu-Hsin Chen, Chien-Fu Liu, Jun-Yi Leu, Huai-Kuang Tsai

PDF DOI

A large language model framework for literature-based disease–gene association prediction

With the exponential growth of biomedical literature, leveraging Large Language Models (LLMs) for automated medical knowledge understanding has become increasingly critical for advancing precision medicine. However, current approaches face significant challenges in reliability, verifiability, and scalability when extracting complex biological relationships from scientific literature using LLMs. To overcome the obstacles of LLM development in biomedical literature understating, we propose LORE, a novel unsupervised two-stage reading methodology with LLM that models literature as a knowledge graph of verifiable factual statements and, in turn, as semantic embeddings in Euclidean space. LORE captured essential gene pathogenicity information when applied to PubMed abstracts for large-scale understanding of disease–gene relationships. We demonstrated that modeling a latent pathogenic flow in the semantic embedding with supervision from the ClinVar database led to a 90% mean average precision in identifying relevant genes across 2097 diseases. This work provides a scalable and reproducible approach for leveraging LLMs in biomedical literature analysis, offering new opportunities for researchers to identify therapeutic targets efficiently.

Peng-Hsuan Li, Yih-Yun Sun, Hsueh-Fen Juan, Chien-Yu Chen, Huai-Kuang Tsai, Jia-Hsin Huang

PDF DOI

Discovery and prioritization of genetic determinants of kidney function in 297,355 individuals from Taiwan and Japan

Current genome-wide association studies (GWAS) for kidney function lack ancestral diversity, limiting the applicability to broader populations. The East-Asian population is especially under-represented, despite having the highest global burden of end-stage kidney disease. We conducted a meta-analysis of multiple GWASs (n = 244,952) on estimated glomerular filtration rate and a replication dataset (n = 27,058) from Taiwan and Japan. This study identified 111 lead SNPs in 97 genomic risk loci. Functional enrichment analyses revealed that variants associated with F12 gene and a missense mutation in ABCG2 may contribute to chronic kidney disease (CKD) through influencing inflammation, coagulation, and urate metabolism pathways. In independent cohorts from Taiwan (n = 25,345) and the United Kingdom (n = 260,245), polygenic risk scores (PRSs) for CKD significantly stratified the risk of CKD (p < 0.0001). Further research is required to evaluate the clinical effectiveness of PRSCKD in the early prevention of kidney disease.

Hung-Lin Chen, Hsiu-Yin Chiang, David Ray Chang, Chi-Fung Cheng, Charles C. N. Wang, Tzu-Pin Lu, Chien-Yueh Lee, Amrita Chattopadhyay, Yu-Ting Lin, Che-Chen Lin, Pei-Tzu Yu", Chien-Fong Huang, Chieh-Hua Lin, Hung-Chieh Yeh, I-Wen Ting, Huai-Kuang Tsai, Eric Y. Chuang, Adrienne Tin, Fuu-Jen Tsai, Chin-Chi Kuo

PDF DOI

See all publications

Contact

hktsai@iis.sinica.edu.tw
+886-2-27883799 ext. 1472
Room N416, 128 Academia Road, Section 2, Nankang, Taipei 115
Enter New Building from Old Building and take the elevator to Room N416

- Old Caoling Tunnel Biking -

- Laser Gun Game -

- Shi Fen Trip -

- Chinese Year End Parties -

- Lunch Gathering to Welcome Aboard -

Join Us

Bio-IT-Station

HK Tsai Lab of Bioinformatics

Institute of Information Science, Academia Sinica

News

Projects

Lab Members

Principal Investigator

Research Fellow/Professor

Researchers

Postdoctoral Researcher

Research Assistant

Research Assistant

Research Assistant

Grad Students

Doctoral Student

Doctoral Student

Administration

Visiting Scholars

Assistant Professor

Assistant Professor

Alumni

Recent Publications

Contact