HUN-REN RCNS - Institute of Molecular Life Sciences, Budapest HU
| +36 1 3826 710 | szepesi-nagy.istvan@ttk.hu | sznistvan.hu | LinkedIn | OrcID | GitHub |
Bioinformatician and PhD candidate with a strong foundation in both computer science and life sciences. Passionate about pursuing a research career in the interdisciplinary field of bioinformatics, with a constant drive for learning and personal growth. Always eager to explore new opportunities and contribute to impactful research
Semmelweis University Doctoral College, Budapest HU
PHD IN BiOLOGiCAL SCiENCES ‑ MOLECULAR MEDiCiNE DiViSiON
Research topic: DNA repair mechanisms in neurons
DNA Repair Research Group ‑ ronalab.org
2024 - present
Pázmány Péter Catholic University, Budapest HU
MSC iN INFO‑BiONiCS ENGiNEERiNG
Grade: Honours
Specialization: Systems biology
Thesis topic: Microbiome gene expression prediction using transformer‑based neural networks
Skills: Microbiome genetics, RNA‑seq analysis, bioinformatics tools, Python, HPC enviroment, NNs
2022 - 2024
Pázmány Péter Catholic University, Budapest HU
GUEST STUDENT ‑ BSC PROGRAM iN MOLECULAR‑BiONiCS ENGiNEERiNG
Courses: Molecular cell biology, neurobiology, biology laboratory practice
2020 - 2022
University of Technology and Economics , Budapest HU
BSC iN COMPUTER SCiENCE ENGiNEERiNG
Grade: Excellent
Specialization: Infocommunications
Thesis topic: Network programmability
Skills: Networking, graph theory, local area networks, network switches, Python
2017 - 2021
HUN‑REN RCNS ‑ Institute of Molecular Life Sciences
PhD candidate - bioinformatician
Multi‑omics analysis of neurodegenerative diseases
Proteomics bioinformatics
2024-09 - present
Hungarian Centre of Excellence for Molecular Medicine (HCEMM)
Scientific Computing ACF Member -part-time remote
Solutions for Advanced Core Facility projects
High performance computing cluster
Bioinformatics, pipeline and data anaylsis
2025-08 - present
Danubia Patent and Law Office LLC
Scientific Counselor -part-time contract
Scientific counselling on intellectual property related to biochemical and bioinformatics innovations
2024-03 - present
Semmelweis University, Department of Pharmacology and Pharmacotherapy
BiOiNFORMATiCiAN ‐ RESEARCH ASSiSTANT - full-time
Conducting scientific literature search to support ongoing projects
Performing NGS data analysis, developing and deploying new omics anlysis sollutions
2024-02 - 2024-08
GENyO ‐ Centre for Genomics and Oncological Research, Granada ES
BiOiNFORMATiCS iNTERN - part-time
Erasmus internship program opportunity
Working in the Gene Expression Regulation and Cancer Research Group
Conducted bioinformatics analysis of miRNAs from DLBCL samples
2023-09 - 2023-12
BBraun Medical Kft. Budapest, HU | SOFTWARE DEVELOPER iNTERN | 2022‐06 ‐ 2022‐08
Contributed to the development of software for acute dialysis machines.
Conducted unit testing to ensure software quality and reliability
Programming experience in C/C++
CISCO Systems Hungary Budapest, HU | DEVOPS ENGiNEER iNTERN | 2021‐04 ‐ 2022‐01
Engaged in networking and API programming activities
Supporting DevOps practices and infrastructure
Background:
Analyzing large-scale, mass spectrometry–based complex proteomics datasets often overwhelms desktop computational resources and requires substantial manual configuration. While FragPipe provides rapid peptide identification across diverse sample preparation and acquisition modes (DDA, DIA, TMT), deploying it at scale remains challenging.
Results:
We introduce Frag’n’Flow, a Nextflow-based pipeline that encapsulates FragPipe, automates input manifest creation and workflow generation, manages tool dependencies, and includes downstream data analysis options. This enables reproducible, high-performance analysis on HPC, cloud, and cluster environments. Benchmarking against other workflow-based solutions demonstrates that Frag’n’Flow maintains quantitative accuracy while reducing runtime by nearly half on a typical DIA dataset (~58 GB), alleviating memory and I/O bottlenecks. We further validate Frag’n’Flow across three representative datasets—label-free DDA, DIA, and TMT—successfully recapitulating published biological signatures with minimal user intervention.
Conclusions:
By combining the sensitivity and speed of FragPipe with Nextflow’s orchestration capabilities, Frag’n’Flow enables scalable analysis of large proteomics datasets and empowers researchers, regardless of computational expertise, to extract deeper insights from existing MS datasets.
Availability: Frag’n’Flow is available at: https://github.com/ronalabrcns/FragNFlow
TurboID-based proximity labeling is a powerful approach to capture protein-protein interactions within their native cellular environment. Here, we present a step-by-step protocol for fusing proliferating cell nuclear antigen (PCNA) to TurboID and generating stable cell lines via lentiviral transduction. We describe steps for cell synchronization, DNA damage induction, and proximity labeling, followed by fractionation, affinity purification, and mass spectrometry to identify biotinylated proteins.
For complete details on the use and execution of this protocol, please refer to Rona et al. 2024.
Background: In the evolving landscape of microbiology and microbiome analysis, the integration of machine learning is crucial for understanding complex microbial interactions, and predicting and recognizing novel functionalities within extensive datasets. However, the effectiveness of these methods in microbiology faces challenges due to the complex and heterogeneous nature of microbial data, further complicated by low signal-to-noise ratios, context-dependency, and a significant shortage of appropriately labeled datasets. This study introduces the ProkBERT model family, a collection of large language models, designed for genomic tasks. It provides a generalizable sequence representation for nucleotide sequences, learned from unlabeled genome data. This approach helps overcome the above-mentioned limitations in the field, thereby improving our understanding of microbial ecosystems and their impact on health and disease.
Methods: ProkBERT models are based on transfer learning and self-supervised methodologies, enabling them to use the abundant yet complex microbial data effectively. The introduction of the novel Local Context-Aware (LCA) tokenization technique marks a significant advancement, allowing ProkBERT to overcome the contextual limitations of traditional transformer models. This methodology not only retains rich local context but also demonstrates remarkable adaptability across various bioinformatics tasks.
Results: In practical applications such as promoter prediction and phage identification, the ProkBERT models show superior performance. For promoter prediction tasks, the top-performing model achieved a Matthews Correlation Coefficient (MCC) of 0.74 for E. coli and 0.62 in mixed-species contexts. In phage identification, ProkBERT models consistently outperformed established tools like VirSorter2 and DeepVirFinder, achieving an MCC of 0.85. These results underscore the models' exceptional accuracy and generalizability in both supervised and unsupervised tasks.
Conclusions: The ProkBERT model family is a compact yet powerful tool in the field of microbiology and bioinformatics. Its capacity for rapid, accurate analyses and its adaptability across a spectrum of tasks marks a significant advancement in machine learning applications in microbiology. The models are available on GitHub (https://github.com/nbrg-ppcu/prokbert) and HuggingFace (https://huggingface.co/nerualbioinfo) providing an accessible tool for the community.
2025
Oral presentation - Hungarian Society for Bioinformatics Conference, Budapest HU
Poster presentation - Central and Eastern European Proteomics Conference, Budapest HU (poster competition award)
Poster presentation - Central European Genome Stability Meeting, Debrecen HU
Poster presentation - Semmelweis University Scientific Days Conference, Budapest HU (best poster award Molecular Medicine)
Poster presenation - Hungarian Biotechnology Student Association Biotech Days, Budapest HU
2025
Proteomics Bioinformatics Course 2025 - EMBL EBI, Hinxton UK
2025
University Researchers’ Scholarship Program Cooperative Doctoral Program (EKÖP-KDP) - 36 months
Pannonia Scholarships - Travel Grant
European Proteomics Association - Travel Grant 2026
2024
University Researchers’ Scholarship Program (EKÖP) - 12 months
Hungarian Bioinformatics Society (2025- )
Hungarian Neuroscience Society (2025- )
Hungarian Biochemichal Society (2024- )
Scientific Association for Infocommunications (2022- )
Mensa HungarIQa (2017- )