Fancy seeing you here π !
I'm Maria Balos, a data scientist based in Cambridge, UK. I currently work at Vocality.ai, training and deploying multilingual TTS models in production. When I'm not behind a screen, you'll probably find me next to a cup of coffee, solving LeetCode problems, or experimenting with personal projects.
Please grab a coffee and feel welcome to this small corner of my work!
LeetCode Stats
Right now I am working on:
- Training and deploying TTS models across multiple languages (en-GB, es-ES, eu-ES, sl-SI) for enterprise clients at Vocality.ai.
- Replicating results from the Manifold HyperConnections (mHC) paper and exploring its applications to computer vision.
- Sharpening my coding skills with LeetCode (300+ problems solved).
Last achievements:
- 2026/01: Started replicating the Manifold HyperConnections paper for computer vision research.
- 2025/12/24: Finished and published CON(e)VOLUTION β A Walkthrough From LeNet to Vision Transformers.
- 2025/07/16: Presented Master's Dissertation: RAG-Driven Educational Assistant for the Deep Learning and Generative AI Master's at Datamecum.
- 2025/01/27: Started working as a Data Scientist at Vocality.ai.
- 2024/12/01: Solved Advent of Code 2024 challenges.
- 2024/10/30: Built Ryanair Timecapsule β reverse-engineering the Ryanair API to collect daily flight prices.
- 2024/10/17: Completed NLP HuggingFace course.
- 2024/05/28: Completed Practical Deep Learning by fast.ai.
- 2024/05/17: Completed Advanced Learning Algorithms by Andrew Ng (Coursera).
- 2024/05/09: Won 1st place in the Datamecum Datathon (Kaggle-style competition) with an ensemble of Random Forest and XGBoost (AUC 0.9851).
Medium posts & Kaggle notebooks:
- CON(e)VOLUTION β From LeNet to Vision Transformers
- PyTorch Cross-Entropy: The Double Softmax Trap
- Exploratory Data Analysis (EDA) for Python Programmers β Part 1
- Machine Learning Applied to the Design Industry: K-Means for Image Palette Generation
- The Power of Decision Stumps
ML & AI:
- Manifold HyperConnections for CV β Replicating and extending the mHC paper for computer vision (in progress).
- Image Classification (CON(e)VOLUTION) β 10 CNN architectures implemented from scratch in PyTorch, benchmarked on the NEU Surface Defects dataset.
- RAG-Driven Educational Assistant β Full-stack RAG app using LangChain, ChromaDB, and LLMs (GPT-4o-mini, Claude-3-Haiku). Ingested ~364h of video content via Whisper transcriptions. Master's dissertation.
- Ryanair Timecapsule β Reverse-engineered the Ryanair API for daily flight price collection. Open-source Python package with tests and CLI.
- Mohs Hardness EDA β Decision Stump for a Kaggle competition (position 598/1632). Includes a presentation on decision stumps.
- Datamecum Datathon β Binary classification capstone project: EDA, self-organising maps, correlation analysis, ensemble modelling.
Python & Web:
- Weever Watermark β K-Means colour palette extraction deployed as a Flask app. Demo
- LinkedIn Toggler β Selenium automation for repetitive LinkedIn tasks.
- MochaMaps β Coffee shop directory using SQLite, SQLAlchemy, Flask, and Jinja2. Demo
- Typing Thunder β Speed-typing GUI app. Demo
- Morse Code Converter β Command-line morse code translator. Demo
Thank you for visiting my GitHub! Feel free to have a deeper look at my repositories to find more specific projects. Any feedback, suggestions, or tips are always welcome!
I'm always happy for a coffee, a chit-chat, or a discussion about possible collaborations. Drop me an email at mariabalos16@gmail.com or send me a message through LinkedIn.
π View my full CV