Skip to content
View mbalos16's full-sized avatar
🚧
Work in progress...
🚧
Work in progress...

Block or report mbalos16

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mbalos16/README.md

Hello πŸ‘©β€πŸ’»!

Fancy seeing you here 😌 !

I'm Maria Balos, a data scientist based in Cambridge, UK. I currently work at Vocality.ai, training and deploying multilingual TTS models in production. When I'm not behind a screen, you'll probably find me next to a cup of coffee, solving LeetCode problems, or experimenting with personal projects.

Please grab a coffee and feel welcome to this small corner of my work!

LeetCode Stats

LeetCode Stats

Right now I am working on:

  • Training and deploying TTS models across multiple languages (en-GB, es-ES, eu-ES, sl-SI) for enterprise clients at Vocality.ai.
  • Replicating results from the Manifold HyperConnections (mHC) paper and exploring its applications to computer vision.
  • Sharpening my coding skills with LeetCode (300+ problems solved).

Last achievements:

  • 2026/01: Started replicating the Manifold HyperConnections paper for computer vision research.
  • 2025/12/24: Finished and published CON(e)VOLUTION – A Walkthrough From LeNet to Vision Transformers.
  • 2025/07/16: Presented Master's Dissertation: RAG-Driven Educational Assistant for the Deep Learning and Generative AI Master's at Datamecum.
  • 2025/01/27: Started working as a Data Scientist at Vocality.ai.
  • 2024/12/01: Solved Advent of Code 2024 challenges.
  • 2024/10/30: Built Ryanair Timecapsule β€” reverse-engineering the Ryanair API to collect daily flight prices.
  • 2024/10/17: Completed NLP HuggingFace course.
  • 2024/05/28: Completed Practical Deep Learning by fast.ai.
  • 2024/05/17: Completed Advanced Learning Algorithms by Andrew Ng (Coursera).
  • 2024/05/09: Won 1st place in the Datamecum Datathon (Kaggle-style competition) with an ensemble of Random Forest and XGBoost (AUC 0.9851).

Medium posts & Kaggle notebooks:

Projects

ML & AI:

  • Manifold HyperConnections for CV β€” Replicating and extending the mHC paper for computer vision (in progress).
  • Image Classification (CON(e)VOLUTION) β€” 10 CNN architectures implemented from scratch in PyTorch, benchmarked on the NEU Surface Defects dataset.
  • RAG-Driven Educational Assistant β€” Full-stack RAG app using LangChain, ChromaDB, and LLMs (GPT-4o-mini, Claude-3-Haiku). Ingested ~364h of video content via Whisper transcriptions. Master's dissertation.
  • Ryanair Timecapsule β€” Reverse-engineered the Ryanair API for daily flight price collection. Open-source Python package with tests and CLI.
  • Mohs Hardness EDA β€” Decision Stump for a Kaggle competition (position 598/1632). Includes a presentation on decision stumps.
  • Datamecum Datathon β€” Binary classification capstone project: EDA, self-organising maps, correlation analysis, ensemble modelling.

Python & Web:

Final Notes & Contact ☎️

Thank you for visiting my GitHub! Feel free to have a deeper look at my repositories to find more specific projects. Any feedback, suggestions, or tips are always welcome!

I'm always happy for a coffee, a chit-chat, or a discussion about possible collaborations. Drop me an email at mariabalos16@gmail.com or send me a message through LinkedIn.

πŸ“„ View my full CV

Pinned Loading

  1. ryanair_timecapsule ryanair_timecapsule Public

    Ryanair's API was reverse-engineered to collect daily flight prices and train machine learning models to forecast price changes.

    Python 1

  2. linkedin_toggler linkedin_toggler Public

    Selenium script in Python that automate repetitive LinkedIn maintenance tasks.

    Python

  3. python_100_days_of_code python_100_days_of_code Public

    This repository showcases my Python learning journey and includes 100+ solved exercises utilizing various libraries.

    Jupyter Notebook

  4. datamecum_tfm datamecum_tfm Public

    Master's dissertation for the DL and GenAI Master's degree at Datamecum. RAG for enhancing education.

    Python

  5. image_classification image_classification Public

    Using the neu_surface_defect_database to understand cnn and image_classification.

    Jupyter Notebook 2 1