Skip to content
View kimble125's full-sized avatar

Block or report kimble125

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kimble125/README.md

Hi, I'm Hoyeon (호연) 👋

AI Engineer passionate about Retrieval-Augmented Generation (RAG) and document automation: OCR → retrieval → grounded answers.

What I focus on:

  • RAG pipelines with Azure (retrieval, embeddings, API integration)
  • OCR automation (Azure Document Intelligence, PaddleOCR)
  • Document-to-knowledge workflows and evaluation

🔭 Representative Projects

  • HistPath (Team Project, Sesac)
    OCR of Joseon Annals (조선왕조실록), integrated with Azure AI Search and RAG chatbot (strict vs. creative modes).
    📄 Final Presentation (PDF)
    🔗 Repository

  • PoliSight (Team Project, Sesac)
    National Assembly attendance data parser, large-scale PDF OCR, data cleaning, and RAG-ready pipeline.
    📄 Mid-term Report (PDF)
    🔗 Parser Repo

  • rag-azure-starter (Personal)
    Minimal RAG setup with Azure OpenAI + Cosmos DB (or Azure AI Search).

  • rag-eval-harness (Personal)
    Automated evaluation for RAG pipelines (accuracy, faithfulness, evidence exposure).


💼 Career

  • TheSunhan, Inc. (더선한 주식회사) – Data Engineer/Analyst (Intern) · 2024.11–2024.12
    Built ETL → dashboard for B2B service The Peak; wrote a research article on HyDE for RAG (theory & applications). 📄 Article: Enhancing RAG with HyDE – Theory & Applications

  • Strasse (개인사업) – Founder/Operator, E-commerce · 2021.05–2024.10
    Ran a women’s plus-size footwear shop end-to-end (product sourcing, operations, CS). Practiced data-driven decisions


📚 Education & Training

  • Microsoft AI Engineering Program (청년취업사관학교) · 2025.06–2025.08
    Learned Azure-based AI solution development; built RAG and deployed pipelines via Azure ML/Functions projects.

  • Salesforce Tableau Bootcamp · 2024.10–2024.11
    Advanced dashboards and data storytelling with Tableau.

  • Sparta Coding Club – Data Analysis Bootcamp · 2024.06–2024.11
    SQL, Python, ML, statistics; four domain projects.


🧩 Tech Stack & Interests

  • OCR: Azure Document Intelligence, PaddleOCR
  • Vector Search: Azure AI Search, CosmosDB
  • LLM Engineering: Embeddings, GPT-based apps
  • Backend/API: FastAPI, Azure Functions
  • UI/Prototyping: Gradio, Streamlit
  • Visualization: Tableau, Plotly/Matplotlib (Python)
  • Data: PostgreSQL, CI/CD on Azure

Also interested in broader AI engineering domains beyond RAG, including NLP, data pipelines, and applied ML.


📫 Contact

Pinned Loading

  1. AutoAuthor AutoAuthor Public

    AutoAuthor — AI-powered content planning & SEO keyword analyzer for Korean bloggers | 콘텐츠 화제성 탐지 + SEO 키워드 분석 + AI 기획안 자동 생성

    Python

  2. AutoInvest AutoInvest Public

    A personalized AI-driven investment data pipeline and daily economic briefing system. This repository automates the collection of global market indicators (KOSPI, S&P500, Tech Stocks), calculates t…

    Python

  3. causal-inference-lab causal-inference-lab Public

    Learning notes on causal inference & Bayesian statistics for marketing analytics. Based on Pseudo-Lab Marketing Science study (2025)

    Jupyter Notebook

  4. PMIK-sns-analysis PMIK-sns-analysis Public

    PM-International Korea SNS Data crawling and alalysis

    HTML

  5. cutmaster-webtoon-analysis cutmaster-webtoon-analysis Public

    Data Analysis Portfolio: Discovering Webtoon Success DNA and Building a What-If Simulator for Growth Actions.

    Python

  6. movie-club-ticket-notifier movie-club-ticket-notifier Public

    🎬 Telegram bot for movie clubs - CGV IMAX booking alerts, film festival notifications, and universal web change detection. 영화 동아리 예매 알림 봇

    Python