AI Engineer passionate about Retrieval-Augmented Generation (RAG) and document automation: OCR → retrieval → grounded answers.
What I focus on:
- RAG pipelines with Azure (retrieval, embeddings, API integration)
- OCR automation (Azure Document Intelligence, PaddleOCR)
- Document-to-knowledge workflows and evaluation
-
HistPath (Team Project, Sesac)
OCR of Joseon Annals (조선왕조실록), integrated with Azure AI Search and RAG chatbot (strict vs. creative modes).
📄 Final Presentation (PDF)
🔗 Repository -
PoliSight (Team Project, Sesac)
National Assembly attendance data parser, large-scale PDF OCR, data cleaning, and RAG-ready pipeline.
📄 Mid-term Report (PDF)
🔗 Parser Repo -
rag-azure-starter (Personal)
Minimal RAG setup with Azure OpenAI + Cosmos DB (or Azure AI Search). -
rag-eval-harness (Personal)
Automated evaluation for RAG pipelines (accuracy, faithfulness, evidence exposure).
-
TheSunhan, Inc. (더선한 주식회사) – Data Engineer/Analyst (Intern) · 2024.11–2024.12
Built ETL → dashboard for B2B service The Peak; wrote a research article on HyDE for RAG (theory & applications). 📄 Article: Enhancing RAG with HyDE – Theory & Applications -
Strasse (개인사업) – Founder/Operator, E-commerce · 2021.05–2024.10
Ran a women’s plus-size footwear shop end-to-end (product sourcing, operations, CS). Practiced data-driven decisions
-
Microsoft AI Engineering Program (청년취업사관학교) · 2025.06–2025.08
Learned Azure-based AI solution development; built RAG and deployed pipelines via Azure ML/Functions projects. -
Salesforce Tableau Bootcamp · 2024.10–2024.11
Advanced dashboards and data storytelling with Tableau. -
Sparta Coding Club – Data Analysis Bootcamp · 2024.06–2024.11
SQL, Python, ML, statistics; four domain projects.
- OCR: Azure Document Intelligence, PaddleOCR
- Vector Search: Azure AI Search, CosmosDB
- LLM Engineering: Embeddings, GPT-based apps
- Backend/API: FastAPI, Azure Functions
- UI/Prototyping: Gradio, Streamlit
- Visualization: Tableau, Plotly/Matplotlib (Python)
- Data: PostgreSQL, CI/CD on Azure
Also interested in broader AI engineering domains beyond RAG, including NLP, data pipelines, and applied ML.
- 📧 Email: hoykim125@gmail.com
- 💼 LinkedIn: linkedin.com/in/hoyeon-kim125