My Projects

LLM Vulnerabilities in RAG Systems

Featured
  • Exposed 40% performance drops in RAG systems via poisoned vector databases; proposed mitigation strategies.
  • Built a scalable fine-tuning setup for LLaMA-2 using LoRA/QLoRA and 4-bit quantization.
  • Developed a semantic QA pipeline with ChromaDB, InstructorEmbedding, and LangChain integration.
  • Streamlined large-scale document processing with PyPDFLoader and SentenceTransformers.
  • Technologies:
    Python Hugging Face LangChain ChromaDB InstructorEmbedding RAG Systems

Emotion-Aware Chatbot

In Progress
  • Developing a real-time face detection and emotion classification system using OpenCV, Streamlit, and ResNet SSD with EfficientNetB0.
  • Converted EfficientNetB0 to TensorFlow Lite with float16 quantization for efficient mobile deployment.
  • Designed a Streamlit interface with live webcam, bounding box toggling, and confidence threshold tuning.
  • Integrating a Hugging Face-powered chatbot using LangChain to adapt conversations based on detected emotions.
  • Containerizing the full application with Docker for reproducible, cross-platform deployment.
  • Technologies:
    Python OpenCV TensorFlow Streamlit Gradio Hugging Face LangChain Docker

Optimizing Question-Answering in LLMs

  • Developed a medical QA assistant using Mistral 7B v0.2, with LoRA/QLoRA fine-tuning and RAG/RAFT techniques for enhanced response quality.
  • Built a retrieval-augmented pipeline to supply domain-specific medical context during model inference.
  • Fine-tuned the model on curated medical datasets (MedQuad, MedicalQA), boosting answer accuracy from 84% to over 95%.
  • Technologies:
    Python Mistral 7B Hugging Face LangChain LoRA QLoRA RAFT

Diabetic Retinopathy Detection

  • Built a deep learning model with Inception V3 (TensorFlow/Keras) using transfer learning to classify retinal images into five severity categories.
  • Achieved 79% accuracy on the APTOS 2019 Blindness Detection dataset through model fine-tuning and hyperparameter optimization.
  • Enhanced diagnostic performance by applying preprocessing techniques such as Gaussian blur, CLAHE, and edge detection (Sobel, Canny) with OpenCV.
  • Technologies:
    Python TensorFlow Keras OpenCV Inception V3

Machine Learning-based Surge Pricing Predictor

  • Built machine learning models (XGBoost, Random Forest, SVM, Neural Networks) to predict price surges in hourly market data.
  • Achieved 76% precision by fine-tuning models on historical pricing patterns and market movement trends.
  • Applied Principal Component Analysis (PCA) for feature selection and dimensionality reduction to improve model performance.
  • Technologies:
    Python XGBoost Random Forest SVM Neural Networks Scikit-learn PCA