my projects
- early cloudburst predetermination system
- signal decomposition: transforming the original signal using EEMD (Empirical Mode Decomposition) into multiple IMFs (Intrinsic Mode Functions)
- feature extraction: by identifying different frequency components from each IMFs using FCR (Fine to coarse reconstruction)
- integrated hardware for real-time data: ESP32 with DHT11 & beta rain sensor
- data pre-processing: datetime transformations, cyclic/seasonal features, sliding window
- prediction pipeline: LSTM, GRU, CNN-1D, & TFT (Temporal Fusion Transformer)
- model evaluation: MSE, MAE, RMSE, FALSE NEGATIVE RATE, and F1-score
- model re-training: DVC
- NOT YET DEPLOYED......
tech used: Python, Numpy, Pandas, Matplotlib, Seaborn, Scikit-learn, PyTorch, TensorFlow, Keras, HuggingFace Transformers, LangChain, Firebase
- low-latency, high-accuracy RAG system for enterprise query resolution
- semantic search: SBERT embeddings + FAISS dense retrieval & vector store
- contextual response generation: Llama3.2 2B with document re-ranking, & retrieval + hallucination grading
- processed 1K+ agricultural/sustainable energy related docs & 100+ queries using spaCy & Tesseract OCR
- outperformed baselines (like GPT-3.5, BART-only, TF-IDF, and BM25) on F1, ROUGE-L, BLEU, and Faithfulness
- reduced response latency by 28% (~620 ms/query)
- pipeline orchestration & re-training: LangChain, FastAPI, DVC
- NOT YET DEPLOYED......
tech used: Python, Pandas, spaCy, Tesseract OCR, PyTorch, HuggingFace Transformers, Sentence Transformers, ChromaDB, FAISS, LangChain, FastAPI, Docker, DVC