KSHITIJ

my projects

  • cloudy (research project)

    • early cloudburst predetermination system
    • signal decomposition: transforming the original signal using EEMD (Empirical Mode Decomposition) into multiple IMFs (Intrinsic Mode Functions)
    • feature extraction: by identifying different frequency components from each IMFs using FCR (Fine to coarse reconstruction)
    • integrated hardware for real-time data: ESP32 with DHT11 & beta rain sensor
    • data pre-processing: datetime transformations, cyclic/seasonal features, sliding window
    • prediction pipeline: LSTM, GRU, CNN-1D, & TFT (Temporal Fusion Transformer)
    • model evaluation: MSE, MAE, RMSE, FALSE NEGATIVE RATE, and F1-score
    • model re-training: DVC
    • NOT YET DEPLOYED......

    tech used: Python, Numpy, Pandas, Matplotlib, Seaborn, Scikit-learn, PyTorch, TensorFlow, Keras, HuggingFace Transformers, LangChain, Firebase

  • adaptive rag for enterprise support systems (research project)

    • low-latency, high-accuracy RAG system for enterprise query resolution
    • semantic search: SBERT embeddings + FAISS dense retrieval & vector store
    • contextual response generation: Llama3.2 2B with document re-ranking, & retrieval + hallucination grading
    • processed 1K+ agricultural/sustainable energy related docs & 100+ queries using spaCy & Tesseract OCR
    • outperformed baselines (like GPT-3.5, BART-only, TF-IDF, and BM25) on F1, ROUGE-L, BLEU, and Faithfulness
    • reduced response latency by 28% (~620 ms/query)
    • pipeline orchestration & re-training: LangChain, FastAPI, DVC
    • NOT YET DEPLOYED......

    tech used: Python, Pandas, spaCy, Tesseract OCR, PyTorch, HuggingFace Transformers, Sentence Transformers, ChromaDB, FAISS, LangChain, FastAPI, Docker, DVC