अपनी प्राथमिकता निर्धारित करें
फ़ॉन्ट स्केलिंग
अप्राप्ति
पृष्ठ अनुमापन
अप्राप्ति
रंग समायोजन
भा.प्रौ.सं.कानपुर
IITK

Leveraging Offline Public Data in Online Differently Private Policy Fine-Tuning (Prof. Sayak Chowdhury, Computer Science & Engineering)

Modern machine learning models often train on offline data and then learn from online user interactions, raising privacy concerns—especially during fine-tuning stages that involve sensitive data. Differential Privacy (DP) mitigates these risks by adding noise to training, though this can hurt accuracy. Using offline public data helps reduce this trade-off. This project aims to design DP-compliant bandit and reinforcement learning algorithms using such data, with theoretical performance guarantees, and compare them to offline and online baselines. It also seeks to develop DP policy fine-tuning for aligning large language models, ultimately enabling privacy-preserving, trustworthy AI systems such as secure chatbots.

अन्य विशिष्ट अनुसंधान