Subhojyoti Mukherjee

I am a research scientist at Adobe Research. My expertise ranges from research and developing algorithms to training machine learning models, Reinforcement Learning, fine-tuning and alignment for LLMs.

Download CV
Email: subhomuk [at] adobe [dot] com

Education

Ph.D. candidate
(Fall 2019 to Feb 2025)
at ECE, University of Wisconsin Madison
advised by Dr. Robert Nowak, Dr. Josiah Hanna, and Dr. Qiaomin Xie

Areas of Research: Reinforcement Learning, Active Learning, incorporating deep active learning strategies for Large Language Models (LLMs), aligning Large Language Models with human feedback (RLHF), and understanding sequential decision-making using transformers (DT).

PhD Thesis: Adaptive Data Collection for Policy Evaluation, Multi-task Learning and LLM Alignment pdf

(Joint) Masters Thesis: Active Sequential Hypothesis Testing with Extension to Active Regression and Multi-armed Bandits pdf
M.S by Research
(2015 to 2018)
at CSE, Indian Institute of Technology (IIT) Madras
advised by Dr. Balaraman Ravindran, and Dr. Nandan Sudarsanam
RISE Lab

Areas of Research: Reinforcement learning, Stochastic and non-stochastic Multi-Armed Bandit settings.

Masters Thesis: Finite-time Analysis of Frequentist Strategies for Multi-armed Bandits pdf
Bachelor of Technology
(2009 to 2013)
at Dept. of Computer Science and Engineering
Meghnad Saha Institute of Technology, Kolkata
under West Bengal University of Technology, India

Research Internships

Amazon AWS AI, Santa Clara, USA
Summer 2024 (full-time)
hosted by Branislav Kveton, Anusha Lalitha
and: Sailik Sengupta, Yifei Ma, Aniket Deshmukh, Gaurush Hiranandani.

Area of Research: Multi-objective alignment for LLMs.
Amazon AWS AI, Santa Clara, USA
Fall 2023 (Part-time)
hosted by Branislav Kveton
and: Yifei Ma, Anusha Lalitha, Kousha Kalantiri, Ge Liu, Aniket Deshmukh, Anoop Deoras.

Area of Research: RLHF with LLMs.
Amazon AWS AI, Santa Clara, USA
Summer 2023 (Full-time)
hosted by Branislav Kveton
and: Yifei Ma, Anusha Lalitha, Ge Liu, Aniket Deshmukh, Anoop Deoras.

Area of Research: Active In-Context Learning with LLMs.
CMU, ECE Dept., Pittsburgh, USA
Summer 2019
hosted by Prof. Gauri Joshi
Area of Research: Structured Bandits.
Adobe Research, San Jose, USA
Spring 2018
hosted by Branislav Kveton
Area of Research: Item recommendation with Ranking and Bandits.
INRIA, SequeL Lab, Lille, France
Fall 2017
hosted by Odalric Maillard
Area of Research: Non-stationary Bandits.

News

2025

2024

2023

2022

2021

2020

2019