Current Research Projects
Consensus Based Machine Learning | Healthcare Analytics | Digital Humanities Research | Agentic AI and Recommendation Systems | Accounting Fraud Detection. | Mining Incident Streams from Social Media. |
Consensus Based Machine Learning - Development of efficient distributed algorithms for machine learning

Related Projects
    Distributed Primal-Dual Methods
    The goal is to construct scalable, efficient, distributed, Primal-Dual algorithms.

    Current students
  • Davoud Moradi
  • Naoman Rukdikar

  • Distributed Relational Learning
  • Haimonti Dutta and Ashwin Srinivasan. "Consensus-based modeling using distributed feature construction with ILP." Machine Learning 107 (5), 825-858

  • Distributed Support Vector Machines
  • Haimonti Dutta, "Games, Auctions and Consensus Based Machine Learning", FLAIRS 2019.
  • Haimonti Dutta, "The effect of stochastic approximations on a Gossip bAseD subGradiEnt solver for Linear SVMs", FLAIRS 2019.
  • Haimonti Dutta and Nitin Natraj, "GADGET SVM: a Gossip-bAseD sub-GradiEnT SVM Solver", Download
  • Releasing GADGET SVM ver1.0
    Github Repo: Access Github Repo
  • Chase Hensel and Haimonti Dutta, "GERMS: a distributed sub-Gradient ERM Solver", 4th Annual Machine Learning Symposium at the New York Academy of Sciences (NYAS), New York, November, 2009.
  • Chase Hensel and Haimonti Dutta, "GADGET SVM: a Gossip-bAseD sub-GradiEnT SVM Solver", International Conference on Machine Learning (ICML), Numerical Mathematics in Machine Learning Workshop, Montreal, Quebec, 2009. Watch Video

  • Distributed Linear Programming
  • Xianshu Zhu, Tushar Mahule, Haimonti Dutta, Sugandha Arora, Hillol Kargupta, Kirk D. Borne: Peer-to-peer distributed text classifier learning in PADMINI. Statistical Analysis and Data Mining 5 (5): 446-462, 2012.
  • Haimonti Dutta,"A Randomized Gossip-based Algorithm for Classification on Peer-to-Peer Net- works", In Proceedings of the NIPS Workshop on Big Learning: Algorithms, Systems, and Tools for Learning at Scale, Grenada, Spain, Dec 2011
  • Haimonti Dutta and Hillol Kargupta, "Distributed Linear Programming and Resource Management for Data Mining in Distributed Environments", 10th International Workshop on High Performance Data Mining (HPDM) held in conjunction with the International Conference on Data Mining (ICDM), Pisa Italy.
  • Haimonti Dutta and Ananda Mathur, "Distributed Optimization Strategies for Mining on Peer-to-Peer Networks", Accepted for publication in International Conference on Machine Learning and Applications (ICMLA), 2008. (Nominated for the Best Paper Award)
Healthcare Analytics - Design and development of healthcare solutions using Machine Learning, Deep and Reinforcement Learning, Natural Language Processing and Generative AI

Related Projects
    Clinical Outcome Prediction
    Electronic Health Records (EHRs) have been widely adopted in the last few decades, leading to an increase in availability of digital healthcare data (Navaz et al., 2022). It usually encompasses a full range of data relevant to a patient’s care such as demographics, medical history, immunization records, problems, medications, physician’s notes, clinically relevant updates, imaging reports, billing, claims and insurance information. In this research, we are interested in accurately predicting clinical outcomes (such as mortality, re-admissions, diagnosis, and procedures, length of stay) of hospital admissions using EHRs.

  • Shahrzad Khanizadeh, Justin Kwok, Haimonti Dutta, "Predicting Long Length of Hospital Stays from Clinical Notes and Structured EHRs", INFORMS Annual Meeting, Atlanta GA, 2025.
  • Patrick Zhang and Haimonti Dutta, "Data cleaning and pre-processing of Electronic Health Records (EHRs) collected during COVID-19 using R", SUNY Undergraduate Research Conference (SUNY SURC), Binghamton University, 2025
  • Haimonti Dutta, "Data driven approaches to estimating hospital length of stay during COVID", Second International Business Analytics Conference(IBAC), Fredonia, 2025.
  • Haimonti Dutta, "Using Patient Similarity Networks to Estimate Hospital Length of Stay", Decision Sciences Institute (DSI) Annual Pedagogy Conference, 2024.

  • Our community partners include:
  • Buffalo Endovascular and Vascular Surgical Associates https://bevsapractice.com/
  • Roswell Park Comprehensive Cancer Center https://www.roswellpark.org/

  • Current students working on this project:
  • Shahrzad Khanizadeh (First year Ph.D. student, Department of Management Science and Systems, UB)
  • Justin Kwok (Senior, Department of Computer Science and Economics, UB)
  • Siddhi Sunil Nalawade (M.S., Data Science, UB)
  • Fyrose Nowar (M.S., Online Business Analytics, UB)
  • Steven Adebisi (Senior, Department of Computer Science, UB). Funded by https://www.buffalo.edu/cpmc/cstep.html(
Digital Humanities Research -- Machine Learning on the Historic Newspaper Archive of the New York Public Library

This research was funded by the National Endowment for Humanities and the project is designated as a "We the people" project.

Computers may have defeated humans in chess and arithmetic, but there are many areas where the human mind still excels such as visual cognition and language processing (Comm. of ACM, Vol 52, No 3, Mar. 2009). If one mind is good, it has been argued that several minds are likely to be superior in certain tasks than individuals and even experts. This project aims to leverage the wisdom of the crowds (von Ahn, 2008) to collaboratively tag historical newspaper articles in the holdings of the New York Public Library (NYPL). Patrons and scholars will be encouraged to generate custom tags for articles they read and use often; these will be integrated into a metadata library and evaluated for their contribution to improving retrieval of documents. Novel machine learning algorithms will be designed for automatic categorization of newspaper articles. The creation and analysis of this corpus will enable advanced search mechanisms on these holdings making them more useful to the general public.

Webpage

Related Projects: Chronicling America, National Digital Newspaper Program

Related Publications
  • Haimonti Dutta, Jayashree Chandrasekaran, Kaushik Panneerselvam. Person Name Disambiguation based on Profession. INFORMS Annual Meeting, Workshop on Data Mining and Decision Analytics, 2018.
  • Aayushee Gupta, Haimonti Dutta, Srikanta Bedathur and Lipika Dey. A Machine Learning Approach to Quantitative Prosopography Download
  • Aayushee Gupta and Haimonti Dutta. A Machine Learning Framework for Quantitative Prosopography, Grace Hopper Celebration in India, December 2015.
  • Aayushee Gupta and Haimonti Dutta. Evaluation of Spell Correction on Noisy OCR Data. INFORMS Workshop on Data Mining and Analytics at INFORMS Annual Meeting, Philadelphia, October 2015.
  • Haimonti Dutta and William Chan, "Using Community Structure Detection to Rank Annotators when Ground Truth is subjective", NIPS Workshop on Human Computation in Science and Computational Sustainability, Lake Tahoe, Dec 7-8, 2012.
  • Haimonti Dutta, Rebecca J. Passonneau, Austin Lee, Axinia Radeva, Boyi Xie, David Waltz and Barbara Taranto, "Learning Parameters of the K-Means Algorithm from Subjective Human Annotation.", The 24th International FLAIRS Conference, Special Track on Data Mining, Palm Beach, FL. May 18-20, 2011.
  • Austin Lee, Haimonti Dutta, Rebecca Passonneau, David Waltz and Barbara Taranto, "Topic Identification from Historic Newspaper Articles of the New York Public Library: A Case Study", 5th Annual Machine Learning Symposium, NYAS, 2010.
Agentic AI and Recommendation System - Design large scale Machine Learning systems (such as recommendation systems, pricing models) using Agentic AI

Related Projects
    Agentic AI for Pricing in Recommendation Systems
    - Design of autonomous AI agents to solve pricing problems in recommendation systems.
    Current students working on this project:
  • Jin Dai (First year Ph.D. student, Department of Management Science and Systems, UB)
  • Folk-art Recommendation Systems
    A description of this project is available here
Mining Incident Streams From Social Media - The goal of this project is to develop mobile apps for situational awareness during emergency situations ( such as shooting incidents, weather related catastrophes etc.) which uses sophisticated machine learning and natural language processing algorithms.

Related Publications
  • H Dutta, H Kwon, HR Rao. A System for Intergroup Prejudice Detection: The Case of Microblogging under Terrorist Attacks. Decision Support Systems. Link pdf
    Media Impressions: Campus Reform, UBNow.
Past Research Projects