Current Research Projects
Consensus Based Machine Learning | Digital Humanities Research | Mining Incident Streams from Social Media. |
Consensus Based Machine Learning - Development of efficient distributed algorithms for machine learning

Related Publications
    Distributed Relational Learning
  • Haimonti Dutta and Ashwin Srinivasan. "Consensus-based modeling using distributed feature construction with ILP." Machine Learning 107 (5), 825-858

  • Distributed Support Vector Machines
  • Haimonti Dutta, "Games, Auctions and Consensus Based Machine Learning", FLAIRS 2019.
  • Haimonti Dutta, "The effect of stochastic approximations on a Gossip bAseD subGradiEnt solver for Linear SVMs", FLAIRS 2019.
  • Haimonti Dutta and Nitin Natraj, "GADGET SVM: a Gossip-bAseD sub-GradiEnT SVM Solver", Download
  • Releasing GADGET SVM ver1.0
    Github Repo: Access Github Repo
  • Chase Hensel and Haimonti Dutta, "GERMS: a distributed sub-Gradient ERM Solver", 4th Annual Machine Learning Symposium at the New York Academy of Sciences (NYAS), New York, November, 2009.
  • Chase Hensel and Haimonti Dutta, "GADGET SVM: a Gossip-bAseD sub-GradiEnT SVM Solver", International Conference on Machine Learning (ICML), Numerical Mathematics in Machine Learning Workshop, Montreal, Quebec, 2009. Watch Video


  • Distributed Linear Programming
  • Xianshu Zhu, Tushar Mahule, Haimonti Dutta, Sugandha Arora, Hillol Kargupta, Kirk D. Borne: Peer-to-peer distributed text classifier learning in PADMINI. Statistical Analysis and Data Mining 5 (5): 446-462, 2012.
  • Haimonti Dutta, “A Randomized Gossip-based Algorithm for Classification on Peer-to-Peer Net- works", In Proceedings of the NIPS Workshop on Big Learning: Algorithms, Systems, and Tools for Learning at Scale, Grenada, Spain, Dec 2011
  • Haimonti Dutta and Hillol Kargupta, "Distributed Linear Programming and Resource Management for Data Mining in Distributed Environments", 10th International Workshop on High Performance Data Mining (HPDM) held in conjunction with the International Conference on Data Mining (ICDM), Pisa Italy.
  • Haimonti Dutta and Ananda Mathur, "Distributed Optimization Strategies for Mining on Peer-to-Peer Networks", Accepted for publication in International Conference on Machine Learning and Applications (ICMLA), 2008. (Nominated for the Best Paper Award)
Digital Humanities Research -- Machine Learning on the Historic Newspaper Archive of the New York Public Library

This research was funded by the National Endowment for Humanities and the project is designated as a "We the people" project.

Computers may have defeated humans in chess and arithmetic, but there are many areas where the human mind still excels such as visual cognition and language processing (Comm. of ACM, Vol 52, No 3, Mar. 2009). If one mind is good, it has been argued that several minds are likely to be superior in certain tasks than individuals and even experts. This project aims to leverage the wisdom of the crowds (von Ahn, 2008) to collaboratively tag historical newspaper articles in the holdings of the New York Public Library (NYPL). Patrons and scholars will be encouraged to generate custom tags for articles they read and use often; these will be integrated into a metadata library and evaluated for their contribution to improving retrieval of documents. Novel machine learning algorithms will be designed for automatic categorization of newspaper articles. The creation and analysis of this corpus will enable advanced search mechanisms on these holdings making them more useful to the general public.

Webpage

Related Projects: Chronicling America, National Digital Newspaper Program

Related Publications
  • Haimonti Dutta, Jayashree Chandrasekaran, Kaushik Panneerselvam. Person Name Disambiguation based on Profession. INFORMS Annual Meeting, Workshop on Data Mining and Decision Analytics, 2018.
  • Aayushee Gupta, Haimonti Dutta, Srikanta Bedathur and Lipika Dey. A Machine Learning Approach to Quantitative Prosopography Download
  • Aayushee Gupta and Haimonti Dutta. A Machine Learning Framework for Quantitative Prosopography, Grace Hopper Celebration in India, December 2015.
  • Aayushee Gupta and Haimonti Dutta. Evaluation of Spell Correction on Noisy OCR Data. INFORMS Workshop on Data Mining and Analytics at INFORMS Annual Meeting, Philadelphia, October 2015.
  • Haimonti Dutta and William Chan, "Using Community Structure Detection to Rank Annotators when Ground Truth is subjective", NIPS Workshop on Human Computation in Science and Computational Sustainability, Lake Tahoe, Dec 7-8, 2012.
  • Haimonti Dutta, Rebecca J. Passonneau, Austin Lee, Axinia Radeva, Boyi Xie, David Waltz and Barbara Taranto, "Learning Parameters of the K-Means Algorithm from Subjective Human Annotation.", The 24th International FLAIRS Conference, Special Track on Data Mining, Palm Beach, FL. May 18-20, 2011.
  • Austin Lee, Haimonti Dutta, Rebecca Passonneau, David Waltz and Barbara Taranto, "Topic Identification from Historic Newspaper Articles of the New York Public Library: A Case Study", 5th Annual Machine Learning Symposium, NYAS, 2010.
Mining Incident Streams From Social Media - The goal of this project is to develop mobile apps for situational awareness during emergency situations ( such as shooting incidents, weather related catastrophes etc.) which uses sophisticated machine learning and natural language processing algorithms.

Related Publications
  • H Dutta, H Kwon, HR Rao. A System for Intergroup Prejudice Detection: The Case of Microblogging under Terrorist Attacks. Decision Support Systems. Link pdf
    Media Impressions: Campus Reform, UBNow.
Past Research Projects