THE PUTNAM WAY JOURNAL
home | research | activities | software | personal | funny stuff

p r e v

Publications

3) Topic Models on Ubiquitous Document Structure - Documents Tagged from Two Perspectives

This paper explores correspondence and mixture topic modeling of documents tagged from two different perspectives. The models proposed in this paper are novel in: (i) the consideration of two different tag perspectives - a document level tag perspective that is relevant to the document as a whole and a word level tag perspective pertaining to each word in the document; (ii) the attribution of latent topics with word level tags and labeling latent topics with images in case of multimedia documents; and (iii) discovering the possible correspondence of the words to document level tags.
(+) Simultaneous Joint and Conditional Modeling of Documents Tagged from Two Perspectives - Pradipto Das, Rohini Srihari and Yun Fu, 20th ACM Conference on Information Knowledge and Management, Glasgow, Scotland, UK, 24th-28th October 2011 paper
ERRATA: The typos in the expressions in the lines just above equations (17) and (20) have been corrected in this pdf - pi's have been replaced with beta's in the RHSs. These are also commented (see "Comments" tab/section) in the ACM DL repository for this paper



2) Topic Model based Multi-document Summarization

This paper addresses the issue of building a Latent Dirichlet Allocation style topic model based summarization system. We estimate the model parameters and infer the summaries given query words (like: "why are sugar substitutes bad for you?") from relevant documents in very short time using variational Bayesian inference.
(+) Learning to Summarize Using Coherence - Pradipto Das and Rohini Srihari, NIPS Workshop on Applications of Topic Models: Text and Beyond, Whistler, BC, 2009. (short paper)
paper

(+) Longer version of this paper appeared in TAC09 proceedings. [NOTE: Slightly modified in here] paper

(+) Talk at TAC on Nov 16, 2009 and Poster at TAC on Nov 17, 2009 [NOTE: Slightly modified in here] presentation


1) First Contact with Blogs

Towards which direction would a blogger be oriented? The speaker or the contents of his/her speech? (A very tricky and hard problem - especially if generalized to data in any language)
(+) Discovering Voter Preferences in Blogs using Mixtures of Topic Models - Pradipto Das, Rohini Srihari and Smruthi Mukund, AND'09, July 23-24, 2009, Barcelona, Spain [Noisy Text Data Analytics Workshop in conjunction with ICDAR'09] paper presentation

Major research events attended

Fall 2011

Gaithersburg, MD, USA > Text Analysis Conference 2011

Summer 2010

Intern at Janya Inc. Responsible for integrating an extensible topic modeling framework within the text analytics product called SEMANTEX.

Previously

Previous Research > Visiting Research Fellow at the "Center for Soft Computing Research" at Indian Statistical Institute. Mostly worked on soft computing clustering algorithms.
Adhoc Datasets > Visit the datasets page.

n e x t

site stats email: pdas3 at [university domain]