THE PUTNAM WAY JOURNAL
home | research | activities | software | personal | funny stuff

p r e v

Research Interests

How to combine the best aspects of robust statistical latent "theme" generation models of text like topic models which are mostly unsupervized with supervized Natural Language feature generation models for better document organization and simultaneous multi-document snippet extraction

In our most recent work we tackled this exact problem by modeling documents tagged from two different perspectives It was good to see that supervized models can be bettered independently of unsupervized models while still aiding the latter for better document organization and even providing faceted document navigation


What I Try To Do

My C.V. : pdf
LDA from two perspectives

Academic activities

(+) I am an RA from Spring 2011
(+) My previous TA assignments can be found here


Frequently Visited Sites

Acitivites Swimming


Important references

(+) An all time hit reference on matrix algebra for statistics can be found in Kaare Brandt Petersen's and Michael Syskind Pedersen's Matrix Cookbook.

(+) Jonathan Shewchuk's painless conjugate gradient


Quite a feat!

Take a read at my uncle Anadish's story in Wikipedia and in his homepage



Site under construction

Caution This tiny part of the internet is under constant repairs. Slow down as you browse and check back frequently - State Law.

Disclaimer All views expressed in these subset of web pages are my own and does not necessarily reflect the views of my department or school.

n e x t

Can we re-structure the shelves to accomodate un-shelved plates easily?

What I Have Been Doing Recently

Applying unsupervized statistical topic models to different problems of interest

Codes used in "Simultaneous Joint and Conditional Modeling of Documents Tagged from Two Perspectives (CIKM 2011)"

Special thanks to Jordan Boyd Graber for an important hint on parameter regularization in TagLDA and its extensions

Latent Dirichlet Allocation in Slow Motion

I had spent some time in drafing what goes through LDA's mind when it crunches the data of counts. Journey of parametric hierarchical multinomial generative mixture models a.k.a LDA/TagLDA

Datasets I Have Recently Used

Multidomain sentiment dataset based on Amazon Reviews

Find more datasets from Sandbox

Yahoo! Answers Dataset

A small sample from Yahoo! Answers collected for summarization - YAHOO! Answers Jul-Aug-2009 DataSet

AND09 Dataset

The Blog data from the US Presidential Election campaign from August to October 2008 (All data are verified by visiting the blog sites) - AND09 Obama Mccain Speeches and Reactions in Blogs Dataset
site stats email: pdas3 at [university domain]