NLPx

Tales of Data Science

Conditional Random Fields (CRF): Short Survey

On a picture above you may see a random field.

Currently, many of us are overwhelmed with mighty power of Deep Learning. We start to forget about humble graphical models. CRF is not so trendy as LSTM, but it is robust, reliable and worth noting.

In this post, you will find a short summary about CRF (aka Conditional Random Fields) – what is this thing, what is it for and some interesting facts. Enjoy!

Read More

12,055 total views, 8 views today

A tale about LDA2vec: when LDA meets word2vec

catdog_word2vec_cropped

UPD: regarding the very useful comment by Oren, I see that I did really cut it too far describing differencies of word2vec and LDA – in fact they are not so different from algorithmic point of view. So I corrected this post. Errare humanum est, stultum est in errore perseverare, you know. Also, now I really recommend you to read this presentation of Yoav Goldberg

A few days ago I found out that there had appeared lda2vec (by Chris Moody) – a hybrid algorithm combining best ideas from well-known LDA (Latent Dirichlet Allocation) topic modeling algorithm and from a bit less well-known tool for language modeling named word2vec.

You can also read this text in Russian, if you like.

And now I’m going to tell you a tale about lda2vec and my attempts to try it and compare with simple LDA implementation (I used gensim package for this). So, once upon a time…

Read More

27,168 total views, 17 views today