Sign in to confirm you’re not a bot
This helps protect our community. Learn more
Tony Ojeda, Benjamin Bengfort, Laura Lorenz - Natural Language Processing with NLTK and Gensim
103Likes
9,661Views
2016May 30
Speakers: Tony Ojeda, Benjamin Bengfort, Laura Lorenz In this tutorial, we will begin by exploring the features of the NLTK library. We will then focus on building a language-aware data product - a topic identification and document clustering algorithm from a web crawl of blog sites. The clustering algorithm will use a simple Lesk K-Means clustering to start, and then will improve with an LDA analysis using the popular Gensim library. Slides can be found at: https://speakerdeck.com/pycon2016 and https://github.com/PyCon/2016-slides

PyCon 2016

15.9K subscribers