If playback doesn't begin shortly, try restarting your device.
•
You're signed out
Videos you watch may be added to the TV's watch history and influence TV recommendations. To avoid this, cancel and sign in to YouTube on your computer.
CancelConfirm
Share
An error occurred while retrieving sharing information. Please try again later.
Speakers: Tony Ojeda, Benjamin Bengfort, Laura Lorenz
In this tutorial, we will begin by exploring the features of the NLTK library. We will then focus on building a language-aware data product - a topic identification and document clustering algorithm from a web crawl of blog sites. The clustering algorithm will use a simple Lesk K-Means clustering to start, and then will improve with an LDA analysis using the popular Gensim library.
Slides can be found at: https://speakerdeck.com/pycon2016 and https://github.com/PyCon/2016-slides…...more
Tony Ojeda, Benjamin Bengfort, Laura Lorenz - Natural Language Processing with NLTK and Gensim
103Likes
9,661Views
2016May 30
Speakers: Tony Ojeda, Benjamin Bengfort, Laura Lorenz
In this tutorial, we will begin by exploring the features of the NLTK library. We will then focus on building a language-aware data product - a topic identification and document clustering algorithm from a web crawl of blog sites. The clustering algorithm will use a simple Lesk K-Means clustering to start, and then will improve with an LDA analysis using the popular Gensim library.
Slides can be found at: https://speakerdeck.com/pycon2016 and https://github.com/PyCon/2016-slides…...more