Skip to content

NLP Work

Continuing my study in machine learning, I decided to focus on language processing and take a class on NLP. My class focused on learning the various libraries and ML techniques we use to under stand language, and scaling that up in python all the way to deep learning in python. We covered:

  • Foundational NLP Language distinctions like Parts of Speech and word, sentance, and corpora
  • Basic Python usage with NLTK for preprocessing
  • Wordnet and building word relationships
  • N-gram models for language generation
  • Context Free Grammars
  • Numpy, pandas, scikit-learn, and seaborn
  • Naive Bayes and Logistic Regression for NLP
  • Keras for CNN’s, RNN’s, LSTM and GRU
  • Using embeddings along with decoders and encoders

For all of these topics we did various projects to get better at implementing our knowledge and sharing it using jupyter notebooks.

The Projects

If you would like to view the code and notebook work related to these projects they are still posted on github to view! However here are some short summaries of my work in NLP. I value my analysis of attention as an explainability metric if you would like to view it!

  • Wordnets: This is an exploration of how wordnets can reveal complex meanings of words not simply found in the definition
  • N-grams: Just a brief description of ngrams to illustrate their usefulness
  • Netscraping for LLM’s: I used BeautifulSoup to scrape the web for an LLM
  • text-classification.pdf: I used simple Neural Networks with the goal of building a network that could be used to train a network on imitating characters (in this case Rick and Morty’s voice and tone)
  • The Impact of Attention: This short paper summarizes a paper on the impact of a “Is Attention Explanation” and bridges the creation of modern GPTs into the now pressing Alignment problem and other consequences of modern attention. A personal favorite project where I explored the quakes in AI research sudden prominence of new AI techniques.
  • More Rick And Morty: I liked to have fun, so I did a take two on classifying text based on the Rick and Morty voice. However, it came out more on a study on how you can’t squeeze data to work your use case. You just have to work with the data you have.

I came out of this class really wanting to do more research, but I did not want to jump right into a masters. Perhaps one day, but I need a break after 16 or so years of schooling. I do feel very comfortable in data science, and I value that greatly!


Last update : October 29, 2023
Created : October 26, 2023