Ask HN: Suggestions for improving current sentiment analysis algorithms?

1 point

11 years ago

Hello everyone,

My group is working on our thesis project for Hack Reactor, and one major aspect of our project is exploring and trying to improve upon sentiment analysis, in our case specifically for tweets.

Of course, natural language processing is something with decades of research behind it, so expecting to make something truly new and better is relatively unlikely. Nevertheless, our goal is to do what we can, build something we can be proud of, and build something that could be at least of some value to someone.

Our approach to sentiment analysis currently utilizes the concept of "layers." It seems that the sentiment analysis libraries we have found so far merely find out whether certain words are likely to be positive or negative in sentiment. Of course, this algorithm stops working as soon as language becomes even remotely complex or interesting.

One of our "layers" that seems to work especially for tweets is an "emoticon layer," where we identify emojis used in tweets to help in sentiment analysis.

Other potential ideas for "layers" are a "negation" layer or a "movies" layer that identifies movie keywords.

Our project is completely open source, and we'd love to open up to suggestions and contributions from the Hacker News community. This is our first week (out of about three) of working on this project, so it is still very much in an infant stage. We're all relatively new to coding, so we're just trying to figure things out as we go.

If you are interested, here are the relevant links to our project:

Deployed website (has inputs for various kinds of searches, and returns our current, very basic, sentiment analysis results) - http://crowdparser.azurewebsites.net/

sentimentjs NPM package - https://github.com/crowd-parser/sentimentjs

web app repo - https://github.com/crowd-parser/crowd-parser

We'd love to hear your thoughts and suggestions, and thanks in advance!