This week we finished the back end part of the project. The analysis is finished, and it was accomplished using the indico library for sentiment analysis. Also, the elimination of stop words concluded almost perfectly, and the word cloud functionality has been finished. The analysis functionality is finished.
Next step is to implement our backend functionality with our frontend design. But, for this course, backend was our only scope. The frontend part is going to be covered by 2 of our teammates for another course.
It’s me again with another update on what we will be working this week. Last week, I worked on mining all tweets from an account on Twitter. This week, I will be polishing what I did last week, and I will be working with Alfonso on eliminating the stop words of a tweet. Stop words are empty words that are filtered out from natural language processing, such as: the, is, at, which, and on.
By doing this, we will be able to create a more accurate words map without having a lot of concurrences of stop words and focusing just on the information that matters.
Stay tuned for the advancement of this week on the post mortem blog post.
This week I added environment variables for the connection information on the database. Also I changed some classes names, and improved the code for better understanding. Connection to the database is working, as well as insertion. However, we still have problems with encoding the text to UTF-8, and avoiding weird characters. EMOJIS are a big problem because, whenever a user has an EMOJI in his user name, or in his tweet, the program halts.
We are still figuring a way to store the text with the emojis for further sentiment analysis. I know it should be possible.
This week is going to be about improving what I previously achieved on the project. I will fix the code, applying some better practices while coding, and I will add some functionalities. Reading, and deleting from the database shall be implemented. Also, I will try to store the tweets without modifying the text for further use on sentiment analysis. For this last part, everything matters; from emojis, to raw text. Henceforth, I will be researching how to store tweets with special characters, and emojis into the database.
This is what I will try to do, and I will keep learning about Python, MySQL, and all these new topics for me.