--Originally published at TC3045 – Sagnelli's blog
2018 is an important year for Mexico, where the next six years are supposed to be defined by the Mexican people; however, corruption has always interfered with democracy, as the government has been accused of manipulating the votes.
This is the problem we are trying to solve with our project. Now, the important question is:
Who are we?
We are a group of students between 6th and 8th semester of Computer Science at Tec de Monterrey Campus Guadalajara:
- Alfonso Contreras
- Arturo González
- Alejandro Vázquez
- Michelle Sagnelli
What is our solution?
Basically, in one sentence, we are building a series of microservices that will let us determine who is the best acclaimed, and the most popular presidential candidate according to Twitter.
How are we supposed to do it?
We will apply data mining using Python Streaming Jobs, and Twitter’s API to temporarily store tweets in JSON’s. Afterwards, this data will be shown and saved for later use.
The challenge is to clean data by mining keywords, eliminating stopwords, and assigning tokens by tweet importance. Henceforth, this “clean” data will be used to analyze with machine learning the importance of this year’s candidates, and political parties. Finally, this information will be stored in JSON format for further analysis of political parties information, and candidates’ level of acceptance.
Extras
We are trying to implement location-based analysis of tweets, and being able to find which tweets belong to bots for achieving a more successful analysis.
This should be fun. I am very interested in this project, as it is challenging, and interesting. If you are interested too, do not hesitate in contacting me, and stay tuned with mine, and my colleagues’ , future blog posts.
Regards,
Mike.