Week 5: Post-Mortem

--Originally published at TC3045 – Sagnelli's blog

As you know we already had a streamer to use with the Twitter API for streaming tweets. With that streamer we were only able to filter using key words; however, tweets from all sources arrived. Now, added a second streamer, where we are now able to filter from whom we want our tweets. This list of users is provided via a .txt file, and we can also use a list of keywords for a more thorough filtering.

What did we do with the first streamer?

It is now fully working, adding filtered tweets to the database, and printing who tweeted the matching results.

The project’s development is going great.

Stay tuned for next weeks developments.

 

Week 5: Filtering tweets

--Originally published at TC3045 – Sagnelli's blog

This week is going to be about the filtering in the streaming of tweets. Being able to determine from which users we are going to gather tweets, and tweets containing certain keywords. Afterwards, we are going to store these tweets in our database.

After doing this filtering, we are going to create our Tweet object in our database to start applying natural language processing, and sentiment analysis.

These improvements are very important because it is the main functionality of our project. It is also worth metioning that we are applying testing in pair with the coding phase to begin writing efficient, and better code.

Stay tuned for the post-mortem analyisis.

Week 4: Post-Mortem

--Originally published at TC3045 – Sagnelli's blog

Hey! Good to see you again.

This week I added environment variables for the connection information on the database. Also I changed some classes names, and improved the code for better understanding. Connection to the database is working, as well as insertion. However, we still have problems with encoding the text to UTF-8, and avoiding weird characters. EMOJIS are a big problem because, whenever a user has an EMOJI in his user name, or in his tweet, the program halts.

We are still figuring a way to store the text with the emojis for further sentiment analysis. I know it should be possible.

Stay tuned.

Week 4: Database & Python 4 life

--Originally published at TC3045 – Sagnelli's blog

What’s new, it’s Mike again.

This week is going to be about improving what I previously achieved on the project. I will fix the code, applying some better practices while coding, and I will add some functionalities. Reading, and deleting from the database shall be implemented. Also, I will try to store the tweets without modifying the text for further use on sentiment analysis. For this last part, everything matters; from emojis, to raw text. Henceforth, I will be researching how to store tweets with special characters, and emojis into the database.

This is what I will try to do, and I will keep learning about Python, MySQL, and all these new topics for me.

Stay tuned for what I achieve this week.

From now on, bad practices shall be avoided.

Post-Mortem: Tweets buffer

--Originally published at TC3045 – Sagnelli's blog

Sup, it’s me again.

What I did this week was not so different from what I had planned. However, I did not fulfilled what I was thinking about. I did something a little bit different, but it was a nice progress towards the final project.

I was able to add tweets’ raw text to a list in Python, and when a specified number of tweets are in the list, those tweets are stored in a database.

I was able to do so by modifying the streamer.py, and adding the tweets to a list there. Afterwards, by calling the function in the micro-service storage.insert-tweet.py to do the insertion of whatever the number of tweets is specified.

I had trouble when the text had break lines, unknown characters, or Emojis. Henceforth, I had to modify the text a little to be able to store them in the database.

Up next is adding  some environment variables for the database connection, and adding some update, and delete functions to the application.

Week 3: PyMySQL is about to be applied

--Originally published at TC3045 – Sagnelli's blog

“When you want to succeed as bad as you want to breathe, then you’ll be successful.”

– Eric Thomas

Hey, I’m still awake. Now, there’s a reason to it, and as bad as I want that reason to be watching a Netflix series, or playing videogames, it is not. I am thinking on what I will be doing this week for the Elections Analyzer 2018 project.

This week is going to be all about creating generic functions in Python for DML queries on a MySQL database. I will be dividing this functions as micro-services. Henceforth, there is going to be a separate micro-service for inserting (creating), selecting (reading), updating, and deleting, (CRUD), from a generic database. Everything using the PyMySQL library.

Stay tuned for the post-mortem update of this week’s development.

Update – Elections Analyzer 2018 improvements

--Originally published at TC3045 – Sagnelli's blog

This week a member of the team automatized the setup, and run methods of the application. As this is a two part project, we are focusing on the python micro-services of gathering, cleaning, and storing data in our database.

Database

We are using a JSON to normalize data for our relational MySQL database. We already discovered how to establish connection from Python to a MySQL database using the PyMySQL library to make DDL & DML queries.

This is an example of the code to do so.

from __future__ import print_function
import pymysql

conn = pymysql.connect(host='', port=, user='', passwd='', db='',autocommit=True)
cur = conn.cursor()

#cur.execute("CREATE TABLE Partidos ( ID int NOT NULL, nombre varchar(50), PRIMARY KEY(ID)); ")
cur.execute("INSERT INTO Partidos VALUES (111,'PAN','IZQ')")
cur.execute("INSERT INTO Partidos VALUES (112,'MORENA','DER')")
cur.execute("INSERT INTO Partidos VALUES (113,'PRI','IZQ')")
cur.execute("INSERT INTO Partidos VALUES (114,'MOVIMIENTO CIUDADANO','IZQ')")
cur.execute("SELECT * FROM Partidos")
cur.execute("DELETE FROM Partidos")
print()
for row in cur:
    print(row)
cur.close()
conn.close()

 

This is what I’ve done so far in the project, and I learned how to use micro-services in Python. I will continue doing generic automatization of queries for when the database is up and running.

Keep tuned for further news on the development of the project.

Update – Elections Analyzer 2018 improvements

--Originally published at TC3045 – Sagnelli's blog

This week a member of the team automatized the setup, and run methods of the application. As this is a two part project, we are focusing on the python micro-services of gathering, cleaning, and storing data in our database.

Database

We are using a JSON to normalize data for our relational MySQL database. We already discovered how to establish connection from Python to a MySQL database using the PyMySQL library to make DDL & DML queries.

This is an example of the code to do so.

from __future__ import print_function
import pymysql

conn = pymysql.connect(host='', port=, user='', passwd='', db='',autocommit=True)
cur = conn.cursor()

#cur.execute("CREATE TABLE Partidos ( ID int NOT NULL, nombre varchar(50), PRIMARY KEY(ID)); ")
cur.execute("INSERT INTO Partidos VALUES (111,'PAN','IZQ')")
cur.execute("INSERT INTO Partidos VALUES (112,'MORENA','DER')")
cur.execute("INSERT INTO Partidos VALUES (113,'PRI','IZQ')")
cur.execute("INSERT INTO Partidos VALUES (114,'MOVIMIENTO CIUDADANO','IZQ')")
cur.execute("SELECT * FROM Partidos")
cur.execute("DELETE FROM Partidos")
print()
for row in cur:
    print(row)
cur.close()
conn.close()

 

This is what I’ve done so far in the project, and I learned how to use micro-services in Python. I will continue doing generic automatization of queries for when the database is up and running.

Keep tuned for further news on the development of the project.

Week 2: The beginning

--Originally published at TC3045 – Sagnelli's blog

As the title states, this week is where it all begins. We already know the objective, and how we are going to achieve it, so let’s dive into work.

Resultado de imagen para gif working

If you want to see our progress, you can by visiting our repo. This week is going to be about getting up and running with the database, and start structuring the python project.

Summary:

  • Define microservices and divide functionalities.
  • Structure python project.
  • Getting up and running with the database.
  • Start developing with unit testing, and Agile methodology.

So, let’s begin…

Election year: let’s be genuine, shall we?

--Originally published at TC3045 – Sagnelli's blog

2018 is an important year for Mexico, where the next six years are supposed to be defined by the Mexican people; however, corruption has always interfered with democracy, as the government has been accused of manipulating the votes.

Resultado de imagen para simpsons vote gif

This is the problem we are trying to solve with our project. Now, the important question is:

Who are we?

We are a group of students between 6th and 8th semester of Computer Science at Tec de Monterrey Campus Guadalajara:

  • Alfonso Contreras
  • Arturo González
  • Alejandro Vázquez
  • Michelle Sagnelli

What is our solution?

Basically, in one sentence, we are building a series of microservices that will let us determine who is the best acclaimed, and the most popular presidential candidate according to Twitter.

How are we supposed to do it?

We will apply data mining using Python Streaming Jobs, and Twitter’s API to temporarily store tweets in JSON’s. Afterwards, this data will be shown and saved for later use.

The challenge is to clean data by mining keywords, eliminating stopwords, and assigning tokens by tweet importance. Henceforth, this “clean” data will be used to analyze with machine learning the importance of this year’s candidates, and political parties. Finally, this information will be stored in JSON format for further analysis of political parties information, and candidates’ level of acceptance.

Extras

We are trying to implement location-based analysis of tweets, and being able to find which tweets belong to bots for achieving a more successful analysis.

This should be fun. I am very interested in this project, as it is challenging, and interesting. If you are interested too, do not hesitate in contacting me, and stay tuned with mine, and my colleagues’ , future blog posts.

Regards,

Mike.