Adding the Data to a Database

Build a Twitter Analytics App

1 Introduction: Start Here

2 The First Step: Design Your Solution

3 In Which I Rant a Little

4  Design Solution

5 Writing Great Code

6 Writing the Backend Twitter Server

Writing the Code in Small Parts: Part 1, The Basic App

Part 2: Adding a Counter to Exit

Part 3: Adding Language and Retweet Count

Part 4: Organising Our Code

7 Adding the Data to a Database

8 Testing: What and How to Test

8.1 Testing Our Frontend

8.2 Testing Our Backend

9 Displaying our Data using the Flask Webserver

9.1 Introduction to Flask

9.2 Adding templates to our Flask app

9.3 Displaying our Tweets in the Flask Web Server

10 Future Work and Improvements

So far, we were just printing our data. But if we want to pass it to the webserver, we need to store it somewhere.

Before we go ahead, I hope you downloaded the Sqlite Browser, as I recommend earlier. It’ll help you check you are writing the values to the database correctly.

While you could have the Flask server gather the data in real time, that is bad design practice, because the two things are then closely coupled. This is a technical term that means changing one part of the code will affect the other. This is very bad, as a bug in one part will cause failures in another, and that will make it impossible to debug.

We want to keep our codebase independent, so each part can be tested and debugged separately. That’s why the code that reads live tweets is separate from the code that displays it on a web browser.

Okay, let’s get started. First thing we need to do is create the database. The code is taken from here, though I will only go over a few relevant parts (enough for you to understand what’s going on:

We open our database and connect to it. We are using Sqlite3, which automatically creates the database if it doesn’t exist.

 

We’re just going to look at one table, to see how it works.

We are using raw sql in the example above, to create a table called lang_data that contains 3 fields. language and top_language is what we get from Twitter. We are also storing the date-time, so that we can do things like Get all tweets in the last hour.

If you run this file, it will create the database for you.

Writing to the database

Okay, let’s see how we will write to our database. Code is here.

We create a connection to our database in the main code:

As you can see, we are passing this database to the TwitterMain class, but opening and closing it from the main code. This is important, as if the class code crashes, we’ll still be able to close the database safely. That’s what the finally code above does.

How do we write to the database?

In the get_streaming_data() function, we get the results from the stats class, and write them to the database.  Note that this happens after all the Tweets have been gathered (so the real time nature of the script is not compromised):

We add the lang and top_lang to the database, along with the current time. Note that I’m doing something weird with the languages:

This is because we can’t store Python objects in a database. So I convert them to a Counter, and then a string. This is how it looks in the actual database:

 

db_capture

If you look at the top_language field above, you’ll see it’s a string that contains a Python list. That means I can read it easily.

These were the top languages. The actual top tweets are returned as a list. We’ll write each one to the database:

We loop over the tweets, writing each one, and the current time. We then commit the database.

And this is how we write the Trends:

This time, we have a trend and 3 Tweets with each trend.

So we build up our list in this way: The first entry is the trend, and the next three are the example tweets of the trend.

We then create a tuple with this list. This allows us to write the whole thing as one block into the database.

Have a look at the complete code, run it a few times, make sure you understand what is happening.

This concludes our backend code.

But you may have noticed there is a big thing missing: Tests. We’ll fix that in the next section. But first, some theory on testing.