Build a Twitter Analytics App
2 The First Step: Design Your Solution
6 Writing the Backend Twitter Server
Writing the Code in Small Parts: Part 1, The Basic App
Part 2: Adding a Counter to Exit
Part 3: Adding Language and Retweet Count
7 Adding the Data to a Database
8 Testing: What and How to Test
9 Displaying our Data using the Flask Webserver
9.2 Adding templates to our Flask app
9.3 Displaying our Tweets in the Flask Web Server
10 Future Work and Improvements
So we have most of our code in place. Our backend reads and writes the tweets to a database. We have a front end webserver that can display the tweets. Now we just need to connect the two.
Let’s look at our code.
1 2 3 4 |
@app.route("/top_tweets") def top_tweets(): tweets, datetime_toptweets = get_top_tweets() return render_template('top_tweets.html', tweets = tweets, datetime_toptweets = datetime_toptweets) |
We saw the top_tweets template in the last section, so let’s do that first. We are calling a function get_top_tweets() to get the top tweets, and the date-time. The date-time is only for debug, just so that we can check the script is running regularly. Let’s look at the function get_top_tweets():
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
def get_top_tweets(): conn = sqlite3.connect(db) conn.row_factory = sqlite3.Row c = conn.cursor() c.execute("SELECT * from twit_data ORDER BY datetime DESC LIMIT 30") result = c.fetchall() tweets = [] datetime_toptweets = result[0]['datetime'] for tweet in result: tweets.append(tweet['top_tweet']) conn.close() return tweets, datetime_toptweets |
We open the database and read the last 30 tweets, as well as the datetime. This is then returned to our function. As we saw in the templates section, the tweets are just printed on the screen:
1 2 3 4 5 6 7 8 9 10 |
{% extends "index.html" %} {% block tweets %} <b> Most Popular Tweets </b> {% for tweet in tweets %} {{ tweet | safe}} {% endfor %} {% endblock %} |
Let’s look at the function to read the trends next:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
def get_trends(): conn = sqlite3.connect(db) conn.row_factory = sqlite3.Row c = conn.cursor() trend = [] trend_tweet = [] c.execute("SELECT * from trend_data ORDER BY datetime DESC LIMIT 10") result = c.fetchall() datetime_trends = result[0]['datetime'] for r in result: trend.append(r['trend']) trend_tweet.append(r['trend_id1']) trend_tweet.append(r['trend_id2']) trend_tweet.append(r['trend_id3']) conn.close() return trend, trend_tweet, datetime_trends |
This time, we only read the last 10 values; but remember, we have 1 trend, and 3 tweets for each trend.
We return both the trends list, as well as the trend_tweet list, which contains the trending tweets.
Displaying graphs
Till now, our code is simple. All we are doing is taking data from our database, and displaying it on the screen. We already got the tweets in the right format using the twitter api, so we needed to do nothing else.
But the languages data is different. This time, we have to actually plot it.
For plotting, we will be using Google Charts.
I tried dozens of JS charting libraries, but found Google Charts to be the easiest to use. Most of the others require a lot of fiddling.
I am not a JS expert, I just want the libraries to work out of the box. Highcharts failed (at least for me), d3.js is too complex. Flotcharts worked, but looks a bit dated.
Google Charts surprised me: It worked out of the box, and produces beautiful graphs.
Have a brief look at Google Charts examples, as in most cases, I just copy pasted the code and tweaked it a little.
Let’s look at the function to read the language data:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
def get_lang(): conn = sqlite3.connect(db) conn.row_factory = sqlite3.Row c = conn.cursor() c.execute("SELECT * from lang_data ORDER BY datetime DESC LIMIT 1") result = c.fetchone() lang = ast.literal_eval(result['language']) top_lang = ast.literal_eval(result['top_language']) conn.close() return lang, top_lang |
This time, we only read one value, as the data is in an array. Since SqLite can’t store Python lists, I had converted the list to a string. Here, I convert it back and return it.
1 2 3 4 5 6 7 8 9 10 11 12 |
@app.route("/") def main(): language_data = [] top_language_data = [] lang, top_lang = get_lang() for l in lang: language_data.append([l[0], l[1], l[1]]) for t in top_lang: top_language_data.append([t[0], t[1], t[1]]) return render_template("lang1.html", language_data = language_data, top_language_data = top_language_data) |
This is where I’m using the data. I am re-shaping the data in a list. You may note that we are repeating the 2nd element twice. Like this:
1 2 |
for l in lang: language_data.append([l[0], l[1], l[1]]) |
Why are we doing this? Because Google Charts wants us to send the data separately. So in the above, the first value is the language, the second is its percentage usage, and the third is the same thing repeated. It’s repeated because we need it for tooltips, and each item we use needs to be passed in, even if it is repeated.
Let’s look at the template (html) file to see how the data is used.
1 2 3 4 5 6 7 8 9 |
{% extends "index.html" %} {% block graphs %} <!--Load the AJAX API--> <script type="text/javascript" src="https://www.google.com/jsapi"></script> <script type="text/javascript">// Load the Visualization API and the piechart package. google.load('visualization', '1.0', {'packages':['corechart']}); // Set a callback to run when the Google Visualization API is loaded. google.setOnLoadCallback(drawChart); |
All this is setup to run the Google Charts Javascript.
1 2 3 4 5 6 7 |
function drawChart() { // Create the data table. var data = new google.visualization.DataTable(); data.addColumn('string', 'Languages'); data.addColumn('number', 'Usage'); data.addColumn({type: 'number', role: 'tooltip'}); |
This is the function to draw the chart. As you can see, we are drawing 3 columns. The first is a string called Languages. Then a number called Usage. Finally, the tooltip data, which is the repeat of the previous value. Hopefully, now you can see why we needed to pass the data like this.
1 |
data.addRows( {{ language_data | safe }} ); |
Here, we pass in the data we read. Again {{ }} is a Jinja command, and the data in here will be replaced by what our Flask server sent it. We’ve already talked about safe: Google doesn’t like it, like Twitter.
Here is an example of how the above field will be filled in real time:
1 |
data2.addRows( [['English', 6, 6], ['Japanese', 1, 1], ['Thai', 5, 5]] ); |
1 2 3 4 5 6 7 8 9 |
// Set chart options var options = {'title':'Most used languages on Twitter: All Tweets', 'width':800, 'height':600, sliceVisibilityThreshold: .01, is3D: true}; // Instantiate and draw our chart, passing in some options. var chart = new google.visualization.PieChart(document.getElementById('chart_div')); chart.draw(data, options); |
Finally, we set some options and draw the graph. The main option is the is3D flag, which just makes the graph look cool 🙂
And there you go. A basic version of our app, running on our machine. Have a play with it.