< Home mto.io

Building a Twitter Realtime Search

Introduction

Twitter is great and all, but it's missing something I always wanted: a realtime search tool.

As you can imagine, the volume of worldwide tweets being sent at any given moment is gigantic. A couple of years ago, when I did some research into creating such a tool, I stumbled upon some third party companies that offered delivering the estimated 500 Million tweets a day to your network: the Twitter Firehose. Today, only GNIP, being owned by Twitter, offers this service. Back then, the maxed out version cost about 8000$ a day... so, yeah.

Node.js

For this tutorial you'll need node.js. If you're not familiar with it, it's basically a program that executes javascript files, mainly used for serving web pages. This very website runs on node.js. You can get it here.

While node.js itself is great, it's really the community that makes it shine. NPM is the name of the package manager that comes with node.js. NPM and node.js' popularity lead to a huge amount of ready-to-use packages which ultimately makes developing with node.js feel light and fast.

We will make heavy use of other people's packages and fly though the programming part at light-speed. You will however need some rudimentary knowledge of web-development with node.js. I recommend Modulus' Absolute Beginners Guide if you're new to the topic.

You'll also need a Twitter account to obtain the keys needed to access the Twitter API. But I'll cover that in detail when we get to it.

Structure

+---------+           +------+           +---------+
|         |           |      |           |         |
| Twitter | <-Query-- | Node | <-Query-- | Web     |
|   API   | ---Data-> |  JS  | ---Data-> | Browser |
|         |           |      |           |         |
+---------+           +------+           +---------+

We're building a typical node.js web application with three main components.

The first one is the Twitter API. It receives our requests and sends back the data we want (lots of tweets!). The second is our web-browser. It submits our search requests and displays the returned data in a readable form . The third one is node.js which acts as a bridge between the other two, passing data back and forth as it becomes available.

You should be familiar with the following typical node.js file layout. It's what we'll end up with:

myproject/
    app.js
    package.json
    node_modules
    public/
        index.html
        style.css

Basically, we'll need to create 4 text files: package.json, app.js, index.html, and style.css. The other files will be created for us as we go along. Technically, the style.css file is not needed, but a little color goes a long way. Let's get going.

Create the project folder (name it as you want) and cd into it.

$ cd ~/
$ mkdir myproject
$ cd myproject

We'll need a package.json file to start the project. Create the file in your project folder and fill it with the following:

{
  "name": "myproject",
  "version": "0.1.0"
}

The Packages

Node.js will do most of the work in our app. It's the instance that sends our queries to Twitter and receives the tweets as they come in.

We'll need three packages provided by NPM. The first one is Express. It makes creating a webserver easy. The second one is Twitter which helps us to easily communicate with the Twitter API. And finally Socket.io which let's us send queries from our browser to node.js and the incoming tweets from node.js back to the browser.

$ npm install express --save
$ npm install twitter --save
$ npm install socket.io --save

Adding the --save flag, tells npm to save the package's name and version number into our package.json file. The packages are now installed in the node_modules folder which should have appeared in your project folder.

Twitter API Access

To use the Twitter API you'll need to provide a set of 4 unique keys with your requests. They're basically random-digit strings. Twitter requires this authentication so they can manage the impact on their servers.

Obtaining the keys is easy. Just log into apps.twitter.com using your Twitter account. Click on the "Create New App" button and fill out the form. You can name the application however you want. Once you created the app, click on "Keys and Access Tokens".

The first two are the Consumer Key (API Key) and the Consumer Secret (API Secret) which should be visible on your screen. Copy and save them somewhere for later.

To retrieve the second pair, scroll to the bottom and click the "Create my access token" button. Now, under "Your Access Token" you'll find the Access Token and the Access Token Secret. Copy and save them as well.

app.js

This is the app file. It's what node.js will be executing for us. Create it inside your project folder with the editor of your choice. Let's go over it in detail.

var _twitter = require('twitter');
var express = require('express');
var app = express();

var server = require('http').Server(app);
var io = require('socket.io')(server);

First, we instantiate all our packages so we can work with them. We also initialize the http server (server) and the socket we need (io) for the node.js-to-browser communication.

app.use('/', express.static(__dirname + '/public'));

We tell our app to make the public folder (which we haven't created yet) available to the world. The public folder will soon contain our index.html file which will hold the logic for the "browser" component we established above.

var twitter = new _twitter({
    consumer_key: 'FiY37v3NaDCk3k5i4MyQcC3HA',
    consumer_secret: 'ieP3mgm5V62p3sr4Bmsad9ai0f9V8Q8z3724JJw6FgysRfzIjF',
    access_token_key: '2384400972-Wjighy6HsgnZr6R8yKJabJ1lWspQ32c34W3h4Ko',
    access_token_secret: '8uOcwIVBxHz2drhIxgcGbAJ64e5u7vKnds3GPbEsxkEvr'
});

We create a new twitter object using the Twitter package. This object will communicate with Twitter's servers. Enter your saved keys here.

var currentstream = 0;

Each keyword we search for opens a new stream to Twitter. Because we need to close each stream before we open a new one to not get penalized by Twitter, we'll save the currently open stream in this variable.

io.on('connection', function (socket) {

    socket.emit('init');

    socket.on('query', function(query) {
        twitter.stream('statuses/filter', {track: query}, function(stream) {
            if (currentstream)
                currentstream.destroy()
            stream.on('data', function(tweet) {
                socket.emit('tweet', tweet);
            });
            stream.on('error', function(error) {
                console.log(error);
            });
            currentstream = stream;
        });
    });
});

The io object's on function is the heart of our app. It is run as soon as a browser connects to the app. Passed in is a socket object which references the connection just established. The app then emits an 'init' message to the browser using the socket's emit function, telling the browser that it is ready to receive queries.

The socket object's on function tells our app to listen for messages from the client. In this case our app listens for a 'query' message from the client which signals that the user want's to search for a new keyword. You can think of a socket messages as a key-value pair where the value is optional.

When a 'query' message from the client is received, the app uses the Twitter object to open a stream with Twitter, by sending the search request to the Twitter API's 'statuses/filter' function.

Next, the app checks if a stream from a previous search is still open. If so, it closes that stream and saves the new stream in the currentstream variable

At this point, our app has done it's job and now waits for Twitter's response. Incoming data from Twitter (the tweets we are searching for) are then sent to the browser using the socket and a 'tweet' message. The browser will receive that 'tweet' message and display it's contents to the user. If there's an error, it is output to your terminal.

server.listen(3000);

This final line tells the node.js server to start up and listen for requests coming in at port 3000. This means after we run this file, we can browse to localhost:3000 to reach the app.

Public Files

To display something in the browser, we'll need to create the remaining files. Change directory into your project folder and create the public folder and the empty index.html and style.css files. We'll flesh them out in the next step.

$ cd ~/myproject
$ mkdir public && cd public
$ touch index.html
$ touch style.css

index.html

This is our HTML file which is displayed to the user when he browses to our webserver.

<!DOCTYPE html>
<html>
<head>
    <meta charset='utf-8'>
    <meta name='viewport' content='width=device-width,initial-scale=0.1' />
    <link rel='stylesheet' href='/style.css'>
</head>

We establish the boilerplate HTML5 meta values and link the style.css file.

<body>
<form>
    <input id="query"></input>
    <input type="submit" id="submit" style="display:none"></input>
</form>

This is our little search bar at the top of the page. We hide the submit button for cosmetics.

<div id="container"></div>
<script src="https://code.jquery.com/jquery-2.1.4.min.js"></script>
<script src="/socket.io/socket.io.js"></script>

We define the container div which will contain all the tweets. This div and the search-form above are the only HTML elements we need.

Next, we load jQuery from it's CDN. It will make our life a bit easier when displaying the tweets using javascript.

Lastly we import socket.io's javascript file which is provided by the package. It enables us to communicate with node.js via a socket. Let's get to the javascript part of our webpage.

<script>
var socket = io.connect('http://localhost:3000');
var regexp = /((ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?)/gi;

We create and open a socket to our node.js server using the io object that became available when we imported socket.io.js above. It is the counterpart to the io object in or app.js file.

Then, we define a regular expression string which we'll use to format the incoming tweets.

socket.on('init', function () {
    socket.emit('query', 'love');
});

This is the function that listens for the 'init' message coming from the node.js server. Remember: it tells us that node.js is ready to process queries. To start streaming tweets right away, we'll send a 'query' message to node.js using "love" as the search term.

socket.on('tweet', function (data) {
    var text = data.text.replace(regexp, '<a href="$1" target="_blank">$1</a>');
    var usr = "<span style='color:#4183C4'>" + '@' + data.user.screen_name + "</span>";
    $('#container').prepend(usr + " : " + text + "<br>");
});

This function listens for a 'tweet' message coming from node.js. If the callback is run, a new tweet arrived and we'll need to display it on the screen.

The data object passed into the callback contains a whole block of data about that individual Tweet. But we'll focus on only two things here: the username (data.user.screen_name) and the actual text of the Tweet (data.text). We'll display the Tweet in the following schema:

<@> <username> : <text>

As we know, tweets often contain links to websites. But we receive them in pain text, so they're not clickable when displayed in our browser. The regular expression we defined previously is just for that purpose.

We match everything in the Tweet that looks like a link and replace it with the appropriate tag to make it clickable. This is done in the first line of the function body. The final string is saved in the text variable.

Next, we construct a element for the username. We'll give it a nice color and save it in the usr variable.

What's left, is to combine usr and text and put them into our container element. We use jQuery to insert the Tweet at the beginning of the container.

$("form").submit(function(e) {
    e.preventDefault();
    var query = $('#query').val();
    socket.emit('query', query);
})

The submit function of our form is run when we hit the enter key after typing a search term. It uses the socket we connected to node.js to emit a 'query' message. With the message, we send the keyword typed in the text field. Node.js will receive and then submit the keyword to the Twitter API.

</script>
</body>
</html>

style.css

html {
    font:14px Helvetica Neue;
    background-color:#eee;
}

a {
    color:#999;
}

input {
    margin:0.5em;
}

#container {
    line-height:1.2em;
    padding:0.5em;
}

This is the little style.css file. Nothing fancy here, just a bit of color and spacing adjustments.

Final Steps

With all our files written, we're ready to go. Move into your project directory and start up our app.js using node.js.

$ cd ~/myproject
$ node app.js

Browse to localhost:3000 using your favourite browser and you'll see a continuous stream of people all over the world expressing their love for seemingly random things. You can start searching for single keywords or multiple ones by separating them by commas.

Twitter restricts how often you can issue requests in a certain timeframe, so spamming will result in a message printed out by node.js in your console, blocking your stream for a few seconds.

Fun stuff to search for are mundane things like eating pizza or jogging. Especially interesting are trending topics and viral videos shared at the moment. Or even philosophical things like comparing the amount of tweets sent containing "love" versus "hate". You can do some market research or find people talking about your hobbies. For most people (me included) it's a fun toy. But it might be of greater value to others.

The source code is available here. Since it's good practice to not ship the node_modules folder with node.js projects, you'll have to run $ npm install once before using the app. Insert your keys and you should be good to go.

Further Thoughts

From a users perspective this app isn't a good app. There's no way of stopping the tweets coming in once a search has started. Tweets might run by way too fast. And it's cumbersome to start and stop the node.js server every time you want to use it. There are projects out there which turn this little program into an conventional app, so that's always an option.

If you have questions or want to share your thoughts, feel free to use the comment section below. Have fun!

Download Source

Want to check out my macOS apps?

I think you'll like them. And they're both free to use :)

Cassette

Low profile, modern music player for macOS.

Youtube Backup

Online video downloader and converter.

From Germany with

Developed by Maximilian Schmidt