Show your best - Interactive campaign

  • Role: Technical director & Interactive producer
  • Digital production: The Rumpus Room, London
  • Agency: Cole & Weber United, Seattle
  • Client: The International Olympic Committee
  • Live date: nov 2011

Flux of MEME - 1st year final report

So it is now time to present the results obtained during the first year of research and development on the Flux of Meme project, and I was glad to fly to Milan for the presentation at Telecom Italia last friday 30th. Thanks-a-mil to Laurent-Walter Goix and Carlo Alberto Licciardi at Telecom for the constant support, reviews and recommendations: it immensely helped to achieve this result. And thanks-two-mils to Giuseppe Serra and Marco Bertini (also with the help of Federico Frappi) at the Media Integration and Communication Center for the help provided in the definition and fine-tweaking of algorithms. Looking forward to starting Flux phase 2!

This is a quick keynote that highlights the main elements of this geo-clustering and topic extraction tool, using twitter as a main data source but willing to expand to proper context-based data heterogeneous sources.
[slideshare id=9492782&doc=fom110930-110930171037-phpapp02]

Twitter geo-located clustering and topic analysis, now opensource!

A year has passed since the beginning of the trial of Flux of MEME, the project I have presented during the Working Capital tour, and it is now time to analyze what has been learned and show what has been developed to conclude this R&D phase and deliver results to Telecom Italia.

the initial idea

It’s worthwhile giving a quick description of the context: Twitter is a company formed in 2006 which has received several rounds of funding by venture capitals over the past few years, this leading to today's valuation of $1.2B, still during the summer of 2009 the service was not yet mature and widespread as it may look now. At that time the development of the Twitter API had just started, this probably being one of the few sources, if not the only one, for geo-referenced data. The whole concept of communication in the form of public gossip, mediated by a channel that accepts 140 characters per message, was appearing in the world of social networks for the first time.
This lead to the base idea of crunching this data stream, which most importantly include the geographical source, then summarize the content, so as to analyze the space-time evolution of the concepts described and, ultimately, make a prediction of how they could migrate in space and time.

A practical use

It could allow you to control and curb the trend of potentially risky situations (such as social network analysis has been useful during the recent riots in London) or even define marketing strategies targeted to the local context.

The implementation

A consistent initial phase of research allowed to have an overview on different aspects: the ability to capture the information from Twitter, the structure of captured data, the ability to obtaining geo-located information, the classification of languages of the tweets, the enrichment of content through discovery of related information, the possible functions for spatial clustering, the algorithms for topic extraction, the definition of views useful for an operator and finally the ability to perform a trend analysis on the information extracted. All of this has resulted in a substantial amount of programming code, its outcome being a demonstrator for the validity of the initial theory.

space-time evolution of the concept "earthquake" in a limited subset of data captured during the period May 2011"

distribution of groups of tweets source languages ​​over Switzerland and northern Italy

The future of the project

The development done so far has had two important results: firstly, it allowed to demonstrate the validity of the initial idea, and secondly it has revealed the requirements needed by the system to be fully functional. The main problem lays in the architecture implemented for the demonstrator, which at the moment relies on a limited amount of data (for obvious reasons of availability of resources): this immediately proved the necessity of scaling up the application environment in a more complex architecture for distributed computing  The market and/or Telecom Italia will eventually decide if this second phase of development can be faced.

References

Configuring NGINX and CodeIgniter on Ubuntu Oneiric Ocelot on Amazon EC2

Few days ago I started the server setup for a web project @therumpusroom_ and, after receiving the traffic estimates, I thought a single Apache server was not enough to handle the expected load of visitors. For several reasons I want to avoid using a load balancer and multiple Apache instances, hence the decision to implement Nginx with MySql running on a separate dedicated server.

The whole infrastructure lives on Amazon Web Services and the web application - still under development - will rely on CodeIgniter. I have read quite a lot of articles on-line and stolen bits and pieces of configuration files, but none of them entirely reflected what I needed. I feel it is quite a common configuration hence I am writing down here the required steps and some code snippets, both for my personal records and also hoping it can be helpful for someone else with similar issues.

The premise: implement a CodeIgniter installation on Amazon EC2 with a dedicated DB server and content delivery network for rich media distribution.

Pre-requisites / specs: Ubuntu 11.10 Oneiric Ocelot 64bit with Nginx web server running on a large instance on Amazon EC2, dedicated MySQL server on Amazon RDS and Cloudfront CDN.

The steps:

1. choose your Ubuntu installation

I ended up choosing Oneiric Ocelot 64bit, I am always too tempted to try the latest, anyhow you can always find your own Ubuntu AMI using the super helpful AMI locator for EC2

2. start a basic NGINX installation

I used this guide on Linode to configure Nginx and PHP-FastCGI on Ubuntu 11.04 (Natty) as a starting point, just be aware of the following:

  • ignore the hostname configuration: it did not work for me and it is not relevant to make the web server work properly
  • start with the suggested config for nginx, but keep in mind you will need to finalize it later

also the init.d/php-fastcgi script in the Linode guide gave errors and was not working properly for me, so I have created a simpler version (you may need to manually create pid/socket folders before running the script the first time):

PHP_SCRIPT=/usr/bin/php-fastcgi
PID_DIR=/var/run/php-fastcgi
PID_FILE=/var/run/php-fastcgi/php-fastcgi.pid
SOCKET_FILE=/var/run/php-fastcgi/php-fastcgi.socket
RET_VAL=0

case "$1" in
    start)
      $PHP_SCRIPT
      RET_VAL=$?
  ;;
    stop)
      rm $PID_FILE
      rm $SOCKET_FILE
      killall -9 php5-cgi
      RET_VAL=$?
  ;;
    restart)
      rm $PID_FILE
      rm $SOCKET_FILE
      killall -9 php5-cgi
      $PHP_SCRIPT
      RET_VAL=$?
  ;;
    status)
      echo "php-fastcgi running with PID `cat $PID_FILE`"
  ;;
    *)
      echo "Usage: php-fastcgi {start|stop|restart|status}"
      RET_VAL=1
  ;;
esac
exit $RET_VAL

by this time you should be able to execute some test php code to chech that your FastCGI script is working properly and receiving parameters from the web server just using the default site already enabled.

3. setup CodeIgniter

now the interesting part: setting up CodeIgniter with correct locations is not straight forward. There is an interesting thread on the official CodeIgniter Forum, pointing the right way but unfortunately it does not entirely solve the problem.

After downloading CodeIgniter and extracting the archive in the document root, the first important step required to see at least the welcome screen is to setup the configuration file so as to receive parameters from the web server, under /application/config/config.php

$config['uri_protocol'] = 'REQUEST_URI';

and finally setup the Nginx "virtual host" to serve the correct directories and path infos used by CodeIgniter controllers to receive parameters: in my setup I have the CodeIgniter application folder also serving the main static contents (under /application/public with subfolders: css, img, js). I started from a config file find on gist then tweaked to reflect my specific needs. Here is the code:

server {
    server_name project.staging.example.com;
    access_log /home/ubuntu/repo/staging/logs/access.log;
    error_log /home/ubuntu/repo/staging/logs/error.log;

    root /home/ubuntu/repo/staging/webdev;
    index index.php index.html;

    location ~* ^/(css|img|js)/(.+)$ {
        root /home/ubuntu/repo/staging/webdev/application/public;
    }

    location / {
        try_files $uri $uri/ @rewrites;
    }

    location @rewrites {
        if (!-e $request_filename)
        {
            rewrite ^/(.*)$ /index.php/$1 last;
            break;
        }
    }

    location /(application|system) {
        internal;
    }

   location ~ \.php {
        include /etc/nginx/fastcgi_params;

        fastcgi_split_path_info ^(.+\.php)(/.+)$;
        fastcgi_param PATH_INFO $fastcgi_path_info;
        fastcgi_param PATH_TRANSLATED $document_root$fastcgi_path_info;
        fastcgi_param SCRIPT_NAME $fastcgi_script_name;
        fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
        fastcgi_pass unix:/var/run/php-fastcgi/php-fastcgi.socket;
        fastcgi_index index.php;
    }
}

this should be all, hope this helps and feel free to drop a line below for questions.

wordpress and google+ vanity url

I could not resist, I had to have my google+ vanity url on my wordpress installation, and this is pretty straight forward using a well-known redirection plugin, just pick your profile id from google+ (as in figure below) and setup a 301 redirect from <your_address>/+ to your profile, that's it!

if you want to say hello: http://tom.londondroids.com/+