What if Twitter wasn't the fastest one…

[Italiano] We have recently done some comparative analysis between Twitter propagation dynamics and FriendFeed propagation dynamics. We chose, as our case study, the news related to the rescue operations of the San Josè mining accident (that left 33 men trapped 700 metres (2,300 ft) below ground for 69 days) [I’m not describing the research now, we have a paper under submission.. so stay tuned for more about the research itself].
As a side product of this research we had the opportunity to monitor the audience exposed to the miner’s rescue news both on Friendfeed and on Twitter. We were therefore able to observe how fast a specific news spreaded through both the networsk and, since we are observing the same news, we can assume that different propagation speeds can be related to the different propagation mechanisms taking place into the two systems.
Even considering the huge difference in absolute numbers (Twitter has a larger number of users) the line of FriendFeed based propagation (Esposti FF) is steeper and shows a less linear progression than Twitter’s line (Esposti TW).


This seems to suggest that a propagation based largely on the interactions made by the people you follow is faster than a propagation based mostly on explicit re-sharing practices (ReTweets).[English]

Come effetto collaterale di una ricerca che abbiamo da poco concluso sulla propagazione online delle news relative al salvataggio dei minatori chileni intrappolati nella miniera di San Jose abbiamo potuto verificare la velocità di propagazione di una notizia all’interno del network di Friendfeed e di Twitter. Dato che la notizia di partenza era la stessa possiamo ipotizzare che le differenze di propagazione osservate siano imputabili ai meccanismi di propagazione dei due sitstemi. Anche tenendo in considerazione l’enorme differenza in termini numerici la curva della propagazione di FriendFeed appare molto più ripida e meno lineare di quella di Twitter.

Questa differenza sembrerebbe indicare come un meccanismo di propagazine basato sulle interazioni dei propri contatti (come avviene su FriendFeed) piuttosto che su esplicite pratiche di propagazione risulti essere più efficace dal punto di vista della velocità.

When followers are not enough

[italian version]
We have just gathered a brand new datased of FriendFeed data (you can download it in the Data section, it’s named 2010-a dataset ). Since it is considerably larger than our previous database we decided to test few more hypotesis on information propagation in SNSs. One of the key concepts speaking about the ability to spread online information is that being well connected is a key element in propagation strategies. This point can be roughly summarised as: the more followers you have the more you can inform. We’ve already challenged this assumption before and we wanted to test it deeper.
Therefore we analysed the relationship between the actual number of followers and the average audience of every users. We defined the average audience value as the average number of users been exposed to the messages sent by a specific user during our sample time. Due to the technical structure of Friendfeed users that were able to start the most engaging discussions have a larger opportunity of have an actual audience larger that the simple list of their followers.

Followers /Avg Users

As it is shown in the graph – that shows only the top 20 users according to their followers number – there could be a huge difference between the followers and the actual audience that users can engage. It is very interesting to point out how the users with a larger average audience is ranked only 18th according to the followers number.
As we said before, when we’re dealing with social phenomena and users engagement (as it happens in online propagations): followers are not enough.[English version]
Partendo dall’ultimo dataset che abbiamo acquisito con i dati di FriendFeed abbiamo iniziato a testare alcune ipotesi relative alla possibilità di definire la capacità comunicativa degli utenti all’interno di questo tipo di reti. Una delle assunzioni che si sono fatte più spesso (più in passato di quanto non avvenga ora) riguarda il nuero di followers. In pratica si considera spesso questo valore come un indicatore della capacità comunicativa di un utente. Brutalmente si pensa che se una persona è in contatto con molte altre persone questi abbia la capacità di raggiungere una massa importante di utenti.
Per verificare questo assunto abbiamo deciso di osservare la relazione tra il numero di follwers e la audience media degli utenti. Con audience media intendiamo il numero di utenti che sono stati esposti ai messaggi postati da uno specifico utente durante il nostro periodo campione (2 Mesi: Agosto- Settembre 2010).
Followers /Avg Users

Data la natura di FriendFeed l’audience tenderà a crescere verso valori più ampi rispetto ai follower diretti tanto più l’utente sarà in grado di far partire discussioni che riescono a propagarsi ed a coinvolgere gli amici degli amici e così via.
Come si può vedere dall’immagine [che mostra il rapporto tra followers e audience media per i 20 utenti con il maggior numero di followers all’interno della rete di FriendFeed italiana (solo account pubblici)] un elevato numero di followers non significa necesariamente un’elevata audience media, anzi l’utente che – in termini assoluti – raggiunge mediamente un’audience maggiore si colloca solo diciottesimo quando andiamo a contare i followers.
Insomma ancora una volta quando parliamo di reti sociali i numeri possono ingannare facilmente.

(Italiano) A proposito di Twitter e della (mancata) conversazione

Recentemente Sysomos ha rilasciato alcuni dati che mostrano come solo il 29% dei Tweet produca effettivamente una reazione (6% Retweet e 23% Replay). A questi dati sono ovviamente seguiti una serie di commenti a proposito del lato social di Twitter o della sua natura Broadcasting.
L’idea alla base di questo ragionamento è che se su Twitter le persone non chiaccherano allora la dimensione social si perde a favore di un’infinita serie di messaggi individuali rivolti alla massa. Di fronte a questo ragionamento è però forse opportuna una riflessione sui dati proposti dalla icerca Sysomos.
Continua a leggere (Italiano) A proposito di Twitter e della (mancata) conversazione

Visualising Italian Friendfeed Network

Recently we’ve posted on Friendfeed a visualisation of Italian Friendfeed users extracted from the data we’ve collected in 2009. Since the map started an interesting debate (you can read it here – in Italian –) we thought to write a short post explaining how the map has been done and what are its limits and possibilities.

The map is based on a network made of 8024 nodes with 244542 edges (even if the map shows only the nodes with more than 147 followers but statistical values have been counted on the whole network).
We collected the data in September 2009 starting from all the public messages posted on Frienfeed (you can read more about this in out SBP10 paper).
We processed the network with Gephi and the map shows the indegree value as node size and betweenness centrality value as node colour.
The final result is rather interesting since it shows on one side a group of huge nodes with many followers but at the same time it shows how the is no simple correlation between the number of followers and the betweenness centrality value. Since BC value is often used to identify relevant nodes or hidden hubs this can be read as the quality of your connections matters more than their number.
Nevertheless a final remark has to be done. Metrics like betweenness centrality works really well in traditional networks but they fail to grasp the new conversational nature of Friendfeed Network (but the same could be said about Twitter). In Friendfeed conversations exist often out of the network made by the following/follower structure. When I get in touch with a message originally produced out of my network only because a friend mine comments on it, what happens is the creation of an actual network based more on social behaviour than on the underlying set of connections.
We need new social metrics.

July RoundUp

July has been a month fully packed of SIGSNA activities. At the beginning we’ve been in Oxford for the Research Methods festival [reported here] and after that we had just a few days at home and we had to fly to Gothenburg for the International Sociological Association Conference (ISA 2010). Due to the high interdisciplinary approach of the SIGSNA project we have to move through many different conferences so it might be strange to follow the line of all our presentations, but that’s the best part of it: to be able to share our research and our ideas with so many colleagues from a large variety of disciplines. During the ISA conference we presented at the RC51- Sociocybernetics session. Sociocybernetic is an interesting sociological approach rooted in the System theory and in Complexity theory; nowadays it shows a good theoretical background for a Sociological Approach to the internet studies. What’s really cool is that I won the “Walter Buckley Memorial Award for Excellence in Presenting Sociocybernetics”!!
Here you can check the slides I used during my presentation:

As soon I made my way back to Bologna I attended the International Visual Sociology Association Conference here in Bologna. Visual Sociology is a rather recent and fascinating field of research and I really wanted to show some visual hints we had from our research. So I presented a brief discussion analysing the top100 most commented pictures posted in the Italian Friendfeed durng our sampling time. Well I’m happy to say that we had a great panel there together with some friends also presenting on UGC/SNS pictures (Agnese, Marina, Alessandra, Stefania, and Fatima – and Many thanks to Giovanni and Laura, chairs of the session).
Here you can see the slides I presented during my talk:

So what’s next? July was really full of stuff and we recently received the news that the SIGSNA research has been authorised to use some of the computing resources of the CINECA supercomputer centre. I can clearly see a huge amount of work just ahead.

SIGSNA data available

In the Data section you can now download the data. We monitored the activity of Friendfeed from September 6, 2009, 00:00 AM to September 19, 2009, 12:00 PM. The service was monitored at a rate of about 1 to 2 updates every second (depending on network traffic). This is the dataset we are using for our research and it is now available to you. Have fun!

Language identification and source analysis

Even during Christmas holidays #SIGSNA team was working on our dataset of conversations. After the positive reaction to our first public presentation we’ve decided to improve our dataset before moving any step forward. We worked on two main issues: 1) we tried to improve the performance of our Language Identificatio system and 2) we tried to identify all the original sources of Friendfeed entries we’ve collected.
In details:
1) We were pretty satisfied by the original performances of our language identification system (SLide). The version we used in our first descriptive analysis of FF network worked fine with a very low error rate. Nevertheless the rate of unidentified post was still to high. This was because every language identification system works better with very long “normal” texts than with very small piece of text often written with an informal style. It is much easier to identify a book chapter than to identify a couple of comments like “wow! great” cool!” to a single Friendfeed entry. Our team did some great job on that issue and now we’re running a new version of SLide (ver. 0.6) that works really better than the old one especially on very short text entries.
During our first public presentation in Urbino we received a very good question (thank you again Adriano!): How may entries in Friendfeed are posted directly as Friendfeed messages and how many are imported in Friendfeed from external services? As soon as we finished the new version of SLIde we decided to go and look for external services in our dataset. We analyzed only the entries (since comments obviously come from Friendfeed) and we discovered something very interesting: the largest number of messages in Friendfeed comes from Twitter! The synchronization between the two services works just fine (as many of you probably know) and then a large number of users seems to have a single microblogging feed (the Twitter one) which is imported in Friendfeed.
Friendfeed itself is the second source of entries (not surprising) and Tumbler and Facebook are on the third and fourth places.
Another interesting aspect is that half of the entries come from a thousand of different services: google news, research query feeds, minor news services and many more.
Friendfeed is really a complex galaxy of information feeds coming from everywhere.

First public presentation

We’re happy to announce that Friday we gave the first public presentation of the #SIGSNA project. We presented our paper titled “SOCIAL NETWORK PRACTICES, AN EMPIRICAL DESCRIPTION OF FRIENDFEED” at the conference “Le reti socievoli” [this could be translated “the friendly networks”], hosted by the University of Urbino “Carlo Bo”. We are very happy about the conference and still quite excited from the results, we had some chance to share perspective and experience with so many interesting people and to check what are doing SNS researcher in Italy.
[SlideShare Presentation]