After SBP10 Conference (the pic is there to prove that we’ve been really there) we are back at work. SPB10 has been an exiting and interesting experience for many reason. First the conference was very well organized, many interesting presentation, good time management, good food and an extraordinary cross fertilization session at the end of the conference. We’ve got a good idea of where the research on many relevant topics is going and we’ve met many interesting people. We’ve got the opportunity (thanks again for the fruitful cross-fertilization session!) to get in touch with many interesting researchers and now we just have to see if anything will came out of that.
Preconference work
Live update from Bethesda, MD. We arrived in Bethesda yesterday and we are still trying to recover from the jet lag. This morning, after a great breakfast at the local pancake house, we are doing some work on a conversation ranking system based on our friendfeed data. Being able to rank online conversation is undoubtedly a great challenge and we are currently working on a model able to keep together relational aspects (followers and popularity of people taking part in the conversation) and conversational aspects (how people took part in that conversation, whit what degree of involvement, how spread was the conversation among the network and so on). We are pretty excited about this model and we hope to be able to share something more soon.
In the meanwhile tomorrow we’re presenting our project and our data here, you can follow us on Twitter.
Social Computing, Behavioral Modeling, & Prediction
Tomorrow we are going to Bethesda to attend the Social Computing, Behavioral Modeling, & Prediction Conference. We are going to give a poster presentation with a very short speech focused on the preliminary results of our research. You can both read the article in the Data&Papers section or take a look at the poster.
SIGSNA data available
In the Data section you can now download the data. We monitored the activity of Friendfeed from September 6, 2009, 00:00 AM to September 19, 2009, 12:00 PM. The service was monitored at a rate of about 1 to 2 updates every second (depending on network traffic). This is the dataset we are using for our research and it is now available to you. Have fun!
Language identification and source analysis
Even during Christmas holidays #SIGSNA team was working on our dataset of conversations. After the positive reaction to our first public presentation we’ve decided to improve our dataset before moving any step forward. We worked on two main issues: 1) we tried to improve the performance of our Language Identificatio system and 2) we tried to identify all the original sources of Friendfeed entries we’ve collected.
In details:
1) We were pretty satisfied by the original performances of our language identification system (SLide). The version we used in our first descriptive analysis of FF network worked fine with a very low error rate. Nevertheless the rate of unidentified post was still to high. This was because every language identification system works better with very long “normal” texts than with very small piece of text often written with an informal style. It is much easier to identify a book chapter than to identify a couple of comments like “wow! great” cool!” to a single Friendfeed entry. Our team did some great job on that issue and now we’re running a new version of SLide (ver. 0.6) that works really better than the old one especially on very short text entries.
During our first public presentation in Urbino we received a very good question (thank you again Adriano!): How may entries in Friendfeed are posted directly as Friendfeed messages and how many are imported in Friendfeed from external services? As soon as we finished the new version of SLIde we decided to go and look for external services in our dataset. We analyzed only the entries (since comments obviously come from Friendfeed) and we discovered something very interesting: the largest number of messages in Friendfeed comes from Twitter! The synchronization between the two services works just fine (as many of you probably know) and then a large number of users seems to have a single microblogging feed (the Twitter one) which is imported in Friendfeed.
Friendfeed itself is the second source of entries (not surprising) and Tumbler and Facebook are on the third and fourth places.
Another interesting aspect is that half of the entries come from a thousand of different services: google news, research query feeds, minor news services and many more.
Friendfeed is really a complex galaxy of information feeds coming from everywhere.
First public presentation
We’re happy to announce that Friday we gave the first public presentation of the #SIGSNA project. We presented our paper titled “SOCIAL NETWORK PRACTICES, AN EMPIRICAL DESCRIPTION OF FRIENDFEED” at the conference “Le reti socievoli” [this could be translated “the friendly networks”], hosted by the University of Urbino “Carlo Bo”. We are very happy about the conference and still quite excited from the results, we had some chance to share perspective and experience with so many interesting people and to check what are doing SNS researcher in Italy.
[SlideShare Presentation]