Oct 252012
 

Recently Zygis and me had a project related to Last.fm and online social networks sampling.

What to do?

  1. Crawl Last.fm with breadth-first search (BFS) and discover to the average node degree over exploration time.
  2. Start the BFS from the RJ node
  3. Use multithreading
What was done?
  1. BFS over Last.fm was implemented
  2. Multithreading is implemented to speed up the BFS
  3. Output in .gdf format
  4. Gephi is used to visualize the results

Findings:

  1. Multithreading speeds up the execution of BFS 🙂 (obviously)
  2. Average Node degree after the 60 000 nodes exploration is 125. (The real one is around 13)
  3. Gephi is a very nice and easy to use tool to visualize the graph (if the number of nodes is less than 10 000 :))
  4. RJ is a friend of many hubs in the Last.fm graph
  5. Depth of the exploration – 3,5

Code:

is here.