Recently Zygis and me had a project related to Last.fm and online social networks sampling.
What to do?
- Crawl Last.fm with breadth-first search (BFS) and discover to the average node degree over exploration time.
- Start the BFS from the RJ node
- Use multithreading
What was done?
- BFS over Last.fm was implemented
- Multithreading is implemented to speed up the BFS
- Output in .gdf format
- Gephi is used to visualize the results
Findings:
- Multithreading speeds up the execution of BFS 🙂 (obviously)
- Average Node degree after the 60 000 nodes exploration is 125. (The real one is around 13)
- Gephi is a very nice and easy to use tool to visualize the graph (if the number of nodes is less than 10 000 :))
- RJ is a friend of many hubs in the Last.fm graph
- Depth of the exploration – 3,5
Code:
is here.