irBlogs is a standard Persian weblogs collection that is suitable for studying Persian social networks and evaluation of graph mining and blog retrieval algorithms. Some characteristics of the collection are:
irBlogs Collection
irBlogs contains 5,000,000+ posts and a relations graph belonging to more than 600,000 Persian weblogs. It can be used in different applications like information retrieval, studying the Persian language in online social networks and even graph theory algorithms. Also, 45 queries and their relevance judgments are created by different users by use of UTIRE evaluation system. Different weblog retrieval algorithms are employed to create the judgment pool and totally 24339 weblogs are judged by the users (on average 540 weblogs for each query).
Download
you can download the whole irBlog collection by clicking Here – 2.9 GB (Google Drive Link Also provided) – to obtain the password contact : dbrg@ut.ac.ir
Copyright
irBlogs is created by crawling of Iranian weblogs. All rights of the collection and the tools of the collection are reserved for Database Research Group of the University of Tehran. If you use this collection, please use [۱] to refer to the collection.
Citation
https://www.sciencedirect.com/science/article/abs/pii/S0747563215302533?via%3Dihub
if you have any problem accessing the Dataset, contact us at dbrg@ut.ac.ir