FoafBot Crawler Info
Purpose of the Crawler
I am currently running an experimental webcrawler for the purposes of retrieving and analysing FOAF files.
Status
The crawler is at a very early stage of development, publicly it is currently only retrieving and parsing robots.txt files.
In the near future retrieval of FOAF files, in addition to certain other files that can provide references to additional FOAF files. This page will be kept updated with the current status of the crawler.
Bot Info
The following information is currently provided by the Bot.
- Honours the robots.txt standard. If the robot does not appear to be honouring this please contact me, [email protected]. To specifically exclude this robot use the User-agent value of
"FoafBot"FoafHarvester, case is unimportant - Current info contained in HTTPRequest headers of public version.
- Accept - text/plain;text/xml;application/rdf+xml;application/rss+xml;text/*
- UserAgent - FoafHarvester (http://www.benmeadowcroft.com/foaf/crawler.shtml)
- Referer - http://www.benmeadowcroft.com/foaf/crawler.shtml
- From - [email protected]
- Written in C# for the .Net platform
Results
This crawler is gathering information in order to analyse FOAF files, if any data is publically released at a later date it will be linked to from here.