On Blog Pinging (again)
Dear Internet,
When a major ping service has problems, I will remind you that Blog Pings are stupid. Some companies rely upon blog pings for their crawling system. What happens when a ping service is down? Your posts don’t get indexed. Oh Snap.
People have noticed that if you want to get higher rankings in some search engines, you should find everyone who has linked to you, and send Pings for them. Ping-spoofing. It really feels like email-spoofing to me. Maybe SPF could be extended to prevent this……
Anyways, At $work we do not currently accept pings from anyone. Instead we crawl every site in our system, every 30 minutes. Its not that hard, honestly, thanks to memcached and conditional http requests. (Oh, and lots of bandwidth).
Blog Pings have fundamental problems. Until they are both reliable and secure, I hope they will die. (By secure I mean knowing that the person who sent the ping is the person who owns the feed….). FeedTree tries to go down this path. It has many problems, the foremost is using Java. If you want everyone in the world to support your protocol, you need support for everyone still using c, perl, python, php, and ruby. The other problem with FeedTree, is that it makes crypto signing of pings optional. Rule number one of making a spec: Nothing is optional, because anything that is, won’t be implemented by half the software out there.
Have a happy Monday.
-Paul
January 31st, 2006 at 12:04 am
Hey, thanks for noticing our project! A couple of follow-up notes on FeedTree:
[It's also worth noting that sometimes you can't sign things. When there's no authoritative publisher pushing feed content to the FeedTree network (what we call a "conventional" or "legacy" feed), FeedTree subscribers self-organize into a collaborative polling scheme (staggering their requests, and sharing new events with one another). The thing is, none of them is really authoritative, so it's not meaningful for them to sign their content; as such, if you use FeedTree, you'll see that these shared events carry no signature and can't be authenticated.]
If, in the worst (best?) case, FeedTree should become so popular that spammers start to try to fill it with spam pings, there are a couple of properties of the network that should make this a losing proposition, long-term:
January 31st, 2006 at 2:29 pm
I don’t think proactive pinging is anything like email spoofing. Spoofing has two victims: those mislead by the spoofed email, and the person/business who’s email was spoofed.
Proactive pinging, on the other hand, rewards blogs who’ve been mentioned on other blogs by giving them proper credit for the links, and drives additional traffic to the linker’s previously unpinged blog. Blog search engines also benefit because it adds additional link data into their systems, allowing them to rank sites properly.
February 2nd, 2006 at 8:10 am
“Anyways, At $work we do not currently accept pings from anyone. Instead we crawl every site in our system, every 30 minutes. Its not that hard, honestly, thanks to memcached and conditional http requests. (Oh, and lots of bandwidth).”
This does help Bloglines maintain a fresh index, but there are various ways that your algorithm could be improved that would save others bandwidth and respect established protocols such as the RSS 2.0 channel-level ttl element. Intelligent spacing of visits, based on historical frequency of posting would also not be a bad thing. For systems with lots of feeds, a deluge of 10 requests per second, twice an hour, on top of normal loads of traffic can be crippling. We’ve had to start sending 304’s to Bloglines during peak hours.