followthemedia.com - a knowledge base for media professionals
All Things Digital

To ACAP or Not To ACAP, That Is The Question As Google, Yahoo And MSN Open Up Transparency On How They Use Robots.Txt

When it comes to Google and newspaper web sites publishers around the world seem to be of two minds – that the traffic Google sends to their own sites by linking to a headline and a few words of a story is worth gold in added traffic and thus hopefully higher advertising rates, while others cry out about copyright infringements – the Belgians have already launched a second lawsuit against the search engine asking for some €49 million ($77 million).

ACAP logoThe World Association of Newspapers (WAN) took that bull by the horn and led a consortium that developed and last year released the Automated Content Access Protocol (ACAP), a machine-interpretable language that lets a web site block indexing of specific pages, or an entire site. It extends what was available from the robots. txt command developed in 1994 to block content on a server and the meta robots that were developed to allow page-by-page blockage.

And as WAN held its annual convention this week in Gothenburg, Sweden with some 1800 senior media executives from around the world attending, it was a great opportunity to encourage publishers to sign up.  Marc Bide, project chairman, told FTM that he was “extremely encouraged” by the foot traffic visiting the ACAP booth and that once publishers really understood what ACAP was all about he believed they were very enthusiastic for the project. He says that currently some 200 publishers in around 30 countries are using the protocol. Newspaper Direct chose the Congress to announce that it, too, is adopting ACAP.

But there’s a big problem, the major search engines – Google, Yahoo and MSN – have not implemented it and if they’re not using ACAP then what’s the point? It’s the subject of ongoing ACAP discussions with the search engines who say they believe the robots.txt system works just fine. Google has stated in the past, We believe search engines are of real benefit to publishers because they drive valuable traffic to their websites. If publishers do not want their websites to appear in search results, technical standards like robots.txt and metatags enable them automatically to prevent the indexation of their content. These Internet standards are nearly universally accepted and are honored by all reputable search engines.”

Publishers, on the other hand, say robots is just a blocking tool that says either “yes or “no” whereby ACAP communicates automatically with the search engines, telling the search engine robots what they can do with each page of copy – publish it entirely, publish only extracts, or not touch it at all.

There may, however, have been a bit of a breakthrough this week. Google, Yahoo and MSN came together to post publicly how each supports robots.txt and the robot meta tag, and ACAP’s Bide says that is “great news” because one thing the publishers have wanted is to see the transparency of how the search engines use those protocols. He said with that transparency “it enables us to properly identify the gaps” between how ACAP works and how the search engines utilize robots, and he believes it is “a very useful step” in ACAP and the search engines eventually coming together.

On the other hand there may be a message there, too, from the search engines to ACAP – that the major search engines are united in their approach to ACAP, it’s not just Google alone any more, and all need to be satisfied if they are to go forward.

Google Project Manager  Prashanth Koppula  said in a posting on Google’s Webmaster Central blog, "In the spirit of making the lives of webmasters simpler, we're releasing detailed documentation about how we implement robots exclusion protocol (REP). This will provide a common implementation for webmasters and make it easier for any publisher to know how their REP directives will be handled by three major search providers -- making REP more intuitive and friendly to even more publishers on the web.”

There was no mention of ACAP and by its absense  the announcement seemed to be saying that Google, Yahoo and MSN, don’t think there really is a need for ACAP now that this transparency has been published.

One has the impression in talking to Bide that WAN’s feet are not fixed in cement – that talks are continuing on trying to reduce the differences between what ACAP is today and what the search engines say is acceptable, and ACAP is willing to make revisions accordingly.  

At an early morning session at the Congress  Wednesday, Gavin O’Reiily, WAN President and chairman on the media consortium that developed ACAP,  said there are ongoing discussions with the search engines who don’t really like ACAP because, “Up to this point they have had unbridled use of our content. The status quo has suited them. It hasn’t suited many publishers.”

The ACAP consortium has been disappointed by the slow uptake since ACAP was launched. Part of the reason may be because of unfavorable blog comments, but O’Reilly says that “drive-by shootings” from the bloggers should not deter implementing ACAP.  Just one publisher at the meeting voiced skepticism of ACAP, urging colleagues to read the blog sites.  “We have had a lot of drive-by shootings from bloggers, because it doesn’t suit their á la carte notion of the world,” O’Reilly responded.

And he intimated that the fact more publishers haven’t signed up is a signal to the search engines that even the newspaper industry itself is divided on whether ACAP is necessary. “Ultimately for  ACAP to work we need the involvement of the search engines. How can we get the search engines to recognize ACAP if we don’t have every publisher on side,” he pleaded.

So why should publishers sign up? Bide, at the digital media roundtable discussion earlier in the week, said it was a way “of demonstrating that you support the idea of the right to your own content.” In other words, you can still allow the search engines complete access if you choose, but if you wish to place some restriction then with ACAP you can.

 


advertisement

related ftm content:

With ACAP and Google Now Discussing How Their Systems Can Communicate It Begs The Question Why All Of This Wasn’t Worked Out Before ACAP’s Launch
Is Google supportive of the new ACAP protocol developed by various media groups that enables web sites to block indexing of specific pages, or an entire site? There seem to be different answers depending whom in Google is speaking, but CEO Eric Schmidt has told Australia’s ITWire that the issues are only technical. “At present it doesn’t fit with the way our systems operate. It’s not that we don’t want them to be able to control their information.”

The War Of Words Heats Up Between Google And Newspaper Publishers Wanting To Protect Their Online News Copy
Google continues to say that robots.txt gives newspaper publishers all the protection they need to stop Google accessing their online news, but the publishers, who have developed their own new coding system to give them more control, are getting ever more angrier that Google won’t play ball.

Have You Installed ACAP On Your Website – The Protocol That Can Control Yahoo And Google News Searches? No Problem, Neither Search Engine Is Using It Yet
Is the ability of Google News to search a news web site’s content and list that site among search results good or bad for newspapers? Is it good that Google can publish on its news site the first paragraph or so of a news item, fully credited to the referenced site and links the reader to that site? Much of the news media seems to believe all of that infringes upon copyright, limiting profits, and so they have come up with a new protocol that controls what the search engines can and cannot do.


advertisement

Media in Spain - Diverse and Challenged – new

Media in Spain is steeped in tradition. yet challenged by diversity. Publishers hold great influence, broadcasters competing. New media has been slow to rise and business models for all are under stress. Rich in language and culture, Spain's media is reaching into the future and finding more than expected. 123 pages, PDF. January 2018

Order here

The Campaign Is On - Elections and Media

Elections campaigns are big media events. Candidates and issues are presented, analyzed and criticized in broadcast and print. Media is now more of a participant in elections than ever. This ftm Knowledge file reports on news coverage, advertising, endorsements and their effect on democracy at work. 84 pages. PDF (September 2017)

Order here

Fake News, Hate Speech and Propaganda

The institutional threat of fake news, hate speech and propaganda is testing the mettle of those who toil in news media. Those three related evils are not new, by any means, but taken together have put the truth and those reporting it on the back foot. Words matter. This ftm Knowledge file explores that light. 48 pages, PDF (March 2017)

Order here

More ftm Knowledge files here

Become an ftm Individual or Corporate Member to order Knowledge Files at no charge. JOIN HERE!

ftm resources

no resources added as of June 5, 2008


ftm followup & comments

no followup as of June 5, 2008

no comments as of June 5, 2008

Post your comment here

copyright ©2004-2008 ftm partners, unless otherwise noted Contact UsSponsor ftm