Limiting the power of the Googlebots
Monday, December 3, 2007/
A coalition of online publishers are joining together in an effort to increase their control over when and how much of their content appears on search engines, Time reports.
Currently websites are able to exert some control over which of their pages search engines have access to by inserting a text known as robots.txt. In effect, these files contain a set of instructions to the netcrawlers that search engines use to map and index the web, allowing the website to block indexing of individual web pages, specific directories or the entire site.
The coalition of publishers wants to extend the kinds of commands they can put into these robots.txt files to expand the control they have over their content by, for instance, limiting how long search engines may retain copies in their indexes or telling the crawler not to follow any of the links that appear within a web page.
The publishers say this will better enable them to express terms and conditions on access and use of content – in particular, they’re concerned about their information staying on search engines long after they’ve locked it on their sites, or excerpts and headlines being used without their permission.
Amantha Imber runs a successful business — but she still has impostor syndrome Amantha Imber Inventium founder
Your future customers: How to crack the gen Z code Simon Slade Affilorama co-founder
Four stupid business decisions that burnt through $1 million Ian Whitworth Scene Change co-founder
Why corporate content will send your customers running Luke Buesnel Story League director
How to write the perfect job advertisement Alex Hattingh Employment Hero chief people officer
How to outshine the millions of websites ranking poorly on Google Adam Rowles Inbound Marketing founder