Opera Software has led a project to create a search engine that
tracks how web pages are structured on the World Wide Web. When
released publicly in the coming months, this engine will help browser
makers and standards bodies work towards a more standards-driven and
compatible web, the company says.
Opera has announced results from its MAMA (Metadata Analysis and
Mining Application) search engine, a brainchild of Opera engineers
that indexes the markup, style, scripting and technology used while
creating web pages. The MAMA search engine scours 3.5 million Web
pages, and the resulting data can answer questions such as “Can I get
a sampling of Web pages that have more than 100 hyperlinks?” or “What
does an average Web page look like?”–a dream come true for Web
developers.
MAMA will help web developers find examples of usage of features and
functions, look at trends and gather data to justify technology to
their clients or managers, says Snorre M. Grimsby, vice president of
Quality Assurance at Opera Software. This will also encourage
standards bodies to take into account developers’ suggestions about
what is happening on the web in reality and will eventually raise the
quality and interoperability of specifications, the Web and browsers,
he adds.
MAMA can also respond to queries as general as “How many sites use
CSS (Cascading Style Sheets)?” (80.4 percent of MAMA’s URLs), or “How
many markup errors does the average Web page have?” (47), or “How
many characters does an average Web page have?” (16,400), to more
specific queries such as “What country is using XMLHttpRequest, a
critical component of AJAX, the most?” (Norway, with 10.2 percent,
within MAMA’s URL set).
MAMA is up to the task of tackling vague questions that do not have
easy answers, like “How many sites are mobile-ready?” or “How
prevalent is Web 2.0?,” says Grimsby. Defining a page as being “Web
2.0” can cover a variety of sub-topics, including the use of micro
formats, RSS, JSON (JavaScript Object Notation) and AJAX, among
numerous other criteria. MAMA is ready to provide the complex answers
to indistinct questions where simple answers do not exist.
For more information on Opera’s MAMA project, please visit:
http://dev.opera.com/articles/view/mama/.