Question About Choosing a Structured Data Technology

This week’s question comes from Jitendra Vyas.

What is your opinion for schema.org? Should we use it? It’s supported by Google, Yahoo, Bing, Yandex

Sherpa Emily Lewis answers:

Answering this question is really about goals: your goals as a developer, your employer’s or client’s goals, and project goals. But before answering that question, let’s first talk about what schema.org is and isn’t.

Schema.org is the structured data initiative from Google, Yahoo!, Bing and Yandex.Structured data is simply the addition of attributes and values to HTML that describe web content for machines. In the case of schema.org, you add defined microdata to your HTML5 markup, which the search engines parse in order to contextually index the content. This allows the search engines to provide a richer search results display.

Schema.org is not a way to boost search results rankings. And structured data, itself, is about far more than search. It’s about machine-readability that makes data — your web content — extensible for sharing, re-mixing and re-use, whether by search engines, other applications, or even browsers and assistive technologies.

This is why structured data technologies like RDFa and microformats were introduced.HTML5 microdata, which is the basis of schema.org, is simply a new addition to the structured data arsenal.

So, now, back to the question: Should you use schema.org? If you are already interested in structured data and the potential it offers, you should make your decision about which technology you use — microformats, schema.org’s microdata or RDFa — based on goals.

Are you looking to get higher on SERPs? More click-throughs? Because it is backed by leading search engines, it isn’t unreasonable to assume that schema.org is the route to follow. However, following the schema.org vocabularies simply won’t give you better rankings.

What schema.org may offer you, though, are better results listings. For example, Google’s Rich Snippets displays search results for web content marked up with structured data a bit differently than other results. This can make a result stand out from the crowd, and may encourage a visitor to click through. 

But you don’t have to use HTML5 microdata and schema.org vocabularies to benefit from Rich Snippets. Google supports microformats, RDFa and microdata.

So then the question becomes: which structured data technology should you use? Here’s where you have to consider your goals. Implementing any structured data technology will be an investment of time and resources. From understanding the syntax to deciding on vocabularies to establishing internal coding practices, you need to decide which technology is best for you and your project.

I, personally, favor microformats because they are comparatively easy to learn and implement, and they work with all versions of HTML. The majority of microformats rely on the well-known class attribute and simple vocabularies that describe the most common web content. Even within a large team of developers, they are easy to learn and incorporate into coding practices because they are familiar. And because microformats have been around for 7 years, there are many more tools available, including creators and validators, that make development so much more efficient.

Microdata, which is what schema.org uses, is a bit more complicated than microformats. It utilizes new attributes introduced in HTML5 — itemscopeitemtype and itemprop — which may not be familiar and could take longer to learn. Schema.org also offers narrowly-defined vocabularies, which can increase the learning curve to determine which vocabulary is right for your content. Lastly, microdata is supported in HTML5, so if you are working with legacy HTML that utilizes an earlier DOCTYPE, you will have validation errors.

RDFa is the most complex of the three, primarily because it is the most extensible and customizable. It works with any XML-based language and allows you to create your own vocabulary to describe content. This can be extremely useful for organizations with unique content, such as laboratories and universities. It also can mean a longer adoption process, not only to learn the more complex syntax, but to define your vocabularies.

To know which of these three to implement, you first have to weigh your goals for structured data with your realities. What is the expected ROI from structured data, and how will you measure it? How large is your team and how long will it take to adopt a new technology? What version(s) of HTML are you using? Do you need tools for publishing and validating?

Knowing your goals will tell you which technology makes the most sense for you.