SDL Tridion and Web Analytics (aka How to Identify Unique Content Instances?)

SDL Tridion can help you manage your Web site's (the hot new, soon-to-be dated term is digital channel) content. You may want to measure your content's performance, but first you'll need to "identify" your content.
Let's look specifically at identifying unique component presentations and links, which might equate to querystring variables at the end of href attributes in anchors (e.g. <a href="/somepath.html?id=some-unique-identifier"). You can then use some server logging or your analytics package to interpret the results. The syntax doesn't matter so much as knowing how Tridion Building Blocks relate to your final markup.
Let's take an analytics package-agnostic approach to understand why Component IDs are not enough, what you can use for unique identifiers, and some surprises you might encounter in implementing Web Analytics with SDL Tridion.

I've seen the following conversation a few times this year.

Customer asks, "How do I do Web analytics with SDL Tridion?"

Consultant explains, "By tracking Component Presentations, fields, and positions for article-type content."

"..." [slight pause as more pressing work takes place]

"So... how do I do Web analytics with SDL Tridion?"

Analytics is just one of those deceptively simple tasks that needs a good deal of input, working sessions, and buy-in to get it right.

Component ID is Not Enough

I see confusion between SDL Tridion Components, the design-independent text and media, and Component Presentations (CPs), especially when organizations skip training or are new to the terminology. This is especially challenging when previous implementations specifically used "component" to describe the visible content instances (now CPs) on the site.

Since authors can use the same Tridion component across multiple pages with different presentations (component templates), Component ID is not enough.

A Component is not the same as a Component Presentation.

The real reason why Component ID is not enough. Of course there are always exceptions, but if you have multiple templates (summary-detail anyone?) you'll want to identify the variations separately.

Recommendation on terms: find a convenient, neutral term or adjective to clear up confusion about components. I've seen "module," "render-side component," or "(Dyanamic or D)CP" help.

Tridion Unique Identifiers

Here's the list of Tridion items and characteristics you might consider mapping against to create "some-unique-identifier" for analytics. Since names aren't necessarily unique, consider identifiers or WebDAV urls as needed. Your final output will likely be a combination of these.
  • Page name, ID, or url (typically already understood by analytics packages)
  • Component name or ID (either the component on the page or linked-to component)
  • Template Name or ID
  • "Position" of the CP on the page (possibly via the TemplateRepeatIndex variable in DWT or render-side code)
  • "Position" of a link in a CP
Not necessarily typical or even a good practice, but you could possibly have the same CP on the same page multiple times. Use position to correctly identify these CPs or the links inside them.

Implementation tip: you might add these in DWT or from code TBBs. Simply using @@ID@@ will render the Component Template ID in preview or publish (thanks, KnewKnow), but the Component ID in Template Builder. See an example on StackOverflow for how to add a Component Template ID to the package to get it to match.
For advanced options, also consider
  • Author-selected Keywords (by ID, name, or key value)
  • Field names (and Embedded Schema)
  • "Region" or Schema (see below)
  • I've also seen customers consider organization for both analytics as well as SEO (by basically rendering folder or structure group paths in the final markup)
Bonus: Re-usable External Links
For easier tracking, control, and management for links, consider implementing all external links as their own components. Jaime Santos Alcón describes a preference for external links as components as well as other link approaches in this StackOverflow answer.
Determine what you need to measure then add the unique identifiers as needed to pages, component presentations, or individual links.
  • You need at least the Component and Template (assuming this Component Presentation is only used once on a page).
  • If you re-use CPs on the same page, add position.
  • Add the page information if for some reason your analytics package can't determine the page (unlikely, I hope).
  • Remember that Tridion identifiers are unique to a CMS environment, even if some original ids match from a database copy, over-time your Tridion Content Manager identifiers (tcm-ids) will differ between the DTAP environments.
Depending on your content model, you might also need embedded schema fields in case a single component creates multiple "content types" (e.g. a single FAQs schema with multiple Q&A embedded schema fields--the business might want to know how often users expand or click on individual Q&A's, which means you need to track fields).
Understand that authors can change templates and move position, if a particular identifier needs to stay the same, consider a separate component to represent links or "regions" baked into your templates.

Surprise

Campaigns Differ

Here's where assumptions go wrong and why context matters. I wrote previously on how the classic Summary-Detail setup doesn't work when modeling "campaign" CPs. These frequently change position, design, and page placement which means you may want to track both the CP behavior, but also the non-changing regions that hold these CPs.

Containers and Regions

For example, you might not want to rely on Component IDs if you use "container" components (i.e. content injection), especially if authors frequently change the containers themselves. If the business is looking to create a "heat map" of sorts to visually see how visitors interact with a page over time, consider identifying Schemas instead of Components. The identifier on a "sale of the week" might need to track activity within a particular part of the page rather than the actual components.

Marketing may be more interested in the end-user activity over-time for a region or links in a specific area, instead of a individual CP. In this case, using a component ID doesn't make sense, especially if it constantly changes. 

Position Doesn't Work for Large Lists

Position works well to identify among a few component presentations or links. In navigation or changing lists, however, you might want to divide parts up even more into regional sections, columns. It's hard for the business to understand what the 50th link means. Separate components or separating a single component with templating could help here (e.g. change the analytics output every new embedded schema or maybe every 5th link).

On Page Weight

Business-friendly identifiers works well for SEO purposes, but you might want to keep pages low, especially if you're putting analytics query string value on every single link. With the right look-up logic, you could reduce your link querystrings and thus page weight. For example you can use just the second part of a tcmd-id (i.e. just the "123" in tcm:5-123) to identify a component. You might infer the schema based on the component as well.

tl;dr

So there you go, an approach at looking at analytics with Tridion-colored lenses. In summary:
  1. For analytics Component IDs are not enough, unless you only use component once (e.g. external link components).
  2. For content, measure Component + Component Template and Position (or nth Component with each Component Template, see update below).
  3. For links, measure the above plus Fields and Position.
  4. Confirm with the business on how they want to measure content. Sometimes it's not specific components, but the type of content (schema) and/or activity in certain regions on a page over time. We haven't even touched on multi-step or paths forward/backwards type measurements. :-)
If you're looking for more drag-and-drop functionality and examples, head over to SDLTridionWorld's eXtensions for the template building blocks that match your package including:
One last tip for implementation: look at the Core Service if you need a mapping between these identifiers and the business-friendly terms. A cheap quick-and-dirty alternative is to "template" a report, literally (see some generic XSLT and C# examples at rendering some CM-side information).

Update (June 8, 2013): Ant P on TRex had a similar functional requirement for unique identifiers for Component Presentations on a page (but for CSS instead of analytics). He found that identifying the ordinal number for a given component with a certain template works just as well.

See his approach on TRex which uses some code, a context variable, and a rather elegant way to call this in DWT (TemplateName@@item_name@@).

No comments:

Post a Comment

Feel free to share your thoughts below.

Some HTML allowed including links such as: <a href="link">link text</a>.