January 25, 2009

Optimizing Tagging UI for People & Search

Overview/Intro

One of my areas of focus is around social tools in the workplace (enterprise 2.0) is social bookmarking. Sadly, is does not have the reach it should as it and wiki (most enterprise focused wikis have collective voice pages (blogs) included now & enterprise blog tools have collaborative document pages (wikis). I focus a lot of my attention these days on what happens inside the organization’s firewall, as that is where their is incredible untapped potential for these tools to make a huge difference.

One of the things I see on a regular basis is tagging interfaces on a wide variety of social tools, not just in social bookmarking. This is good, but also problematic as it leads to a need for a central tagging repository (more on this in a later piece). It is good as emergent and connective tag terms can be used to link items across tools and services, but that requires consistency and identity (identity is a must for tagging on any platform and it is left out of many tagging instances. This greatly decreases the value of tagging - this is also for another piece). There are differences across tools and services, which leads to problems of use and adoption within tools is tagging user interface (UI).

Multi-term Tag Intro

multiterm tag constructionThe multi-term tag is one of the more helpful elements in tagging as it provides the capability to use related terms. These multi-term tags provide depth to understanding when keeping the related tag terms together. But the interfaces for doing this are more complex and confusing than they should be for human, as well as machine consumption.

In the instance illustrated to the tag is comprised or two related terms: social and network. When the tool references the tag, it is looking at both parts as a tag set, which has a distinct meaning. The individual terms can be easily used for searches seeking either of those terms, but knowing the composition of the set, it is relatively easy for the service to offer up "social network" when a person seeks just social or network in a search query.

One common hindrance with social bookmarking adoption is those familiar with it and fans of it for enterprise use point to Delicious, which has a couple huge drawbacks. The compound multi-term tag or disconnected multi-term tags is a deep drawback for most regular potential users (the second is lack of privacy for shared group items). Delicious breaks a basic construct in user focussed design: Tools should embrace human methods of interaction and not humans embracing tech constraints. Delicious is quite popular with those of us malleable in our approach to adopt a technology where we adapt our approach, but that percentage of potential people using the tools is quite thin as a percentage of the population.. Testing this concept takes very little time to prove.

So, what are the options? Glad you asked. But, first a quick additional excursion into why this matters.

Conceptual Models Missing in Social Tool Adoption

One common hinderance for social tool adoption is most people intended to use the tools are missing the conceptual model for what these tools do, the value they offer, and how to personally benefit from these values. There are even change costs involved in moving from a tool that may not work for someone to something that has potential for drastically improved value. The "what it does", "what value it has", and "what situations" are high enough hurdles to cross, but they can be done with some ease by people who have deep knowledge of how to bridge these conceptual model gaps.

What the tools must not do is increase hurdles for adoption by introducing foreign conceptual models into the understanding process. The Delicious model of multi-term tagging adds a very large conceptual barrier for many & it become problematic for even considering adoption. Optimally, Delicious should not be used alone as a means to introduce social bookmarking or tagging.

We must remove the barriers to entry to these powerful offerings as much as we can as designers and developers. We know the value, we know the future, but we need to extend this. It must be done now, as later is too late and these tools will be written off as just as complex and cumbersome as their predecessors.

If you are a buyer of these tools and services, this is you guideline for the minimum of what you should accept. There is much you should not accept. On this front, you need to push back. It is your money you are spending on the products, implementation, and people helping encourage adoption. Not pushing back on what is not acceptable will greatly hinder adoption and increase the costs for more people to ease the change and adoption processes. Both of these costs should not be acceptable to you.

Multi-term Tag UI Options

Compound Terms

I am starting with what we know to be problematic for broad adoption for input. But, compound terms also create problems for search as well as click retrieval. There are two UI interaction patterns that happen with compound multi-term tags. The first is the terms are mashed together as a compound single word, as shown in this example from Delicious.

Tag sample from Delicious

The problem here is the mashing the string of terms "architecture is politics" into one compound term "architectureispolitics". Outside of Germanic languages this is problematic and the compound term makes a quick scan of the terms by a person far more difficult. But it also complicates search as the terms need to be broken down to even have LIKE SQL search options work optimally. The biggest problem is for humans, as this is not natural in most language contexts. A look at misunderstood URLs makes the point easier to understand (Top Ten Worst URLs)

The second is an emergent model for compound multi-term tags is using a term delimiter. These delimiters are often underlines ( _ ), dots ( . ), or hyphens ( - ). A multi-term tag such as "enterprise search" becomes "enterprise.search", "enterprise_search" and "enterprise-search".

While these help visually they are less than optimal for reading. But, algorithmically this initially looks to be a simple solution, but it becomes more problematic. Some tools and services try to normalize the terms to identify similar and relevant items, which requires a little bit of work. The terms can be separated at their delimiters and used as properly separated terms, but since the systems are compound term centric more often than not the terms are compressed and have similar problems to the other approach.

Another reason this is problematic is term delimiters can often have semantic relevance for tribal differentiation. This first surface terms when talking to social computing researchers using Delicious a few years ago. They pointed out that social.network, social_network, and social-network had quite different communities using the tags and often did not agree on underlying foundations for what the term meant. The people in the various communities self identified and stuck to their tribes use of the term differentiated by delimiter.

The discovery that these variations were not fungible was an eye opener and quickly had me looking at other similar situations. I found this was not a one-off situation, but one with a fair amount of occurrence. When removing the delimiters between the terms the technologies removed the capability of understanding human variance and tribes. This method also breaks recommendation systems badly as well as hindering the capability of augmenting serendipity.

So how do these tribes identify without these markers? Often they use additional tags to identity. The social computing researchers add "social computing", marketing types add "marketing", etc. The tools then use their filtering by co-occurrence of tags to surface relevant information (yes, the ability to use co-occurrence is another tool essential). This additional tag addition help improve the service on the whole with disambiguation.

Disconnected Multi-term Tags

The use of distinct and disconnected term tags is often the intent for space delimited sites like Delicious, but the emergent approach of mashing terms together out of need surfaced. Delicious did not intend to create mashed terms or delimited terms, Joshua Schachter created a great tool and the community adapted it to their needs. Tagging services are not new, as they have been around for more than two decades already, but how they are built, used, and platforms are quite different now. The common web interface for tagging has been single terms as tags with many tags applied to an object. What made folksonomy different from previous tagging was the inclusion of identity and a collective (not collaborative) voice that intelligent semantics can be applied to.

The downside of disconnected terms in tagging is certainty of relevance between the terms, which leads to ambiguity. This discussion has been going on for more than a decade and builds upon semantic understanding in natural language processing. Did the tagger intend for a relationship between social & network or not. Tags out of the context of natural language constructs provide difficulties without some other construct for sense making around them. Additionally, the computational power needed to parse and pair potential relevant pairings is somethings that becomes prohibitive at scale.

Quoted Multi-term Tags

One of the methods that surfaced early in tagging interfaces was the quoted multi-term tags. This takes becomes #&039;research "social network" blog' so that the terms social network are bound together in the tool as one tag. The biggest problem is still on the human input side of things as this is yet again not a natural language construct. Systematically the downside is these break along single terms with quotes in many of the systems that have employed this method.

What begins with a simple helpful prompt...:

 SlideShare Tag Input UI

Still often can end up breaking as follows (from SlideShare):

SlideShare quoted multi-term tag parsing

Comma Delimited Tags

Non-space delimiters between tags allows for multi-term tags to exist and with relative ease. Well, that is relative ease for those writing Western European languages that commonly use commas as a string separator. This method allows the system to grasp there are multi-term tags and the humans can input the information in a format that may be natural for them. Using natural language constructs helps provide the ability ease of adoption. It also helps provide a solid base for building a synonym repository in and/or around the tagging tools.

Ma.gnolia comma separated multi-term tag output

While this is not optimal for all people because of variance in language constructs globally, it is a method that works well for a quasi-homogeneous population of people tagging. This also takes out much of the ambiguity computationally for information retrieval, which lowers computational resources needed for discernment.

Text Box Per Tag

Lastly, the option for input is the text box per tag. This allows for multi-term tags in one text box. Using the tab button on the keyboard after entering a tag the person using this interface will jump down to the next empty text box and have the ability to input a term. I first started seeing this a few years ago in tagging interfaces tools developed in Central Europe and Asia. The Yahoo! Bookmarks 2 UI adopted this in a slightly different implementation than I had seen before, but works much the same (it is shown here).

Yahoo! Bookmarks 2 text box per tag

There are many variations of this type of interface surfacing and are having rather good adoption rates with people unfamiliar to tagging. This approach tied to facets has been deployed in Knowledge Plaza by Whatever s/a and works wonderfully.

All of the benefits of comma delimited multi-term tag interfaces apply, but with the added benefit of having this interface work internationally. International usage not only helps build synonym resources but eases language translation as well, which is particularly helpful for capturing international variance on business or emergent terms.

Summary

This content has come from more than four years of research and discussions with people using tools, both inside enterprise and using consumer web tools. As enterprise moves more quickly toward more cost effective tools for capturing and connecting information, they are aware of not only the value of social tools, but tools that get out the way and allow humans to capture, share, and interact in a manner that is as natural as possible with the tools getting smart, not humans having to adopt technology patterns.

This is a syndicated version of the same post at Optimizing Tagging UI for People & Search :: Personal InfoCloud that has moderated comments available.



Web Mentions

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License.