While at the WWW Conference in Banff for the Tagging and Metadata for Social Information Organization Workshop and was chatting with Jennifer Trant about folksonomies validating and identifying gaps in taxonomy. She pointed out that at least 70% of the tags terms people submitted in Steve Museum were not in the taxonomy after cleaning-up the contributions for misspellings and errant terms. The formal paper indicates (linked to in her blog post on the research more steve ... tagger prototype preliminary analysis) the percentage may even be higher, but 70% is a comfortable and conservative number.
In my discussion with enterprise organizations and other clients that are looking to evaluate their existing tagging services, have been finding 30 percent to nearly 70 percent of the terms used in tagging are not in their taxonomy. One chat with a firm who had just completed updating their taxonomy (second round) for their intranet found the social bookmarking tool on their intranet turned up nearly 45 percent new or unaccounted for terms. This firm knew they were not capturing all possibilities with their taxonomy update, but did not realize their was that large of a gap. In building their taxonomy they had harvested the search terms and had used tools that analyzed all the content on their intranet and offered the terms up. What they found in the folksonomy were common synonyms that were not used in search nor were in their content. They found vernacular, terms that were not official for their organization (sometimes competitors trademarked brand names), emergent terms, and some misunderstandings of what documents were.
In other informal talks these stories are not uncommon. It is not that the taxonomies are poorly done, but vast resources are needed to capture all the variants in traditional ways. A line needs to be drawn somewhere.
The difference in the taxonomy or other formal categorization structure and what people actually call things (as expressed in bookmarking the item to make it easy to refind the item) is normally above 30 percent. But, what organization is comfortable with that level of inefficiency at the low end? What about 70 percent of an organizations information, documents, and media not being easily found by how people think of it?
I have yet to find any organization, be it enterprise or non-profit that is comfortable with that type of inefficiency on their intranet or internet. The good part is the cost is relatively low for capturing what people actually call things by using a social bookmarking tool or other folksonomy related tool. The analysis and making use of what is found in a folksonomy is the same cost of as building a taxonomy, but a large part of the resource intensive work is done in the folksonomy through data capture. The skills needed to build understanding from a folksonomy will lean a little more on the analytical and quantitative skills side than the traditional taxonomy development. This is due to the volume of information supplied can be orders of magnitude higher than the volume of research using traditional methods.
After getting flooded with e-mail yesterday about the Folksonomies: Tidying Up? in the January DLIB 2006 and yes I agree that by using Flickr as a base for much of their analysis they made a mess of their conclusions. Please go see Explaining and Showing Broad and Narrow Folksonomies to begin to get an understanding of why Flickr is not a great example of folksonomy. Showing tag distributions when tagging is limited by the tool (Flickr only permits one of each tag and does not allow identification of the person tagging, unless the API is used) is rather pointless. The central focus of a folksonomy is for personal refindability and derived from that point we get great value.
I would love to see this research redone with a better understanding of folksonomy and run the research on broad folksonomy tools like del.icio.us, Furl, Shadows, etc.
I am glad to have my my newest book in my hands finally. It is Personal, Portable, Pedestrian: Mobile Phones in Japanese Life edited by Mitzuko Ito, Daisuke Okabe, and Misa Matsuda. Last year's trip to Amsterdam floored me at how far behind we in the U.S. are with mobile (as well as broadband, which was really amazing). In talking with others on my trip they were pointing out how much farther ahead Japan is than Europe. What do I mean farther ahead? The trends in personal usage of mobile devices are two or three years ahead of the U.S. How people interact and use their mobile devices (text, web, information interaction, etc.). I have been watching the trends I read about in magazine articles and heard in conversation flow from other countries and after years bubble up in U.S. culture. My interest in the Personal InfoCloud draws me toward deeper understandings on personal devices around the world.
Ethnographic insights interest me, particularly for cultures and interactions with technology that I can not witness first hand. This book has come highly recommended and having met Mimi this past Spring I really have been looking forward to this book.
Wired has an overview of Personal, Portable, Pedestrian. You may also want to look at the publisher's MIT Press, page for Personal, Portable, Pedestrian.
SmartMobs announces It is official, there are more cellphones lines than landlines in the U.S.. I was thinking about this in the past couple weeks. We have already started seeing text and data uses tipping our mobile hands (it is about time we started getting to where much of the rest of the globe has already been).
Now if I could just keep my finger on the number of data enable phones and the lesser number of laptop/desktop internet connections for the globe. Every time I see this number I forget to mark it or grab it.
[Hat tip Anne]
One of the things that I am still mulling over that came out of the Social Software in the Academy Workshop is the relationship between academic cites and interested parties (non-academics researching, thinking deeply, and writing about a subject). Over the past year I have had some of the work I have posted on my web sites cited in academic papers. These papers have been for general coursework to graduate thesis.
In the academic realm these cites in other's works give credibility and ranking. In the realm of the professional or "interested party" these cites mean little (other than stroking one's ego). These cites do not translate to higher salary, but they may have some relationship to credibility in a subject area.
Another aspect is finding a way to tie into academic work around these subjects. There are often wonderful academic related gatherings (conferences, symposia, etc.) around these subject matters, but these are foreign to the "interested party". There is a chasm between academic and professional world that should be narrowed or at a minimum bridged in a better way. At SSAW there were some projects I found out about that I would love to follow, or even contribute to in some form (advisement, contributor, etc.).
I have a feeling I will be mulling this for some while, and will be writing about it again.
I have been following Wade Roush' continuousblog since its inception a few weeks ago. Continuousblog is focussing on the convergence that is finally taking place in the information technology realm. I had a wonderful conversation with Wade last week and have been enjoying watching his 10,000 Brainiacs evolve in 10,000 Brianiacs, Part 1; 10,000 Brainiacs, Part 2; 10,000 Brainiacs, Part 3; and soon to be 10,000 Brianiacs, Part 4.
Wade's concept of "continuous computing" fits quite nicely in line with the Personal InfoCloud as we have access to many different devices throughout our lives (various operating systems, desktops, laptops, PDA, mobile phone, television/dvr, as well as nearly continuous connectivity, etc.). The Personal InfoCloud focusses on designing and developing with the focus on the person and their use of the information as well as the reuse of the information. It is good to see we have one more in the camp that actually sees the future as what is happening to day and sending the wake-up call out that we need to be addressing this now as it is only going become more prevalent.
Today I finished reading the Malcolm McCullough book, Digital Ground. This was one of the most readable books on interaction design by way of examining the impact of pervasive computing on people and places. McCullough is an architect by training and does an excellent job using the architecture role in design and development of the end product.
The following quote in the preface frames the remainder of the book very well:
My claims about architecture are indirect because the design challenge of pervasive computing is more directly a question of interaction design. This growing field studies how people deal with technology - and how people deal with each other, through technology. As a consequence of pervasive computing, interaction design is poised to become one of the main liberal arts of the twenty-first century. I wrote this book because I ran into many people who believe that. If you share this belief, or if you just wonder what interaction design is in the first place, you may find some substance here in this book.
This book was not only interesting to me it was one of the best interaction books I have read. I personally found it better than the Cooper books, only for the reason McCullough gets into mobile and pervasive computing and how that changes interaction design. Including these current interaction modes the role of interaction design changes quite a bit from preparing an interface that is a transaction done solely on a desktop or laptop, to one that must encompass portability and remote usage and the various social implications. I have a lot of frustration with flash-based sites that are only designed for the desktop and are completely worthless on a handheld, which is often where the information is more helpful to me.
McCullough brings in "place" to help frame the differing uses for information and the interaction design that is needed. McCullough includes home and work as the usual first and second places, as well as the third place, which is the social environment. McCullough then brings in a fourth place, "Travel and Transit", which is where many Americans find themselves for an hour or so each day. How do people interact with news, advertisements, directions, entertainment, etc. in this place? How does interaction design change for this fourth place, as many digital information resources seem to think about this mode when designing their sites or applications.
Not only was the main content of Digital Ground informative and well though out, but the end notes are fantastic. The notes and annotations could be a stand alone work of their own, albeit slightly incongruous.
I have posted my thoughts on Tools to Manage Information On Your Personal Hard Drive for Mac OS X in particular. I have posted this on my Personal Info Cloud site. This is the first piece of content that I am not posting in both places. This may become a trend as I am spending a fair amount of time thinking through ideas related to the Personal Info Cloud in one place. The Personal Info Cloud has an RSS feed and I will be posting notices that new info has been added there as it happens.
Project Oxygen has progressed quite well since we last looked in (Oxygen and Portolano - November 2001). Project Oxygen is a pervasive computing system that is enabled through handhelds. The system has the users information and media follow them on their network and uses hardware (video, speakers, computers, etc.) nearest the user to perform the needed or desired tasks. Project Oxygen also assists communication by setting the language of the voicemail to match the caller's known language. The site includes videos and many details.
Project Oxygen seems to rely on the local network's infrastructure rather than the person's own device. This creates a mix of Personal Info Cloud by using the personal device, but relies on the Local Info Cloud using the local network to extract information. The network also assists to find hardware and external media, but the user does not seem to have control over the information they have found. The user's own organization of the information is important for them so it is associated and categorized in a manner that is easy for them to recall and then reuse. When the user drifts away from the local network is their access to the information lost?
This project does seem to get an incredible amount of pervasive computing right. It would be great to work in an environment that was Project Oxygen enabled.
This weeks New York Times Circuits article: Now Where Was I? New Ways to Revisit Web Sites, which covers the Keep the Found Things Found research project at University of Washington. The program is summarized:
The classic problem of information retrieval, simply put, is to help people find the relatively small number of things they are looking for (books, articles, web pages, CDs, etc.) from a very large set of possibilities. This classic problem has been studied in many variations and has been addressed through a rich diversity of information retrieval tools and techniques.
This topic is at the heart of the Personal Information Cloud. How does a person keep the information they found attracted to themselves once they found that information. Keeping the found information at hand to use when the case to use the information arises is a regular struggle. The Personal Information Cloud is the rough cloud of information that follows the user. Users have spent much time and effort to draw information they desire close to themselves (Model of Attraction). Once they have the information, is the information in a format that is easy for the user or consumer of the information to use or even reuse.
MIT's Technology Review discusses Randolph Wang's wireless PDA for personal information storage (registration for TR may be required). This brief description (I could find no longer nor explicit description at Wang's Princeton pages nor searching CiteSeer) is very interesting to me.
One PC at work, another at home, a laptop on the plane, and a personal digital assistant in the taxicab: keeping all that data current and accessible can be a major headache. Randolph Wang, a Princeton University computer scientist, hopes to relieve the pain with one mobile device. Designed to provide anytime, anywhere access to all your files, the device stores some data, but its main job is to wirelessly retrieve files from Internet-connected computers and deliver them to any computer you have access to. WangĂs prototype is a PDA with both cellular and Wi-Fi connections, but the key is his software, which grabs and displays the most current data stored on multiple computers. Wang has tested his prototype with more than 40 university and home computers on and around the Princeton campus. He eventually wants to shrink the device down to the size of a wristwatch to make carrying it a snap.
This is really getting to a personal information cloud that follows the user. This really is getting to the ideal. Imagine having everything of interest always with you and always available to use. Wang's solution seems to solve one of the ultimate problems, synching. The synching portion of this seems to stem from PersonalRAID: Mobile Storage for Distributed and Disconnected Computers, which was presented at a USENIX conference. I really look forward to finding out more about this product.
The Media Lab Europe's research lab for Human Connectedness really has some great things in progress. The most news worthy of late has been tunA, which is a wireless sharing of your personal music device, which extends your personal info cloud and creates a local info cloud for others. tunA was covered in Wired News: Users Fish for Music article a couple weeks ago.
The group's focus tends to be connecting people by digital tools using aural and visual presentation methods. There are some very intriguing applications that could come out of this research.
Peter Van Dijck relaunched Guide to Ethnography Wiki. This is a very good resource for understanding ethnographic studies and research.
Department of Informatics, University of Sussex, Interact Lab, HCI papers provides offerings in: Pervasive Environments and Ubiquitous Computing - Shared Interaction Spaces; Playing and Learning - Tangibles & Virtual Environments - Collaborative Learning; Theory & Conceptual Frameworks; Technology Mediated Communication; and Interactive Art. [hat tip Anne]
I stumbled across the London School of Economics, Information Science Working Papers, which has some interesting offerings. I want to come back to this and read more of these. Mobile Services: Functional Diversity and Overload (PDF) and Mobility: an Extended Perspective (PDF) have been interesting and are sitting in my research directory for more mulling.
Matt picks up on the failure of navigation and points to similar conversations to ones I had with Stewart that turned me to look for something other than navigation as a means to build information structures. Each user approaches information with two of their own receptors, cognitive and sensory receptors. The cognitive elements include vocabulary and rhetoric (essentially writing style). The sensory include visual elements, which include color, texture, and layout. Layout includes the visual structure and context given through proximity. These two seem to have paralells to Andrew Dillon's semantic spatial model, but I want to know more about his model.
Matt discusses the problems with navigation consistancy at the BBC sites. Here is where navigation gets in the way, as browsing structures is a better term and less restrictive. The user needs a means to find other information that is related or provides context to the information the see on their screens. If there is some attraction to the information infront of the user they often believe what which they seek will be close by if the information is grouped by like information. Much like a market where produce is grouped together, as they are like products.
iSociety Mobile Phones and Everyday Life, is a report looks at the impact of mobile devices as they impact everyday life. Looking at how we work with mobile devices today will help us set a framework for the future.
A healthy questioning of evaluation techniques in Henry Lieberman's Tyranny of Evaluation
There is a new design research repository on the Web, InformeDesign. This contains research for the Interior Design field. At times it is good to break out of one's normal shell and see what other fields are doing and researching.
In First Monday The Institutional Design of Open Source Programming: Implications for Addressing Complex Public Policy and Management Problems by Charles M. Schweik and Andrei Semenov. More simply put, what lessons can be derived from understanding the Open Source distributed methods that will help collaboration on intellectual issues, such as public policy, and understaning collaboration to better solve management issues. The physically distributed model is getting a test in many smaller organizations, including on information architecture non-profit that has recently opened its virtual doors.
In First Monday A Gendered World: Students and Instructional Technologies by Indhu Rajagopal with Nis Bojin offers a good insight into some gender differences in learning with technology. I want to come back and read this in full.
Beginning with a discussion with Stewart on Peterme and the encouragement of Lane in another discussion to look for a metaphor other than navigation that could better explain what we do on the Web. Seeing Stewart walk by at SXSW after I had seen some of Josh Davis visual plays I combined the discussion with Stewart with the magnetic attraction Josh showed, which began my thinking about a metaphor of attraction. Magnetism seems like what happens when we put a search term in Google, it attracts information that is draw to the term on to your screen.
Come see where else this metaphor can go in this poorly written for draft of the metaphor of attraction. This is posted to begin a collaboration to dig back and move forward, if that is where this is to go. The writing will improve and the ideas will jell into a better presentation over the next few weeks.
One navigation method that I find less and less is offering similar links based on what the user has clicked to. Often I would like to read the archives of a regular columnist in a magazine. I should not have to search to find the archives as that method often provide chaff with the goal of my search. Storage and metadata can greatly assist the navigation approach.
I personally find navigation and search combinations on a site create a higher probability that I will find the information that I am searching for.
More Off the Top:
OtT Archives
OtT Categories (by alpha)
OtT Categories (by use)