Thursday, 30 September 2010

Four key concepts in Web 3.0 - 1. Semantic content

The years of Web 2.0 were very exciting. This was not just because I think we were genuinely seeing novelty, difference and potential but were capturing it in definitions. The fact that those definitions were different depending on whose opinion one asked was great fun. This enabled me to conduct one of my favourite activities - bar arguments - on a regular basis. Web 2.0 felt like a big step forward.

I have had several conversations recently about what's coming next. We know the themes of the moment; social, mobile etc, but it's time to do some future gazing. OK, so Tim Berners-Lee has already had a go at defining what Web 3.0 might be. It's probably fair to accept that he has a certain 'authority' in these things, but I'd like to stick my neck out and make a few suggestions of my own.

Actually that's a bit of a lie.These are, of course, not my own opinions, but those of the people I meet, follow and read. Just call me Boswell. For the sake of your eyes Dear Reader, this will be separated into four sections over the next few weeks.

Berners-Lee has championed the concept of the semantic web as being core to Web 3.0 and this has to be one of the major memes of the next few years. Structured, parametric data has been the mainstay of databases and data definition for decades (centuries?) - we all understand that to store data and to make it useful we have to describe it.That description has to be consistent, well maintained and generally rather inflexible if it is to allow us to use data to generate insight.

This rigidity has caused huge overheads in data management. The inflexibility has meant that stored data does not reflect the changing environment and has often led to enormously complex taxonomies, heavyweight tagging systems which generally sacrificed speed for structure. It also requires data experts. Time and time again businesses have seen how completely unreliable ordinary users are when it comes to defining and entering data properly. Even content professionals struggle to tag well (witness the eternal battle between data/content managers and authors and journalists).

Parametric based systems are unwieldy - watch techies wince when faced with data migration from one platform to another, or try to match business rules with taxonomy structures. The end result is that in all but the simplest systems precision is far from perfect, good enough is often scarily poor and flexible non-linear thinking is hindered.

The semantic web is not a new idea. I remember seeing a prototype semantic thesaurus in the mid/late 90s. But it has been severely held back by technical limitations and the inability for machines to index and catalogue accurately. It has also been held back by the lack of systems that can read crowds and adapt definitions according to their behaviour. However I'm seeing more and more examples of this being done well by people who are brilliant.

It has to be the future. True semantic systems will be agnostic of the form or shape of content but will be able to recognise context and meaning. The RBI Search guys did some tremendous work in this when they launched Zibb and the various other off-shoot products. By the end they were automatically recognising the difference between B2B and B2C, the difference between news and other types of content (indeed the last time  I asked they were differentiating between 15 different types of content) and are doing incredibly sophisticated things with context and meaning.

Google have blazed their path over the last decade. Love them or hate them, one has to admire the brilliance of their machine led comprehension of content and data and the range of content types over which we now expect them to operate.

Semantic based systems will arrive, probably soon and with an enormous impact. I'd like to postulate systems that will incorporate content of whatever form (be it text, audio or video). They will recognise, index and tag new content and place it in context with existing records. That context will be driven by a combination of algorithm driven logic and actual user activity (it will probably still have a taxonomical basis as well - at least for a while). Whole pages can be generated on the fly according to an individual's preferences and behaviour. Any distinction between content and databases will become less important or even irrelevant. The distinction between sites will disappear.

This gives content producers startling advantages. The costs of keeping and managing data will tumble. Complexity will be replaced by flexibility and new data sources will be incorporated into existing tools with little or no additional  work. They will be able to initiate individual conversations with their customers in which the customer will have enormous control and equality.  This aren't new ideas, but technology and our abilities to deploy that technology are maturing enough to deliver this. Business models will evolve to be more value based and certainly more focused on the micro than the macro.

This is a fundamental basis for changing the web and making it even more useful. In combination with the next few themes I think it will provide the framework for as radical a change as that ushered in by web 2.0 (whatever that was) and perhaps even than by the web itself.

Next time: platforms

BTW has everybody seen ? Semantic wonderfulness in action.

1 comment:

  1. Nice post Graham. I believe that semantic is where things are heading. My worry is though that too many information providing businesses are too heavily invested and geared up (in terms of money, infrastructure and business practices) in the macro and heavy taxonomy way of doing things to understand and shift their business models to more micro-payment, app-like models that success in the semantic world will require.