Looking for use cases for a semantically enhanced CMS

iks-logo.jpgDay is participating as an industrial partner in the Interactive Knowledge project, which aims to provide an open source technology platform for semantically enhanced content management systems.

We are starting to collect use cases for a semantically enhanced CMS – although I’m not 100% sure what semantically enhanced means (and I assume that means different things to different people), I have started with use cases like the following:

When I drop an image of a house in my content, the system allows me to see images of similar houses, and pages that talk about houses.

When I start writing a new piece of content, the system optionally shows me similar content that’s already in the repository, even if written in other languages.

The system allows me to formulate queries like “recent pages that talk about houses to rent in the french part of Switzerland”.

If you have additional ideas for such use cases, or examples of systems that provide such features, I’m all ears!

11 Responses to Looking for use cases for a semantically enhanced CMS

  1. Berthold Wagner says:

    I would think this is less about providing you authoring with info (e.g. showing you similar houses) but would “autotag” your content appropriately. This would then enable content consumers to “find” and you as the author just as well.

    However efforts around tagging have fallen far short of expectations so far. Possibly tags are a dead end in being to inflexile. In the end the system still doesn’t “understand” the content and the implied concepts.

    For a CMS the goal must be to provide the “right” content to a requests and the issues just starts with agreeing on what is right. I am not impressed by the results we achive today.

  2. bdelacretaz says:

    Thanks Berthold! Autotagging is certainly useful if that can work somewhat reliably, and this is something that we’ll want to explore in the IKS project.

    In general I like tags very much, especially when qualified with some form of namespacing – one could imagine using tags like house/automatic and house/verified to indicate the source/reliability of tags while allowing similarities to be discovered.

  3. NN says:

    I actually need (and incompetently try to build, because I could not find such a beast) such a system right now. Its an in integration of RDF into JCR/RDBM.
    In science you can annotate (not tag) the output of programs automaticlly with RDF (or manually). Then, based on knowledge of input types, output types and program (transformation of “content”) the CMS can suggest other programs to further process the output.
    Its not a folksonomy approach because many ontologies are available that
    a) bring in a common vocabulary
    b) are actually meaningful hierarchialy constructed (isA, hasA, partOf) so inference is possible

  4. Chris says:

    Semantically enhanced – what does that mean?

    The Greek word “Semantikos” means significant. So, “semantically enhanced” might be translated to “making more sense” – to the user, I guess.

    Instead of thinking about new and cool features, one should first ask: what makes more sense? More sense to the actual users of the system? The right answer in many cases will not be adding new features but doing the things the system already does – the right way round.

    I’ve seen, used and tried out many CMSes. Most of them have cool features and completing standard tasks works quite well ootb – as long as you use them the way the developers ment you should. But as soon as you leave the beaten track, things rapidly get quirky and annoying.

    One might leave the beaten track by doing strange things like:
    – using another UI language than English,
    – managing some more downloads than the ones in the demo site,
    – not only creating but also maintaining a multilingual website,
    – working with dozens of users on multiple sites,
    – upgrading the system after some needed customization,
    – wanting to update documents on a page

    And so on.

    My advice to the developers of a semantically enhanced CMS: Ask the users.

    Look over their shoulders in the real world. Watch them sweat blood when completing simple tasks. You won’t have to look out for new ideas anymore for a long time.

  5. bdelacretaz says:

    @chris, thanks for your comment, I see your point!

    Totally agree that the basics have to be covered before thinking about fancy semantic features (and BTW I think my employer’s products are pretty good w.r.t the features that you mention).

    In the context of the IKS project, “semantically enhanced” goes more in the direction of more intelligent search and tagging features, context extraction to create relationships between (multimedia) content items, etc.

    The goal in collecting use cases is to make these concepts more concrete and more understandable by users, and to define requirements for semantic components that will hopefully be usable as add-ons for a wide range of CMSes.

    I think this matches your “look over their shoulders” point of view, but in this particular cases goes beyond the basic use cases, towards making the sometimes esoteric “semantic features” natural to the average user.

  6. bdelacretaz says:

    See http://www.jroller.com/robertburrelldonkin/entry/semantic_content_repositories for additional thoughts about this, by Robert Burrell Donkin.

    And yes Robert, brainstorming at ApacheCon next week would be cool – I’ll be back from an IKS meeting which takes place on Monday and Tuesday morning.

  7. Sally Khudairi says:

    Bertrand, I have a bunch of names for you. Please see me during ApacheCon :-)

  8. From my POV, “semantically enhancing” a CMS would mean to enable, encourage, or even force the content authors to add semantic information to the content, and establish standardized ways to access the content based on semantic information. A basic example: Use FOAF (instead of XHTML or whahtever) as the native format to store data on people.

    As so often, it’s probably not mainly a technical issue. XML-based CMSs like Apache Lenya are around for many years; they make it easy to acquire and store data in a structured, semantically unambiguous way. But most organizations are not aware of the advantages they could gain by leveraging these features. I’ve seen many websites where contact information, FAQs etc. are stored in XHTML pages. This is “dead” content, it can’t be automatically queried or processed in a semantic way.

    Maybe a valid goal would be to agree on some standard output and query formats / protocols, and implement them in as many CMSs as possible, showcasing the possibilities and increasing the awareness for these issues?

  9. bdelacretaz says:

    @andreas, thanks for your comments! I don’t like “forcing” things on people, so I’d rather lean towards “encouraging”, which usually works when people see the benefits of adding that semantic data.

    I don’t think using FOAF or similar formats as native storage is required, as long as you can get such “semantically useful” formats out of the system.

    About agreeing on some standard output, I’m currently looking at Linked Data (http://en.wikipedia.org/wiki/Linked_Data) which looks promising.

    • Andreas Hartmann says:

      Hi Bertrand,

      thanks for your reply! By “forcing” people, I was referring to guidelines in organisations, ensuring that content is acquired/tagged in a way that allows semantic processing.

      In my experience, the actual content authors are often too consumed with the job at hand to invest any thoughts in future opportunities to leverage their content. So semantic content acquisition should be as natural and straightforward as possible. The Firedocs editor (http://www.firedocs.org) is a very good example: A browser-based WYSIWYG authoring environment for structured content (arbitrary XML), based on XSLT. Similar products have been available for some time, but I have to admit that this is the first (open source) product that I (as a techie) prefer to an XML source editor.

  10. Lina Wolf says:

    One important use case for semantic CMS is the metadata tagging of real life print documents or theire digitalisation images. Some important Viewers like the DFG Viewer are based on CMS (here TYPO3). Improvement of semantic support could improve these.

%d bloggers like this: