Announcing the Release of the CMIS Connector for SharePoint

I’m pleased to announce that we have Released to Web, the Content Management Interoperability Services (CMIS) Connector for SharePoint.  The CMIS Connector for SharePoint ships as part of the SharePoint 2010 Administration Toolkit, providing a CMIS interface over the top of SharePoint as well as a CMIS Consumer Web Part that can be used to display content from other CMIS enabled repositories.

You can download the SharePoint 2010 Administration Toolkit today and start to take advantage of this new set of capabilities within SharePoint Server 2010 by building your own Composite Content Applications that talk to SharePoint through CMIS or configuring SharePoint to interoperate with other ECM repositories through the CMIS Consumer Web Part.

Microsoft has been involved in defining the CMIS specification since the beginning and has invested significant resources to ensure that our customers are able to take advantage of support for CMIS in SharePoint 2010 just months after releasing the latest version of our platform.  We are excited about the opportunities that the CMIS standard will open up within the industry and look forward to seeing more ECM vendors deliver support for CMIS in their upcoming product releases.

For further reading on CMIS, visit these sites:

Ryan Duguid
Senior Product Manager
Microsoft Corporation

Announcing the Release of the CMIS Connector for SharePoint

I’m pleased to announce that we have Released to Web, the Content Management Interoperability Services (CMIS) Connector for SharePoint.  The CMIS Connector for SharePoint ships as part of the SharePoint 2010 Administration Toolkit, providing a CMIS interface over the top of SharePoint as well as a CMIS Consumer Web Part that can be used to display content from other CMIS enabled repositories.

You can download the SharePoint 2010 Administration Toolkit today and start to take advantage of this new set of capabilities within SharePoint Server 2010 by building your own Composite Content Applications that talk to SharePoint through CMIS or configuring SharePoint to interoperate with other ECM repositories through the CMIS Consumer Web Part.

Microsoft has been involved in defining the CMIS specification since the beginning and has invested significant resources to ensure that our customers are able to take advantage of support for CMIS in SharePoint 2010 just months after releasing the latest version of our platform.  We are excited about the opportunities that the CMIS standard will open up within the industry and look forward to seeing more ECM vendors deliver support for CMIS in their upcoming product releases.

For further reading on CMIS, visit these sites:

Ryan Duguid
Senior Product Manager
Microsoft Corporation

Introducing Enterprise Metadata Management

Hi there, my name is Pat Miller, and I am the development lead for the Enterprise Metadata / Taxonomy features in SharePoint 2010.  I’ve been working on the ECM team and its fore-bearers for the better part of 11 years now, first with NCompass Labs which was acquired by Microsoft in 2001, then on the Content Management Server team, then with the CMS team as part of MOSS 2007.  This is the first of many blog posts on the Enterprise Metadata Management (EMM) system in the 2010 release.  This will be the overview of the system, and future posts will drill into specific areas like event receivers, field editing and search refinements.

First, some background.  At one point during the development of Content Management Server 2002, we spent some time with the folks that run the Microsoft.com set of websites.  One of the things they were very keen on was this taxonomy system that they had built.  It seemed fairly useful, and we considered implementing something like it, but didn’t have the time, and there was a general concern that no one would actually do the work of tagging data.  During the development of MOSS 2007, we were spending most of our time rewriting our feature set to run on top of SharePoint, and once again, taxonomy fell off the list of things we were willing to tackle (and still, people would consistently say that people just don’t tag).

Around this time people started tagging things in their own world.  The rise of digital cameras and mp3 players brought a huge amount of data that for the most part, had to be marked up with metadata in order to be searchable.   Some metadata was added to the files automatically (things like date, size, camera model, etc.), but specific user information wasn’t there.  You quickly learned that if you categorized the images (either through folder location or tags) you could navigate your way through 10’s of thousands of files (images, music, etc.) the way that works for you personally, rather than relying on default information like date the picture was taken.  People became more familiar with the concept of navigating their content via metadata – "Let’s listen to all my Pearl Jam albums, I feel like listening to Electronica, find me photos of Dad".  It’s only a small step from that to wanting to impose some sort of hierarchy – find me photos of my whole family, my extended family, I want to listen to all classical music, or perhaps just from the Baroque period.  Tagging all that data really unlocked a lot of potential.

Perhaps the landscape had changed…

We decided to run with it in the 2010 release.  There were a few main tenets that we tried to let guide us:

  1. No one (well, almost no one) apply metadata for the shear joy.  It’s always for a purpose.
  2. #1 means that the reason for the system has to be for the end user benefit.  What can you do if you have this rich metadata applied?
  3. In order for #2 to come to realization, the metadata has to be present, which means that applying consistent metadata needs to be as easy and ubiquitous as possible.

To that end, we set out to enable a bunch of new user scenarios for SharePoint 2010.

We started out the release with a blank sheet of paper and some very knowledgeable people in the information management space.  We also found that most people started twitching uncontrollably when the word "ontology" was mentioned.  ‘Tagging’ was fine, ‘metadata’ was OK, at ‘taxonomy’ they started looking for an exit.  Telling people that a taxonomy was just a hierarchy calmed them down, but the whole ontology thing was too much of a stretch.  It also complicated things considerably, and we could still get a huge amount of value out of a taxonomy, so this was our starting point.

Some features were very obvious – filtering list views based on hierarchy inclusion, search refinement, etc.  Some were a small step from this – if you have a consistent vocabulary across an enterprise, you can start to do some interesting things.  You can match areas of expertise to specific content or workflows.  You can start to relate content in totally different systems based on something with more context than a simple string.  What if you could relate your analytics content to your taxonomy system and get a real-time view of what topics people are viewing instead of simply guessing based on their position in a URL namespace?  How about overlaying your security model with your metadata so that certain people had rights to view content based on the metadata applied to it?  How about we get down to business and focus our resources and ship a compelling collection of features.

To that end, we came up with the following components in the system:

The taxonomy repository itself, we call it the Term Store.  Some companies have very top down strict taxonomies, so some term stores might have a very few people allowed to edit them.  We’ll have to support having multiple term stores.

The taxonomy system needs to be able to support a complex enterprise.  A simple flat list of strings isn’t going to be sufficient.  To that end, we support the following concepts and behaviors:

  • Terms - A term is the central object in the taxonomy system.  It’s the concept itself.  It’s very hard to come up with a name for a concept and have it be sufficiently descriptive and not too vague.  Term is what we came up with.
  • Labels - Terms have to be known by a bunch of different names.  When someone types "check" it should be the same thing as someone that types "cheque".  "USA" and "United States" and "United States of America" are all referring to the same term.  We call these names labels.
  • Default Label - It’s a whole lot easier if one label is the default.  You can find it through any of its synonyms, but we’ll display the default label in most circumstances.
  • Termset - A collection of related terms in a hierarchy is a termset.  Things like "locations" and "products".
  • Term Reuse - This is a key point to the system.  If you have two termsets "Capitol Cities" and "Locations", the term "London" and all of it’s synonyms, etc. should be the same in both.  We don’t allow a term to have two parents in the same termset, but it can have two parents in different termsets.
  • Homographs – A homograph is a word that is spelt the same, but has a different meaning.  You should be able to have a hierarchy that has "Paris" existing in both France and Texas.  To keep things a bit more sane for the user, we don’t allow homographs to have the same parent.
  • Multiple language support - A given term has a bunch of meaning associated with it.  The translations belong to the term in the same way that synonyms do.  If a term doesn’t have a translation, we use the default language.
  • Groups - Groups in the taxonomy system are simply collections of termsets that share a common security assignment.  Termsets and terms aren’t ACL’d, groups are.
  • Deprecated terms - if a term shouldn’t be used any more, it can be deprecated.  This doesn’t remove it from the system, you just can’t apply it to new content moving forward.
  • Terms that are unavailable for tagging - this is slightly different from deprecated terms.  A deprecated term is deprecated in all occurrences in the taxonomy and isn’t shown to the user when tagging.  Unavailable terms are only unavailable in a specific termset, and are still displayed when browsing the hierarchy at tagging time.  The purpose of this is to allow things to be hierarchical without allowing people to tag with the wrong term.  For example, in the Capitol Cities termset, you might have continents in it so that people can find a particular city, but they would be marked as unavailable for tagging (with respect to Capitol Cities) because they should not be selectable at tagging time.
  • Merging terms - at times, you might get multiple terms in the system that really are the same thing.  They might be in the same termset, or they might be in different termsets.  When you merge them, you get a single term with all of the properties, and this new term will be reused in all termsets that the original terms existed.
  • Open Termsets - There are times when a highly managed taxonomy makes sense.  You shouldn’t be able to add random countries to the list of known countries.  However, you probably don’t want to give taxonomy editing permissions to everyone that is creating a new codeword.  Open termsets allow content editors to add new terms to a hierarchy at content authoring time.  It’s a bit of a meeting point between bottom up folksonomies and top down taxonomies.
  • Keywords - The degenerate case of a folksonomy is a simply flat list of strings.  They have no extra semantic meaning.  This is the enterprise keywords termset. Terms here don’t have a hierarchy, definitions, synonyms or translations.  However it is possible to move a keyword into a managed termset and add this additional data.
  • Local termsets - The taxonomy field type gives you all sorts of useful features, but you probably don’t want "places to order food from" to wind up in your enterprise taxonomy.  Local termsets are only visible within a single site collection.

OK, that’s a nice set of features in the taxonomy system.  What do we want to do with all those terms and termsets?

The next set of features involve integrating the taxonomy system with SharePoint.  The primary place this happens is in the new managed metadata field type.  Think of it as a choice field that went to the gym.  It’s much more powerful.  The metadata field type is a normal field that can be applied to any content type (list or document library).  However it has a few nice things associated with it:

  • Termset binding - You can specify what termset a field should be bound to.  You can have lots of fields bound to the same termset.  When you update the termset, all of the bound fields use the changes immediately.
  • Path or node display - You can choose to display the default label of the term by itself "Paris" or its path "Europe > France > Paris".
  • Multi-lingual rendering -   If a given term has been translated to a given language, when your UI is set to that language, the term translations are displayed.
  • Content type syndication – This isn’t a taxonomy feature per se, but it’s part of the enterprise metadata feature set.  We allow a term store to have a site collection defined as it’s "hub".  On that hub you can publish content types, and these content types will be pushed out to all consuming site collections.  This means that in addition to having a consistent vocabulary across your enterprise, you can have a consistent set of content types using all that goodness.
  • Rich editing - when you are applying a term to an item, you can search across the entire termset (including synonyms) or view the tree itself.  It makes it possible to choose from thousands of choices, which would normally break lookup and choice fields.
  • Editing support in the rich client applications - the document information panel in the Office client applications allows for applying terms.
  • Offline editing in the rich client applications - when you edit in the rich client applications, a copy of the bound termsets is cached locally.  You can tag on the plane.

Once data is in SharePoint, other SharePoint features can deliver additional goodness:

  • Better listview filtering - not only can you filter in the normal "everything with value X" but you can also do inclusive filtering, displaying everything tagged with X or a child of X.
  • Better metadata navigation behavior - The metadata navigation feature allows you to navigate through libraries using hierarchies other than the folder hierarchy.  The termset is one of the allowed hierarchy types, meaning that you can browse your libraries along multiple axes.  You can now free your data from the tyranny of the URL or folder namespace.
  • Routing and policy - The document routing feature can direct your content based on the metadata applied to it.  Taxonomy fields can even be used to create folder hierarchies at the routing destination.  Retention policies can be driven off of taxonomy fields as well.
  • File open / save - Can’t remember exactly where your document is stored in a large library?  You can use the taxonomy field to filter the open dialog display.

Now that we have all that nice consistent metadata on our content, we can do a few more things:

  • Content by query Web Part enhancements - You can configure the CBQ to filter based on taxonomy fields, including descendent inclusion.
  • Automatic search refinement - The search system is aware of all taxonomy fields, and if a result set has a sufficient amount of data with the same taxonomy fields, a search refinement will appear, allowing users to filter their data.
  • Power user profile and social tagging - it doesn’t make much sense to have a corporate taxonomy and then do your social tagging using just string matching.  All of the social properties are actually sourced from the taxonomy system, meaning that you won’t get people asking you where a good place to stay in Paris, France when you are an expert on Paris, Texas.

And since we know that we can’t possibly implement every feature that everyone would want, everything is accessible through our API.  In future blog posts, we’ll go over how to use this API to deliver some compelling features.

Hopefully this is a nice introduction to the work we did around taxonomies and enterprise metadata.  We had a lot of fun coming up with the design and implementation, and hope that it resonates with you.

Thanks for reading.

Pat.Miller at Microsoft.com

Variations: Propagate Pages on Your Terms

Hi everyone,

I’d like to answer a common question about how to modify the behavior the Variations feature in SharePoint 2010 uses when propagating pages. That is, how pages in the source variation site are copied and appear on target variation sites as minor draft versions.

Page propagation is triggered by publishing a page on the source variation site by default. Each time you publish a source page, the Variations Event Receiver adds a work item to the Variations Propagate Pages timer job queue. When the timer job runs, it will begin executing the first 100 page propagation work items. For each work item, Variations will copy the source page to all target sites, creating the page if it does not yet exist, or appending a draft minor version if the target page does already exist.

In some cases, users might not want changes to a page on the source to necessarily propagate to all targets. That is, users might want to make source-local changes and have the option to make changes globally applicable when they want. This often takes the form of a question like “How can I stop variations from overwriting my target pages every time I publish a source page?” Variations in SharePoint 2010 helps you do this.

In SharePoint 2010, we’ve worked to improve the Variations feature’s server citizenship by moving all Variations operations to the timer service. This way, server administrators can control the frequency with which operations run and better manage server load. The “Update Variations” button now adds a work item to the same Variations Propagate Page timer job queue as does publishing a page when “Automatic Creation” is enabled. What differentiates “Update Variations” is that you can also use this button to propagate source draft versions without publishing them on the source variation site.

When you run this PowerShell script to enable “On-Demand Page Propagation,” all Variations Propagate Pages timer job work items are filtered and discarded except those added to the queue by the Update Variations button:

Enable On-Demand Page Propagation:

[System.Reflection.Assembly]::LoadWithPartialName(“Microsoft.SharePoint”)
$site = new-object Microsoft.SharePoint.SPSite(“http://yourserver/sites/abc”)
$folder = $site.RootWeb.Lists["Relationships List"].RootFolder
$folder.Properties.Add(“DisableAutomaticPropagation”, “True”)
$folder.Update();

Disable On-Demand Page Propagation:

[System.Reflection.Assembly]::LoadWithPartialName(“Microsoft.SharePoint”)
$site = new-object Microsoft.SharePoint.SPSite(“http://yourserver/sites/abc”)
$folder = $site.RootWeb.Lists["Relationships List"].RootFolder
$folder.Properties.Remove(“DisableAutomaticPropagation”)
$folder.Update();

clip_image001

“Update Variations” in SharePoint 2010 can be used to propagate the current version of a page on-demand provided that the Variations Propagate Page Timer Job is enabled

In MOSS 2007, some users disabled the “Variations Propagate Pages” timer job in Central Admin as a workaround. With the timer job disabled, publishing a source page would not cause SharePoint to copy the source page to any target page. Authors on the source variation site could then use the “Update Variations” button to propagate the current version of the source page on-demand.

Clicking “Update Variations” in MOSS 2007 immediately propagated the current version of the source page to all target pages, “skipping the line” and bypassing the Variations Propagate Pages timer job. So when the timer job was disabled, “Update Variations” could still be used to propagate pages.

However, when the timer job is disabled, publishing pages on the source continues to add work items to the timer job queue if the “Automatic Creation” option is enabled, as it is by default. Over time, the queue can grow and contain hundreds or thousands of work items, all of which would begin to execute if the Variations timer job were re-enabled in the future. If you upgrade to SharePoint 2010 with a backlog of work items, SharePoint will discard these.

With the new “On-Demand Page Propagation” functionality, you can achieve this content distribution model out of the box.

clip_image002

“Update Variations” in MOSS 2007 works differently under the hood from its counterpart in SharePoint 2010

Things to keep in mind:

  • “On-Demand Page Propagation” affects the entire site collection; that is, if you enable this setting, no source page will be copied to any target page when the source page is published. Only the “Update Variations” button will cause pages to propagate when the timer job is enabled.
  • Source pages will be copied as draft minor versions to all target variation sites when you use the “Update Variations” button.

Thanks for reading. Happy propagating!

Josh Stickler
Program Manager

Page 1 of 912345...Last »

Categories


Other sites you might enjoy: