Monday, June 14, 2010

Data's Final Resting Place

I was asked recently a question regarding where SharePoint documents go when they are deleted. This is a tiered answer. Like so many other questions, we need to ask what data you want to know about.



Documents:

When a user deletes a document from a document library, it is a 2 stage process. When a user deletes a document from a document library it is sent to a site recycle bin. After 30 days (by default) this document is permanently removed from the site and the underlying database.



If the document is deleted by the user from this recycle bin before the document is removed automatically, it is sent to the Site Collection's recycle bin. At this point, the Site Collection administrator can restore it if necessary.

At this point the document is marked for deletion, if you should peak into the database and look at the dbo.AllDocs table you will find the following important information in the following fields:


  • ID: a unique identifier that holds the GUID of the document

  • SiteID: another unique identifier that holds the Site Collection GUID from which the document came from

  • DirName: Directory where the document is stored. Even if the document is deleted, this field is still valid. SharePoint will use this to have a location to restore the document.

  • WebId: a unique identifier that holds the GUID of the subweb site from which the document was stored.

  • ListID: this is the unique Identifier that holds the GUID of the List the document came from. Since document libraries under the hood are modified lists, you can query the dbo.AllLists table with the value to get the reference to this list.

  • DeleteTransactionId: Another unique identifier holding the transaction identifier (not a GUID). This field will be set to 0x if the document is not in the recycle bin.

There are other fields that are quite valuable, so take a look at this page for more information.


Site Collection and Web Site


To truly purge your database of the the documents without waiting for administrative action is to delete the entire web site. This will purge all of your documents from your site. This includes their reference in the Site Collection recycle bin. Gone. Poof.