Thursday, 28 January 2016

Event Queues Part Duo

IMPROVING EVENT QUEUE PROCESSING SPEED

The number one problem is high number of records in Event Queue table. As we already know once the event is processed by all the Sitecore instances, it becomes useless and eligible to be removed.

If the number of records gets too high, cleanup procedure can fail with timeout exception, causing MSSQL to bend the knee.

Determine events by type statistics


Given SQL query shows 'hot' event types, and highlights priority optimizations:

SELECT MIN([Created]) AS Earliest,
MAX([Created]) AS latest,
DATEDIFF(day, MIN([Created]),MAX([Created])) as [Days between min and max],
COUNT(1) as TOTAL,
[EventType]
  FROM [EventQueue]
 GROUP BY [EventType]

The 'PropertyChangedRemoteEvent' usually is the winner for core database.
Lets see how can we reduce the received numbers.

Reducing number of events

Switching indexes property store to file system

Sitecore Content Search engine stores indexing metadata inside database property store.

Content Search needs a reliable storage to store metadata ( f.e. last indexed item ) to avoid re-indexing of already indexed data.

There is no need to share/sync the data between servers, thus no need to sync changes.

Using database properties could be costly in terms of performance in this case.
Sitecore Content Search may cause performance issues due of excessive updates of the EventQueue table gives a solution how to change property store.

The #420602 performance optimization  not to raise events in case code executed under EventDisabler) is addressed in 7.2 Update-5.

Logging to Sitecore interfaces with 'Remember Me' flag

In case 'Remember Me' flag is not selected during login to Sitecore interfaces, sliding expiration policy would be applied, and user ticket would be prolonged on every request.

Since Sitecore Client security mechanism should keep track/share information about simultaneously logged in users between servers, it has to use properties as well =\

Every ticket prolongation would update value inside properties table, thus provoke 'property changed' event.

A #443748 performance optimization has been introduced in CMS 7.2 Update 6, that noticeably reduces number of database property calls.

To sum up, always select 'Remember Me' flag.

Reducing number of publish operations

Every publish operation would produce 'publish:begin', 'publish:end' events, as well as publish 'languages' items.

Even though there are no actual content items to be published, a set of events would be produced, and system language items would be published.

If you are not going to add more languages to your solution, can comment out 'AddLanguagesToQueue' processor inside publish pipeline.

You can also consider to reducing the frequency of PublishingAgent executions.

Write less data into EventQueue

MSSQL would perform better if rows would have less data.

The #422510 optimization that allows to specify which changes are to be added into event data is available from CMS 7.2 Update 3.

In short - whenever item is saved, a list of modified fields with values is added into EventQueue.

If you update a field with 500 KB HTML text, it would be serialized and forwarded to database.

Needless to say that database server would appreciate if we could put less data.

I will create an article to describe in more details HowTo configure the configuration.

Aggressive cleanup policy

Stock 'CleanupEventQueue' agent was improved to perform cleanup in more aggressive way starting from 7.2 Update 4 (#392673).

For prior CMS versions once can use reworked stock cleanup task

How to pick optimal interval


The interval to keep should be more than longest running operation that uses EventQueue ( f.e. onPublishEnd indexing strategy )

F.e. Content indexes would be populated with freshly published data when publish:end is raised.

One should keep EventQueue rows produced by the publishing until indexing on all servers is over.

Wednesday, 27 January 2016

How to share information between Sitecore Instances

What is the challenge ?

Lets say one wants to have a set of Sitecore instances that should modify/update same data(key-value pair) (f.e. number of people who have filled form, last task execution time, and so on).
A storage must be persistent, so application recycle would not cause any data lost.
Storage must handle high load, in-memory caching is required.

Can we achieve that without customization?
Can we just use OOB Sitecore functionality?

Technical sketch

  • It make sense to create a table (key-value) inside database.
  • Caching layer must respect data changes, as well as have defined limit
  • All interactions should be done via database provider, so one can handle changing database engine, and create tests if needed.


How To inside Sitecore ?

Sitecore CMS has Database Properties mechanism encapsulates everything aforesaid:

Sitecore.Configuration.Factory.GetDatabase("core").Properties["customKey"]=value

This mechanism is used internally f.e. to maintain Sitecore Client User tickets, publishing and indexing metadata.

Database Properties logic is inside 'Sitecore.Data.DataProviders.Sql.SqlDataProvider' class ( 'GetPropertyCore', ' SetPropertyCore', and  'RemovePropertyCore' methods ), so please feel free to check exact implementation via any reverse-engineering tool.


Implementation details

Each Sitecore database has 'Properties' table that represents key-value storage:

Interactions with database are done through DataProvider (defined in web.config under dataProviders node):

'Database property changed' event is added into EventQueue once property value is changed. As a result other Sitecore Instances eliminate modified property from cache, and would reload it from database directly on next call. Given SQL could be used to check what is written into EQ:
        •  SELECT * FROM [EventQueue] WHERE [InstanceType] LIKE '%PropertyChangedRemoteEvent%'
Property cache size is controlled by hidden 'Caching.DefaultPropertyCacheSize' setting, and equals to 500KB by default:

Summarize

Sitecore provides key-value storage synchronized across all instances out of the box.
The price to forward modification from one instance to other is an extra row in EventQueue.
Frequent properties modification could produce a large amount of EventQueue entries, so please use the feature wisely.