logo Professional
Database
Developers
Aldex Software Ltd.
SQL Server Replication

Introduction:

This page is intended to provide a low-tech overview of SQL Server Replication, concentrating on the concepts and reasons for replication.


What is replication for?

Replication gives you the ability to have the same data at more than one location. This is useful in a number of different sceanrios.Replicated pine trees

The connections between sites may be....

  • Of poor quality
  • Slow (eg low bandwidth)
  • Expensive
  • Temporary (eg a dial-up user)

For example you may have several locations within a country, or between countries, but the WAN links between them are fairly slow and are in any case frequently overloaded. However you still need to be able to use a responsive and reliable database at each site.

Another example is the mobile salesforce who needs to be able to access the database from their laptop when they are at a client's site.

Replication can also be used to partition or consoidate data. For example at your Birmingham office you only want to hold Birmingham data, you do not want Glasgow or Manchester data, whilst at your London head office you need consolidated data from all of your sites.

Another big use for replication is to spread the processing load across more than one server in order to make individual servers more responsive. Typically you might have a second server + database that is used for reporting or analysis purposes so that these activities do not reduce the performance of your live database.

The last main use for replication is in backup/disaster recovery/high availability situations.



Replication comes in several flavours...

There are three fundamental types of SQL Server replication.

Snapshot replication is the simplest, but generally the least useful. In effect it gives you a read only copy of the database and each time you replicate you get a complete new copy sent over. This is similar in concept to restoring a database from a backup onto a different server. Snapshot replication tends to be used with smallish data sets that are relatively static. For example if you published a new price list every 6 months and the prices between updates were fixed. Snapshot replication is also used with the other two types of replication in order to give a common initial starting point.

Transactional replication commences with a snapshot replication and then only transfers data that has changed since the last synchronisation. This is good for read-only situations where information flows out from the centre on a frequent basis, such as distributing price or stock lists where the individual items are changing daily or for where you need a seperate server for reporting and analysis purposes.

There are also two related forms of Transactional Replication which allow data to flow in both directions (ie users at a replicated site can update their copy of the database and the changed data will appear on the central server as well as on all the other replicated servers). These are called Immediately Updating and Queued Updating Transactional Replication. They are mainly used where database consistency needs to be high, i.e. where the data held in different replicated databases must all be in sync with each other. Such scenarios are typically found in booking systems where you must ensure that you do not book the same holiday/flights/parts/room/etc. to different people at the same time. If you make a booking it is essential that that booking is quickly reflected in all of the other databases so that the same item cannot be double booked. Updating transactional replication requires a high quality infrastructure with reliable connections between all of the sites.

Merge replication is used where updates need to be applied at remote sites as well as at the central site. With merge replication each databases can function independently of each other (ie autonomously) so that you always have full database functionality, including the ability to update the database, even if there is no connection back to the central server. The downside being that each site will tend to be slightly out of synch and hence all of the sites may have slightly different versions of data (low consistency). Additionally we need to consider and resolve conflicts, when updates for the same item occur at two different locations and then conflict when they are merged together.

Merge replication is the most complex of the three but it is also a powerful and flexible option.


Replication is complicated...

Unfortunately replication is not straightforward; it requires a good underlying knowledge of SQL Server and database design and there are a large number of decisions, settings and options that need to be correctly selected and configured for your specific situation before replication will work reliably and consistently. You need to consider issues such as latency (the time lag before a change is synchronised to other servers), autonomy (how independent a server is) and consistency (the degree of synchronisation between the databases). There are also a number of components outside of SQL Server including firewalls, the Distributed Transaction Coordinator (MSDTC), security account settings, permissions, etc. that can all prevent replication from working correctly.However, when set up correctly, and in a suitable environment, SQL Server Replication is a scalable and very reliable solution to many otherwise difficult to solve problems.

Some of the common decisions that need to be made include:

  • Transactional, Merge or Snapshot replication?
  • Which server/disk drives should we place the distributor on?
  • What publications do we need? All tables in one publication or several? Articles?
  • Do we need partitioning? Horizontal and/or Vertical Filters? What about dynamic Filters?
  • Where should the snapshots be placed?
  • What profile should the agents run under?
  • Push or Pull subscriptions? Anonymous subscriptions?
  • Immediately Updating or Queued? Or neither? Or Merge?
  • Update frequency?
  • How should we set the firewall up?
  • Do we need to enter data at multiple sites?
  • What conflict resolver should we use? How do we write a custom conflict resolver?
  • What latency is acceptable?
  • Availability? When are the servers taken down for maintenance?
  • How do we design our database to allow for replication?
  • What expiry time should we set for our subscriptions

What to do if you have a problem with replication...

SQL Server Replication is often seen as a bit of a black art and something that even experienced SQL administrators steer clear of. We appear to be one of the few companies willing to provide assistance with SQL Replication problems. If you need such help then please contact us. You can also look at our SQL Server Replication Support page.

Separator
Copyright ©2004, Aldex Software Ltd.

logo
Return to front page