by Robert Martin, Lead Principal Consultant
Given the complexity of integrating Microsoft Exchange and IBM Domino environments in a cohesive coexistence solution, it’s not surprising that one thing Messaging Engineers struggle with is providing for Highly Available and Load Balanced designs in their coexistence infrastructure.
In the following post, I will describe the various techniques for providing both High Availability and Load Balancing in a coexistence design utilizing the Binary Tree CMT for Coexistence toolset. I’ll discuss the major components of coexistence and how each component can be configured to provide the necessary level of availability and throughput required by the customer. The major components are:
In some instances, providing for one need (such as load balancing) will naturally follow through with the other (high availability). In other instances, this will not be the case, and each need will have to be configured and will operate independently of the other.
Load Balancing and Fault Tolerance of Mail and Calendar Message Routing
This post will discuss setting up a load-balanced solution within a single Domino Domain. I will discuss leveraging it for multiple Domino Domains in a future post. Because CMT for Coexistence utilizes native transport protocols for both IBM Domino and Microsoft Exchange, the techniques for providing load balancing differ significantly depending on the direction of message flow. The following describes how to configure load balancing in each direction:
Domino to Exchange Message Load Balancing
In most coexistence scenarios, message traffic requirements and the duration that coexistence must be maintained does not warrant the use of more than one CMT for Coexistence Domino server acting as the messaging coexistence gateway. Other scenarios may require the deployment of more than one CMT for Coexistence Domino server to serve a Domino Domain.
An example of one such instance may be when the Domino environment is geographically separated by slow or saturated WAN links. If the corresponding Exchange servers are deployed to mirror the Domino environment, it may not make sense for a message from one site to traverse a slow WAN link in order to cross the coexistence gateway, only to have to traverse the same WAN link to be relayed to the intended recipient. This is extremely inefficient and a large amount of traffic can quickly overwhelm the link. Optimally, the message traffic should be relayed to a coexistence server within the same site as illustrated below.
Another scenario may be that the high volume of mail traffic necessitates the deployment of multiple Messaging Coexistence servers. Although a properly sized and configured Domino Coexistence server can easily process and relay more than 20,000 messages per hour, extremely large organizations can quickly reach this level of messaging traffic once they’ve migrated a large number of their users to Microsoft Exchange. In this scenario, a single Domino Coexistence gateway is not enough to handle the amount of mail traffic generated. Message traffic from mail servers in the environment needs to be balanced across multiple coexistence servers as illustrated below:
Both scenarios involve deploying multiple Coexistence servers within the Domino Domain and then directing a subset of mail servers to specifically use one of them and not the other.
Because CMT for Coexistence requires that Notes routing (NRPC) be used between Domino mail and coexistence servers, deploying multiple coexistence servers requires creating multiple Foreign Domain documents. Since there is no field within a Foreign Domain document to restrict its use to a particular mail server, and because the Mail Server field within that document does not support Domino Cluster or Server Group names, we must limit which Foreign Domain document is replicated to any given mail server.
Limiting the replication of a Foreign Domain document is accomplished by setting the '$Readers' field within the document to allow only specific servers to have read access to it. There are three (3) other standard readers fields on the Foreign Domain document that may allow a user or server to read the document. They are the:
The DocumentAccess field is set to [NetModifier] by default. The LocalAdmin and ListOwner fields correspond to the Owner and Administrators fields on the Administration tab. Because all servers within a Domino Domain are members of the LocalDomainServers group by default, and that group is assigned the [NetModifier] role, it is necessary to change the DocumentAccess field to some other value to keep it from replicating to all servers in the Domain. Unfortunately, there is no direct way to edit this field, as it is not included on the Foreign Domain document form. A simple action agent can be written to adjust this field to another value, such as a distribution hub server.
The following steps describe how to deploy two (2) different Foreign Domain documents within a Domino Domain in support of the mail flow shown in the illustration above:
1. Create an empty Domino Directory based on the Domino Directory template on the local workstation. This database will be used to create the Foreign Domain document. The agent to modify the DocumentAccess field of the document will be run against this database and not the production Directory. Creating all the Foreign Domain documents in this “staging” directory will allow you to verify all the readers fields prior to copying them into production.
3. Be sure that both documents contain information on the Calendar Information tab, as each mail server in the environment will only have one document in their local Domino Directory.
4. Verify that the Owners and Administrators fields on the Administration tab are correctly set to those users or user groups that will have authority to administrate the Foreign Domain documents.
5. In the Document Properties dialog for each Foreign Domain document, clear the “All readers and above” checkbox, then select the servers or server group that the selected Foreign Domain document should replicate to.
6. Create a simple action agent to modify the DocumentAccess field of the Foreign Domain document. In the below example, ACME_APPS/ACME is the central distribution hub server that all other servers replicate their Domino Directory from. It does not contain any mail files or applications that route to the Exchange environment represented by the Foreign Domain documents.
7. Run the agent against the Foreign Domain documents you’ve created.
2. Create a Foreign Domain document for each coexistence server. All documents should have the same Foreign Domain name.
8. Verify that the DocumentAccess field is correctly set in each document.
9. Once you have verified that the $Readers, DocumentAccess, Owner, and Administrators fields are correctly set, copy the documents from the “staging” database to the production Domino Directory. Depending on the replication topology of the Domino environment, you may only need to copy these documents to a central distribution server, or you may have to selectively copy the documents to specific servers in order to ensure that they can properly replicate throughout the environment.
It is important to recognize that this solution provides only for load balancing. Since each Domino Mail server receives only one of the defined Foreign Domain documents, it is only aware of the Domino Coexistence server defined in that document. If that Domino Coexistence sever is unavailable, the downstream mail server has no reference to the other coexistence server to fail over to, and mail routing is interrupted to Exchange.
In situations where fault tolerance is required, a second coexistence server is usually set up and running, but most traffic flows through only one “primary” coexistence server at any given time. Details on setting up Fault Tolerance will be described in the next post.
The above Foreign Domain document settings addresses how downstream Domino Mail servers can load balance their NRPC based mail traffic between multiple Domino Coexistence servers, but what about the Domino Coexistence servers themselves? How can we load balance traffic from the Domino Coexistence servers to the Exchange Hub Transport servers?
There are three (3) techniques we can employ to perform this type of load balancing:
DNS ‘A’ Record Round-Robining: With this technique, we define an ‘A’ record within DNS with an arbitrary host name we want to use for connecting to the Exchange Hub Transport servers. We assign that ‘A’ record the IP Address of one of the Hub Transport servers. Then, we create another ‘A’ record with exactly the same hostname, but this time assign it the IP address of a different Hub Transport server. We keep adding ‘A’ records until all the Hub Transport servers we wish to use for Domino to Exchange mail routing are included. The hostname we used in creating the ‘A’ records is then defined within the SMTP Connection Document as the target SMTP MTA relay host. Using this method, the Domino Coexistence server will get a different IP address each time it queries DNS for the hostname. This method does have a couple of drawbacks, however. One, unless intelligent DNS is used, the IP address returned by the DNS server is not guaranteed to be available if that Hub Transport server is experiencing connectivity issues. Second, the Domino Server will cache the IP address for the hostname it queries from DNS for a period of time. This means that subsequent connections to the hostname will be made to the same cached IP address until that cached entry times out.
For a more reliable load balancing technique we have to use the next method:
DNS ‘MX’ Record: Using this technique, we define an ‘MX’ record within DNS with an arbitrary host name that we will use in connecting to the Exchange Hub Transport servers. Instead of creating multiple records and assigning each a different Hub Transport server, we create just a single ‘MX’ record. This record is then assigned all of the hostnames of the Hub Transport servers we wish to use in Domino to Exchange mail routing. A single ‘MX’ record is capable of encapsulating any number of Hub Transport server hostnames and can accept prioritization values for each. Like the DNS ‘A’ Record technique, we define the hostname of the ‘MX’ record within the SMTP Connection Document’s SMTP MTA relay host field. The advantages to this method is that the Domino server will attempt to connect to subsequent entries in the ‘MX’ record if the first connection attempt fails, meaning that fault tolerance is also provided. Unfortunately, the Domino Coexistence server will still cache a successful connection and will only re-query when that cache times out. To get the most robust load balancing, the third method must be employed:
Network Load Balancing (NLB): This technique involves setting up either a software or hardware load balancing device to service the Hub Transport Servers involved with Domino to Exchange mail routing. Although this technique requires the most setup as far as infrastructure, it also provides the most robust load balancing solution, as every aspect of how message traffic is relayed from the Domino Coexistence servers to the Hub Transport server can be controlled. The state of the target Hub Transport server is monitored by the Load Balancing device, so no connection attempts are made to a down server. Since all Hub Transport servers are represented as a single virtual IP address and hostname, any caching the Domino server does with this information has no effect on the load balancing. Configuration of the SMTP Connection document is similar to the above two methods, the SMTP MTA relay host field is populated with the IP or hostname assigned to the Network Load Balanced cluster.
Hopefully this post helps you understand the various methods involved in load balancing message traffic from the Domino to the Exchange messaging environment. As I mentioned above, the next post will discuss how Fault Tolerance is configured in the environment, particularly in traffic from downstream Domino Mail servers to the Domino Coexistence server. I will also discuss methods of combining Load Balancing and Fault Tolerance within a single implementation.
7/21/2011 9:30:00 AM