This section lists some common issues that you might encounter when using Network Load Balancing (NLB) clusters.
What problem are you having?
- After installing Network
Load Balancing and restarting a cluster host, a message appears:
"The system has detected an IP address conflict with another system
on the network..."
- There is no response when
you use ping to access the cluster's IP address from an outside
network.
- There is no response when
you use ping to access a host's dedicated IP addresses from another
cluster host.
- When attempting to use
Network Load Balancing Manager to connect to a host in your
cluster, you receive the error "Host unreachable."
- When using Telnet or
attempting to browse a computer outside the cluster from a cluster
host, there is no response.
- When invoking the Network
Load Balancing remote control commands from a computer outside the
cluster, there is no response from one or more cluster
hosts.
- There is no reply when you
use the dedicated IP address of a host to specify it as a target
for a remote control command. However, specifying the host by its
priority (ID) works.
- Connectivity to the cluster
is denied to some users, but not all.
- You cannot view or change
the Network Load Balancing properties by using net config and
Windows Management Instrumentation (WMI).
- An unusual number of TCP
connections to the cluster's IP address are being reset by the
server or the client.
- Virtual Private Network
(VPN) calls fail when you make a change that causes convergence
(such as adding a host, removing a host, or draining a
host).
- After the cluster hosts
start, they begin converging, but they never complete
convergence.
- The cluster moves in and
out of a converged state.
- After the cluster hosts
start, Network Load Balancing reports that convergence has
finished, but more than one host is a default host.
- Network Load Balancing is
not load balancing applications, and the default host handles all
network traffic.
- Traffic alternates
unexpectedly between the cluster hosts, and it breaks TCP
connections.
- Network traffic does not
appear to load balance evenly among the cluster hosts.
- When you are using Network
Load Balancing with Microsoft Internet Security and Acceleration
(ISA) Server, one cluster host logs blocked packets that are
directed to the dedicated Internet Protocol (IP) address of another
host.
- You are unable to create a
Network Load Balancing cluster in a 64-bit version
environment.
After installing Network Load Balancing and restarting a cluster host, a message appears: "The system has detected an IP address conflict with another system on the network..."
- Cause: The same IP address already
exists on the network.
- Solution: Choose a new IP address, or
remove the duplicate address.
- Cause: You have configured different
cluster operation modes (Unicast or Multicast) on the
hosts, which causes two different MAC addresses to map to the same
IP address.
- Solution: Ensure that all hosts are
configured with the same cluster operation mode.
- Cause: You configured the cluster's IP
address before NLB was bound to the network adapter.
- Solution: Remove the cluster's IP
address from TCP/IP properties, enable NLB on the proper adapter,
and then configure the cluster's IP address.
- Cause: You added the cluster's IP
address to a network adapter that has not been enabled for NLB.
- Solution: Remove the cluster's IP
address from the incorrect adapter's TCP/IP properties, enable NLB
on the proper adapter, and then configure the cluster's IP
address.
For more information about enabling NLB, see Installing Network Load Balancing
There is no response when you use ping to access the cluster's IP address from an outside network.
Verify that you can use ping to access the dedicated IP addresses for the cluster hosts from a computer outside the router. If this test fails, and you are using multiple network adapters, the issue is not related to NLB. If you are using a single network adapter for the dedicated and cluster IP addresses, consider the following causes:
- Cause: If you are using multicast
support, you might find that your router has difficulty resolving
the primary IP address into a multicast media access control (MAC)
address by using the Address Resolution Protocol (ARP).
- Solution: Verify that you can use
ping to access the cluster from a client on the cluster's
subnet and to access the cluster hosts' dedicated IP addresses from
a computer outside the router. If these tests work properly, the
router is probably at fault. You should be able to add a static ARP
entry to the router to circumvent the issue. You can also turn off
NLB multicast support and use a unicast network address without a
hub.
- Cause: When using NLB in multicast or
unicast mode, routers need to accept proxy ARP responses
(IP-to-network address mappings that are received with a different
network source address in the Ethernet frame).
- Solution: Make sure that your router
has proxy ARP support turned on. You can also set a static ARP
entry to keep proxy ARP support disabled in the router.
- Cause: Internet control message
protocol (ICMP) to the cluster is blocked by a router or
firewall.
- Solution: Allow ICMP traffic through
the router or firewall. Be aware that this may expose your system
to additional security risk.
There is no response when using ping to access a host's dedicated IP addresses from another cluster host.
- Cause: When using NLB in multicast or
unicast mode, routers need to accept proxy ARP responses
(IP-to-network address mappings that are received with a different
network source address in the Ethernet frame).
- Solution: Make sure that your router
has proxy ARP support turned on. You can also set a static ARP
entry to keep proxy ARP support disabled in the router.
- Cause: Internet control message
protocol (ICMP) to the cluster is blocked by a router or
firewall.
- Solution: Allow ICMP traffic through
the firewall or router. Be aware that this may expose your system
to additional security risk.
When attempting to use Network Load Balancing Manager to connect to a host in your cluster, you receive the error "Host unreachable."
- Cause: Internet control message
protocol (ICMP) to the host is either blocked by a router or
firewall or disabled on the host's network adapter.
- Solution: Enable ICMP on the host's
network adapter or allow ICMP traffic through the firewall or
router. Be aware that this may expose your system to additional
security risk. You can also use NLB Manager's /noping
option.
When using Telnet or attempting to browse a computer outside the cluster from a cluster host, there is no response.
- Cause: Verify that you can use
ping to access the computer outside the cluster. If this
test is successful, you might not have listed the host's dedicated
IP address first in the TCP/IP properties.
- Solution: If ping fails to access the
computer outside of the cluster, refer to the following issues
(described earlier in this Troubleshooting topic):
When invoking the Network Load Balancing remote control commands from a computer outside the cluster, there is no response from one or more cluster hosts.
- Cause: Remote control commands are not
being sent to the cluster's IP address.
- Solution: Commands must be sent to the
cluster's primary IP address, which was assigned in the Network
Load Balancing Properties dialog box. Be sure that you send
remote commands to the correct IP address.
- Cause: The remote control traffic is
being encrypted by Internet Protocol security (IPSec). NLB remote
control commands will not work correctly if they are sent from a
computer that has IPSec configured so that the remote control
traffic is encrypted by IPSec.
- Solution: Disable IPSec.
For more information, see the Internet Protocol Security (IPSec) Help content.
- Cause: NLB UDP control ports are
protected incorrectly by a firewall. By default, remote control
commands are sent to UDP ports 1717 and 2504 at the cluster IP
address.
- Solution: Be sure that these ports
have not been blocked incorrectly by a router or firewall. You can
also change the port number by modifying the corresponding NLB
parameter.
There is no reply when you use the dedicated IP address of a host to specify it as a target for a remote control command. However, specifying the host by its priority (ID) works.
- Cause: None of the hosts have a
dedicated IP address.
- Solution: Assign a dedicated IP
address to each host. For more information, see Configure Network Load
Balancing Host Parameters.
Connectivity to the cluster is denied to some users, but not all.
- Cause: An application that is being
load balanced is not responding.
- Solution: This is an
application-specific issue that is not related to NLB. Refer to
your application's documentation to correct this issue. You may
need to stop and restart the application.
- Cause: If your cluster is configured
for unicast mode, a switch might have learned the NLB network
adapter's MAC address.
- Solution: Clear the switch's port to
MAC address mapping.
- Cause: The cluster's IP address was
not added to TCP/IP on one or more of the hosts.
- Solution: If you do not use NLB
Manager to configure your cluster, you must manually configure
TCP/IP with the cluster's IP address.
- Cause: A host is leaving the cluster
because of a drainstop or stop command, but
convergence did not complete correctly.
- Solution: Wait for the convergence to
complete. If the convergence does not complete, see the following
issue later in this Troubleshooting topic:
After the cluster hosts start, they begin converging, but they never complete convergence
You cannot view or change the Network Load Balancing properties by using net config and Windows Management Instrumentation (WMI).
- Cause: To view or change Network
Load Balancing properties, you must be a member of the
Administrators group.
- Solution: Log on as a user who is in
the local Administrators group of the computer that is running
NLB.
An unusual number of TCP connections to the cluster's IP address are being reset by the server or the client.
- Cause: The HTTP keep-alive values are
enabled on the NLB hosts and keep-alive value-enabled clients are
connecting to the cluster.
- Solution: Disable HTTP keep-alive
values. For more information about HTTP keep-alive values and
Internet Information Services (IIS), refer to the IIS documentation
set.
To view the IIS documentation set from your desktop, install IIS, then click Start, click Run, and type the following command in the Open text box:
%windir%\help\iisrv.chm
- Cause: Low system resources on the
server are causing TCP to reject the connections.
- Solution: Free system resources by,
for example, adding additional system memory or closing unnecessary
applications.
- Cause: The cluster has diverged into
two separately converged clusters, which causes more than one node
to claim ownership of every connection.
- Solution: Remove the two clusters,
then recreate a single cluster.
Virtual Private Network (VPN) calls fail when you make a change that causes convergence (such as adding a host, removing a host, or draining a host).
- Cause: When using NLB to load balance
VPN traffic, you must configure the port rules that govern the
ports handling the VPN traffic (TCP port 1723 for PPTP/GRE and UDP
port 500 for IPSEC/L2TP) to use either Single or
Network affinity.
- Solution: Configure the port rules
that govern ports 500 and 1723 to use Single or
Network affinity. For more information, see Network Load Balancing
Manager Properties.
After the cluster hosts start, they begin converging, but they never complete convergence.
- Cause: A different number of port
rules or incompatible port rules on different cluster hosts were
entered. This will inhibit convergence.
- Solution: Open the Network Load
Balancing Properties dialog box on each cluster host and verify
that all hosts have identical port rules.
- Cause: You have a bad network adapter
or cable.
- Solution: Use the ping command
to test connectivity. Enter the host's fully qualified domain name.
You can also learn more about the issue by using the ping
command to search your domain controller by IP address and other
network servers by name and IP address.
- Cause: Duplex settings on a switch or
hub are mismatched.
- Solution: Confirm that the duplex
settings in each of your switches and hubs are configured
appropriately.
- Cause: The dedicated IP address that
you used for one of the hosts already exists on the network.
- Solution: Choose a new IP address, or
remove the duplicate address.
- Cause: Your cluster contains hosts
that are running Windows 2000.
- Solution: Your cluster must be running
Windows Server 2008 on all hosts. An NLB cluster environment
that contains hosts with Windows Server 2003 and Windows
Server 2008 is supported only when performing a rolling
upgrade to Windows Server 2008. Mixing Windows
Server 2003 and Windows Server 2008 in the same cluster
is not supported for long periods of time.
- Cause: You have configured different
cluster operation modes (unicast and multicast) on the hosts.
- Solution: Use NLB Manager to ensure
that all hosts are configured with the same cluster operation
mode.
Note | |
You can also view the Windows event logs to check for errors and warnings. For more information see Installing Network Load Balancing. |
The cluster moves in and out of a converged state.
- Cause: Heartbeats are being missed due
to intermittent network connectivity caused by a bad network
adapter or cable or other network problems.
- Solution: Use the ping command
to test connectivity. Enter the host's fully qualified domain name.
You can also learn more about the issue by using the ping
command to search your domain controller by IP address and other
network servers by name and IP address.
After the cluster hosts start, Network Load Balancing reports that convergence has finished, but more than one host is a default host.
- Cause: The cluster hosts have become
members of different subnets, so all the hosts are not accessible
on the same network.
- Solution: Be sure that all cluster
hosts can communicate with each other.
- Cause: A layer-three switch is being
used.
- Solution: Put a layer-two switch
between the hosts and the layer-three switch.
- Cause: A break in a redundant switch
caused the cluster to separate into two clusters, creating two
default hosts.
- Solution: Remove the two clusters,
then create a single cluster.
- Cause: Your switch is configured to
reject broadcast packets.
- Solution: Configure your switch to
accept broadcast packets (be aware that this might introduce
certain security risks), or configure your NLB cluster to use
multicast mode.
- Cause: One host is unable to send or
receive heartbeats.
- Solution: Use the ping command
to test connectivity to each of the hosts. Enter the hosts'
fully-qualified domain name.
- Cause: A host is plugged into the
wrong port on the switch.
- Solution: Use the correct port on the
switch.
Network Load Balancing is not load balancing applications, and the default host handles all the network traffic.
- Cause: A port rule is missing. By
default, NLB directs all incoming network traffic that is not
governed by port rules to the default host—this ensures that
applications that you do not want load balanced behave
properly.
- Solution: To load balance an
application across the cluster, create a port rule on every cluster
host for the TCP/IP ports that are serviced by the application.
- Cause: You added a second host to a
single host cluster, but the second host is not configured
correctly. The cluster never converges and the original host
continues to handle all of the traffic.
- Solution: Carefully review (and if
necessary, correct) each of the settings on the second host—for
example, the cluster IP address, dedicated IP address, and port
rules.
- Cause: If your cluster is configured
for unicast mode, a switch might have learned the NLB network
adapter's MAC address.
- Solution: Clear the switch's port to
MAC address mapping.
- Cause: A proxy server is sending all
connections that are using a single IP address to your cluster in
single affinity mode.
- Solution: Configure your proxy server
to use multiple IP addresses.
Traffic alternates unexpectedly between the cluster hosts, and it breaks TCP connections.
- Cause: Unicast network addresses are
causing issues with the switching hub. If you are using a switching
hub to interconnect the cluster hosts, you must use NLB multicast
support. Otherwise, the switch can behave erratically when the same
unicast network is used on multiple switch ports.
- Solution: Check that you have selected
multicast support in the Network Load Balancing Properties
dialog box. If you do not want to use multicast support, you can
interconnect the cluster hosts with a hub or coaxial cable instead
of with a switch.
Network traffic does not appear to load balance evenly among the cluster hosts.
- Cause: The network traffic is coming
from a limited number of IP addresses, possibly due to the setting
on a proxy server.
- Solution: Configure your proxy server
to use multiple IP addresses.
When you are using Network Load Balancing with Microsoft Internet Security and Acceleration (ISA) Server, one cluster host logs blocked packets that are directed to the dedicated Internet Protocol (IP) address of another host.
- Cause: One of the cluster hosts is
configured with a host priority identifier equal to 1.
- Solution: Do not configure any cluster
host with a host priority identifier of 1. Use numbers that are
greater than 1. For more information, see Configure Network Load
Balancing Host Parameters.
You are unable to create a Network Load Balancing cluster in a 64-bit version environment.
- Cause: You might not be running the
appropriate NLB version for your environment. NLB cannot form a
cluster when the 32-bit version of NLB is used on a 64-bit version
computer. This issue might have gone undetected because 32-bit NLB
components (nlb.exe, wlbs.exe, and nlbmgr.exe) appear to run
correctly in the 64-bit version environment.
- Solution: If you plan to use a 64-bit
version computer environment, you must use the 64-bit NLB
version.
Notes | |
|