Hi,
I hope that there is someone out there that can point me in the right direction concerning a frustrating issue I have been having with my router hardware and our company network?
I’ll start with a bit of background history to get you up to speed with our configuration etc.
My company comprises of 30ish users all running on XP SVC PK3 aside from two users that have Vista SVC PK2. (These machines have had ivp6 protocol disabled in the NIC, as per recommendation from Zen support). We use a Compaq Proliant ML370GS Server (192.168.16.100) with 2003 O/S (no ISA server yet but, plans to install one in the New Year) and a 3Comm Baseline 2948-SFP Plus switch. We have a second server (192.168.16.2) running alongside the Proliant that is being used to terminate the VPN clients logging in remotely. This box is due to become the ISA. The main server acts as the DHCP and DNS server. The preferred DNS address is pointing back to 192.168.16.100. Up until a few days ago the ADSL+ service, provided by Zen, was working fantastically. However, when I upgraded the existing Netgear DG834GT router to a Draytek Vigor 2820n router the entire network and Internet browsing came to a grinding halt. 30ish complaining users and 1 stressed out admin!!!
Typically the issue was web pages would not load up unless they were constantly refreshed by pressing F5. Sometimes even this wouldn’t help on some sites. I noticed that when downloading large files I.E, MS Office SVC PK 2, the download speed was not impaired. We were getting up-wards of 5mbps. Also users were reporting that any emails with attachments were not being delivered or received or there were huge delays in emails being received / delivered. A good example was an email that was received by the exchange server with a time stamp of 15:16 and was finally delivered into the users inbox at 17:16, this particular email had no attachments associated with it either.
I pre-configured the Vigor in an isolated network environment to ensure it was configured correctly before going live with it. The plan was to simply unplug the RJ11 & RJ45 from the Netgear and plug directly into the Vigor for a seamless swap over. The first thing I noticed was; when attempting to log into the admin console of the Vigor from the server it was so slow it was unusable. However, I could log in from my workstation and navigate the options relatively quickly.
The few port forwarding rules we use were setup OK, as was the VPN access, as we terminate on the 192.168.16.2 server.
The Proliant has two NIC’s bonded together as a Team using HP’s utility which in turn are plugged into the 3Comm switch. The second server has just one LAN cable to the switch and the Vigor is connected to the switch also. So the overview is:
1. ADSL+ to Vigor
2. Vigor to switch
3. Switch to Proliant
4. Proliant to switch
5. Switch to users
All the users’ firewalls are disabled by default and IE8 is the preferred browser by choice. Needless to say that all machines have the latest service pack and Microsoft updates installed.
After spending an hour on the Draytek support line, @75p a minute, they assured me that the router had been configured correctly. They even remoted on and checked all the settings and surmised that the fault lay within the LAN.
A further call to Zen support highlighted that when pinging known addresses there were definite delays in the packets being sent. Using a ping path command from the server you could see the 1st hop from the router was instant, 212ms to the 2nd 220ms to the 3rd 176ms to the 4th and three Astrix’s for the 5th where it was failing to resolve the DNS address. Below are the results when using the Netgear router. (Obviously no issues with this test and unfortunately I didn’t capture the Vigor results)!
>pathping 212.23.3.100
Tracing route to dns.lb.mbr-roch.zen.net.uk [212.23.3.100]
over a maximum of 30 hops:
0 james-pc05.JAMES.LOCAL [192.168.16.134]
1 192.168.16.254
2 losubs.subs.dsl1.kp-leeds.zen.net.uk [62.3.85.17]
3 ae0-112.cr1.kp-leeds.zen.net.uk [62.3.85.177]
4 lotze-ae2-0.hq.zen.net.uk [62.3.80.69]
5 epictetus-ge-0-0-0-11.hq.zen.net.uk [62.3.82.66]
6 dns.lb.mbr-roch.zen.net.uk [212.23.3.100]
7 dns.lb.mbr-roch.zen.net.uk [212.23.3.100]
Computing statistics for 175 seconds...
Source to Here This Node/Link
Hop RTT Lost/Sent = Pct Lost/Sent = Pct Address
0 james-pc05.JAMES.LOCAL [192.168.16.134] 0/ 100 = 0%
1 0ms 0/ 100 = 0% 0/ 100 = 0% 192.168.16.254 0/ 100 = 0%
2 37ms 0/ 100 = 0% 0/ 100 = 0% losubs.subs.dsl1.kp-leeds.zen.net.uk [62.3.85.17] 0/ 100 = 0%
3 38ms 0/ 100 = 0% 0/ 100 = 0% ae0-112.cr1.kp-leeds.zen.net.uk [62.3.85.177] 0/ 100 = 0%
4 40ms 0/ 100 = 0% 0/ 100 = 0% lotze-ae2-0.hq.zen.net.uk [62.3.80.69] 0/ 100 = 0%
5 39ms 0/ 100 = 0% 0/ 100 = 0% epictetus-ge-0-0-0-11.hq.zen.net.uk [62.3.82.66] 0/ 100 = 0% |
6 38ms 0/ 100 = 0% 0/ 100 = 0% dns.lb.mbr-roch.zen.net.uk [212.23.3.100] 0/ 100 = 0%
7 38ms 0/ 100 = 0% 0/ 100 = 0% dns.lb.mbr-roch.zen.net.uk [212.23.3.100]
Trace complete.
So the diagnostic path I have followed so far is:
- Re-set the Vigor to factory defaults and updated to the latest firmware 3.3.3 Append A.
- Set the MTU to 1440.
- Isolated the network and brought each PC on-line in sequence to see if a rogue PC was dragging the network down.
- Configured a standalone PC to Zen’s DNS address to bypass servers DNS settings. (No noticeable positive effect).
- Asked an independent person using a PC outside our network to telnet our port 25, which failed with the Vigor, but passed with the Netgear.
- Monitored all traffic using Wire Shark Packet Sniffer to look for excessive dropped packets or collisions.
- Diagnosed network using Look@Lan.
- Diagnosed switch with 3Comm's utilities.
- Invited Draytek (Expensive) to check hardware settings.
- Invited Zen (Thorough) to walk through the logical approach.
I am thinking that the fault has got to be LAN side and I am of the opinion that this is a DNS issue, although my colleague disagrees but, like me, is at a loss to spot the ball too!
After taking two days worth of grumpy user’s comments, I reluctantly re-commissioned the Netgear router and scuttled back to my dark office and remain in hiding until it's safe to come out again!
I have exhausted my options and am at a complete loss as to what is causing this issue, apart from the obvious conclusion….’is the router faulty’, or could it be a ‘large wood and some up-close trees moment’?
Any help or advice would be received with the up-most gratitude!