This isn't a cert prep question per se, it's actually a production network issue, but I've been banging my head against the wall for over a month on what the issue could be, doing a bunch of research, but have come up empty handed so I figured I would ask here. Maybe @donpezet will be able to shed some light on what the issue is.
Without delving too much into the details, we were trying to add a new location into an existing multi-site Cisco network infrastructure. For cost reasons, they went with a Sonicwall firewall for the new site, though the other locations and central data center hub are all Cisco equipment (routers). The network architecture is a very simple hub and spoke design, with the central DC router being the hub. The other locations all have VPN tunnels (with Tunnel interfaces on both sides) going over the Internet. They utilize EIGRP for interior routing, and each site uses it's own Internet WAN for the default gateway. Simple enough.
Due to some limitations I read online about integrating Cisco<->Sonicwall VPNs, we did not utilize the tunnel interfaces for this VPN and setup a simple IPSec link on both sides to handle traffic between them. No problems there, everything came up as expected. We setup the necessary ACLs on both sides to handle the traffic and I confirmed everything was working properly. That's when things started to get weird.
The primary issue was that the other sites weren't able to reach the new one, while the data center subnets (going through the hub directly) and Azure traffic (using a VPN to the hub with hard coded static routes to all site networks) could. All other site-to-site traffic worked fine. I began troubleshooting and noticed that the subnet for the new site was not in the DC router's routing table, whereas all the other sites were. As a result, it was not being distributed via EIGRP. I tried some various troubleshooting steps, but nothing would get it to show up, so I gave up for the day and did some research that night.
I came back the next day and, voila, the route for the new location was in the routing table and being distributed properly. Nothing had changed config wise. I chalked it up to some delay issue and figured we were good to go. I forget to mention that this was being done in our office prior to deployment on-premise for the customer, but outside of the site's WAN address changing, everything else was equal.
We go to install it on-site, and once again the routing issue comes back. No route in the table, no EIGRP distribution to the other sites, but the network is reachable to and from the data center. In other words, if the data center network is 192.168.0.0/24 and the new site is 192.168.1.0/24, I could ping between those subnets, but not between the new site and other locations through the DC. Once again, I troubleshooted, and waited to see if it magically fixed itself, but nope.
Eventually, I gave up and just set static routes from the other sites to direct traffic to the DC router if it was destined for the new site's network. While not ideal, it's a simple and small network and it works fine. The fact that it does work is what is confusing. If the route to the new site isn't in the router's routing table, how is it routing the traffic there in the first place? Why did it show up once, and even stay up after I made some general network optimizations (taking the tunnel up/down a few times in the process), but then disappear again never to show back up. I even tried setting the route on the DC router statically, mimicing the settings from when it did show up, but that didn't work either.
Anyway, like I said, this has been a puzzle to me for a couple months. It just doesn't make any sense. So I figured I would run it by people here to see if anyone else has any ideas.
Thanks.