You are currently browsing the tag archive for the ‘F5’ tag.

As I discussed in a previous post, simply redirecting a user to a “friendly” 404 page isn’t the best option. First, the user might not remember what they clicked/typed to get them to the page and also, simply clicking the back button might not be an option, especially if they submitted form data. Fortunately, as F5 LTMs are “Strategic Points of Control,” we can use them to better handle the situation.

 

First off, let’s determine the desired behavior when a user request induced an error code. In our case, let’s choose 404 as the error code to watch for. If we detect this error code being sent to the user, let’s redirect them to our Home Page (www.sample.com) rather than simply keeping them at an error page. To make their experience better, let’s also hold the user at an custom page for a few seconds while explaining their issue as well as the request that caused the problem.

 

Since we’re looking for error codes sent by the servers, we’ll need to run our commands from within the “HTTP_RESPONSE” event. As you’ll see from the DevCentral wiki page for HTTP_RESPONSE, there are examples for using the “HTTP::status” command to detect error codes and redirect users. For some, the rule below is perfectly acceptable.

 

when HTTP_RESPONSE {

if { [HTTP::status] eq “404” } {

HTTP::redirect “http://www.sample.com” }

}

 

Unfortunately, that rule would result in the user being sent to the redirect page without any explanation as to what they did wrong. So, we’re going to beef the rule up a bit. As you’ll recall from this post, we can set variables from the HTTP_REQUEST event and then reference them from our HTTP_RESPONSE event in order to show the user the link that caused the error.

Here’s a nice sample rule I just whipped up. We’re using “HTTP::respond” so the response comes directly from LTM. Also, I’m setting a variable “delay” to the amount of seconds to keep the user at the hold page.

 

when HTTP_REQUEST {

set hostvar [HTTP::host]

set urivar [HTTP::uri]

set delay 4

}

when HTTP_RESPONSE {

if { [HTTP::status] eq “404 } {

HTTP::respond 200 content \ “<html><head><title>Custom Error Page</title></head><body><meta http-equiv=’REFRESH” content=$delay;url=http://www.sample.com/></head>\<p><h2>Unfortunately, your request for $hostvar$urivar casued a 404 error. After 4 seconds, you’ll automatically be redirected to our home page. If you feel you’ve tried a valid link, please contact webmaster@sample.com. Sorry for this inconvenience.</h2></p></body></html>” “Content-Type” “text/html”

}

}

 

So, with that rule, the user requests a page that causes a 404 error. LTM will detect the 404 error, and instead of sending it to the user, it will respond with a 200 status code and HTML showing the user the link the requested as well as apologizing and telling them to contact your webmaster if there’s an issue. I was too lazy to use the HTML to make the e-mail address clickable, maybe next time. Also, by using “Meta Refresh,” we’re holding the user at the page for 4 seconds and then sending them to our error page. As you can see, HTTP::respond is a very powerful command. It’s pretty cool being able to use LTM to send HTML to a user.

 

 

Conserving public IP addresses has always been a good idea. Naturally, it’s become more important lately but that’s neither here nor there as far as this post goes.

Let’s assume you’re managing a website powered by an F5 BIG-IP LTM. You’ve got the following setup:

1. Virtual Server with IP Address 1.1.1.1 and listening on port 80.

2. A Pool called “pool_webservers” containing web servers 10.1.1.1:80, 10.1.1.2:80, 10.1.1.3:80, 10.1.1.4:80, and 10.1.1.5:80.

3. A DNS record “www.sample.com” pointing to the Virtual Server’s IP Address of 1.1.1.1

While the site is working fine, you’d like to be able to access individual web servers from an external network. This way, if a customer tells you your site isn’t working, you can test each server individually to try and narrow it down. Also, perhaps you’re releasing code to individual servers and would like to make sure it looks good.

This is a very common requirement for sites. Unfortunately, since your servers are using non-internet routable addresses from the 10.1.1.0 network, you can’t hit them externally.

People frequently deal with such an issue by doing one of the following:

1. Assign public IP addresses to each server and creating DNS records accordingly.

In this case, DNS might look like this: www1.sample.com=1.1.1.2, www2.sample.com=1.1.1.3, etc.

2. Create NATs on a public-facing router or Load Balancer to translate the public IPs to the server’s private ones.

In this case, DNS would look the same as above.

Unless you’re using port translation (1.1.1.2:80 = server 1, 1.1.1.2:1080 = server 2 etc,) then you’re using a Public IP address for each server you’d like to access. Since larger sites typically have far more than 5 servers, it’s each to chew up Public Addresses quickly.

Fortunately, we can use iRules to “route” requests to the proper web servers without using a single additional Public IP Address. From the DevCentral iRules Commands page, you’ll notice an event called “HTTP::host.” When a user types “www.sample.com” into their browser, their HTTP request contains an “HTTP Header” that contains the host (www.sample.com,) they requested.

If you’ll remember our layout, we have a Virtual Server at 1.1.1.1:80 served by “pool_webservers” with members 10.1.1.1:80, 10.1.1.2:80, 10.1.1.3:80, 10.1.1.4:80, and 10.1.1.5:80. http://www.sample.com points to 1.1.1.1 and is how users access the site. Now, we’d like the ability to target individual pool members from outside the network. Typically, this would require a public IP address for each web server but with iRules, we’re all set.

First, we’re going to create additional DNS records. Fortunately, they’re all going to point at the same 1.1.1.1 address as the other ones. Our DNS zone for “sample.com” now looks like this:

www IN A 1.1.1.1

www1 IN A 1.1.1.1

www2 IN A 1.1.1.1

www3 IN A 1.1.1.1

www4 IN A 1.1.1.1

www5 IN A 1.1.1.1

Now, it’s time to put together our iRule. As I was extremely inspired by Joe Pruitt’s recent post comparing iRule Control Statements, I thought I’d give multiple examples of how to accomplish our goal.

First, we’ll go with a simple “else, if” rule.

when HTTP_REQUEST {

if { [string tolower [HTTP::host]] eq “www1.sample.com” } {

pool pool_webservers member 10.1.1.1 80

} elseif { [string tolower [HTTP::host]] eq “www2.sample.com” } {

pool pool_webservers member 10.1.1.2 80

} elseif { [string tolower [HTTP::host]] eq “www3.sample.com” } {

pool pool_webservers member 10.1.1.3 80

} elseif { [string tolower [HTTP::host]] eq “www4.sample.com” } {

pool pool_webservers member 10.1.1.4 80

} elseif { [string tolower [HTTP::host]] eq “www5.sample.com” } {

pool pool_webservers member 10.1.1.5 80

}

}

Well, that was painless enough. If a user’s host header is “www1.sample.com,” we’re sending them to 10.1.1.1:80. Simply bind that iRule to our 1.1.1.1:80 Virtual Server and we’re set. You might also notice I’m using “string tolower.” That just converts the value to lowercase so I don’t have to support users inputting combinations of upper and lower case characters. Most browsers automatically convert the host header to lowercase but not all. If you read either “control statement” post above, you’ll notice that if/elses are hardly the most efficient method for doing something like this.

Now, we’ll try a “switch statement.”

when HTTP_REQUEST {

switch -glob [string tolower [HTTP::host]] {

“www1.sample.com” { pool pool_webservers member 10.1.1.1 80 }

“www2.sample.com” { pool pool_webservers member 10.1.1.2 80 }

“www3.sample.com” { pool pool_webservers member 10.1.1.3 80 }

“www4.sample.com” { pool pool_webservers member 10.1.1.4 80 }

“www5.sample.com” { pool pool_webservers member 10.1.1.5 80 }

default { pool pool_webservers }

}

}

This is a much cleaner, more efficient option. As you’ll notice, I used “-glob” with switch. Glob allows you to use wildcards and also look for patterns. If you read the above post comparing control statements, you’ll notice -glob isn’t as efficient as just using switch. Since we aren’t doing any pattern/wildcard matching here, you could easily leave off the -glob. I like to use it just in case I decide to add such enhancements later. I also used a “default” statement so requests not matching the other statements would go to our normal pool.

My personal preference is to use “classes/data groups.” A class is essentially a list that can be searched or matched. Typically, you have the field you’re matching and a value you can record should that value be matched. In version 10, the class features were greatly enhanced.

For our sample rule, our class could look like this:

class host_headers {

{

“www1.sample.com” { “10.1.1.1” }

“www2.sample.com” { “10.1.1.2” }

“www3.sample.com” { “10.1.1.3” }

“www4.sample.com” { “10.1.1.4” }

“www5.sample.com” { “10.1.1.5” }

}

}

In this case, “www1.sample.com” is what we’re matching against and “10.1.1.1” is the value we’d like to return. If simply using “class match,” we can ignore/omit the value on the right. If using “class search -value,” then we’re trying to return it. Here’s an example:

when HTTP_REQUEST {

if { [class match [string tolower [HTTP::host]] eq host_headers] } {

set hostvar [class search – value host_headers eq [string tolower [HTTP::host]]]

pool pool_webservers member $hostvar 80 }

}

The first thing we did was compare the host header to our class/datagroup called “host_headers.” If there’s a match, we set a variable called “hostvar” to the corresponding value. If the user requested “www1.sample.com,” for instance, the corresponding value in the class is “10.1.1.1.” So, now that “hostvar” = 10.1.1.1, we reference the variable in our pool command. So, the pool command essentially became “pool pool_webservers member 10.1.1.1 80.”

Joe’s “Comparing iRule Control Statements” showed that using classes was ridiculously efficient. Using classes can make it a bit more difficult to understand what an iRule does as it requires reading the rule and then reading the class contents. With that said, it’s very efficient and minimizes the amount of text within the rule. The ability to extract a value is very nice too.

To “complicate” things a bit, let’s assume you don’t want people outside of your IP space to access individual servers. If you’re releasing new code or price updates, there’s a fair chance you don’t want people hitting the system being worked on. To accomplish this, let’s create an address-type “data group/class.” containing the IP Address or Network we’d like to allow access. Let’s assume this class is called “allowed_access”

when HTTP_REQUEST {

if { [class match [string tolower [HTTP::host]] eq host_headers] and ! [class match [IP::client_addr] eq allowed_access] } {

HTTP::respond 403 “You’re not allowed!” }

else {

set hostvar [class search – value host_headers eq [string tolower [HTTP::host]]]

pool pool_webservers member $hostvar 80 }

}

Now, if a user requests one of our “specific server host-headers,” but doesn’t match the allowed IP addresses class, we’re going to respond with an HTTP 403. If they do match both conditions, the rule should operate normally.

While my examples used iRules to target specific servers using host headers, it shouldn’t stop there. Let’s say you’re administering tons of different sites similar to the following:

http://www.sample.com = main company page

http://www.domain.com = a domain registrar site you’re hosting

http://www.social.com = you’ve jumped on the social networking bandwagon and are hosting facebook variant

http://www.dating.com = self explanatory

It’s fair to assume you’d have different web servers hosting these sites. Typically, you’d have a different Virtual Server as well as the corresponding public IP as well. That’s not always necessary though. Using our switch statement from above, we can change our pool command a bit.

when HTTP_REQUEST {

switch -glob [string tolower [HTTP::host]] {

“www.sample.com” { pool pool_sample}

“www.domain.com” { pool pool_domain }

“www.social.com” { pool pool_social }

“www.dating.com” { pool pool_dating }

default { pool pool_default }

}

}

One of the more popular e-mail/forum signatures I see is “with iRules, you can.” I think this is a great example. Since LTM is a “Strategic Point of Control,” it can extract information such as the Host Header, or a Requested URI, and react to it.

It shouldn’t surprise anyone that I enjoy new technical challenges. While I think I’ve become pretty decent at writing iRules, I’m constantly reminded of how much more I have to learn.

Yesterday, someone posted a question on DevCentral that I couldn’t initially answer. They were running an online forum and wanted to keep a user from posting spam. Their idea was to search the post when it was submitted and if it contained a “blocked word,” prevent the post from being made. Unfortunately, the vast majority of my experience with iRules has been around inspecting HTTP GET requests and responses. In order to accomplish what this user wanted, the iRule would have to search the Payload of an HTTP POST which was new to me.

 

Fortunately, there were plenty of examples on DevCentral where people did something similar.  One of the most popular examples is for Sanitizing Credit Card Numbers. That iRule searches the response payload for strings that match credit card patterns. In this case, we’re searching the request data instead.

 

While the vast majority of rules I’ve seen only care about requests and responses, this was such an awesome reason to look at the payload, I thought I had to learn and also had to share it. Thanks to DevCentral user Hoolio’s posts as well as the awesome wiki, I had a relatively easy time learning. Yet another great reason for leveraging your “Strategic Points of Control” I’m curious to know what other uses for inspecting request/response data people could think of.

 

Here’s the code I ended up recommending.

 

when HTTP_REQUEST {

   # Only check POST requests
   if { [HTTP::method] eq "POST" } {

      # Default amount of request payload to collect (in bytes)
      set collect_length 2048

      # Check for a non-existent Content-Length header
      if {[HTTP::header Content-Length] eq ""}{

         # Use default collect length of 2k for POSTs without a Content-Length header
         set collect_length $collect_length

      } elseif {[HTTP::header Content-Length] == 0}{

         # Don't try collect a payload if there isn't one
         unset collect_length

      } elseif {[HTTP::header Content-Length] > $collect_length}{

         # Use default collect length
         set collect_length $collect_length

      } else {

         # Collect the actual payload length
         set collect_length [HTTP::header Content-Length]

      }

      # If the POST Content-Length isn't 0, collect (a portion of) the payload
      if {[info exists collect_length]}{

         # Trigger collection of the request payload
         HTTP::collect $collect_length
      }
   }
}

when HTTP_REQUEST_DATA {
# Define a string-type datagroup called dg_blocked containing words to be blocked
   if { [matchclass [HTTP::payload] contains dg_blocked] }{
      HTTP::respond 403 "Blocked"
   }
}


 

I’ve only recently started to look at the blog statistics provided by wordpress. One of my favorite data points is the “searches” through which users find my blog. One of the most popular searches pertains to having your F5 LTM use an iRule to send users to a maintenance page if all servers in a pool are down. Since I hate the idea of F5 customers being unable to leverage their device for a very common scenario like this, I thought I’d write a post.

 

As a reminder, there are plenty of examples of this exact scenario on devcentral.f5.com.

 

First, the easy way…simply utilize a “fallback host” in an HTTP Profile attached to your Virtual Server. If LTM is unable to connect to a pool member to serve a request, it’ll send the customer a redirect.

 

http://support.f5.com/kb/en-us/solutions/public/6000/500/sol6510.html?sr=11563781

 

As you’ll notice, the post above also illustrates how to use an iRule for a similar task.

when LB_FAILED {

if { [active_members [LB::server pool]] < 1 } {

HTTP::fallback “http://www.sample.com/redirect.html&#8221; }}

 

I rewrote the example rule a bit, but the point is there. If the pool to which the user was attached has less than 1 active member, then utilize a fallback. It’s important to note that the pool members are only inactive if they’ve failed their health checks. So, if you’re using a tcp port check and a pool member is throwing 500s for every request, it’ll remain up. In order to combat this, you can either use a better health check, or build additional logic into your rule.

 

when HTTP_RESPONSE {

if { [HTTP::status] eq “500” } {

HTTP::redirect “http://www.sample.com/redirect.html&#8221; }}

 

Now, rather than looking at the number of active pool members, we’re redirecting users if their pool member sent a 500. The negative to this method is that the pool might have other members that aren’t serving 500s which is why reselecting a pool member might be the better options. I’ll touch on that in another post.

 

As everyone knows, retail is an extremely seasonal industry. Retail E-Commerce is no different so when building an environment to support a retail site, architects and engineers have to plan for the highest demand. Let’s pretend cloud computing doesn’t exist or isn’t feasible in this case.

 

You’ve got a site that has an average daily peak of 50Mbps but on Black Friday, the peak is 1.2Gbps. Besides Black Friday, no other day of the year exceeds 200Mbps. Naturally ISPs can provide burstable ethernet so you’re only paying for what you use, but switches, load balancers, etc might not provide the same capability. So, you might have to build (and buy) an infrastructure that supports 10 Gbps to provide for your “peak” growth as that 1.2Gbps number might grow at 40% a year or more.

 

Before building out this environment though, it might be beneficial to learn more about your “peak” demand. For instance, let’s say the peak happens at midnight on Black Friday and that it’s sustained from 12:00 – 12:50 AM. High demand continues the rest of the day, but never exceeds 500Mbps. Why are so many people hitting your site from 12:00 – 12:50 AM? Let’s assume the marketing people tell us that they release some sort of promotion allowing shoppers huge discounts starting at 12:00 AM and going throughout the day. Unfortunately, there’s only enough inventory for 100 of each discounted item, so shoppers hit the site as soon as they’re available.

 

Before this conversation, we were planning on building an infrastructure to support that 1.2Gbps (and beyond) number that’s only hit once per year, and for only an hour. Now that we know more about why that time period is so popular, it’s time to determine whether it’s “cost-effective.” Let’s say we’re spending $1M extra to support demand that exceeds 1Gbps. If we want to avoid that spend, what options do we have to keep our traffic spikes under 1Gbps? What if the promotions are released the night before Thanksgiving? What if different promotions were released each hour during the day? What if there was enough inventory to assure all customers the items they want? What if promotions were e-mailed to different customers at different times? Obviously a marketing group would be better able to answer these questions than I, but there’s a decent chance that such methods could eliminate the short (duration), large (size) spike. Perhaps rather than a 1.2Gbps spike from 12:00 – 12:50 AM, we see a 500Mbps spike from 11:00 PM – 3:00 AM. Assuming profitability isn’t tied to when folks are buying goods, such a change in traffic spikes would allow us to delay a large expense for at least another year.

 

Naturally, retail is a great arena for public cloud. What happens, though, when all retailers are on public cloud? Wouldn’t the cloud provider have to have a huge hardware footprint to support Black Friday for all of its retail customers? At any rate, supporting seasonal demand is definitely a challenge, but it poses some interesting opportunities.

As I discussed in my post about “Strategic Points of Control,” F5 LTMs are in a great position to capture and report on information. I’ve recently encountered several issues where I needed to log the systems sending HTTP 404/500 responses and the URLs for which they were triggered. While this information can be obtained from a packet capture, I find it much easier to simply leverage iRules to log the information.

 

If you don’t know too much about iRules, I’d encourage you to head over to DevCentral and do some reading. One of the first things you’ll learn is that there are several “events” in which an iRule can inspect and react to traffic. Each event has different commands that can be used. While some commands can be used in multiple events, some may not.

 

As an example, HTTP::host and HTTP::uri can be used in the HTTP_REQUEST event, but not in the HTTP_RESPONSE event. Since an HTTP Error Response sent by a server would occur in the HTTP_RESPONSE event (between server and LTM,) we can’t simply log the value of HTTP::host or HTTP::uri as those commands aren’t usable in the HTTP_RESPONSE context. Fortunately, variables can be set in one event and referenced in another which allows us to still access the proper information.

 

Here’s an overview of what we’re trying to accomplish:

 

1. A client makes a request to a Virtual Server on the LTM.

2. The LTM sends this request to a pool member.

3. If the pool member (server) responds with an HTTP Status code of 500, we want to log the Pool Member’s IP, the requested HTTP Host and URI, and the Client’s IP address.

 

We’ll be using the “HTTP::status” command to check for 500s. Since this command needs to be executed within the HTTP_RESPONSE event which doesn’t have access to HTTP::host or HTTP::uri, we’ll need to use variables.

From the HTTP_REQUEST event, we’ll utilize said variables to track the value of HTTP::host, HTTP::uri, and IP::client_addr.

The HTTP_REQUEST event in our iRule will look something like this:

when HTTP_REQUEST {

set hostvar [HTTP::host]

set urivar [HTTP::uri]

set ipvar [IP::client_addr] }

Now, we’ll check the HTTP status code from within the HTTP_RESPONSE event and if it’s a 500, we’ll log the value of the variables above.

when HTTP_RESPONSE {

if { [HTTP::status] eq 500 } {

log local0. “$ipvar requested $hostvar $urivar and received a 500 from [IP::server_addr]” }}

 

Now, whenever a 500 is sent, you can simply check your LTM logs and you’ll see the client who received it, the server that sent it, and the URL that caused it. This is a fairly vanilla implementation. I’ve had several situations in which I needed to also report on the value of a JSESSIONID cookie so our app folks could also check their logs. In a situation like that, you’d simply set and call another variable.

From HTTP_REQUEST:

set appvar [HTTP::cookie JSESSIONID]

From HTTP_RESPONSE:

log local0. “session id was $appvar”

 

This was a good example of how easily iRules can be leveraged to report on issues. Unfortunately though, this isn’t always a scalable option which is why I thought I’d talk about a product I’ve really enjoyed using.

The folks behind Extrahop call it an “Application Delivery Assurance” product. Since both co-founders came from F5, they have a great handle on Application Delivery and the challenges involved. Since I’m typically only concerned with HTTP traffic nowadays, I use Extrahop to track response times, alert on error responses, and also to baseline our environment. As an F5 user, I’m very pleased to see the product’s help section making recommendations on BIG-IP settings to tune if certain issues are seen.

I’d definitely encourage you to go check out some product literature. Since it’s not always fun to arrange a demo and talk to sales folks, they offer free analysis via www.networktimeout.com. Simply upload a packet capture, it’ll be run through an Extrahop unit, and you can see the technology in action.

 

 

As I discussed in this post, sending an HTTP GET for a page on a server to which you load balance traffic is one of the better health checks available. If you use the right page, it can be an extremely light-weight, yet highly reliable check.

In order to properly utilize these health checks, you need to know enough about the application you’re supporting to understand how it behaves when it fails.

In my case, I send traffic to a pool of Apache Servers running mod_weblogic. From there, the traffic is sent to application instances.

 

Using an F5 BIG-IP LTM as an example, there are several configurable parameters when defining a health check.

 

1. Interval (How often the check is sent)

2. Timeout (How long does the resource have to respond)

3. Send String (The request you’re sending the resource)

4. Receive String (What response causes the health check to pass?)

5. Receive Disable String (What response causes the health check to fail)

 

There’s several more, but let’s concentrate on the typical ones.

The default interval is 5 seconds while the timeout is 16. I’ve always been ok with that.

For our send string, let’s do “GET /login.jsp HTTP/1.1\r\nHost: \r\nConnection: Close\r\n\r\n”

So, we’re sending an HTTP GET for /login.jsp using HTTP/1.1 and an empty host header. We’re also closing out the connection so it doesn’t have to sit idle on the server.

For our receive string, let’s do “HTTP/1\.(0|1) (2)”

So, we’re considering a response starting with 2 using HTTP 1.0 or 1.1 as a success. Typically, a server will respond with a 200 when all is well so this is pretty typical.

 

Unfortunately for me, our resource actually sends an HTTP 301 (Permanent Redirect) when a user tries loading the login page. This happens fairly often, especially if you’re sending a health check for “/” and the resource redirects you to a different directory. Since we consider this permanent redirect to be normal behavior, we’ll modify our receive string to “HTTP/1\.(0|1) (2|3)” Now, we’re including all 3** responses as well. Since a failed resource will usually timeout or send a 404/500 when it fails, this should work well.

 

As I mentioned before, my LTM sends traffic to Apache which then sends it to our App instances via mod_weblogic. So, what happens when the app instances are down? I’d expect a 404 or 500 from Apache, right? Sure, as long as your application folks haven’t configured it to send an HTTP 302 (Temporary Redirect) so users go to a custom error page when the App Instances are down.

 

So, here’s what we’ve seen:

 

1. During normal conditions, the resource returns a 301 for its health check.

2. If application instances are down, the resource returns a 302 for its health check.

 

Naturally, we need to modify our Receive String

 

HTTP/1\.(0|1) (2|3)

to

HTTP/1\.(0|1) (2|301)

 

We’re still allowing any 2xx response but are now only allowing 301s.

 

We’ve done what we wanted to. We’ve configured a health check that accurately determines the system’s health. As you’ve noticed though, it required trial and error, and a lot of testing. When determining a health check strategy, it’s critical that either you or an application owner understands their application’s behavior while it’s working, and even more importantly, when it’s not. Also, it’s not always wise to “set and forget” these checks. If, for instance, our application folks changed the “/login.jsp” redirect from a 301 to a 302, the check would fail, and we’d have to come up with a new strategy.

 

 

If you have any familiarity with performance monitoring in a large environment, you’ve likely heard of Gomez. In a similar fashion, if you have experience with application delivery or load balancing, you’ve likely heard of F5. While F5 helps you deliver applications as efficiently as possible, Gomez typically helps you measure and monitor them.

Like most hosted monitoring services, Gomez provides the ability to test a website from multiple locations, multiple browsers, and multiple networks. While these capabilities give a site owner a view into when and where issues occur, they don’t 100% show what users are seeing. Obviously if DNS or routing isn’t working, Gomez will see it, just like your customers would. Unfortunately though, Gomez can’t replicate every single browser, network connection, and machine from which a client might hit your site.

To solve this problem, Gomez recommends “Real-User” monitoring. In order to leverage this technology, users must insert client side JavaScript onto their web page requests. Unfortunately, if you’re using Gomez, you’re likely monitoring a fairly large site so having to integrate this JavaScript could get very complicated.

Luckily for F5 users, Gomez is a Technology Alliance Partner which makes this problem quite a bit easier to solve. Since F5s are “Strategic Points of Control” that see the client requests and application responses, it’s easy enough to leverage them for the Real-User monitoring.

Joe Pruitt wrote a series of articles on how to leverage iRules to obtain real-user monitoring without having to make application changes.

Part 1 is here.

Part 2 is here.

Part 3 is here.

Throughout the series, Joe discusses how to link client requests to a Gomez account and allows site owners to view stats on a Page, Data Center, or Account basis. While it’s a fairly “complex” iRule, it’s an amazing example of utilizing “network scripting” to allow leveraging an amazing monitoring technology.

While a typical Gomez implementation gives you visibility into how your site is performing for their probes, real-user monitoring shows you how it’s performing for your actual customers. This is a huge win for both designers and troubleshooters. Imagine being able to see that 10% of your users are having issues with a particular page and only in a particular Data Center. Talk about expediting the troubleshooting process. Also, if you can see that your page load times are exceeding SLAs but only for mobile users, you’ve quickly identified a page that might be a candidate for mobile optimization.

Performance monitoring has obviously come a long way in the last few years. Once upon a time, it was adequate to load separate pages. Now, transactional monitoring is typically a requirement. Again, a simple Gomez implementation does allow you to monitor that your systems are handling transactions but it doesn’t tell you that your users are really completing them.

With most monitoring vendors, you pay extra to have a site tested from multiple locations. By utilizing real-user monitoring, you’ve turned every one of your visitors into a monitoring probe and are able to gather and act upon the data they’re generating for you. In my opinion, the biggest win with real-user monitoring is that you’re 100% seeing issues before your customers report them…as long as the user can get to your F5s anyways.

For awhile now, F5 has been referring to their BIG-IP products as “Strategic Points of Control.” When I first heard that phrase, I didn’t really understand what they were trying to say and assumed it was “marketing speak.” As I’ve gotten better at leveraging F5 technologies to solve my very complicated requirements, I’ve begun understanding what they meant.

I was going to write a blog post about “Strategic Points of Control” a couple months ago, but Lori MacVittie had already beaten me to it.

She defines Strategic Points of Control as “Locations within the data center architecture at which traffic (data) is aggregated, forcing all data to traverse the point of control.”

I think that’s a great definition so I’ll happily use it here.  For our example, let’s assume we’re hosting an E-Commerce site. Naturally, traffic traverses our F5 LTMs on its way to our application instances. This means the F5s are not only a point of failure, but also a point of control. They see all inbound and outgoing content for this application. Since F5 does a wonderful job of building L7 visibility into their devices, LTM becomes a great candidate for altering or reporting on the traffic flowing through it. Of course, just because it can, doesn’t mean it should.

Someone posted a question on DevCentral (F5’s User Community) wondering when it was prudent to use iRules. Naturally, most of us answered “it depends.”

While almost everyone appreciates the flexibility of iRules, some fear that might be used when they shouldn’t be.

I recently worked on a project that required us to ensure an HTTP application only used HTTPS. Since this application was being fronted by an F5 LTM pair, it made sense to terminate the SSL there and send cleartext between the F5 and application.  While sometimes, it’s as easy as making an HTTPS Virtual Server and applying an SSL profile containing the proper cert, I wasn’t that lucky. This particular application sent redirects to the user based on how it was being accessed. If it was being hit over HTTP, it sent redirects specifying http. If it was being hit over HTTPS, it sent redirects specifying https. In this case, even though we were using HTTPS between the client and LTM, the application would still see traffic over HTTP since we weren’t re-encrypting the data between LTM and the application. Naturally, this would cause a user to stop using SSL as soon as they clicked a link.

Fortunately, since LTM sees the traffic between itself and the application, it can see these redirects and rewrite them. By using “redirect rewrite,” I was able to rewrite the redirects sent by the application to use https. Unfortunately, this application also had javascript buttons that when clicked, would cause the user to send a GET request specifying HTTP. Again, since LTM is a “strategic point of control” and sees the traffic, I simply wrote an iRule to redirect all HTTP requests for this Virtual Server to HTTPS.

After creating the iRule for the redirect, I let the application team know that we were ready for them to start testing. They were somewhat surprised that I was able to make the application use HTTPS without them making any changes. One of them actually said, “awesome, I like when it’s easy like this and we don’t have to hack crap together.” With a huge smile on my face, I said, “that’s pretty much exactly what I just did.”

It only took me about 10 minutes to brush up on “redirect rewrites” and since I had written plenty of “http-to-https” iRules, this was extremely easy. At the end of the day though, I used iRules to fix an application “issue.” While this is one of the best features of iRules, it demonstrates their potential use as a mitigation tool. What if I was the only person to have a good understanding of iRules or how we were using LTM to handle the redirects for this application? If someone accidentally altered or removed that iRule, the application would start having issues. If the application code was rewritten to only use HTTPS, there really wouldn’t be any concerns. Of course, there are a ton of application instances and by making the change on the F5s, we keep traffic from having to get to the apps just to be redirected and also are able to make a change in only one place.

One of the most enjoyable posts I’ve made dealt with using iRules to generate Heatmaps to illustrate site visitors. Even though I tested this iRule and got it working well, I ended up choosing not to use it. Because my site leverages Akamai’s DSA product, we have access to very similar information through their portals. By using their site to track this info, I essentially traded one Strategic Point of Control for another. Obviously I saved myself a performance hit on our F5s, but it really came down to whether tracking users like this was a proper use of my LTMs.  The answer, as always, is that “it depends.” For sites that don’t have Akamai or some other product that also has visibility into information like this, F5 might be your best option.

Assuming you’re using Akamai and have an F5 deployment, you’ll run into several areas of overlapping technologies:

1. Using Context to handle different users…differently.

2. Protecting application resources by throttling users based on whether cookies exist.

3. Web Application Firewalling

4. Redirects

5. Limiting access to a site to certain geographic areas/types of users

6. Compression, Caching, Acceleration

The list could easily go on, but it demonstrates some potential challengers an architect might face. Since both Akamai and F5s are strategic points of control, which should you use? I think the most accepted rule is “the closer to the user, the better.” In reality, it comes down to a cost/benefit comparison. While making these decisions in Akamai-land both limit traffic to your infrastructure and also accelerate the user experience, there’s a price for that. Assuming you already have capacity on your LTM, it would be free (save labor) to use it instead whereas Akamai would likely charge for each feature.

I’ve often spoken about how valuable learning from failures is to an IT professional’s development. The challenge is how best to limit these failures to a controlled environment in which business impact is minimized.  Due to the complexities of IT environments, it’s not always easy to notice a “mis-configuration” when it happens, thus exposing businesses to potentially pro-longed issues.

Fortunately, a lot of systems provide logging capabilities. Pretty much every network vendor allows SNMP trapping and syslogging from their devices. The challenge is configuring these properly and making sure you’re always watching them.

Here’s an example iRule that limits access to certain domains:

when HTTP_REQUEST {

if { ! [class match  [HTTP::host] eq dg_host] } {

reject

log local0. “[IP::client_addr] went to [HTTP::host][HTTP::uri] and was rejected.” } }

The line “log local0. “[IP::client_addr] went to [HTTP::host][HTTP::uri] and was rejected.”” is only executed if a user hits the Virtual Server with a host-header that doesn’t exist in our Data group of allowed hosts.  This is similar to logging blocks on a router ACL.

While viewing my log entires, I quickly noticed I was blocking people trying to go to “Domain.com”, “DOMAIN.COM”, and “domain.com.”.  Since the users were still trying to go to the proper domain, I modified my statement from

“if { ! [class match  [HTTP::host] eq dg_host] } {“

to

if { ! [class match  [string tolower [[HTTP::host]] eq dg_host] } {

“string tolower” converts the specified string to lowercase. The reason I hadn’t initially done this was because most browsers automatically lower the host-header when they submit a request. By logging the blocks for my rule, I was able to see exactly what was getting blocked so I could make a change.

Since LTMs are typically placed at “strategic places of control” within a network, they can control and report on traffic. In this case, we’re logging the User’s IP address, Host Header, and URI request.

A typical log entry might look like “1.1.1.1 was blocked going to http://www.domain.com/index.html.”;

This is actually a relatively simple logging statement. While having a recent issue where certain users weren’t accepting cookies from my LTM, I decided to add [HTTP::header "User-Agent"] to my logging which quickly pointed out that the users having issues were Google Droids which told me I needed to check our mobile-adaptive logic. If I added the User-Agent logic to my iRule above, I’d have quickly discovered which browsers don’t convert the host-headers to lower-case.

You can easily comment out logging commands from an iRule unless you need them. By locking at different points of your rule, you can quickly see at which steps you’re having issues.

Follow

Get every new post delivered to your Inbox.