HTTP Content Checks are very similar to HTTP Checks, except that HTTP Content checks look for a specific word (string) in the data returned in the response in addition to checking the status code and response time. HTTP is the transfer protocol for web pages, as well as being used for some other web based technologies. HTTP is a request-response protocol. The client, in most cases in daily life a browser, sends a request to a server and the server sends a response back, hopefully with the information the user was hoping to get. The response includes a header component and a content component. The content component is the part of the response people see in their browser. It's the part that contains the HTML. The HTTP Content check makes an HTTP GET request for the target URL and checks to see if the response code is between 200 and 399, that the response time is within the configured threshold, and that the content string or regex configured in the check is in the page. This check type supports either positive or negative checks, meaning that it can be configured to treat the check as passing if the page includes the content, or it can treat the check as passing if the page does not include the content.
HTTP Content Checks are a key piece of a good website monitoring strategy. You should use HTTP Content checks anytime you need to monitor if a web page is responding with specific content. Most often the content you should configure to look for is from a part of the page that doesn't change. This is particularly important when you are monitoring dynamic sites (especially blogs) and sites with content management systems. Footer sections of pages are often a good choice.
Optionally, instead of looking for an exact match string in the returned content, you can use a regular expression (regex).
Negative content checks are useful to monitor dynamic content blocks for errors. For example, if you have a feed or block on your page that shows the latest stories, that part of the page might have errors or problems that are independent of the rest of the page loading properly. In those situations you can use the "Does not contain" configuration for HTTP Content checks to make sure that the page does not include "Database connection error" or "0 New Stories." HTTP Content pages are also useful for monitoring status pages. For example, some servers have a status page that lists "OK" for several components of the services on that server, and an "Error" status for a specific service if it has problems. You can configure a HTTP Content check to Pass if the content returned by the page does not include the word "Error."
To set up a HTTP Content check,
This check doesn't care what the rest of the data returned by the page looks like. In fact, it doesn't even need to be HTML. It just has to contain the string or regex match the check is looking for. This means that the check is useful for checking XML or JSON content as well as standard web pages. It can also look for HTML tags, but the system does some filtering in order to protect against XSS attacks, so there are some specific strings that you can't use for this check. Typically there are ways around this by using parts of the string that don't look like a XSS attack.
There is a 3MB limit for this check on the data received from the server. If your URL returns more than 3MB of data, this check will fail.
When following redirects, there is a 4 redirect limit then the check will fail with the following error "Too many redirects".
The threshold timeout applies for all redirect requests and responses, not just the last HTTP request.
IPv6 URLs require the bracket formatting such as http://[2606:c700:4020:11::53:4a3b]/
SSL/TLS certs are not validated by this check so it will work fine for expired and self-signed certs. You'll want to add an SSL check to verify your SSL certs and get warnings before they expire.
SSLv3/TLS1.0 are not supported.