IsValid and RegEx Issues

While writing some new documentation on ColdFusion MX 7 features you may have missed (for the soon to be released Fusion Authority Quarterly Update), I came across an interesting issue with the IsValid function.

This function is used to test if a ColdFusion variable is of a particular data type or if it is a particular value (email address, credit card, etc). In many ways this is a single function that can stand in for most of the standard "IS" functions (IsStruct, IsQuery, etc.) One noted exception is that the function has no existance checking (IsDefined).

One feature of the function is the ability to use a regular expression as a validator. This makes the IsValid tag a perfect replacement for using REFind to do regular expression pattern validation. That is, with a single exception. REFind is case sensitive and has a NoCase version to deal with non-case sensitive matches. IsValid is only case sensitive.

For many, this may mean that using IsValid in place of REFind for regular expression pattern validation is not an option. To those I say, there is a simple solution.

ColdFusion MX 7 has some advanced options for regular expressions which include a simple prefix for any expression to make it case insensitive. To make IsValid in regular expression mode case insensitive, just do the following:

IsValid('regex', variable, '(?i)pattern')

The (?i) syntax at the start of a regular expression pattern makes the entire match case insensitive. Other advanced regular expression prefixes can be used with the function as well, which can greatly expand its capabilities.

Note that we are talking about the ColdFusion-style regular expressions here. The documentation for the IsValid function incorrectly states that it uses Javascript-style regular expression syntax. This is not the case.

There are a few other quirks with the function that you're almost never going to run into and I suggest you keep an eye on the IsValid LiveDocs entry for any changes and clarifications.

Abort is an Error

I've been trying to use application.cfc more in my work and I've found another little 'gotcha'. Lets walk through this one step by step. A LOT of people use a CFABORT in their code to control the flow of their page. The logical assumption is that when the CFABORT is called, the page stops dead in its tracks. If the showerror attribute is set inside the CFABORT then an exception error is thrown. This acts like a slightly cut down version of the CFTHROW tag. But when you have not set the showerror attribute, you don't expect an error, right?

Guess what. There is a specific situation where the use of a simple CFABORT will be considered an error and this is (or should be) a common situation. When you have an application.cfc with an onRequest() and an onError() method and the template being called by the onRequest() has a CFABORT in it, the onError will detect that usage of the CFABORT as an error.

The error message will be supremely unhelpful as it simply says that there's an error in the onRequest method of the application.cfc. You have to look through the stack trace and tag context to see that the error was on a specific line and the last tag to fire was a CFABORT. But even then, you would not think this was the problem as there is no logical reason for it to be the problem.

The solution is to either not use CFABORT for page control when using these two methods or to remove one of those methods from your application.cfc. Either one can be removed and the error is gone.

So what has this taught me? Not that there is a strange combination of code in ColdFusion. Not that there is a potential bug in how onError handles error events. This has taught me that there are just not a lot of people using the application.cfc template. If there were, this problem would have appeared much earlier. Maybe it has and I missed it, but somehow I doubt that.

pseudo-memory leak

For the last few weeks I've been having some problems with House of Fusion. The memory for the JRun.exe has been going through the roof and I didn't know why. The code was tight, nothing had really changed on the site, so what was up? The answer was Yahoo.
In the last 3 weeks Yahoo has ramped up their indexing of sites. For a site as large as House of Fusion, this can take quite a bit of time. I've logged 2-4 yahoo bot hits per second at some times.
So how was yahoo the problem? Because of client variables. Not DB client variables and not even the dreaded registry client variables. Just simple cookie based client variables. It seems that when a client variable is set, a memory structure is also set for CF. Now each bot hit is assumed to be it's own session as it does not accept cookies. This mean each bot hit generates a memory structure of about 1k. Now this is not really a lot, but when you have a few 10's of thousands of hits from bots a day, it adds up.
I'm still waiting on word from Macromedia as to when a client memory structure times out, but this seems to be the issue.

So what's the solution? There are 4.
1. Increase your ram. If you can do this, then ramp up your memory as high as you can. This is not a perfect solution but it saves throwing time at the problem and gives you a 'buffer' against problems of this sort.
2. Set a robots.txt with a Crawl-delay setting. Mine is set to 1 second but you can set yours to something higher
3. set a different cfapplication for the most common bots. I use a simple regular expression to find key words that only exist in bots:
<CFIF REFindNoCase('Slurp|Googlebot|BecomeBot|msnbot|Mediapartners-Google|ZyBorg|RufusBot|EMonitor', cgi.http_user_agent)>
<CFAPPLICATION name="FusionA" clientmanagement="no" sessionmanagement="no" setclientcookies="no" setdomaincookies="no" clientstorage="Cookie">
<CFELSE>
<CFAPPLICATION name="FusionA" clientmanagement="yes" sessionmanagement="no" setclientcookies="yes" setdomaincookies="no" clientstorage="Cookie">
</CFIF>
This will make sure that a client structure is NOT created for one of these bots.
4. Use the same regex to clean out the client structure after the bot finishes the page. Use structclear(client) to remove the data in the onRequestEnd.cfm, the onRequestEnd method of the application.cfc or in the template itself.
Bottom line is that while bots are great for indexing your content, they can cause havoc on your system when a lot of memory is assigned to what is essentially a 'dead session'.

Be careful updating

I recently added the post ColdFusion 7.01 hotfix to my server. I thought this was a great idea except I made a single major mistake. I had never updated the server to 7.01. The result of this was over 10,000 error emails between the hours of 3am and 11am. I'm not sure if the errors were all from bots or people as well, but the bottom line is the same. I had to remove the hot fix and reboot. Everything was good from there. I'll add it back in after I actually update to 7.01.

I'm sure no one else will do such a foolish thing, but if you start getting a stream of "java.lang.NoClassDefFoundError" errors saying "coldfusion/util/IPAddressUtils null
The error occurred on line ...", then you know the reason.

CF 7.01 Hotfix 1

BlogCFC was created by Raymond Camden. This blog is running version 5.9. Contact Blog Owner
House of Fusion | ColdFusion Jobs @ House of Fusion | Fusion Authority