Posts for April 2008

How to: Create an ASP.NET CAPTCHA Control (part 1)

As I explained in my previous post, I developed a CAPTCHA ASP.NET control for this blog. In the next few posts, I will explain the steps involved in doing this, and how you can develop your own CAPTCHA control.

There are some variations on CAPTCHA tests, the most common one requiring the user to input the characters displayed on an image. The idea is that only a human will be able to read these characters; so if the challenge response is correct, it is most likely a "real human" submitting the data. Since modern OCR software can be quite efficient, it is neccessary to make the charaters hard-to-read by altering shape, adding noise or lines. Of course these measures also make the CAPTCHA harder to read for a human. For my CAPTCHA control, I decided to create a control, that emphasizes on ease-of-use for the end user. Therefore, the images generated should be easy to read.

When deciding which characters to display on the image, there are generally two approaches: Generate some randomly, or choose between a pre-defined set of words. I choose the latter approach, since it would be easiest for a human to recognize an actual word. Therefore, I am storing a list of English words, from which I select one randomly whenever I need to generate a CAPTCHA.

Step one: Creating the basic control
I have chosen to implement the CAPTCHA as a UserControl, so that the look and/or different parts of the control can be changed at a later time, if I need to do so. So I created a UserControl and placed an image tag and a textbox on it. These are the essential parts of the CAPTCHA control.

The basic control implementation does the following: Whenever the control is shown, a word is selected randomly for the challenge. A unique, random URL for the CAPTCHA image is also generated. The purpose of using a unique URL is to ensure that the browser does not display an old CAPTCHA image because it caches it locally.

The selected word is stored in Session state. Alongside the URL, it is exposed as a public static property, that populates on-demand. This makes sure that the image-rendering code will be able to get the correct word, and the encapsulation ensures that I can change the storage if necessary. This is the implementation of these two properties:

1:         /// 
2:         /// Gets the captcha URL.
3:         /// 
4:         /// The captcha URL.
5:         public static string CaptchaUrl 
6:         { 
7:             get
8:             {
9:                 if (MyContext.Session[CaptchaUrlKey] == null)
10:                     MyContext.Session[CaptchaUrlKey] = String.Format("/captcha/{0}.ashx", rand.Next());
11:                 return (string)MyContext.Session[CaptchaUrlKey];
12:             }
13:         }
15:         /// 
16:         /// Gets the captcha word.
17:         /// 
18:         /// The captcha word.
19:         public static string CaptchaWord
20:         {
21:             get
22:             {
23:                 if ( MyContext.Session[CaptchaWordKey] == null)
24:                 {
25:                     string listWords = Settings.User["CaptchaWords"];
26:                     var words = listWords.Split(',');
27:                     MyContext.Session[CaptchaWordKey] = words[rand.Next(words.Length - 1)].Trim();
28:                 }
29:                 return (string)MyContext.Session[CaptchaWordKey];
30:             }

When the control is displayed, the image on the control is databound to the CaptchaUrl property; so it will display the image containing the correct word. The request the browser sends for the image will get handled by a separate http handler (which we will discuss in a later post); which will output the generated image.

On postback, the control will check the text the user has entered, and if it matches the generated word, a public property called "IsValid" will be set to true. This indicates to the control on which our CAPTCHA resides, that the user has passed the CAPTCHA test. After the check, the word and URL is reset, so a new CAPTCHA will be generated if the control is shown again.

A slightly better approach would be to implement the control as a .NET Validator control, so that it could take part in the page validation along with other validator controls. This would eliminate the need of the other controls on the page being aware of the CAPTCHA. Doing this would not be much more work; one would simply need to inherit from the abstract BaseValidator class and implement the neccessary methods.

Hacking ASP.NET: Trace information

All ASP .NET developers propably know about the trace feature in ASP .NET. Provided you have enabled tracing in web.config, (using <trace enabled="true" /> in the system.web element; requesting the url /trace.axd will provide you with a nice list of trace information for the previous requests.

I have often thought about putting the wealth of information to better use; perhaps making more detailed reports based on the trace information. This could be useful during testing. Unfortunately, as far as I can tell, there is no other way to get the information, than requesting Trace.axd. There seems to be no supported programmatic way of doing this.

So I set about finding out, how this could be done. At first I thought about creating a screen-scraper for requesting trace.axd and collecting the information. But this would be impractical; especially when large amounts of data should be collected.

A better approach seemed to be to find out how ASP .NET actually stores this information. Since trace.axd is actually an IHttpHandler (System.Web.TraceHandlers.TraceHttpHandler), the natural starting point was using Reflector to view the internals of this class. It did not take long to figure out, that the HttpRuntime class has a static internal property named Profile of the type System.Web.Util.Profiler, which is internal. This is the class responsible for collecting the Trace information, and has a GetData method. This method returns the current trace information as an IList containing DataSets.

Armed with this information, I wrote a small class that uses reflection to obtain the profiling data. The class looks like this:

   1:  using System;
   2:  using System.Collections;
   3:  using System.Collections.Generic;
   4:  using System.Data;
   5:  using System.Linq;
   6:  using System.Reflection;
   7:  using System.Web;
   9:  namespace dr.TraceAnalyzer
  10:  {
  11:      /// 
  12:      /// Proof-of-concept class for accessing trace data using reflection.
  13:      /// 
  14:      public class TraceData
  15:      {
  16:          /// 
  17:          /// Data
  18:          /// 
  19:          private IList data = null;
  20:          /// 
  21:          /// Gets the trace data in its raw list-of-datasets representation.
  22:          /// 
  23:          public IList Data
  24:          {
  25:              get
  26:              {
  27:                  if (data == null)
  28:                      GetCurrentData();
  29:                  return data;
  30:              }
  31:          }
  33:          /// 
  34:          /// Returns the response time for each request stored in the trace data.
  35:          /// 
  36:          public IEnumerabledouble>double> RequestResponseTimes
  37:          {
  38:              get
  39:              {
  40:                  GetCurrentData();
  41:                  var sets = from d in Data.Cast()
  42:                             select d;
  43:                  return from set in sets
  44:                               let traceTable = set.Tables["Trace_Trace_Information"]
  45:                               where traceTable != null && traceTable.Rows.Count > 0
  46:                               select (double) traceTable.Rows[traceTable.Rows.Count - 1]["Trace_From_First"];
  47:              }
  48:          }
  50:          /// 
  51:          /// Gets the current data from the Profiler instance's GetData method.
  52:          /// 
  53:          /// 
  54:          public IList GetCurrentData()
  55:          {
  56:              var profiler = GetProfiler();
  57:              Type profilerType = profiler.GetType();
  58:              MethodInfo method = profilerType.GetMethod("GetData", BindingFlags.Instance | BindingFlags.NonPublic);
  59:              return data = (IList) method.Invoke(profiler, null);
  60:          }
  62:          /// 
  63:          /// Use reflection to get the Profiler instance.
  64:          /// 
  65:          /// 
  66:          private object GetProfiler()
  67:          {
  68:              Type runtimeType = typeof (HttpRuntime);
  69:              PropertyInfo profileProperty = runtimeType.GetProperty("Profile",
  70:                                                                     BindingFlags.NonPublic | BindingFlags.Static);
  71:              if (profileProperty != null)
  72:              {
  73:                  return profileProperty.GetValue(null, null);
  74:              }
  76:              throw new ApplicationException("Reflection to get profiler instance failed.");
  77:          }
  78:      }
  79:  }

I have yet to decide what I am going to use the trace data for. But an obvious way to use it would be to represent some of the performance data that is collected, as a graph. For now, I have added a property, RequestResponseTimes, that returns a list of the total time taken for each request stored in the trace data.


And, please remember to disable tracing when putting your site into production ;-)