Technology


20
Jul 10

Better web application interface markup: lessons from theme frameworks

Every time I start a new web application project, I spend a while (re)thinking what the layout structure should be in terms of CSS and HTML (e.g. semantic naming, organizing CSS markup).

Recently, I did a project using the Thematic theme framework for WordPress, and that got me thinking about how custom web applications interfaces could be improved. As an individual developer, I rarely have the luxury of focusing on the details of the layout. In contrast, the developers building theme frameworks have spent years thinking about how to create a generic, extensible structure for web application interfaces.

I had a look at Thematic, WP-Framework and Theme Hybrid (see more frameworks in this Smashing Magazine article). I used Thematic, since I ended up using it in the WordPress project I did. Here is how Thematic does it’s HTML layout:

An overview of the HTML code (based on Thematic)

I think the most interest parts of Thematic for web application design are the HTML structure, use of ids and classes and the CSS files. The hook-and-filter system (overview here) is less interesting from the web application point of view, since you will most likely be writing all of the code from scratch for your own web applications.

<body class="wordpress y2010 m06 d02 h03 home blog not-singular windows firefox ff3">
<div id="wrapper">
   <!-- Header -->
   <div id="header">
      <div id="branding">
         <div id="blog-title"><span>...</span></div>
         ...
      </div>
      <div id="access">
         <div class="menu"><ul>...</ul></div>          
      </div>
   </div>
   <!-- Main -->
   <div id="main">
       <div id="container">
          <div id="content"><div>...</div></div>
       </div>
       <div id="primary"><ul>...</ul></div>
   </div>
   <!-- Footer -->
   <div id="footer">
      <div id="siteinfo">...</div>
   </div>
</div>
</body>

1. Body and wrapper

In all of the frameworks, the body classes contain various information such as the platform (windows) and the browser (firefox, ff3).

Benefits:
Easier to specify per-browser css fixes (if needed). This makes it easier to make adjustments based on the browser or OS used in CSS, eg. “.windows xxx.yyy { … }”.

Separation between body and content wrapper. In the framework, the content is wrapped in a wrapper layer (id=wrapper), so that the body has only one child element. I would imagine this is to make it easier to use CSS to adjust the padding and background of the pages.

2. Heading

The heading consists of two sub-divs, as shown below:

There are only three subelements to the content wrapper: heading, main and footer. Each of these divs is full-size and positioned relatively.

Benefits:
Easy to add repeating backgrounds. One can easily apply a CSS background-image to create a repeating and consistent heading.

Easy positioning of the elements. The heading consist of the branding and access divs. This makes it easy to add new heading elements in the branding, while keeping the menu (=access div) separate, with it’s own background.

Standard menu HTML. All of the frameworks appear to be using the Superfish menu by default, which is based on jQuery.

3. Main content

The main content consists of a container with a content-sub-div and a primary sidebar:

The main content div has a fixed size. This is then subdivided into two divs, container (for the content) and primary (for the primary menu block).

Benefits:
Semantic markup; floats to reposition Having these separate divs means that switching from “menu on the right” to menu on the left is simple, since one can change the CSS float directions to reverse the positions of the subcontainers.
Easy to add new content areas One could also add more subdiv for additional content areas, and create multiple columns relatively easily by positioning the subdivs within the main content div.
Easy to add sub-items The primary menu is a div, with each block having its own unordered list (ul). In practice, this leads to two levels of lists – one for the item blocks themselves, the other for the sub-items (e.g. the links in a Archives block).

4. Footer

The footer consists of a single subdiv:

Again, the top-level div only specifies the margin, while the inner divs are positioned within it. This makes it easy to add a background to the footer div.

Does this work?

I’m currently using this approach in recent two web applications I built. The markup seems a lot cleaner, more standard and I have found that re-theming the same basic HTML is much nicer than reinventing the wheel. In short, I think this is a good approach. Let me know if you have improvements via a comment.

Somewhat related: What about other techniques, such as CSS Grid frameworks or Haml/Sass?

I’m not yet convinced that I need to use a CSS Grid framework (e.g. Stackoverflow discussion).

As for Haml, I’m pretty sure that I am fine with regular HTML without any syntactic sugar.

Sass seems to be a real improvement over CSS (variables, nesting, mixins etc.), but on the other hand I haven’t had the time to set it up with my non-Ruby environment.


2
Jun 10

Kohana3: automatically collect internationalization strings

I started implementing i18n for my upcoming KO3 application, and implemented a quick patch so that I don’t need to manually find and type translation strings.

What this code does

What the code below does is it checks whether the translation string exists, if not then it saves it into the translation file with the English equivalent. This updated version of the translation string file is saved into /application/i18n/languagename.php, and the old file is saved with a new name containing the current date and time.

Hope this helps!

How to set it up

First set the language using i18n::lang(‘xy-xx’) in bootstrap.php.

Also, add the following as the last line in bootstrap.php:

// Write the updated language file, if necessary
i18n::write();

Finally, add the file /application/classes/i18n.php which overrides i18n::get():

<?php
/**
 * A patch for the Internationalization (i18n) class.
 *
 * @package    I18n
 * @author Mikito Takada
 */
class I18n extends Kohana_I18n {
   // Cache of missing strings
   protected static $_cache_missing = array();
   /**
    * Returns translation of a string. If no translation exists, the original
    * string will be returned.
    *
    * @param   string   text to translate
    * @return  string
    */
   public static function get($string)
   {
      if ( ! isset(I18n::$_cache[I18n::$lang]))
      {
         // Load the translation table
         I18n::load(I18n::$lang);
      }
      // Return the translated string if it exists
      if(isset(I18n::$_cache[I18n::$lang][$string]))
      {         
         return I18n::$_cache[I18n::$lang][$string];
      } else {
         // Translated string does not exist
         // Store the original string as missing - still makes sense to store the English string so that loading the untranslated file will work.
         I18n::$_cache_missing[I18n::$lang][$string] = $string;
         return $string;
      }
   }
 
   public static function write()
   {
      // something new must be added for anything to happen
      if(!empty(I18n::$_cache_missing)) {
         $contents = '<?php defined(\'SYSPATH\') or die(\'No direct script access.\');
/**
 * Translation file in language: '.I18n::$lang.'
 * Automatically generated from previous translation file.
 */
return '.var_export(array_merge(I18n::$_cache_missing[I18n::$lang], I18n::$_cache[I18n::$lang]), true).';';
 
         // save string to file
         $savepath = APPPATH.'/i18n/';
         $filename = I18n::$lang.'.php';
         // check that the path exists
         if(!file_exists($savepath)) {
            // if not, create directory
            mkdir($savepath, 0777, true);
         }
         // rename the old file - if the file size is different.
         if(file_exists($savepath.$filename) && ( filesize($savepath.$filename) != strlen($contents) ) ) {
            $result = rename($savepath.$filename, $savepath.I18n::$lang.'_'.date('Y_m_d_H_i_s').'.php');
            if(!$result) {
               // Rename failed! Don't write the file.
               return;
            }
         }
         // save the file
         file_put_contents($savepath.$filename, $contents);
      }
   }
}

Caveats and notes

There are two things that have to be taken into account:

  • First, this is would obviously be inefficient for a production site, since actual files are being rewritten on each request that finds new translation strings.
  • Why this is not a problem: My recommendation is that you shouldn’t run this code in production mode, since there is no point and it is very easy to remove the code after developement is completed.
  • Second, this approach is less comprehensive than using something like the gettext tools that are available – those tools scan all of the source code, while my approach depends on run-time detection of new strings. This means that a small percentage of strings will not be found automatically (ex. rare errors that never get triggered).
  • Why this is not a problem: This approach will still get the vast majority of the strings without requiring any manual hunting for strings, so I think it’ll save you quite a bit of time.

26
Jan 10

Organizing Javascript code

Javascript is an interesting language because it is flexible and surprisingly powerful. Once you grow up from having one static page into having a large number of pages in a dynamic web application, you need conventions on how to organize your code.

In my opinion, there are two parts to the problem:

  1. the easy part, which is to apply JS programming patterns
  2. the hard part, which is doing so in a consistent and maintainable manner

The easy part: Namespaces, modules and commenting conventions

Namespace and module pattern

Javascript has no explicit syntax for advanced language features such as namespaces and private variables, but these can be implemented using very simple programming patterns.

A lot has been written on the namespace and module pattern, so I will only illustrate it here.

I like to add a “placeholder” function shown below to prevent me from making errors while modifying the module (IE6 freaks out if there is a comma after the last item). This used to happen when I moved or added the methods and forgot to check whether that the commas are correct.

The example code below is adopted from (http://yuiblog.com/blog/2007/06/12/module-pattern/):

// create a namespace
YAHOO.namespace("myProject");
// Assign the return value of an anonymous function to the namespace
/** @namespace */
YAHOO.myProject.myModule = function () {
	/** @private */
	var myPrivateVar = "I can be accessed only from within YAHOO.myProject.myModule.";
 
	/** @private */
	var myPrivateMethod = function () {
		YAHOO.log("I can be accessed only from within YAHOO.myProject.myModule");
	}
 
        /** @scope YAHOO.myProject.myModule */
	return  {
		/** describe myPublicProperty here */
		myPublicProperty: "I'm accessible as YAHOO.myProject.myModule.myPublicProperty."
		/** describe myPublicMethod here */
		myPublicMethod: function () {
			YAHOO.log("I'm accessible as YAHOO.myProject.myModule.myPublicMethod.");
 
			//Within myProject, I can access "private" vars and methods:
			YAHOO.log(myPrivateVar);
			YAHOO.log(myPrivateMethod());
 
			//The native scope of myPublicMethod is myProject; we can
			//access public members using "this":
			YAHOO.log(this.myPublicProperty);
		},
		/**
		* A placeholder function to prevent errors due to not having a comma after the last function in the return statement for this module.
		*/
		placeholder: function () {}
};
 
}(); // the parens here cause the anonymous function to execute and return

Commenting conventions. Pick a commenting tool and stick with it. Comment the intent of the code and any gotchas as well as the expected arguments and purpose of methods.

The two main JavaScript commenting tools are: JsDoc-Toolkit (self-contained Java) and YuiDoc (Python).

File names and splitting code into files. Put all the hard-to-reuse code into its own namespace (ex. Framework. ModuleName. ModulePart.ApplicationName) in a separate file (application.js or custom.js). This way you keep the library clean of any application-specific stuff. It is often hard to avoid implementing some things in a one-off manner, so you might as well organize all the single-use stuff in its own file.

That is, if writing your own extensions to an existing framework, you will most likely have:

  1. code that is fully reusable (specialized or pre-configured widgets, utilities etc.)
  2. code that is hard or pointless to reuse (page- or application-specific messages, formatting and utilities)

The hard part: structuring code for reuse, logical naming conventions

Structuring code for reuse. Encourage reuse by splitting code into configuration, implementation and customization.

  1. Configuration is anything that might need to change on each instantiation of your widget. It should be single section within a module (a private object in JSON notation). Make sure you have good and documented defaults, because this makes it easier to use the code again.
  2. Implementation is the main code. It should NEVER contain hardcoded ID’s, messages or other things that will eventually need to be overridden.
  3. Customization is a set of page-specific blocks of code that alter the configuration. You only override the few things that are needed for the specific functionality on the page.

Here is an example (from http://www.wait-till-i.com/2008/05/23/script-configuration/):

myProject.myModule = function(){
// CONFIGURATION
  var config = {
    CSS:{
     classes:{
       hover:'hover',
       active:'current',
       jsEnabled:'js'
     },
     ids:{
       container:'maincontainer'
     }
    },
    timeout:2000,
    userID:'chrisheilmann'
  };
 
  // IMPLEMENTATION
  function init(){ };
  // make init and config public
  return {
    init:init,
    config:config
  };
}();
 
// CUSTOMIZATION
// This makes it possible to override stuff before calling init
module.config.CSS.ids.container = 'header';
module.config.userID = 'alanwhite';
module.init();

Naming conventions. I don’t mean deciding whether or not to use camelCase, but rather how the logical organization names of variables and widgets should be done. It is reasonably simple to stop using global variables and functions to avoid name conflicts, but what is harder is to come up with a logical naming convention that allows any code to be reused anywhere.

In particular, when you create new widgets, you often need to be able to connect them to one another in some manner. Being able to identifying the type (class) and retrieve/replace data from a new widget using a standard interface takes some planning and a lot of discipline, but it makes reusing the widgets much easier.

ID and name attribute naming convention. I use “data[widgetName][recordIndex][fieldName]” for input names (because PHP will parse this into arrays automatically) and “widgetName_fieldName_recordIndex” for ID attributes (because it makes string comparisons easier).

Widget instance naming convention. I use YAHOO.ApplicationName.WidgetInstanceName for widget instances, and use WidgetInstanceName to create any related tags with IDs.

I am still looking for a good logical naming convention for widgets, but it seems this topic is not as commonly discussed. If you have any tips, please leave a comment!

Reference stuff

http://stackoverflow.com/questions/211795/are-there-any-coding-standards-for-javascript

http://ajaxian.com/archives/maintainable-javascript-videos-are-now-available

http://yuiblog.com/blog/2007/06/12/module-pattern/

http://www.wait-till-i.com/2008/05/23/script-configuration/


14
Jan 10

MVC frameworks: stack vs glue, and how to pick the right one

MVC, Model-Controller-View, is all the rage in web development these days. With regards to MVC, I think the right question is to ask not whether you should use an MVC framework but rather which framework fits the kinds of problems you are likely to encounter while developing your web application.

MVC frameworks: full stack vs. glue frameworks

There are the two basic options with MVC frameworks: glue frameworks and full stack frameworks.

  1. “Glue frameworks” are collections of components and libraries which you can use to build an application. Most of the functionality is optional and often you must make decisions regarding how you will structure control flow within your application.
  2. “Full stack frameworks” are integrated sets of components in which most of the components and libraries are mandatory and most of the design decisions have been made for you in advance.

How full stack frameworks sometimes get in the way: an example

Most frameworks are optimized for the 0-3 month project. All the basic functionality you need is there, but the expectation seems to be that you are creating a custom version of a blog or a CMS and that the most complex bits will be the views (custom code) and the models (posts, users, tags, comments). However, the more you’re dealing with complexity with regards to business logic, integration, regulatory validation and scaling requirements, the more likely it is that a full-stack framework will require manual optimizations and supplementary mechanisms which are not easily implementable in a full stack framework.

Here is one example of how a full stack framework can end up in more complexity than a glue framework. In CakePHP (a full stack framework), if you want to show multiple “flash” messages, you have to rather actively work around the existing mechanism, because it does not support multiple flash messages easily. Yes, you can set one message per action and you can set custom styles, but setting more than one message becomes unnecessarily complex (see this article and have a look at the comments as well) because by default, the Session flash messages overwrite each other rather than appending to the existing set of messages.

This is a very trivial example – and you can use the existing Session classes to create most of the functionality necessary to fix this. But if you do that, you will end up with two “flash” message functionalities: the default one and the one you built to make it easy to set more than one message of each type.

A glue framework might offer the Session functionality without directly supporting flash messages. This does mean that you have to write your own, but you’ll end up with a slightly simpler codebase. These kinds of small things add up if you have a larger project.

Now, the key question to me when deciding which to use is:

What kind of complexity/problems are you tackling in your application?

The optimal choice of framework depends on covering the 80% case effectively (without compromising constraints like performance).

Here are some questions to consider when picking the right framework for your project:

  1. Complex views – What are the most probable uses of the system? Make sure you can cover the technical functionality needed to implement the most important client goals in a way that delights or at least does not annoy your customer. What is key here is the user experience you can deliver, not just technical sophistication.
  2. Complex data structures – What data structure requirements are most likely to change? If possible, pick an architecture that makes the most likely changes relatively cheap to make. This may imply a tradeoff of some kind with other functionality.
  3. Complex business logic – What are the key concepts and processes, and how are these likely to change? How easy is it to create exceptions to the rule and how can rules be verified and tested?
  4. Complex integration – What are the key connecting technologies, and how well are they supported by the framework? Ideally, you can use pre-existing and tested code to integrate key areas.
  5. Strict performance requirements – What is the everyday workload going to be? What mechanisms will allow you to identify bottlenecks in the system? Is there a possibility to optimize functionality by scaling horizontally or by bypassing some unnecessary functionality?
  6. Scaling requirements – How can you partition your workload? How well do the core mechanisms lend to custom enhancements to fulfill scalability requirements?
  7. Strict schedule requirements – What is the minimum feature set? How well does the framework support rapid implementation and testing of the minimum feature set? What are the main dependencies, and are there pre-existing modules or plugins for them?

If what you are building is not particularly complex, or the primary complexity is in building the views, then a full-stack framework is likely to be a great choice – particularly if you are have a very limited schedule. An example of this would be a brochure website, which has most of its complexity in the views.

The larger and more complex the application, the less likely you are to obtain significant productivity gains from using a framework. This is simply because having a framework can at best save you a few hundred hours of work (which should be more than what it would take for you to replicate the functionality you use). If the project is big, the time saved doing this becomes less significant compared to the time needed to build the rest of the application (e.g. the useful parts).



19
Dec 09

Implementing agile development

“Agile development” is used to describe a wide variety of development practices. Claiming that one follows an “agile” development methodology is easy. But declaring that your practices are “agile” is just about as useful as declaring yourself the winner – saying you are the winner doesn’t make it so. Thus the question is, how can one ensure that what is called “agile development” by some company actually delivers what is expected?

On a surface level I believe it is rather easy to agree with the Agile values. Working in a small company, whether or not the right things are done comes down mostly to a question of self-discipline: there is a number of time-consuming and less exciting things that have a high payoff, but are not done because each person (this includes you, reader) has a preferred way of working. In most cases this good: second-guessing oneself is hardly productive. But in order to actually benefit from best practice one must not only be aware of it but also try to implement it as best possible – blindingly obvious, but somehow frequently ignored.

McDonald et al. (2008) make an interesting reference to Auguste Compte’s Law of Three Stages from 1830. The law proposes that a scientific discipline progresses through three stages: theological (belief-based knowledge), metaphysical (philosophy-based knowledge) and positive (based on scientific reasoning). The authors (each with SW development background of at least 20 years) propose that the Agile movement and manifesto represent a move from belief-based practices towards the second, philosophical stage. This is interesting because it implies that they see Agile (with a big A) is still more as a philosophy than a full methodology – or more importantly, that doing Agile well requires that you pick and define an overall framework within which you apply the Agile principles.

In a post on his website, Scott Ambler asks five questions to help determine whether a team is Agile:

  1. Is the team is doing developer regression testing, or better yet taking a test-driven approach to development?
  2. Are stakeholders active participants in development?
  3. Is the team producing high-quality, working software on a regular basis?
  4. Is the team is working in a highly collaborative, self-organizing manner within an effective governance framework?
  5. Is the team improving their process on a regular basis?

Some further questions

I think these are very good questions to identify whether the Agile principles are used in developing software. But if you are thinking about implementing Agile practices, you ought to also be able to answer further questions about the underlying process:

1. What are you doing to detect, predict and prevent defects? “Regression testing” is not an answer: what part of your testing is automated? What is the role of practices such code reviews, test scenario development and design validation in your process? Who is responsible for quality assurance and how are defects tracked?

2. How do you manage and communicate requirements? Who talks to the stakeholders? How do you disseminate the information? How do you track fulfilled and unfulfilled requirements? How do requirement changes influence planning and scheduling?

3. How are you ensuring that quality and security concerns are addressed and that lessons learned from previous projects are remembered?

These are just a few process-related questions off the top of my head.

Why can’t you answer?

If you can’t answer, or aren’t doing one or more of these practices, ask why not. My answer to “why not?” is that this is a result of a lack of appreciation of whatever the unimplemented best practices could bring, such as: automated testing, active stakeholder participation, better scheduling or controls, self-organization or process improvement accountability. While these are best practices, that does not mean that adopting them is always in the personal interest of the participants of the software development process. For example:

1) Active stakeholder participation may not be valued because it is considered to be customer interference rather than a method of increasing the focus on customer value, or it may not be valued because a consultancy may prefer to have a costly billable change process.

2) Better scheduling or controls may be seen as overhead and micromanagement by developers.

3) Increasing accountability for mistakes may be uncomfortable and thus seen as undesirable.

To me, the key question is how can we can raise the perceived value of these practices in order to create the internal motivation to implement these processes more stringently – and hence hopefully realize the benefits of best practice in our organization?

Example: TDD

For example, let’s look at why TDD might not be valued.

Regression testing / test-driven development may not be seen as valuable because of beliefs such as “the real work of development is implementation” and “quality is achieved by molding software into shape by correcting defects” (McDonald et al. 2008). What could be done to create the internal motivation to implement TDD?

One part would be to examine these beliefs and examine the benefits of defect prevention via TDD. In particular, it may be that the negative impact of the ripple effect of flaws on dependent functionality and the cost in customer satisfaction are underestimated.

Second, there is a larger question of whether quality ought to take precedence over other potential priorities – because focusing on quality does not mean following best practices only when they are convenient, and throwing them out when the going gets tough. Would you trade intangibles such as quality over tangibles such as lines of codes or number of features or maintaining a release schedule? A quality-oriented organization ought to choose quality over features or schedule and be confident that this will still lead to a better result than prioritizing features or schedule.

Third, there ought to be an increase in emphasis on the test and quality assurance results as a deliverable. Are your test results a deliverable, and do you actually deliver them? Can you make them a deliverable? By making it clear that the results of an intangible activity are still deliverable and are visible work will increase their perceived value. An internal report may be less important than a report going to the customer (even if the report is an aggregate) and will likely help shift the emphasis from an internal, optional activity to a deliverable, required activity.


1
Dec 09

Tip: Netbeans 6.x scanning performance fix

I use Netbeans as my primary editor. It’s a great editor with pretty much all the features I would want built in (zero-configuration Mercurial support, code completion+navigation, unit testing support and more) .

The only caveat has been the performance – I was so frustrated I installed (and ultimately rejected) the vast majority of other (PHP) editors. There is a simple fix if you have memory to spare: increase the heap size in the startup options. The Java VM default heap size seems to be way too small, causing a lot of swapping within the Java VM (while not using all the memory available from the OS perspective).

Fixing Netbeans scanning performance

Open /etc/netbeans.conf (e.g. C:\Program Files\Netbeans 6.x\etc\netbeans.conf). If you are on Windows 7, you need to start your text editor with Administrator privileges (right click and select Run as..).

Change the line: netbeans_default_options=”-J-client -J-Xverify:none ….”

by adding the following to the single line (AFTER “-J-client -J-Xverify:none”):

-J-Xmx2048m -J-XX:+UseConcMarkSweepGC -J-XX:+CMSClassUnloadingEnabled
-J-XX:+CMSPermGenSweepingEnabled

The -Xmx option changes the maximum heap size the Java virtual machine can address to 2048M from the default value (which is a “best guess” method). The other three are explained in the conf file.

DONE! Ever since I made this tweak (about two months ago) there have been no additional pauses at all. My theory is that the Netbeans scanning code was exhausting the heap, and a majority of the time “scanning” was instead spent dealing with the heap exhaustion. Now I will not say that it is reasonable for an IDE to consume 2 gigabytes of memory – but if that’s what it takes to fix performance, fine.


3
Nov 09

Setting up SPF, SenderID and DKIM on Centos 5.3 using sendmail

The biggest four email providers Gmail, AOL, Hotmail and Yahoo (in this order according to Comscore) all implement some form of anti-spam techniques.

The main technologies are reverse DNS checking, SPF, SenderID, Domainkeys and DKIM. I will discuss all of these here and provide my tips on setting up SPF, SenderID and DKIM. Please keep in mind that I am not an expert on email servers – but I hope this helps someone! It took me about half a day to figure this all out, so it is probably worth doing to improve your email delivery rates.

This blog entry from Dave Zohrom provides a nice discussion of sending email to a general audience (part 2 of a 3-post series).

Introduction

SPF or Sender Policy Framework

SPF is easy to setup. It uses DNS TXT records to allow the email providers to check which servers are authorized to send email for a particular domain. The SPF project has done a great job at making this simple to setup. Just go to their homepage and use the setup wizard to generate the appropriate TXT records (the text field underneath “Deploying SPF”). Then change the values in your DNS records.

One thing to note is that if you have multiple servers and send email from a server with a different hostname (e.g. “mail.example.com”), you need to setup a record for that server as well. See the common mistakes page on the SPF site. Also, if you have hostnames that are not supposed to send mail, you ought to indicate this as well.

Another thing is that if you are using Google Apps for Your Domain to send email, then you will need to have to add “include:aspmx.googlemail.com” to also allow mail from Google to validate. See this answer for more.

SenderID from Microsoft

SenderID is a variant of SPF, which for most practical cases is the same as SPF. Just setup SPF and this should produce a validation pass for SenderID as well. The semantics of the validation are a bit different,  but this does not seem to be a major practical problem. See testing tips below to make sure that this is also true in your case.

DomainKeys

DomainKeys is an older version of DKIM (DomainKeys Identified Mail) developed by Yahoo. Despite having very similar names, these ARE NOT the same!

Both DomainKeys and DKIM store public key information in DNS records and sign the message headers of every email sent. The recipient can then verify the signature.

DomainKeys was deprecated in 2007, but some email providers may still be using it. However, these are a shrinking minority and Yahoo does support the newer DKIM. Because of this I did not add DomainKeys support but opted only to use DKIM.

You can add it to sendmail or Postfix using the dk-milter project code, but the unofficial RPM release is not maintained anymore, which means you will need to install it from source (available via SourceForge).

DKIM, or DomainKeys Identified Mail

DKIM on the other hand seems to be gaining momentum. It is used by Gmail, Yahoo and AOL and many others, and also works by publishing public keys via DNS TXT records and by signing the emails at the email server.

To setup DKIM, you need an additional filter which takes the completed email and adds the DKIM signature to the email prior to sending it out.

Postfix and sendmail support “milters“, which is apparently short for “mail filter”. There is a DKIM-milter package available for Centos at the EPEL repositories (see Centos page for 3rd party repos).

Setting up DKIM-milter with sendmail

For Postfix, use these instructions from All About LAMP. If you want to use sendmail as I did, here are my additional tips:

Steps 1-7 as in the linked tutorial.

Step 8. Configure dkim-milter

Open configuration file /etc/mail/dkim-milter/dkim-filter.conf and use the following configuration:

Canonicalization simple
Domain example.com
KeyFile /some/path/to/whatever-your-keyfile-was
Selector name-of-the-selector
SignatureAlgorithm rsa-sha256
Socket inet:8891@localhost
Syslog Yes
Userid dkim-milter

NOTE: you will be configuring dkim-milter to use the loopback interface instead of a socket file. I was unable to get dkim-milter to work via the socket file with sendmail. If you get it working, let me know.

You may also want to setup the following:

SubDomains Yes
SyslogSuccess Yes
X-Header Yes

The X-Header and Syslog options are useful for debugging. See the config file, each option should be documented there.

Step 9. Change the default init.d script to use the loopback interface

The default init script uses a socket, this needs to be changed. Open /etc/init.d/dkim-milter and change/comment the line:

SOCKET=local:/var/run/${name}/${name}.sock

to:

SOCKET=inet:8891@localhost

Step 10. Configure sendmail to use dkim-milter

First a few reminders about sendmail configuration, remember that:

1. Sendmail comments DO NOT USE # as the comment, instead “dnl”  (delete through newline) at beginning of the line is used to comment lines out.
2. Sendmail configuration is built from the *.mc script files using the M4 macro processor.
3. You need to install the sendmail-cf package for dependencies and install the m4 macro processor separately.
4. In the configuration, the opening quote is a grave accent ` and the closing quote is a straight quote ‘.

To configure sendmail, open the “/etc/mail/submit.mc” file (which contains the settings for message sending; in older sendmail versions this config was in sendmail.mc).

10.1 Edit submit.mc by adding the following entry to it:

INPUT_MAIL_FILTER(`dk-filter', `S=inet:8891@localhost')dnl

(for example just before “FEATURE(`msp’, `[127.0.0.1]‘)dnl”). Make sure that there are no define(`confINPUT_MAIL_FILTERS’, `…’)dnl lines after this; if there are, you will need to add dkim-milter manually to the INPUT_MAIL_FILTERS list.

10.2 Build and install a new submit.cf:

m4 /etc/mail/submit.mc > submit.cf

Tip: use m4 -d /etc/mail/submit.mc to debug first.

10.3 Restart sendmail

service sendmail restart

Testing tips

Some possible errors:

  1. Errors getting the sendmail configuration generated: Check whether the dependencies (m4 and sendmail-cf) are installed and paths are correct.
  2. Errors starting sendmail: Make sure you replaced the correct files (submit.mc to submit.cf and sendmail.mc to sendmail.cf) and if you modified sendmail.mc in addition to submit.mc, make sure you have regenerated both.
  3. Errors starting dkim-milter: check the permissions on the key file
  4. Sendmail seems to ignore the dkim-milter, no X-dkim-milter header is in the mail even after you enabled the X-Header option in dkim-filter.conf:
    1. Check /var/log/maillog.
    2. Sendmail seems to default to ignoring the milter if it cannot connect to it. This was the problem I ran into when using sockets; after switching to the loopback interface everything started working.
    3. Also, check whether your mail is sent from a recipient for whom mail is supposed to be signed! If you haven’t setup your hostname, then this may lead to email not being signed (eg. hostname -F hostname plus /etc/hosts plus /etc/sysconfig/network).  See, for example this tutorial for configuring sendmail.
    4. Also, check if you need to configure masquerading for Sendmail, see http://www.sendmail.org/m4/masquerading_relaying.html

Sending email from the console using only sendmail

Create a file with the following content:

To: "Recipient name" <john.doe@example.com>
From: "Sender name" <admin@example.com>
Reply-To: admin@example.com
Subject: Hello world
This is the content of the message, end it with a line containing only a period as sendmail expects this.
.

Then cat the file and pipe to sendmail -t:

cat message.txt |sendmail -t

Checking all of the technologies mentioned above

Port25.com offers a free service which check SPF, SenderID, DomainKeys and DKIM: http://port25.com/domainkeys/

Quote:

A reply email will be sent back to you with an analysis of the message’s authentication status. The report will perform the following checks: SPF, SenderID, DomainKeys, DKIM and SpamAssassin.

Additional resources:

How to Setup DKIM for Postfix in Fedora using dkim-milter

How to Prevent Web Server Emails from being Marked as SPAM

DomainKey Implementor’s Tools and Library for email servers & clients

Sendmail DKIM

Setting up DKIM, SPF, Domainkeys DNS, Regular DNS on CentOS 5.3 at Pacificrack.com

How to manually install DKIM-Filter with Sendmail


14
Oct 09

How to setup a LAN DNS server using MaraDNS under Windows 7

Are you tired of using 192.168.0.x to refer to the computers within your LAN? Setting up a DNS server and getting domain names for your local computers is surprisingly easy – even on Windows.

0. Preliminary setup: make sure each computer gets a constant IP address

Before setting up the DNS server, you need to ensure that each computer or virtual machine gets a fixed IP address. Otherwise the IP address of the computer may change on each reboot.

This should be done on your router, which is usually located either at 192.168.0.1 or 192.168.1.1 (check your current local IP address by running the Command Prompt and “ipconfig”, which shows your current IP address). Check your router manual on instructions how to log in, most routers have a HTTP-based configuration system which can be accessed when connected to the router.

In my case (I have a Asus RT-N11), the address was 192.168.1.1. Make sure that the computers you want to setup IPs for are connected, then find the “Status” or “DHCP Leases” listing on the router web interface. This listing will contain the MAC addresses for each host. Here is mine:

Host Name       MAC Address       IP Address      Lease
--------------------------------------------------------------
HOST1        00-11-22-33-F0-AC 192.168.1.5     60016 secs.
HOST2        00-11-33-44-F2-2C 192.168.1.6     85074 secs.

A MAC address is a unique identifier given to each network adapter. It allows you to setup fixed IPs.

Find the router functionality which allows you to “Assign IP Addresses Manually”. This should enable you to specify a MAC address and a corresponding fixed IP address. Do this for each of the network adapters you wish to have a fixed IP address. On the Asus RT-N11 this was under “IP Config” -> “DHCP Server”.

Add a MAC address <-> IP address pair for each computer.

1. Get MaraDNS

MaraDNS is a free, lightweight and relatively easy-to-configure DNS server for Windows and Linux. Download it from here and unzip it to some folder.

2. Configure MaraDNS

Open “secret.txt” and change the value to something else (random characters).

The MaraDNS configuration is in the “mararc” file in the same directory. DNS servers have two sets of functionality. They can function as a “Authoritative name server” or a “Recursive/caching name server”.

Authoritative name servers specify IP addresses for domain names. Recursive name servers store information from authoritative name servers and pass on queries in a recursive manner.

We will be configuring both authoritative and recursive functionality in MaraDNS.

2.1 Authoritative configuration

We will configure the server to provide authoritative names of the LAN domain names. Pick any domain, I chose “local.com” (note though that you will not be able to access the actual “local.com” website if you pick an existing domain name).

Add configuration lines to “mararc” like these:

csv2 = {}
csv2["local.com."] = "db.lan.txt"

Where local.com is the domain name you picked, and db.lan.txt is the name of the second configuration file which we will be creating next (change it if you want to name the second configuration file).

Create a new file named “db.lan.txt” in the same directory as MaraDNS.

For each of the computers you want to resolve to a name, add a line to “db.lan.txt”. For example, for two machines, one “dev.local.com” and the other “blog.local.com”, add the following lines:

dev.%       192.168.1.4 ~
blog.%        192.168.1.6 ~

Done!

2.2 Recursive configuration

We will setup MaraDNS to ask your default name servers for all other domains so that you can resolve all other domain names to their correct IP addresses.

Find out your ISP’s DNS server addresses. These are likely to be listed either on the Router status page, or by checking the details on your network adapter.

Now add your ISP’s DNS servers as upstream servers in “mararc”:

upstream_servers = {}
upstream_servers["."] = "xxx.xxx.xxx.xxx, yyy.yyy.yyy.yyy"

Where xxx.xxx.xxx.xxx and yyy.yyy.yyy.yyy are your ISP’s DNS servers.

Done!

3. Run MaraDNS and test it using askmara.exe

Double-click “runmara.bat” , and leave the server running.

Open a command prompt, navigate to the MaraDNS directory and try running:

askmara.exe Agoogle.com.

and

askmara.exe Ablog.local.com.

You should get replies like this:

# Querying the server with the IP 127.0.0.1
# Question: Agoogle.com.
google.com. +300 a 74.125.67.100
google.com. +300 a 74.125.53.100
google.com. +300 a 74.125.45.100
# NS replies:
# AR replies:

and:

# Querying the server with the IP 127.0.0.1
# Question: Ablog.local.com.
blog.local.com. +86400 a 192.168.1.6
# NS replies:
#local.com. +86400 ns synth-ip-7f000001.local.com.
# AR replies:
#synth-ip-7f000001.local.com. +86400 a 127.0.0.1

If you get problems with the first query, you messed up the recursive DNS settings (are your ISP DNS server addresses correct?), and if you get an error with the second query, you messed up the authoritative settings.

4. Change MaraDNS to reply to queries from your LAN

Shutdown the MaraDNS window, and change the first two lines of “mararc” to something like:

ipv4_bind_addresses = "192.168.1.2
recursive_acl = "192.168.1.0/24"

Where 192.168.1.2 is the IP address of the computer on which the server will be running and the “192.168.1″ part of recursive_acl is the same as on your network (might be 192.168.0.0/24).

Start MaraDNS again, and leave it running.

5. Setup your router to hand out your new DNS server

Open your router’s web interface and find the DHCP server settings. There should be an option to set up a DNS server. Write the IP address of the computer on which the DNS server will be running.

For each of your computers, disconnect the network (e.g. by disabling and enabling it in Windows, or by using “ifconfig eth0 down”/”ifconfig eth0 up” on Linux).

That’s it, you should now be able to refer to your LAN computers by their domain names.


19
Sep 09

Going virtual

virtualization2

Virtualization is awesome in software development and testing. It uses more resources (than having separate physical machines), but significantly reduces the hassles of setting up and running different operating systems on a single machine. Being able to start a new system without rebooting, having virtual servers and being able to store snapshots of different systems has been really useful.

What is virtualization?

Three-sentence definition from Wikipedia: “Virtualization is performed on a given hardware platform by host software (a control program), which creates a simulated computer environment, a virtual machine, for its guest software. The guest software is not limited to user applications; many hosts allow the execution of complete operating systems. The guest software executes as if it were running directly on the physical hardware”.

Which basically means you can run multiple virtual computers on a single physical machine, each with their own separate operating system and access to the network.

How I use virtualization

I use virtualization for the following:

1. Development servers: Running Linux in a virtual machine to provide pre-configured web servers for testing and doing development work. The main alternative would be to run a web server on the local computer (e.g. using Uniform Server). However, the key advantage in using virtualization is that the server installation is practically identical to the deployment installation – this is important since many dependencies of web applications such as PDF and Excel generation may work slightly differently on different operating systems. By using virtualization, these differences are eliminated.

2. Testing IE6. Internet Explorer 6 is unfortunately still used in many corporate environments and still has a significant market share. It is an insecure and deeply broken browser, but unfortunately it still needs to be tested. Instead of keeping it installed on a separate Windows XP system, I keep a virtual machine around with IE6 and use it for testing purposes.

3. Trying out beta software and server software. In particular, alternative Linux setups for servers which are often easier to do using fresh virtual machines rather than on a single server. If there are any problems, I can just revert the whole system to the previous snapshot.

One interesting website related to this is Bitnami, which offers ready-made virtual machines for various open source software tools. Worth trying if you want a particular web application working with minimal hassle.

4. Low-hassle deployment of temporary/personal stuff. Also known as “poor-man’s-Xen-virtualization”. I am planning on doing a small BIND DNS client setup soon to provide hostnames for the home LAN. I will probably build and test out the configuration on the my main computer, then just copy it to an older computer.

While there is a performance loss from deploying on consumer grade hardware with a consumer-oriented virtualization solution, it also allows you to use almost any hardware to host virtual machines (more lightweight hypervisors like Xen have rather specific hardware requirements).

And I am thinking about:

5. Work desktop virtualization (looking into this). I move a lot between computers and hate having to spend time installing and customizing the different applications I use. In the near future, I am planning to move my work setup from a physical machine to a virtual machine, meaning that when I change from one physical machine to another, the only thing I need to do is copy the virtual machine and have all my programs and settings remain intact. There are two things holding me back: one, I would prefer to have my files stored outside VMs (I am a bit worried about losing them if the virtual machine gets corrupted somehow) and two, graphics cards are not supported on current virtualization solutions which is bad for graphics performance.


9
Sep 09

PhpDocumentor 1.4.3 gotchas

Here are three minor gotchas:

  1. To ignore a directory, use -i path/relative/to/the/src/root/ with a “/” at the end (or “\” for Windows). You MUST have that trailing slash, otherwise the directive is treated differently (filename match)
  2. To ignore multiple directories, you cannot use multiple -i -directives. If you do, they will overwrite one another, and only the last one will be applied. Instead, use a comma to separate the paths: -i first/path/,second/path/
  3. If you happen to use 1.4.3 and use the ”HTML:frames:earthli” template, you will notice that the images and css files won’t load because they are missing one character at the end for some reason. Solution: rename or copy the template from a previous version to “./PhpDocumentor/Converters/HTML/frames/earthli”.

Really, this is simple but the documentation is rather unclear as to how this works (particularly the point about the significance of the last slash character). I found some comments in the bug tracker which helped me find the correct syntax (here). Both of these things are implied in documentation, but I figure writing this might save someone else some time.