Saturday, October 30, 2010

exploring Google App Engine

Started to explore the Google App Engine, a cloud computing platform that enables developers to plug into the Google platform and services. It supports Python and Java (with some limitations) Web app frameworks.

Here are some links that I found useful to begin with:

Many books already available for this topic:

O'Reilly: Programming Google App Engine (Nov 2009), Using Google App Engine (May 2009)
Apress: Developing with Google App Engine (Feb 2009), Beginning Java Google App Engine (Dec 2009)

It is a pity that PHP is not supported so far, but people have found ways to get around it by using a pure Java implementation of the PHP called Quercus.

Friday, October 29, 2010

move posts from blogsome to Blogger

I used to like free wordpress blog service provided by blogsome and occasionally update my technical notes at woshiadai.blogsome.com. But its online editor is really painful to use and way behind the Blogger editor.

Today, after tried and failed to set up the SyntaxHighlighter javascript code highlighter on blogsome. I decided to move from blogsome to Blogger.

But how to export existing posts and import into Blogger? After a bit Google search, here is what you should do (assuming your blogsome blog is foo.blogsome.com)

1. export existing blogsome posts in RSS 2.0 format in an XML file

- log into blogsome as admin
- go to manage->posts, write down the id of your latest post, say 125, this is the total number of posts you have for your blogsome blog
- go to options->reading->syndication feeds, change "show most recent" from x (e.g. default is 10) to 125, also change "for each article, show" to full text
- open a new browser window, enter http://foo.blogsome.com/feed/, you will get all existing posts in RSS 2.0 format, save the current page as feed.xml

2. convert exported blogsome feed.xml to a format that Blogger can recognize/accept

Blogger has some strange behavior, if you directly import this feed.xml, Blogger gives a very hard to understand error with the slide bar keeping going back and forward (Blogger, please fix this bug, at least provide some more meaningful explanation about this error)

"Sorry, the import failed due to a server error. The error code is bX-tjg9ds"

I also tried to export blogsome posts in atom format, which is supported by google, but no luck, still the same. Then I googled the error and someone has a solution already!

You simply go to wordpress2blogger conversion tool site, use "choose file" to upload your feed.xml and click "convert", and then save the result to feed-converted.xml

3. import feed-converted.xml to Blogger

blogger has a detailed tutorial on that already, so I don't want to repeat that. Just follow the instructions there. Unless you need to hide some posts, you should select "automatically publish all imported posts" on the import page.

That's it, you have a new life on Blogger!

References:

Hello world!

Welcome to WordPress. This is your first post. Edit or delete it, then start blogging!

Firefox: 50000000 downloads

Firefox is celebrating 50 million downloads now

I have been using it for a long time now, it is quite neat and fast. You don’t have to worry about IE security holes and non-standard M$ tags anymore.

BTW, this is a test post using BlogJet. The only thing I found annoying with this desktop blog publishing tool is that it has poor support for encoding, e.g. I cannot post Simplified Chinese contents

check if a database foo exist in MySQL

Problem: I want to see if a database named “foo” exist in MySQL server

This should not be a tough problem at first sight. However, I googled the topic and did not find an official answer except several quick&dirty solutions. Please let me know if there are better and clean ways to do it right.

Setup: MySQL v4.1.9 + mysql-connector-java v3.1.6 + J2SE v1.4.2_06

Solution 1: Catch the exception for “unknown database %s” (I used this one finally).

If you create a new Connection to the URL that points to a non-existent database in MySQL, an SQLException is thrown with the message “unknown database %s” where %s is the database name. The exception has error code of 1049 (int) and SQL state of 42000 (String). Check MySQL error code list for the full list of possible errors for MySQL.

So, I just check the SQLException error code and/or SQL state to make sure this exception happens so that I know a database with the given name exist or not.

        boolean isExistRepos = true;

        try{
            connection = DriverManager.getConnection(reposURL, “root”, “rootpassword”);
        }catch (SQLException se){
            while(se != null && isExistRepos){
                String logMessage = “\n\n An SQL Error Occured: “
                                  + se.getMessage() + “\n\t”
                                  + “Error Code: ” + se.getErrorCode()
                                  + “\n\t” + “SQLState: “
                                  + se.getSQLState() + “\n”;
                System.err.println(logMessage);

                //repos does not exist and the connection cannot set up
                //MySQL error list (http://dev.mysql.com/doc/mysql/en/error-handling.html)
                //#Error: 1049 SQLSTATE: 42000 (ER_BAD_DB_ERROR)
                //Message: Unknown database ‘%s’
                if((se.getErrorCode() == 1049) && (se.getSQLState().equalsIgnoreCase(”42000″)))
                    isExistRepos = false;
                se = se.getNextException();
            }
        }

Solution 2: Call mysql command line client using Runtime class.

I did not try it, but here is the general idea. MySQL comes with a command line client program called “mysql” where users can interact with the database server. You can type in “show databases” and a list of existing databases will be presented.

In Java, we can use Runtime.getRuntime().exec(“mysql -u root -p rootpassword”) to get a handle on a Process object, then we can use OutputStream and InputStream to input “show databases” and parse the output to see if the database foo exist in the list of databases.

Solution 3: Check if the directory corresponding to the database exists in MySQL data directory.

I just found that for each database, there exist a directory with the same name in MySQL data directory, e.g. C:\Program Files\MySQL\MySQL Server 4.1\data for my case. So, you can juse check if the directory foo exist in that directory to tell if the corresponding database exist or not. I am not sure about MySQL internals, so this solution is not stable and portable.

create a new database foo in MySQL using JDBC

Problem: want to create a new database in MySQL using JDBC

Usually when people work with JDBC, they need a Connection object to the destination database, but since we need to create a new database, where to find an existing connection?

Setup: MySQL v4.1.9 + mysql-connector-java v3.1.6 + J2SE v1.4.2_06

Solution 1: Create a Connection to “mysql” admin database and use it to create the new database.

There are two preloaded databases when you install MySQL: mysql and test. mysql is the admin database that keeps metadata like access control information. We can just create a Connection object to mysql admin database and use it to create another new database.

Solution 2: Use Runtime.exec(command) to call mysql command line client.

Solution 3: Create a directory with the same name as the new database name in MySQL data directory, e.g., C:\Program Files\MySQL\MySQL Server 4.1\data. And MySQL server will “think” a new database is created. This method might have risk of corrupted metadata although I have tried this method before and no abnormal behavior was observed.

Represent inner classes in UML

Problem: How to represent inner classes in the class diagram in UML?

I am doing TA for Software Methodology class this quarter and we use lots of UML diagrams. When grading students’ homework, I found this problem about representing inner classes in the class diagram.

After consulting the UML 1.5 spec and other resources, there are two ways.

Solution 1: (from Holub Associates: UML Reference Card)

Nesting, Inner Class.. Identifies nesting (containment) relationships in all diagrams. In a class diagram: an “inner” class whose definition is nested within the another class definition. Typically puts the inner class in the name space of the outer class.

Solution 2: Use a package symbol (from UML 1.5 spec, chapter 3 UML Notations (PDF), 3.48.2)

“Note that nested notation is not the correct way to show a class declared within another class. Such a declared class is not a structural part of the enclosing class but merely has scope within the namespace of the enclosing class, which acts like a package toward the inner class. Such a namescope containment may be shown by placing a package symbol in the upper right corner of the class symbol. A tool can allow a user to click on the package symbol to open the set of elements declared within it. The “anchor notation” (a cross in a circle on the end of a line) may also be used on a line between two class boxes to show that the class with the anchor icon declares the class on the other end of the line.”

Using Jakarta Commons CLI

Several examples from Jakarta Commons Cookbook by Timothy M. O’Brien, O’Reilly, Nov 2004

Example 1: Parsing a Simple Command Line

import org.apache.commons.cli.CommandLineParser;
import org.apache.commons.cli.BasicParser;
import org.apache.commons.cli.Options;
import org.apache.commons.cli.CommandLine;

public static void main(String[] args) throws Exception {

    // Create a Parser
    CommandLineParser parser = new BasicParser( );
    Options options = new Options( );
    options.addOption(”h”, “help”, false, “Print this usage information”);
    options.addOption(”v”, “verbose”, false, “Print out VERBOSE information” );
    options.addOption(”f”, “file”, true, “File to save program output to”);

    // Parse the program arguments
    CommandLine commandLine = parser.parse( options, args );

    // Set the appropriate variables based on supplied options
    boolean verbose = false;
    String file = “”;

    if( commandLine.hasOption(’h') ) {
        System.out.println( “Help Message”)
        System.exit(0);
    }

    if( commandLine.hasOption(’v') ) {
        verbose = true;
    }

    if( commandLine.hasOption(’f') ) {
        file = commandLine.getOptionValue(’f');
    }
}

Example 2: Using OptionGroup

Example 3: Print Usage Info

import org.apache.commons.cli.CommandLineParser;
import org.apache.commons.cli.BasicParser;
import org.apache.commons.cli.Options;
import org.apache.commons.cli.OptionBuilder;
import org.apache.commons.cli.OptionGroup;
import org.apache.commons.cli.CommandLine;
import org.apache.commons.cli.HelpFormatter;

public class SomeApp {
    private static final String USAGE = “[-h] [-v] [-f <file> | -m <email>]”;
    private static final String HEADER =
        “SomeApp - A fancy and expensive program, Copyright 2010 Blah.”;
    private static final String FOOTER =
        “For more instructions, see our website at: http://www.blah123.org”;

    public static void main(String[] args) throws Exception {

        // Create a Parser
        CommandLineParser parser = new BasicParser( );
        Options options = new Options( );
        options.addOption(”h”, “help”, false, “Print this usage
                                                                      information”);
        options.addOption(”v”, “verbose”, false, “Print out VERBOSE
                                                                         information” );

        OptionGroup optionGroup = new OptionGroup( );
        optionGroup.addOption( OptionBuilder.hasArg(true).withArgName(”file”)
                                            .withLongOpt(”file”).create(’f') );
        optionGroup.addOption( OptionBuilder.hasArg(true).withArgName(”email”)
                                            .withLongOpt(”email”).create(’m') );
        options.addOptionGroup( optionGroup );
           // Parse the program arguments
        try {
            CommandLine commandLine = parser.parse( options, args );

            if( commandLine.hasOption(’h') ) {
                printUsage( options );
                System.exit(0);
            }

               // … do important stuff …
        } catch( Exception e ) {
            System.out.println( “You provided bad program arguments!” );
            printUsage( options );
            System.exit(1);
        }
    }

    private static void printUsage(Options options) {
        HelpFormatter helpFormatter = new HelpFormatter( );
        helpFormatter.setWidth( 80 );
        helpFormatter.printHelp( USAGE, HEADER, options, FOOTER );
    }
}

About BitKeeper not free anymore

Recently, I am working on an assessment paper about various existing SCM systems. Basically, we see feature comparison matrices all the time as the marketing tool to sell SCM-X, however people general don’t explicitly state the scenarios for the feature comparisons. For example, you can claim SCM-X does commits 20% faster than SCM-Y, but how big is the commit size, change size, etc., are not clearly explained. We want to fill the void.

Back to the topic, the reading about BitKeeper starts from its feature comparison matrix with Subversion: http://www.bitkeeper.com/Comparisons.Subversion.html. Then, on Subversion site, developers jumped out and tried to debunk BitKeeper’s false claim: http://subversion.tigris.org/bitmover-svn.html. Then it turned out that BitMover, BitKeeper’s company is going to withdraw this free product because of many attempts of reverse-engineering in open-source world. Although Linus himself enjoyed a lot using BitKeeper for kernel development, it will happen soon and kerneltrap.org has a detailed coverage about that, also Linus’s original email. Apparently, Linus does not like Subversion at all (P.S. part) and Karl Fogel, on behalf of the Subversion team, reponded to Linus’ comments on Subversion.

Watching a war like this quite fresh experience for me: previously I thought all those open-source developers are shy, gentle, silient people with long hair or no hair. Well, it is my first time watching the flames between them.

Interestingly enough, there is a survey about “my favorite FOSS source control system” at the side bar of the editorial.

Total votes: 2393

Subversion: 37%

CVS: 24%

Darcs: 10%

Depends on project: 7%

GNU Arch: 6%, not listed: 6%

Bazaar-NG: 3%

Bazaar: 1%

Vesta: 0%, Codeville: 0%

Subversion experience

My Experiences With Subversion by Simon Tatham

1. Introduction

When I’m not at work, I’m a free software developer. I maintain a variety of published projects, ranging from fairly major things like PuTTY to tiny little Unix utilities; and I have almost as wide a variety of unpublished projects as well, ranging from half-finished major programs to my personal .bashrc. Until November 2004, all these projects were stored in CVS, along with probably 90% of the other free software in the world.

Then I migrated to Subversion. This took a fair amount of thought and effort to do well, and shortly afterwards I was asked by a colleague if I could write something about my experiences. He was probably expecting something more like a couple of paragraphs, but I thought, hey, why not do the job right?

This article is not a rant. In general, I have found Subversion to be linearly superior to CVS and I certainly don’t regret migrating to it. The article is just an attempt to share my experiences: things to watch out for, how to get the most out of Subversion, that sort of thing.

More >>>

BitKeeper Testdrive

BitKeeper test drive documentation from their website.

A discontinued tutorial for BitKeeper by Zac, it is quite interesting to see the reason for discontinuing the tutorial writing:

“(DISCONTINUED) A guide for starting into the world of Bit Keeper, a really cool source control system. This is the document I wish I had when I was getting started.. [UPDATE] Cancelled to do me finding out what the company behind the product is really like. Too bad, because it seems like it’s actually a pretty good product.”

I think sometimes people enjoy free lunches should not blame too much for not having free lunches suddenly. Everyone has family to support and needs money. But I would rather say at the very beginning clearly rather than giving the impression that I want to make money from the user base which were attracted because of the free services.

Notes for by Craig Grannell

1. CSS shorthand for boxes:

margin: 10px — applies to all edges
margin: 10px, 20px — 10px applies to the top and bottom edges; 20px applies to left and right edges
margin: 10px, 20px, 30px — 10px applies to the top, 20px to the left and right, 30px to the bottom
margin: 10px, 20px, 30px, 40px — clockwise order starting from the top (top, right, bottom, and left)

2. Applying styles to a Web page:

a) use a link tag:
[link rel=”stylesheet” type=”text/css” href=”mystylesheet.css” /]

b) use import:
[style type=”text/css”]
@import url(mystylesheet.css);
[/style]

c) embed styles in HTML document:
[head]
[style type=”text/css” media=”all”]
p{
color: black;
}

navigation p{

color: blue;
font-weight: bold;
font-size: 120%;
}
[/style]
[/head]

d) inline styles:
[p style=”color: red;”]This is painted in Red.[/p]

3. DOCTYPE declarations:

a) XHTML strict:
[!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”]

b) XHTML transitional: good for depreciated tags
[!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd”]

c) XHTML frameset:
[!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Frameset//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd”]

4. Go to top of a page:

a) [a id=”top” name=”top”][/a]
[a href=”#top”]Back to top[/a]

b) some browsers ignore empty elements, so we need to put a single
space in between; if we use XHTML Strict, we need to put elements into
a block element such as [p] or [div].
[div id=”topOfPageAnchor”]
[a id=”top” name=”top”] [/a][/div]

define CSS
div#topOfPageAnchor{
position: absolute;
top: 0;
left: 0;
height: 0;
}

c) use Javascript:
[div id=”topOfPageAnchor”]
[a id=”top” name=”top”] [/a][/div]

[a href=”#top” onclick=”javascript: scrollTo(0,0);”]Top of page[/a]

5. Attaching Javascript:

a) External
[script type=”text/javascript” src=”javascriptfile.js”][/script]

b) Internal
[script type=”text/javascript”] script content [/script]

6. Toggling div visibility with Javascript:

[div][a href=”#” title=”show text” onclick=”swap(’hiddenDiv’);return false;”]show text[/a][/div]
[div id=”hiddenDiv” style=”display: none;”][p]hello[/p][/div]

7. pseudo-class selectors:

For anchors

a{color: #3366cc;}
a:link{color: #3366cc;}
a:visited{color:#777700;}
a:active{color:#cc00ff;}
a:hover{color:#0066ff;}

Useful resources:

CSS hacks for different Web browsers
W3C Markup Validation
W3C Link Checker
W3C CSS Validation
iCapture shows your page in Safari
CSS switcher demo

Experiment with Perforce

Perforce is one of the popular SCM system used by many IT companies. It uses RCS ,v file format for history archive file. The network communication model is client/server over TCP/IP. I am not sure about the detailed protocol Perforce uses.

Some quick points after reading Perforce user guide and administration guide:

1) Configuration and setup is a bit complicated. The document is not very straightforward about the setup process, meaning of basic concepts and usage of the environment variables. After reading the whole user guide, you finally get the basic ideas about various elements of Perforce and how they should be used, but not at the very beginning.

2) Too many commands and options. There are just too many commands and options to get a job done. And the worst part is that some commands have different names from CVS’s or Subversion’s commands, which are well-known to SCM users. Perforce also uses their own terms for well-understood concepts, e.g. integrate v.s. merge.

3) Perforce provides support for a wide range of platforms and offers multi-language API for 3rd-party developers. The Web client interface is really cool and it should save a lot of time for beginners.

4) Perforce reuses RCS history archive file format and maybe this is why it does not support directory versioning.

5) Internally, Perforce keeps a mapping between file extension and file type to distinguish binary and text files. Binary files are stored in whole while text files are stored in deltas.

6) Transaction is well supported in Perforce. Either all changed files get submitted in a changelist or none of them.

7) Branching is basically copy operation, and Perforce can keep track of the branching internally so that the history information is preserved.

8) In Perforce, a job records what needs to be done and a changelist stores the actual changes. They can be linked to represent the scenario of a bug report and how the bug is fixed.

Concepts in Perforce:

Depot: repository on Perforce server, it is a directory called “depot” in the perforce root directory on the server.

Workspace: isolated place where users get their job done. Users define a mapping between depot directory structure and workspace directory structure.

View: a mapping between files in depot and files in workspace.

Changelist: similar in spirit to the changeset concept. It records changes made as a transaction. There is a default changelist and users can define individual chagnelist called “numbered changelist” as well.

Job: a description of problems that should be solved, e.g. a bug report. Jobs are linked to changelists that actually make changes happen and fix the problems.

Label: similar to label in RCS, CVS and Subversion. Use label to make logic group of file, e.g. alpha release, bug fix, etc.

Branch: copy of files at a different directory. There are two approaches for branching: based on file specifications (from-files to-files) or branch specifications (name the branch mapping).

Perforce User Guide (2005.1)Perforce User Guide (2005.1)
Perforce Admin GuidePerforce Admin Guide

MDA and Software Factories

I just read some recent articles and literatures about the idea of “building software at a higher level (than raw source code)” such as MDA, software workbench, software factories, etc. It seems like people are starting or have started (e.g. tools like OptimalJ from Compuware) using models as an efficient way of building software and leave the complexity issues to automation (as much as possible). Also, it appears to me that domain specific languages might appear as the major programming languages for specific domains. Actually, lots of XML documents have been serving as configuration files, preferences and options for quite some time, those in some sense are DSLs too.

Here are some interesting projects and articles from Eclipse and VS.NET 2005:

1) EMF+GEF+GMF: those are the modeling frameworks in Eclipse. EMF is quite similar in spirit to MDA proposed by OMG, GEF provides the framework to write graphical editors for editing EMF models. GMF (http://www.eclipse.org/gmf/) is a newly proposed project that aims to add automatic generation of GEF editors for EMF models, which bridges the gap between EMF and GEF.

2) Other MDA-related projects at Eclipse: GMT (http://www.eclipse.org/gmt/) is a set of research tools for generative model transformation. This one seems quite interesting because we are thinking about model transformation as well for our wizard-based design tools.MDDi (http://www.eclipse.org/proposals/eclipse-mddi/index.html) seems quite intersting as well, but it is also in its infancy.

3) MS has been pushing about software modeling in VS.NET 2005 and there are ideas and tools about “software factotires” (http://lab.msdn.microsoft.com/teamsystem/workshop/sf/default.aspx). They have a series of articles about software factories by Jack Greenfield. He also wrote a book (http://www.wiley.com/WileyCDA/WileyTitle/productCd-0471202843.html).

A quick comparison of UML 1.4 and 2.0

From UML Bible by Tom Pender

“The UML authors of both 1.4 and 2.0 endeavored to uphold the four-layer meta-model (M0–M3) architecture, an approach that supports the distribution of concepts across many levels of abstraction. The layering supports the specification of concepts for different purposes such as object modeling (UML) and data modeling (CWM), customization for different domains, and ultimately for different implementations.

In the UML 1.4 architecture:

The MOF provides the foundation for the UML

The UML defines a Foundation package as well as the Behavioral and Model Management features. Together these packages define all of the essential features needed to create the modeling elements used to build UML diagrams.

In the UML 2.0 architecture:

The new architecture defines an Infrastructure and a Superstructure

The Infrustructure redefines the highest level of the architecture used to create the MOF and all other MDA components.

The Superstructure is the UML portion of the architecture. The Superstructure derives all of its components from both the Infrastructure and the MOF.

The Superstructure is organized according to the three types of diagrams defined by UML, that is structural (Class, Object, and so on), behavioral (Sequence, Timing, State Machine, and the like), and supplemental (information flows, profiles, and templates).

Diagram changes from 1.4 to 2.0:

2.0 replaced the Collaboration diagram with a more limited Communication diagram.

2.0 added two new interaction diagrams: the Interaction Overview diagram and the Timing diagram.

2.0 added the Protocol State Machine.

2.0 added the Composite Structure diagram

2.0 isolated the Activity diagram with its own semantics separate from the State Machine.”

Requirment Management Tools

From a Gartner report: Agile Requirements Definition and Management Will Benefit Application Development

Principal Tools:

IBM Rational RequisitePro

Borland CaliberRM

Serena Requirements & Traceability Management

Telelogic Doors

Less Known Tools:

Apptero – Apptero 2004

Axure Software Solutions Rapid Prototyper

Compuware Reconcile (with QACenter, DevPartner)

Goda Software Analyst Pro

iRise Application Simulator

MKS Requirements 2005 (with Integrity Manager)

Sofea Profesy

SpeeDev – SpeeDev RM

SteelTrace Catalyze

TCP Integral Requisite Analyzer

Others:

3SL Cradle

UGS Teamcenter

ViewSet Pace

Vitech Core

requirement engineering tools

iRise
-iRise Studio: requirement development environment
-iRise Manager: middleware to manage requirements
-iRise Reader: client

RequisitePro

-Dynamic integration between Word and requirement database
-Integration with other IBM software development tools, for coding, testing, and mantanence
-Distributed access to requirement repository (not sure about the detailed mechanisms)
-Traceability and coverage analysis (among requirements)
-Impact of requirement changes (among requirements)
-Query capabilities based on attributes
-Requirement audit trail
-Central requirement repos
-Controlled access to repos
-Customizalbe requirement structure (define requirement types, each type has a unique set of atts)
-Requirement project templates (not sure what is diff with req types)

RequisitePro has several ways to edit requirements:
1) Use Word plugin: insert new requirements directly by selecting the plugin toolbar or the right-click menu.
2) use requirement editor
3) use requirement viewer

Because all requirements are stored in the database, searching and revision control have been taken care of easily.

Coverage anaylysis uses queries to create views that shows the relationship between requirements, requirement and
use cases, etc.

Change impact analysis: my understanding of how this works. Links are set up between a requirement and a use case,
for example. When the text of the requirement changes, the link will be marked as a suspect link.

RequisiteWeb: a web interface to the requisitePro
see demo (http://www3.software.ibm.com/ibmdl/pub/software/rational/web/demos/viewlets/reqpro/RequisiteWeb_V2002_viewlet.html)

In RequisitePro we can create four types of items:

1)Package: just like a folder to group things together
2)Document: 5 templates are provided: glossary, requirement management plan, supplementary requirement specification,
use case specification, vision
3)View: attribute matrix, traceability matrix, traceability tree (in or out)
4)Requirement: can have the attributes of type, priority, status, difficulty, stability, origin, contact, enhancement-request
defect (the last two are connected with clearquest tool)

CaliberRM from Borland

http://info.borland.com/techpubs/caliber_rm/

features and functions are quite similar to RequisitePro
It has a document factory to generate Word document from Word templates.
Import and export of project data support many different formats like Word doc, ASCII files, etc.

Requirement types and attributes can be customized.

Advanced version history and traceability, integrates with SCM tools like Borland StarTeam, Merant PVCS, Microsoft SourceSafe,
Rational ClearCase.

The tool integrtes with a line of Borland requirement engineering tools, e.g. estimate professional for workload/cost estimation,
datamart+BusinessObjects for advanced query/analysis to support decision making process,
Mercury quality center/testDirector.

Telelogic DOORS

1)DOORS:
2)DOORS XT: distributed requirement management
3)DOORS/Analyst: using UML diagrams to draw requirement models, a visual modeling environment
4)DOORSnet: enable Web access to the requirement management functions

DOORS:
=change tracking
=traceability
=scalability
=inherant small-scale test environment, integration with Mercury quality center/testDirectory for large-scale testing
=integration with other telelogic products

Serena Requirements Traceability Management(RTM)

Requirements authoring using Word, or visual models defined using Serena Composer, or Wb client

A uniqe polling feature that allows people to vote to make decisions

Built on Oracle DB

On-line ability to collaborate, in real-time with live data while keeping a history of that collaboration.
A true data repository that provides full traceability; with no ‘limitations’ or special proprietary scripting languages
User interfaces, Word and Web, that are industry standards and commonly used in every organization
The hybrid capability of viewing requirements data in both document form and object form
True baselining and versioning capability with full history of all changes
User designed forms to capture and manage any engineering information such as defects, enhancement requests and change requests, to name a few
icAdvisor for requirements correctness

Documents and Papers

Increasing Business/IT Relevance and Adaptability: Adopting Requirements Visualization
(a META Group White Paper)

META group estimates that 60%-80% of project failures can be attributed directly to poor
requirements gathering, analysis, and management.

config Tomcat for JNDI

Had a hard time configuring Tomcat to use JNDI to access MySQL, here is how to do it. Basically, we need to add a datasource resource, then define a resource link in server.xml.

…

…
type="javax.sql.DataSource"
password="1234ge"
driverClassName="com.mysql.jdbc.Driver"
maxIdle="2"
maxWait="5000"
username="guozheng"
url="jdbc:mysql://localhost:3306/testdb"
maxActive="4"/>

…
name="localhost">

jdbc connections to DB

MySQL(http://www.mysql.com)mm.mysql-2.0.2-bin.jar
Class.forName( “org.gjt.mm.mysql.Driver” );
cn = DriverManager.getConnection( “jdbc:mysql://MyDbComputerNameOrIP:3306/myDatabaseName”, sUsr, sPwd );

PostgreSQL(http://www.de.postgresql.org)pgjdbc2.jar
Class.forName( “org.postgresql.Driver” );
cn = DriverManager.getConnection( “jdbc:postgresql://MyDbComputerNameOrIP/myDatabaseName”, sUsr, sPwd );

Oracle(http://www.oracle.com/ip/deploy/database/oracle9i/)classes12.zip
Class.forName( “oracle.jdbc.driver.OracleDriver” );
cn = DriverManager.getConnection( “jdbc:oracle:thin:@MyDbComputerNameOrIP:1521:ORCL”, sUsr, sPwd );

Sybase(http://jtds.sourceforge.net)jconn2.jar
Class.forName( “com.sybase.jdbc2.jdbc.SybDriver” );
cn = DriverManager.getConnection( “jdbc:sybase:Tds:MyDbComputerNameOrIP:2638″, sUsr, sPwd );
// (Default-Username/Password: “dba”/”sql”)

Microsoft SQLServer(http://jtds.sourceforge.net)
Class.forName( “net.sourceforge.jtds.jdbc.Driver” );
cn = DriverManager.getConnection( “jdbc:jtds:sqlserver://MyDbComputerNameOrIP:1433/master”, sUsr, sPwd );

Microsoft SQLServer(http://www.microsoft.com)
Class.forName( “com.microsoft.jdbc.sqlserver.SQLServerDriver” );
cn = DriverManager.getConnection( “jdbc:microsoft:sqlserver://MyDbComputerNameOrIP:1433;databaseName=master”, sUsr, sPwd );

ODBC
Class.forName( “sun.jdbc.odbc.JdbcOdbcDriver” );
Connection cn = DriverManager.getConnection( “jdbc:odbc:” + sDsn, sUsr, sPwd );

8.DB2
Class.forName(”Com.ibm.db2.jdbc.net.DB2Driver”);
String url=”jdbc:db2://192.9.200.108:6789/SAMPLE”
cn = DriverManager.getConnection( url, sUsr, sPwd );

A dive into Web app frameworks

Started to look for a good Web application framework to do my last piece of dissertation. I am quite a newbie in the world of Web app development and suddenly got swamped by the vast amount of existing frameworks. Most of these use MVC architecture–using MySQL/Postgresql as RDBMS, Hibernate/iBatis as the ORM layer, Spring to implement business logic and a presentation framework such as Spring MVC, Struts/Struts2(merge of Struts and WebWork2), JSF, Tapestry, etc.

I tried to take a look at Matt Raible’s appfuse project, which is a ready-to-use framework that provides customized stacks of existing frameworks at various layers. It’s quite cool, but still I need quite a lot of time to learn.

Then, I read about comparisons of many existing frameworks and found the new trend to favor much simpler and easy-to-use frameworks such as Wicket, Stripes, Click. After reading their documentation, quick start guide and sample applications. I decided to try Wicket.

Wicket is a bit different from those MVC frameworks in that it cleanly separates UI design (in html) and application implementation (in Java code). Also opposite to frameworks that use a lot of XML configuration files, Wicket does not use any XML configuration or Java annotations. From its rich set of live examples, ajax integration is quite good.

So, I set out and started to learn and use Wicket. First, I tried to follow its online example code. However, those code is quite outdated and its maven pom.xml file did not even work properly. I had to use mvn eclipse plugin (mvn eclipse:eclipse) to import code into eclipse. After many attempts to fix the pom file, I finally gave up. It appears to be a Wicket plugin for eclipse (Wicket Bench from Laughing Panda), however this plugin is quite primitive and showed many errors in my eclipse 3.2.2.

Then, luckily I found a wicket module for Netbeans 5.5 (module is the plugin for Netbeans). It has well-documented steps to develop wicket applications and everything is made so simple in Netbeans. I even did not have to worry about libraries and dependency configurations, etc. The bundled Tomcat 5.5.17 is launched automatically when you run the Web application and the browser page is brought out automatically as well. Nice features like HTTP monitor, log messages are also available to help.

I think now I need to explore more about Netbeans 5.5. It’s really a quite sharp and neat tool, in some aspects better than eclipse/MyEclipse 5.1.1 I am using now.

A note to configure Tomcat and use log4j 1.2 and commons-logging:
1. Shutdown Tomcat if it is currently running.
2. Download the Commons Logging package from the Apache web site (unless you already have it).
3. Copy the commons-logging.jar file from the distribution into your Tomcat common/lib directory.
4. Download the Log4j package from the Apache web site (unless you already have it).
5. Copy the log4j.jar file from the distribution into your Tomcat common/lib directory.
6. Create a log4j.properties file in your Tomcat common/classes directory (see next section).
7. Restart Tomcat.

Sample log4j.properties file:
#

Configures Log4j as the Tomcat system logger

#

#

Configure the logger to output info level messages into a rolling log file.

#
log4j.rootLogger=INFO, R

#

To continue using the “catalina.out” file (which grows forever),

comment out the above line and uncomment the next.

log4j.rootLogger=ERROR, A1

Configuration for standard output (”catalina.out”).

#
log4j.appender.A1=org.apache.log4j.ConsoleAppender
log4j.appender.A1.layout=org.apache.log4j.PatternLayout
#

Print the date in ISO 8601 format

#
log4j.appender.A1.layout.ConversionPattern=%d [%t] %-5p %c - %m%n

#

Configuration for a rolling log file (”tomcat.log”).

#
log4j.appender.R=org.apache.log4j.DailyRollingFileAppender
log4j.appender.R.DatePattern=’.'yyyy-MM-dd
#

Edit the next line to point to your logs directory.

The last part of the name is the log file name.

#
log4j.appender.R.File=/usr/local/tomcat/logs/tomcat.log
log4j.appender.R.layout=org.apache.log4j.PatternLayout
#

Print the date in ISO 8601 format

#
log4j.appender.R.layout.ConversionPattern=%d [%t] %-5p %c - %m%n

#

Application logging options

log4j.logger.org.apache=DEBUG

log4j.logger.org.apache=INFO

log4j.logger.org.apache.struts=DEBUG

log4j.logger.org.apache.struts=INFO

Some Hibernate Lessons

I have started using Hibernate in my project and it turns out to be more difficult for a beginner than I thought, especially when the domain model is a bit complicated. But I guess it is worthwhile to do the learning and fix the problems along the way because it will be even more problematic if I choose to brew a layer like that by myself.

So, here are some of the lessons I learned.

1) Inheritance: I chose the 3rd strategy, subclass per table because it is the most beautiful solution to my problem. The lesson I learned is that I don’t need to add an extra ID to the subclass since it will use the parent class’s ID anyways. But for the database table schema, you should have an ID for the subclass table.

2) One-to-many mapping: it is a bit similar to 1) except that now it is a foreign key reference that you don’t need to put into the many-to-one end POJO definition. For example, we have a parent p and a child c, p contains a set of c and c contains a single p, so this is a bi-directional one-to-many mapping. In the tablec, there should be a column like pid that references id column in tablep. But you don’t need to have a property called pid in the c mapping file. Otherwise, you will see duplicated mapping error. You don’t need to have an attribute p_id with getter and setter defined in c POJO either. Otherwise, you will see a null value returned from c.getPId() method.

Here are two links for examples:

1-to-many

subclass per table

Windows 2003 Standard Server

Microsoft started offering many software to students for FREE at DreamSpark. I got a copy of Windows 2003 Standard Server and installed it over the weekend. It is much faster and memory efficient than my old Windows 2000 server. But it is quite annoying to configure it as a daily workstation too. Here are some tips that I want to take a note for.

1) Add a new user: Start -> Run…, type lusrmgr.msc and this opens the user and group management. Select Users, right click and select New User, then create a new user.

If you want to add admin right to the new user, right click on the newly create user. Select properties, open Member Of tab, click Add… , in the Enter the object names to select textbox, type Administrators and hit Check Names button (it will fill the full name of admin role), then click OK.

2) IE security tuning: IE’s default security profile is set to High, and this makes is very unusable because it keeps bugging you to add virtually every website to trusted zone. Here is how to set to a lower security level: open IE, go to Tools…, go to Internet Options…, select Security tab, in “Security Level for this zone” change High to Medium and confirm the change.

3) Install Kaspersky 6 desktop version: Kaspersky is for desktop computers only and there exists server version and it is quite expensive. I have successfully installed KAV 6 on Windows 2000 server before, but installing it on Windows 2003 server SP2 is much trickier.

Modify KAV msi file using ORCA MSI editor: open KAV msi file using ORCA, then do a ctrl-F search for “MsiNTProductType=1″, replace each occurrence with “MsiNTProductType>=1″ and save the msi file. Then you can install it without any problem.

After installation and adding the license file, when you restart, Windows 2003 server will show a blue screen. Press the famous F8 and select “safe mode with command line”. I tried to select GUI safe mode, but it just didn’t respond to keyboard.

On the command line. Type “edit” command and this will give you a basic text editor, create a file called kav.reg with the following content:
Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\kl1] "Start"=dword:00000001

Save it and exit the editor. Then, type “regedit /s kav.reg” to import this change to the registry. Reboot the machine using “shutdown /r” command.

Finally, no more blue screen!

4) Hardware drivers: driver support is not so good on 2003 server. I could not find a driver for my RAID IDE-SATA converter card. I should have used software to backup drivers before upgrading. Here is a good one called Driver Genius.

5) Adding “show desktop” icon to the task bar: for the newly created user, there is no show desktop icon. You can either copy a file called Show Desktop.scf from C:\Documents and Settings\Administrator\Application Data\Microsoft\Internet Explorer\Quick Launch\ to C:\Documents and Settings\\Application Data\Microsoft\Internet Explorer\Quick Launch.

Or you can directly create a new Desktop.scf in C:\Documents and Settings\\Application Data\Microsoft\Internet Explorer\Quick Launch\, here is the file content:

[Shell] Command=2 IconFile=explorer.exe,3 [Taskbar] Command=ToggleDesktop

XHTML and CSS notes

Stylin with CSS

XHTML Structure

DOCTYPE declarations:

Strict:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

Transitional:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">

Frameset:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
"http://www.w3.org/TR/html4/frameset.dtd">

XML namespace declaration:

Content type declaration:

Symbols such as <, &:

http://htmlhelp.com/reference/html40/entities/

A simple template:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns=http://www.w3.org/1999/xhtml lang="en" xml:lang="en">

<head>
<meta http-equiv="Content-type" content="text/html; charset=iso-8859-1" />

<title>Title for your Web page</title>

</head>

<body>

</body>

</html>

Four ways of CSS declaration:

embed <style></style> section in the head element
link to external CSS file: <link href="mystylesheetprint.css" media="screen" rel="stylesheet" type="text/css">, for print, just change media="print"

in-line style, just specify at each tag, e.g. <p style="font-size: 25pt; color: red;">

use @import in <style></style> section in the head element. The only downside is that IE6 might have so called FOUC (Flash of Un-styled Content) problem, meaning the content will be momentarily displayed without CSS formatting. See more at http://bluerobot.com/web/css/fouc.asp

CSS syntax-related stuff:

Contextual selector:
this limits style to parent element, e.g. p em{color:green;} makes only those <em> within <p> elements green color.

Class selector:
e.g. p.green{color:green;} <p class="green">green text</p>, note that if multiple classes exist, the last one declared in CSS definition file wins.

Id selector:
p#green{color:green;} <p id="green">green text</p>
Difference between id and class selector is that one id value is unique to one element and id is usually used in javascript as well. So, for those styles unique to an element, use id, for styles that can be shared among different elements, use class selector.

Attribute selector:
select element based on attribute existence or values, e.g. this example add a pdf icon after links to pdf files, use [href|="foo"] to specify link names that start with "foo"
a[href$=".pdf"] {
background:transparent url(images/iconpdf.gif) no-repeat scroll right center;
padding-right:18px;
}

Unit measurement:

try to use relative units: em (from the width of a character), ex (from the height of character x), or percentage.

Colors:

many different ways to specify a color: #RRGGBB, (R%, G%, B%), or use color name. There are only 16 color names in the spec: aqua, black, blue, fuchsia, gray, green, lime, maroon, navy, olive, purple, red, silver, teal, white, yellow.

Fonts:

serif, sans-serif, monospace, fantasy, cursive, e.g. p{font-family:sans-serif;}

Round corners:

Website Performance

This is a very nice book by the creator of Yslow, an addon to Firebug. It lists 14 rules for high performance websites. Very useful.

DavidHerron.com also has many resources for high performance websites.

What happens when you open a URL in your Web browser

Here is a nice article touching this topic.
Since it is not that long, I simply copy&past here:

Testing Page Load Speed
Posted at 2:27 PM
One of the most problematic tasks when working on a Web browser is getting an accurate measurement of how long you’re taking to load Web pages. In order to understand why this is tricky, we’ll need to understand what exactly browsers do when you ask them to load a URL.
So what happens when you go to a URL like cnn.com? Well, the first step is to start fetching the data from the network. This is typically done on a thread other than the main UI thread.
As the data for the page comes in, it is fed to an HTML tokenizer. It’s the tokenizer’s job to take the data stream and figure out what the individual tokens are, e.g., a start tag, an attribute name, an attribute value, an end tag, etc. The tokenizer then feeds the individual tokens to an HTML parser.
The parser’s job is to build up the DOM tree for a document. Some DOM elements also represent subresources like stylesheets, scripts, and images, and those loads need to be kicked off when those DOM nodes are encountered.
In addition to building up a DOM tree, modern CSS2-compliant browsers also build up separate rendering trees that represent what is actually shown on your screen when painting. It’s important to note two things about the rendering tree vs. the DOM tree.
(1) If stylesheets are still loading, it is wasteful to construct the rendering tree, since you don’t want to paint anything at all until all stylesheets have been loaded and parsed. Otherwise you’ll run into a problem called FOUC (the flash of unstyled content problem), where you show content before it’s ready.
(2) Image loads should be kicked off as soon as possible, and that means they need to happen from the DOM tree rather then the rendering tree. You don’t want to have to wait for a CSS file to load just to kick off the loads of images.
There are two options for how to deal with delayed construction of the render tree because of stylesheet loads. You can either block the parser until the stylesheets have loaded, which has the disadvantage of keeping you from parallelizing resource loads, or you can allow parsing to continue but simply prevent the construction of the render tree. Safari does the latter.
External scripts must block the parser by default (because they can document.write). An exception is when defer is specified for scripts, in which case the browser knows it can delay the execution of the script and keep parsing.
What are some of the relevant milestones in the life of a loading page as far as figuring out when you can actually reliably display content?
(1) All stylesheets have loaded.
(2) All data for the HTML page has been received.
(3) All data for the HTML page has been parsed.
(4) All subresources have loaded (the onload handler time).
Benchmarks of page load speed tend to have one critical flaw, which is that all they typically test is (4). Take, for example, the aforementioned cnn.com. Frequently cnn.com is capable of displaying virtually all of its content at about the 350ms mark, but because it can’t finish parsing until an external script that wants to load an advertisement has completed, the onload handler typically doesn’t fire until the 2-3 second mark!
A browser could clearly optimize for only overall page load speed and show nothing until 2-3 seconds have gone by, thus enabling a single layout and paint. That browser will likely load the overall page faster, but feel literally 10 times slower than the browser that showed most of the page at the 300 ms mark, but then did a little more work as the remaining content came in.
Furthermore benchmarks have to be very careful if they measure only for onload, because there’s no rule that browsers have to have done any layout or painting by the time onload fires. Sure, they have to have parsed the whole page in order to find all the subresources, and they have to have loaded all of those subresources, but they may have yet to lay out the objects in the rendering tree.
It’s also wise to wait for the onload handler to execute before laying out anyway, because the onload handler could redirect you to another page, in which case you don’t really need to lay out or paint the original page at all, or it could alter the DOM of the page (and if you’d done a layout before the onload, you’d then see the changes that the onload handler made happen in the page, such as flashy DHTML menu initialization).
Benchmarks that test only for onload are thus fundamentally flawed in two ways, since they don’t measure how quickly a page is initially displayed and they rely on an event (onload) that can fire before layout and painting have occurred, thus causing those operations to be omitted from the benchmark.
i-bench 4 suffers from this problem. i-bench 5 actually corrected the problem by setting minimal timeouts to scroll the page to the offsetTop of a counter element on the page. In order to compute offsetTop browsers must necessarily do a layout, and by setting minimal timers, all browsers paint as well. This means i-bench 5 is doing an excellent job of providing an accurate assessment of overall page load time.
Because tests like i-bench only measure overall page load time, there is a tension between performing well on these sorts of tests and real-world perception, which typically involves showing a page as soon as possible.
A naive approach might be to simply remove all delays and show the page as soon as you get the first chunk of data. However, there are drawbacks to showing a page immediately. Sure, you could try to switch to a new page immediately, but if you don’t have anything meaningful to show, you’ll end up with a "flashy" feeling, as the old page disappears and is replaced by a blank white canvas, and only later does the real page content come in. Ideally transitions between pages should be smooth, with one page not being replaced by another until you can know reliably that the new page will be reasonably far along in its life cycle.
In Safari 1.2 and in Mozilla-based browsers, the heuristic for this is quite simple. Both browsers use a time delay, and are unwilling to switch to the new page until that time threshold has been exceeded. This setting is configurable in both browsers (in the former using WebKit preferences and in the latter using about:config).
When I implemented this algorithm (called "paint suppression" in Mozilla parlance) in Mozilla I originally used a delay of 1 second, but this led to the perception that Mozilla was slow, since you frequently didnt see a page until it was completely finished. Imagine for example that a page is completely done except for images at the 50ms mark, but that because you’re a modem user or DSL user, the images aren’t finished until the 1 second mark. Despite the fact that all the readable content could have been shown at the 50ms mark, this delay of 1 second in Mozilla caused you to wait 950 more ms before showing anything at all.
One of the first things I did when working on Chimera (now Camino) was lower this delay in Gecko to 250ms. When I worked on Firefox I made the same change. Although this negatively impacts page load time, it makes the browser feel substantially faster, since the user clicks a link and sees the browser react within 250ms (which to most users is within a threshold of immediacy, i.e., it makes them feel like the browser reacted more or less instantly to their command).
Firefox and Camino still use this heuristic in their latest releases. Safari actually uses a delay of one second like older Mozilla builds used to, and so although it is typically faster than Mozilla-based browsers on overall page load, it will typically feel much slower than Firefox or Camino on network connections like cable modem/modem/DSL.
However, there is also a problem with the straight-up time heuristic. Suppose that you hit the 250ms mark but all the stylesheets haven’t loaded or you haven’t even received all the data for a page. Right now Firefox and Camino don’t care and will happily show you what they have so far anyway. This leads to the "white flash" problem, where the browser gets flashy as it shows you a blank white canvas (because it doesn’t yet know what the real background color for the page is going to be, it just fills in with white).
So what I wanted to achieve in Safari was to replicate the rapid response feel of Firefox/Camino, but to temper that rapid response when it would lead to gratuitous flashing. Here’s what I did.
(1) Create two constants, cMinimumLayoutThreshold and cTimedLayoutDelay. At the moment the settings for these constants are 250ms and 1000ms respectively.
(2) Don’t allow layouts/paints at all if the stylesheets haven’t loaded and if you’re not over the minimum layout threshold (250ms).
(3) When all data is received for the main document, immediately try to parse as much as possible. When you have consumed all the data, you will either have finished parsing or you’ll be stuck in a blocked mode waiting on an external script.
If you’ve finished parsing or if you at least have the body element ready and if all the stylesheets have loaded, immediately lay out and schedule a paint for as soon as possible, but only if you’re over the minimum threshold (250ms).
(4) If stylesheets load after all data has been received, then they should schedule a layout for as soon as possible (if you’re below the minimum layout threshold, then schedule the timer to fire at the threshold).
(5) If you haven’t received all the data for the document, then whenever a layout is scheduled, you set it to the nearest multiple of the timed layout delay time (so 1000ms, 2000ms, etc.).
(6) When the onload fires, perform a layout immediately after the onload executes.
This algorithm completely transforms the feel of Safari over DSL and modem connections. Page content usually comes screaming in at the 250ms mark, and if the page isn’t quite ready at the 250ms, it’s usually ready shortly after (at the 300-500ms mark). In the rare cases where you have nothing to display, you wait until the 1 second mark still. This algorithm makes "white flashing" quite rare (you’ll typically only see it on a very slow site that is taking a long time to give you data), and it makes Safari feel orders of magnitude faster on slower network connections.
Because Safari waits for a minimum threshold (and waits to schedule until the threshold is exceeded, benchmarks won’t be adversely affected as long as you typically beat the minimum threshold. Otherwise the overall page load speed will degrade slightly in real-world usage, but I believe that to be well-worth the decrease in the time required to show displayable content.

HTTP Notes

From Best Practices for Speeding Up Your Web Site

1. Redirection

HTTP/1.1 301 Moved Permanently
Location: http://example.com/newuri
Content-Type: text/html

2. About ETag

Entity tags (ETags) are a mechanism that web servers and browsers use to determine whether the component in the browser’s cache matches the one on the origin server. (An "entity" is another word for what I’ve been calling a "component": images, scripts, stylesheets, etc.) ETags were added to provide a mechanism for validating entities that is more flexible than the last-modified date. An ETag is a string that uniquely identifies a specific version of a component. The only format constraints are that the string be quoted. The origin server specifies the component’s ETag using the ETag response header.
      HTTP/1.1 200 OK
      Last-Modified: Tue, 12 Dec 2006 03:03:59 GMT
      ETag: "10c24bc-4ab-457e1c1f"
      Content-Length: 12195
Later, if the browser has to validate a component, it uses the If-None-Match header to pass the ETag back to the origin server. If the ETags match, a 304 status code is returned reducing the response by 12195 bytes for this example.
      GET /i/yahoo.gif HTTP/1.1
      Host: us.yimg.com
      If-Modified-Since: Tue, 12 Dec 2006 03:03:59 GMT
      If-None-Match: "10c24bc-4ab-457e1c1f"
      HTTP/1.1 304 Not Modified
The problem with ETags is that they typically are constructed using attributes that make them unique to a specific server hosting a site. ETags won’t match when a browser gets the original component from one server and later tries to validate that component on a different server, a situation that is all too common on Web sites that use a cluster of servers to handle requests. By default, both Apache and IIS embed data in the ETag that dramatically reduces the odds of the validity test succeeding on web sites with multiple servers.
The ETag format for Apache 1.3 and 2.x is inode-size-timestamp. Although a given file may reside in the same directory across multiple servers, and have the same file size, permissions, timestamp, etc., its inode is different from one server to the next.
IIS 5.0 and 6.0 have a similar issue with ETags. The format for ETags on IIS is Filetimestamp:ChangeNumber. A ChangeNumber is a counter used to track configuration changes to IIS. It’s unlikely that the ChangeNumber is the same across all IIS servers behind a web site.
The end result is ETags generated by Apache and IIS for the exact same component won’t match from one server to another. If the ETags don’t match, the user doesn’t receive the small, fast 304 response that ETags were designed for; instead, they’ll get a normal 200 response along with all the data for the component. If you host your web site on just one server, this isn’t a problem. But if you have multiple servers hosting your web site, and you’re using Apache or IIS with the default ETag configuration, your users are getting slower pages, your servers have a higher load, you’re consuming greater bandwidth, and proxies aren’t caching your content efficiently. Even if your components have a far future Expires header, a conditional GET request is still made whenever the user hits Reload or Refresh.
If you’re not taking advantage of the flexible validation model that ETags provide, it’s better to just remove the ETag altogether. The Last-Modified header validates based on the component’s timestamp. And removing the ETag reduces the size of the HTTP headers in both the response and subsequent requests. This Microsoft Support article describes how to remove ETags. In Apache, this is done by simply adding the following line to your Apache configuration file:
      FileETag none

3. HTTP Status Code
Those codes can be found here

Javascript Notes

Found several nice short tutorials for JS:

by Sergio Pereira

by fallenlord blog

Updated High Performance Website Tips

Latest presentation from Yahoo! Exceptional Performance Group

| View | Upload your own

Regex for javascript to detect a URL

var regex = /\b([\d\w\.\/\+\-\?\:]*)((ht|f)tp(s|)\:\/\/|[\d\d\d|\d\d]\.[\d\d\d|\d\d] \.|www\.|\.tv|\.ac|\.com|\.edu|\.gov|\.int|\.mil|\.net|\.org|\.biz|\.info|\.name|\.pro |\.museum|\.co)([\d\w\.\/\%\+\-\=\&\?\:\\\"\'\,\|\~\;]*)\b/gi;

Online tester for javascript regex:

http://www.regular-expressions.info/javascriptexample.html

Short tutorial for regex in javascript:

http://www.regular-expressions.info/javascript.html

Firefox 3 Beta

I run three browsers on my computer with Opear the most frequently used one. Firefox sometimes consumes too much memory, either because of the memory leak problem or its cache implementation.

Firefox betas are out for quite sometime and I tried its latest 3.0pre from nightly build trunk. It is much faster and polished and I really love it. The only problem is that most plugins are not officially compatible with Firefox 3 betas right now.

There are several FF plugins that I cannot live without:

Greasemoney:

this animal allows you to write customized scripts to control the actual rendering of any page. One most useful script is Linkify ting, it automatically turns text URLs into clickable links by adding <a> tags. But the original script has bugs that it cannot handle many types of URLs. So, what you can do is to open that script and replace "var regex = blah blah" with the following:

"var regex = /\b([\d\w\.\/\+\-\?\:]*)((ht|f)tp(s|)\:\/\/|[\d\d\d|\d\d]\.[\d\d\d|\d\d]\.|www\.|\.tv|\.ac|\.com|\.edu|\.gov|\.int|\.mil|\.net|\.org|\.biz|\.info|\.name|\.pro|\.museum|\.co)([\d\w\.\/\%\+\-\=\&\?\:\\\"\’\,\|\~\;]*)\b/gi;".

There are also many scripts on UserScripts.org including many hacks on iGoogle and Gmail. There are also tutorials like Dive into Greasemonkey and books like Greasemonkey Hacks if you want to write your own scripts. It is really a lot of fun.

Firebug:

this is the ultimate toolbox for Web developers, it includes many powerful features to detect JavaScript errors, and design your XHTML and CSS files. There is another addon to Firebug called YSlow! from Yahoo extreme performance group that evaluates a Web site based on several performancing-improving guidelines.

Del.icio.us:

this is the Web-based bookmark solution. Although they have a new version that integrates with Firefox bookmark manager, I still like the classic version that adds to buttons on the browser toolbar better.

So, how to make these plugins working with FF3.0 betas? It is quite straightforward actually.

1. Manually download XPI file: Instead of click install button on these plugin website (actually, those buttons are greyed out and you cannot click at all), you go to the bottom of a plugin page where there is a link for "advanced details", expand that section and click on "complete version history", then download the xpi file manually using "save as…".

2. Open XPI file with WinRAR and edit install.rdf file: in target application element, there is an element called <em:maxVersion>, just change the value to 3.0pre. Save the install.rdf and put it back to the XPI file.

3. In FF3, use "open file…" to open the modified XPI file. Bingo! You got those old buddies back

For del.icio.us plugin, there is a bit more extra work due to the signature files. FF3 will have some error for "signing could not be verified" error. Simply open the XPI file with WinRAR and delete META-INF folder and save the XPI file. Here we go, del.icio.us buttons back too!

Java Garbage Collection

A short article about different types of garbage collectors in Java

Never know that Java could have memory leak problem before since I believed that GC will do all the trick to reclaim unreachable objects. But actually it does due to "unintentional object retention". This is a good article explaining about this problem and providing suggestions to deal with it using things like WeakHashMap (finally I understand what this class is useful for).

OpenSUSE 10.3

It’s been quite some time since I last used my OpenSUSE (I have a dual-boot machine and do most of my research projects using Windows tools). Now I am looking for a job and UNIX familiarity seems a precious resource. So… I switched back to OpenSUSE. I have been using SUSE since version 8. I tried Ubuntu again and again, the device driver problems bited me again and again while OpenSuSe always came to my rescure.

Here are something I did to set up the OpenSUSE 10.3 box as my daily development machine.

1) Install Sun J2SE. The default gcj is cool but I’d prefer Sun’s JDK more. Here is a reasonably good tutorial to set up J2SE: http://fedorasolved.org/browser-solutions/sun-jdk. Besides using update-alternatives to add java, I also added javac in a similar manner.

There is one thing that I found incorrect in this tutorial: if you add his java.sh into /etc/profile.d/, this messed up the PATH variable somehow and gdm even cannot start correctly. So, instead I added these to my ~/.profile:
export JAVAHOME=/opt/jdk1.6.006 export PATH=$PATH:$JAVAHOME/bin export MANPATH=$MANPATH:$JAVAHOME/man

Install Netbeans (as of writing, 6.1 is still RC2). Netbeans 6 caught up very quickly to Eclipse and it did a great job to integrate lots of useful things in nice packages. Since it also has good support for C/C++ development and Ruby, I decided to use it on OpenSuSe.

However, no matter what I did, there are always problems during installation. First, even though there is only one installation wizard running, it always says “there is another installation instance running, are you sure you want to start a new instance?”, yes, of course. Then it dies and Bug-Buddy, a GUI-based bug reporting application shows up and installation wizard simply crashes. I found someone having the same problem and his solution is to uninstall Bug-Buddy and try installation again, it worked!!!

BTW, you can choose to download and install JDK together with Netbeans. In this case, you still need to add Java plugin to Firefox and update alternatives for java and javac.

Install Aptana Studio. This is my current Web application IDE. It support ajax, PHP, RoR, AIR, and iPhone development, covering all major ajax libraries you will find today. And its community version is FREE! You can forget about Dreamweaver, Frontpage, M$ Expression Web, etc. Well, if you are into ASP.NET development, probably you still need VS.NET though.

Add several software update sources (I installed OpenSUSE using a live Gnome CD): OpenSUSE non-open source and open source, Mozilla, Gnome stable, etc. With these update sources, you can install software unavailable to the live Gnome installation CD.

Update Firefox to the latest stable version. It is still 2.0.0.x as of now. I have been using FF3pre, code named Minefield for sometime on Windows now and it simply blew me away with performance and new features like smart URL address field.

Install these plugins for FF: Greasemonkey, BetterGmail2, Firebug, Session Manager, del.icio.us.

Update pidgin, the IM application for linux, to latest stable version. Activate two plugins “historys” and “conversion colors”. The default font size is 12, too big for me and I changed it to 10. The default theme is quite ugly, especially the system tray icon. So, you can find a Pidgin theme on gnome-look made by Embrace.

Go download this Human Pidgin Theme (I guess it is made for Ubuntu based on colors) from here.

Then, login as root. Unpack the package and you will see two directories: pidgin and Style. Use pidgin to replace /usr/share/pixmaps/pidgin directory, which contains original theme. Make a backup if you want to roll back later. The Style directory is a bit confusing and it actually adds a background image. To use it, simply copy pidgin_bg.jpg and .gtkrc-2.0 to your home directory.

Now, start pidgin and you will see a more polished interface.

Install new login window themes, gnome application window themes and mouse icon themes. I like Mac themes but don’t have the money right now to buy a Macbook Pro, so I can tune my OpenSUSE at least look similar.

Application Themes: Go to http://art.gnome.org/themes/gtk2/ first using Firefox. Then, open “Appearance Preference” window (computer->control center->Appearance (under look and feel). Then, just drag a theme to your “Appearance Preference” window and it will get installed automatically. Then, choose that newly installed theme. I am using Glossy P theme and it looks like Mac style.

Login Themes: login as root first. Type “gdmsetup” on the terminal and this opens up gdm settings. In “Local” tab, you will see currently installed login themes. Choose one you like.

Or, go to http://art.gnome.org/themes/gdm_greeter/, download a new login theme and install it in gdmsetup window. It says you can drag and drop to install, but I did not succeeded somehow and choosed “Add…” to manually install the themes. I am using “Sunergos Blue GDM Theme” and I like it a lot, simple and beautiful.

Mouse themes: you can download a theme pack from gnome-looks.org and unpack it to ~/.icons. For example, mine icon pack name is Obsidian. Then, after you unpack it into .icons directory, it becomes ~/.icons/Obsidian, rename it to ~/.icons/default and logout and login to see the new effect. I am using a theme called “ShereKhanX” to match the Mac window theme.

You can also customize splash screen, but I will leave it for later.