NodeJS for everyday things

HEADS UP! This article was written for an older version of node. More up-to-date information may be available elsewhere.

Everyday things:

Those little programs you write quickly to get something done like counting pages in a text document.

Page counting

Recently I was writing an essay and I needed to calculate a page count. My text editor was great at giving me a word count. I used the result and divided it by 350 to ball park my page count. This became problematic because I'm lousy at doing math in my head. Next I decided to chain some Unix commands together to do the job. I played with Bash, Unix's wc and cut commands I came up with this:

#!/usr/bin/env bash
let x
=$(wc -w $1 | cut -d\  -f 1)/350
echo
"Page count for $1 is $x";

This was nice. It was short. It worked but I wanted to procrastinate a little more. I started thinking about the lack of nuance in this approach. What was wc really counting? How was it handling punctuation and new lines? I could have tracked down the source code for wc but that seemed a little excessive. I pondered on - process the file with sed before piping it to wc; wrap it in a function for processing multiple files; keep a running total and pipe stats into an SQLite database. I found myself running down the rabbit and out to the sea. It is truly impressive what you can conjure up from the Unix command line and build into a nice little complicated shell script. Sometimes it is easier to visualize something in one language. I thought of Node.

The page count problems are basically a simple analysis of text with some accounting. Could I quickly write a program using node to do something so mundane? Yes and it was surprisingly straight forward. I fired up node-repl and started playing around before typing up this:

word-count.js
#!/usr/bin/env node
/**
 * Calculate the approximate the number of "pages" in a text document based on
 * a word count of 350 words per page.
 */


var sys = require('sys'),
    fs
= require('fs');

/* Include some instructions for when I forget how this works. */
function USAGE(message) {
 
var error_code = 0;
  sys
.puts("\n  USAGE: pagecount FILENAME\n\n" +
   
"  Show the estimated page count of a file based on 350 words per page.\n" +
   
"  FILENAME should be the name of a utf-8 encoded text file.\n" +
   
"\b\n" +
   
"  Example:\n\t\tpagecount MyFile.txt\n\n  Estimates the page count of MyFile.txt\n");

 
if (message !== undefined) {
    sys
.puts(message);
    error_code
= 1;
 
}
  process
.exit(error_code);
};

/* PageCount() analyze the file and displays the results */
function PageCount(filename) {
  fs
.stat(filename, function (stat_error, stat) {
   
if (stat_error) {
      USAGE
("ERROR: " + filename + ", " + stat_error);
   
}

   
if (stat.isFile()) {
      fs
.readFile(filename, 'utf8', function (read_error, content) {
       
var subtotal_words = 0;
       
if (read_error) {
          USAGE
("ERROR: Can't read " + filename + ". " + read_error);
       
}
       
/* Replace all non-letter characters with a single space. */
        subtotal_words
= content.replace(/\W+|\s+/gm,' ').split(' ').length;
        page_count
= (subtotal_words/350);
       
if (Number(page_count).toFixed(0) <= 1) {
          sys
.puts(filename + "(" + subtotal_words + " words): " +
           
Number(page_count * 100).toFixed(0) + "% of one page.");
       
} else {
          sys
.puts(filename + "(" + subtotal_words + " words):" +
           
Number(page_count).toFixed(2) + " pages.");
       
}
     
});
   
} else {
      USAGE
("ERROR: " + filename + " is not a file.");
   
}
 
});
};

if (process.argv.length < 3) {
  USAGE
();
}

/* For each file I want to tally up call PageCount() */
for (var i = 2; i < process.argv.length; i += 1) {
 
PageCount(process.argv[i]);
}

pagecount reports the plain text version of this essay is 636 words or 1.82 pages including the source code examples. It's a little longer than my shell script. On the other hand it is easier to read and with a little modification I can embed it as a web service or put it into a web page.


View the discussion thread.blog comments powered byDisqus