Plain Text Reading and Processing with NodeJS, Ruby, and PHP

Hello coders!

There was a time when I wondered, which one has better performance, NodeJS, Ruby, or PHP? And I decided to make a little test. I didn’t think this test would conclude overall performance, but specifically for this context, I would know which one was the best. The test were divided into 3 phases, they were: preparation, result, and conclusion. I hope my little research here will help you in your development. And for your information, this test is using NodeJS (0.8.9), Ruby (2.0.0p0), and PHP (5.4.17).

Preparation

The first step was creating a dummy file text. I made a simple plain text formed by repeated lorem ipsum text. The file contains around 77000 lines and the size is almost 47 megabytes. I thought it was pretty much adequate for the test. Anyway, if you want to use the file, you can download it here: https://dl.dropboxusercontent.com/u/22086299/lipsum.txt

Let’s write a very simple code, I started it with NodeJS code.

var fs = require('fs'),
    fs_starttime;

fs.readFile('lipsum.txt', 'utf8', function(err, data) {
  fs_starttime = Date.now();

  if (err) {
    return console.log(err);
  }

  var start_time = Date.now(),
      splitted = data.split(' '),
      words = [],
      end_time, elapsed, fs_endtime, fs_elapsed;

  console.log('Total words: ' + splitted.length);
  fs_endtime = Date.now();
  fs_elapsed = fs_endtime - fs_starttime;
  console.log('Counting time: ' + fs_elapsed);
  for(var i=0; i<splitted.length; i++) {
    var key = splitted[i].toLowerCase();
    words[key] = words[key] + 1 || 1;
  }

  end_time = Date.now();
  elapsed = end_time - start_time;
  console.log('Elapsed time: ' + elapsed);
});

Next, I wrote Ruby code.

reader = File.read('lipsum.txt')
fs_starttime = Time.now
start_time = Time.now
data = reader.split(' ')
puts 'Words count: ' + data.size.to_s
fs_endtime = Time.now
fs_elapsed = fs_endtime - fs_starttime
puts 'Counting time: ' + fs_elapsed.to_s
words = Hash.new
(0).upto(data.size) do |idx|
  words[data[idx]] = (words[data[idx]].nil?) ? 1 : words[data[idx]] + 1
end
end_time = Time.now
elapsed = end_time - start_time
puts 'Elapsed time: ' + elapsed.to_s

And the last one, I wrote PHP code.

ini_set('memory_limit', '-1');
$file = file_get_contents('lipsum.txt');

$fs_starttime = microtime(true);
$start_time = microtime(true);
$words = explode(' ', $file);
echo 'Words count: ' . count($words) . PHP_EOL;
$fs_endtime = microtime(true);
$fs_elapsed = $fs_endtime - $fs_starttime;
echo 'Counting time: ' . $fs_elapsed . PHP_EOL;
$uwords = array();
for($i=0;$i<count($words);$i++) {
  $key = $words[$i];
  $uwords[$key] = (isset($uwords[$key])) ? $uwords[$key] + 1 : 1;
}
$end_time = microtime(true);
$elapsed = $end_time - $start_time;
echo 'Elapsed time: ' . $elapsed . PHP_EOL;

I had made my preparation, the next step was to see the result.

Result

For each code, I executed it 3 times. I intended to see, whether it was a stable result or not. So, let’s see the result of NodeJS code.

NodeJS File Reading and Performance Result

NodeJS File Reading and Performance Result

From the 3 program executions, NodeJS performance, in average, was 2.992 sec.

Now, let’s see the result of Ruby code execution.

Ruby File Reading and Performance Result

Ruby File Reading and Performance Result

From the written code, Ruby detected in total 6,879,744 words. Based on the detected words, Ruby code needed, in average, 4.954 sec to read and processed the file.

This is the result of PHP code execution.

PHP File Reading and Performance Result

PHP File Reading and Performance Result

The word dectected by PHP was the same with NodeJS. But, the performance was significantly different. PHP needed, in average, 6.129 sec to read and processed the file.

Conclusion

As you can see, from the results above, NodeJS has the best performance!

I actually have heard about this from many people but this is the first time I see the real proof. It’s pretty fast and I believe it could be very helpful for you in developing a well performed web application. But, you need to remember this line:

Node.js is a platform built on Chrome’s JavaScript runtime for easily building fast, scalable network applications. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient, perfect for data-intensive real-time applications that run across distributed devices. – taken from NodeJS website

That means, NodeJS is actually built to solve network scaling problem, not for handling high computation program. Any application built with NodeJS is a single thread application, so if you have a heavy computation or blocking program, it will significantly decrease NodeJS actual capability. But, if you insist on developing a large application with heavy computation, I suggest you use message passing platform like RabbitMQ.

Well, that’s it for today. I hope you enjoy it. Thanks for reading, guys!

About these ads

2 thoughts on “Plain Text Reading and Processing with NodeJS, Ruby, and PHP

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s