http proxy in rails

February 1st, 2010 by admin

Put this in the file where a connection to an outside service is made. This setup only works for a http proxy that doesn’t require login.

ENV['HTTP_PROXY'] = "http://proxyserver:port"
ENV['HTTPS_PROXY'] = "http://proxyserver:port"
ENV['http_proxy'] = "http://proxyserver:port"
ENV['https_proxy'] = "http://proxyserver:port"

Clear log file

January 8th, 2010 by admin

This command can be used to easily clear a log file without removing it and creating it again (just to avoid problems with permissions). Only available on Unix systems.

cat /dev/null > log/development.log

Update String Column With REPLACE Command (mysql)

June 29th, 2009 by admin

Sometimes the wrong data gets into the database and you have to update the record, that’s no biggie. You run a simple update command and everything is back on track. In other cases the error is not on one record but on many.

Alarm.

I don’t now why exactly, but when I have to do the same thing twice I’m already looking for a way to automate it. So this is what happened.

I added a new domain name to the dns tables dns_soa and dns_rr. This is done through a rails application I’m working on, but the thing I forgot was to check on spaces. So instead of having ‘domainname.com’ in my tables, ‘domainname.com ‘ (an extra space at the end is added) was inserted. The space has to go away and this is how I’ve done it.

First check what you are going to do:

SELECT REPLACE(origin,'domainname ','domainname'),
FROM dns_soa
WHERE origin like 'domainname %';

Nothing is done to the tables just yet, so be sure to check the result. After carefully checking if that was the result I wanted, I ran the actual update.

UPDATE dns_soa
SET origin = REPLACE(origin,'domainname ','domainname')
WHERE origin like 'domainname %';

Sortable Tables

June 19th, 2009 by admin

One of the best things I’ve found recently is a brilliant jquery plugin called Tablesorter. This plugin lets you do all kind of things that we’ve come so accustomed to do with the ubiquitous Microsoft Excel. The things I do with tabular data are:

  • Sort
  • Filter
  • Find

These are the basic things that allow for easy representation of data the way you like it.

The setup of the plugin is quite easy, although you have to get it right. Everyone knows that debugging javascript is a bitch (although firebug helps alot). I’ve integrated this in a rails project and the results are beyond my expectations. It handles tables with 2000 records just fine. Sorting and finding is quasi instant. All-in-all a nice user experience.

Permalinks .htaccess and mod-rewrite

June 17th, 2009 by admin

Paste the code below in your apache website.com.vhost file.

<Directory /var/www/website.com/web>
  Options FollowSymLinks
  AllowOverride Indexes AuthConfig Limit FileInfo Options
  Order allow,deny
  Allow from all
</Directory>

This allows you to use .htaccess files and paste in the following code that wordpress requires for pretty urls which are used as permalinks. Options is not really needed for mod_rewrite, but is commonly used for code inserted in .htaccess.

<IfModule mod_rewrite.c>
  RewriteEngine On
  RewriteBase /
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteRule . /index.php [L]
</IfModule>

Take care now.

How to use tar

June 17th, 2009 by admin

I need it frequently, but always, and I really mean always, forget the options. Without further ado, how to tar and untar (and gzip as a bonus).

The following code will compress all files and folders in the current directory to the filename you specified and store it in the underlying directory. This is important as otherwise your archive will be stored in your archive while tarring.

tar -pczf ../.tar.gz *

The following command will untar everything in the current directory. What I like to do is create a directory, copy the tar.gz file into that directory and run the command.

tar -xzvf .tar.gz

Hope this helps.

Tree-based XML parser in Ruby

April 12th, 2007 by admin

Yesterday I wrote an article about the event-driven streamparser in Ruby. While this solution is very fast and has a very small memory footprint, it comes at the cost of expressibility. You have to write a decent amount of code to get it working. Luckily for us there is an alternative.

For most applications I would recommend to use a tree-based approach. This is a lot easier and the code is more concise. Navigation is a blast because you can use XPath path descriptions. Observe following code block that does exactly the same as the event-driven parser. (We reuse the Playlist and Song class from previous article)

require 'rexml/document'

file = File.new('playlist.xml')
playlistxml = REXML::Document.new(file)

playlist = Playlist.new
playlistxml.elements.each("playlist/entry") { |song|
  title  = song.elements["title"].text
  artist = song.elements["artist"].text
  album  = song.elements["album"].text
  playlist.append(Song.new(title,artist,album))
}
puts playlist.to_s

Event-driven XML Parser in Ruby

April 11th, 2007 by admin

This article shows how to write an event-driven xml parser in Ruby. Event-driven xml parsers are typically used if speed is important or large amounts of data are in play. One of the best known implementations is the SAX2 parser for Java. No Java here though, all Ruby baby.

To build an xml parser, you need three things

  • An xml file
  • A listener to handle the xml parsing
  • A main program that binds everything

This is not hard, but like so many things you need to know how to do it.

An xml file

In order to explain the xml parser we, of course, need an xml file. I’m not going to begin with xml schema’s and stuff, because that is not the scope of this article. It is just a simple file to keep a playlist of songs.

<playlist>
  <entry>
    <title>La Femme d'Argent</title>
    <artist>Air</artist>
    <album>Moon Safari</album>
  </entry>
  <entry>
    <title>Talisman</title>
    <artist>Air</artist>
    <album>Moon Safari</album>
  </entry>
  <entry>
    <title>Saudade Pt. 2</title>
    <artist>Arsenal</artist>
    <album>Outsides</album>
  </entry>
  <entry>
    <title>Don't Cry For Louie</title>
    <artist>Vaya Con Dios</artist>
    <album>Vaya Con Dios</album>
  </entry>
</playlist>

The listener

The most important thing that you need to understand is that an event-driven parser does not keep track of some sort of element tree. This is in contrast with tree like parsers (also called DOM parsers) that first load the complete xml file in memory and allow the user to easily find or iterate over certain elements. The biggest disadvantage of DOM parsers is their memory footprint and slower speed compared to event-driven parsers.

For small files like the one in this article, I would always prefer a DOM parser, but imagine a file containing one million entries (with a file size of more than 100 MB). Now we get a whole different story. The computer will struggle to get it all into memory and processing will be dead slow (believe me, I speak from experience). This is where the streamparser will shine. A very small memory footprint and super fast processing. The toll you have to pay is that you need to keep track of which elements you have encountered and act accordingly. Don’t worry we will tackle this.

The listener is the module that does all the hard work. It contains three important methods.

class XMLListener
  def tag_start(name, attrs)
  end
  def text(text)
  end
  def tag_end(name)
  end
end
  • tag_start is called when a new xml element is encountered (e.g. <entry>). The name parameter holds the tag name (e.g. for <title> the name is title)
  • text is called when an xml element contains text (e.g. for <artist>Air</artist> the text is Air)
  • tag_end is called when an xml element is closed (e.g. </entry>).

To show you how to use the parser, let’s imagine that we want to objectify the file. Therefore we need two classes, Playlist and Song.

The Song class holds the data for a song (title, artist, album).

class Song
  def initialize(title, artist, album)
    @title = title
    @artist = artist
    @album = album
  end

  def title
    @title
  end
  def title=(title)
    @title = title
  end

  def artist
    @artist
  end
  def artist=(artist)
    @artist = artist
  end

  def album
    @album
  end
  def album=(album)
    @album = album
  end

  def to_s
    "@song{title=#@title, artist=#@artist, album=#@album}"
  end
end

The Playlist class contains a queue of Songs.

class Playlist
  def initialize
    @songs = Array.new
  end

  def append(song)
    @songs.push(song)
    self
  end
  def delete_first
    @songs.shift
  end
  def delete_last
    @songs.pop
  end
  def [](index)
    @songs[index]
  end

  def to_s
    result = "\@songs{"
    @songs.each { |song|
      result += song.to_s + ", "
    }
    result += "}"
  end
end

It is always a good idea to test the functionality of your classes with writing a small unit test. This may seem trivial, but it is a good practice to always do it (or at least try).

require 'test/unit'

class TestPlaylist < Test::Unit::TestCase
  def test_append
    playlist = Playlist.new
    assert_equal("@songs{}", playlist.to_s)
    s1 = Song.new('title1', 'artist1', 'album1')
    playlist.append(s1)
    assert_equal(s1, playlist[0])
  end
  def test_delete
    list = Playlist.new
    s1 = Song.new('title1', 'artist1', 'album1')
    s2 = Song.new('title2', 'artist2', 'album1')
    s3 = Song.new('title3', 'artist3', 'album2')
    s4 = Song.new('title4', 'artist4', 'album3')
    list.append(s1).append(s2).append(s3).append(s4)
    assert_equal(s1, list[0])
    assert_equal(s3, list[2])
    assert_nil(list[9])
    assert_equal(s1, list.delete_first)
    assert_equal(s2, list.delete_first)
    assert_equal(s4, list.delete_last)
    assert_equal(s3, list.delete_last)
    assert_nil(list.delete_last)
  end
end

To test this, you can put everything sequentially in one file and run it. Now we have created the data objects, it is time to focus on the real task at hand. Parsing the xml file and producing a Playlist containing multiple Songs.

I’ll begin with giving you the complete listener class and then explain each method one by one.

class PlaylistXMLListener
  def initialize
    @textbuffer = '' # a buffer for the text extraction
    @element = ''    # to keep track of which element we are currently processing

    @playlist_tag = 'playlist'
    @entry_tag = 'entry'
    @title_tag = 'title'
    @artist_tag = 'artist'
    @album_tag = 'album'

    @playlist = Playlist.new
  end

  def tag_start(name, attrs)
    if name == @entry_tag
      @song = Song.new('','','')
      @element = @entry_tag
    elsif name == @title_tag
      @element = @title_tag
    elsif name == @artist_tag
      @element = @artist_tag
    elsif name == @album_tag
      @element = @album_tag
    end
  end

  def text(text)
    @textbuffer = text
  end

  def tag_end(name)
    if name == @entry_tag
      @playlist.append(@song) # append song to playlist
    elsif name == @title_tag
      @song.title = @textbuffer
    elsif name == @artist_tag
      @song.artist = @textbuffer
    elsif name == @album_tag
      @song.album = @textbuffer
    end
    # Clear the buffer any time we close
    @textbuffer = ''
    @element = ''
  end

  def playlist
    @playlist
  end
end

The initialize method contains some bookkeeping things.

  • Two buffers, @element and @textbuffer, to keep track of which element we are processing at the moment and the text that is in that element.
  • Some _tag variables that represent the names of the xml elements.
  • The @playlist variable holds an instance of the Playlist class.

The tag_start method is called every time a new xml element is encountered. What happens here is that we create a new Song if we come across an <entry> element. Besides that we update @element to correctly represent the element we are processing at the moment.

The text method updates the @textbuffer every time it is called.

The tag_end method is called every time an xml element is closed. What happens here is that the title, artist and album attributes of the @song variable get assigned with the text from the buffer each time the corresponding tag is encountered. If we run across the </entry>, we append the song to the @playlist variable. This way all the songs are being appended to the playlist.

At the end I also put in a method that returns the @playlist variable, so we can do things with the objectified data of course.

Putting it all together

When all the hard work is done of writing the objects and writing the playlist xml-parser, we can tie everything together and run the code.

require 'rexml/document'
require 'rexml/parsers/streamparser'

listener = PlaylistXMLListener.new
source = File.new "playlist.xml"
REXML::Document.parse_stream(source, listener)
puts listener.playlist.to_s

As you can see, this step is really easy. We first import two classes of the ruby core libraries, namely REXML::Document and REXML::Parsers::StreamParser and then create an instance of the listener that will be called by the streamParser. Next we create a new File that contains the xml data. Then we call the class method, parse_stream, on REXML::Document that will use the listener to parse the data in the source as a stream. Lastly we output the playlist, that is stored in the listener, to the standard output.

That’s all folks. I hope you can use this stuff in your own projects and see you next time.