If you're looking for scripting access into client side JavaScript or Screen Scraping mechanisms to capture content as rendered in the browser, this will be of interest to you: I've started to notice Ruby now for about 3 years, stumbling onto Ruby on Rails only occasionally to find it dispereased sparsely, but herald proudly, within the development community. Until recently I've pretty much ignored Ruby and have stuck with traditional Lamp platforms, relying on PHP for server side scripting. Something I've wanted to do for a long time is to automate web browsing tasks. While I've used Perl's Mechanize library, my most pressing desire was to capture client-side JavaScript. My research uncovered two possible solutions. I found a firefox extension JSSh, a TCP/IP JavaScript Shell server for Mozilla, over at Ideas for Dozens: Telnet to JavaScript. JSSh acepts a telnet connection interface to the JavaScript Mozilla's environment. While JavaScript Window objects are passed as objects in JSSh, there seems to be limitations, as these objects do not seem to offer full inheritance of Window Objects. Basic Math, Array and other objects are present, but what I needed was the Window.setTimeout() method. Maybe I am not fully understanding the functionality of JSSs, but if it has more features, they're not well documented. For certain limited applications, JSSh offers great flexibility to solve problems by providing any telnet capable application access to JavaScript and is non the less very cool. My next tangent was found in Watir, an automated IE Screen Scraper, written in Ruby. With Ruby, and the libraries Watir and BeautifulSoup, I was able to automate a full function screen scraper in a couple hours (should have been minutes, had I already been familiar with Ruby) The Class has three functions: It opens a specific page, logs in if required and then monitors the contents of a specific HTML tag. When the content changes, it raises an alarm. On initialization:
  • Open desired web page in a hidden IE window
  • Login if redirected to login page
  • Hold the contents of a single specific HTML tag in a Class variable
On updates:
  • Wait a specified delay interval
  • Refresh the page
  • Raise alarm and open a visible IE window if content has changed
OK, I guess now I'm a Ruby fan too. I've been reading Ruby Documentation ever since. Backed by Apple, I'm sure Ruby on Rails is destined for even more popularity.

Dynamic Page QR Code

Popular Posts

My LinkedIn PingTag


View My Stats