Installing MagentoCommerce – problems

MagentoCommerce

MagentoCommerce is an open-source enterprise level ecommerce system suitable for high end ecommerce websites. We’re in the throes of developing a large solution for a local company, and the developers involved in provisioning the site are installing local copies of Magento on their Apache servers to test with.

Here’s a list of issues that I’ve come across with installing Magento, and how I’ve solved them. For the record I’m using Magento 1.3.2.2 (released 19.07.2009)

1. Can’t get past the second install step

This is the step that asks you for your database password. If the database you’re specifying doesn’t exist then you can’t proceed. I’m sure there’s a bug here because it’s like an error message isn’t displayed. Would love to dig into this but our project is on a tight time frame. MAKE SURE YOU CREATE THE DATABASE YOURSELF FIRST (no tables required).

2. Can’t log in after installation

If after logging in you go to the /admin directory and you can’t log in, it’s probably because of the URL that you used to install it. I tried installing it to http://localhost/, http://127.0.0.1/ and http://192.168.2.10/ (my IP) all with the same result. After googling it seemed to be something to do with Magento refusing to set a cookie for a top-level domain (or somesuch). Workarounds are to use a host name that has a dot in it, e.g. http://localhost.localdomain/ (or in my case, http://bob-desktop.local). Add this to your hosts file, you may have to restart your browser for it to pick it up.

3. Firefox keeps asking you to download a PHTML file.

After doing a completely fresh install of Apache2, PHP5, mysql, Magento etc on a fresh machine (Ubuntu 9.04, Firefox 3.0.11) Firefox kept asking me what I wanted to do with the PHTML file when accessing the freshly untarred Magento directory. It took me ages to find an answer to this, but it was as simple as clearing your cache in Firefox. I dicked around with the Apache configuration, permissions, .htaccess files etc and finally found a comment about the Firefox cache.

4. AllowOverride All

Magento uses a .htaccess file in the root directory. I noticed that the default for /var/www is AllowOverride None which prevents Apache from looking at the content of .htaccess files. While this didn’t cause me trouble, I did set it to “AllowOverride All” while trying to solve #3. The file is /etc/apache2/sites-enabled/000-default, line 11. Note there may be security implications of doing this if you’re the administrator of a shared hosting environment and as such you should read about the AllowOverride directive.

Installation requirements:

From a stock Ubuntu 9.04 machine I had to install the following packages to support Magento:

  • libapache2-mod-php5
  • php5-curl
  • php5-mcrypt
  • php5-mysql
  • php5-gd
  • mysql-server

Also, I have this command on standby to reset the Magento environment back to scratch (deletes the magento directory, drops the database, untars the source file and creates the empty database again)

cd /var/www ; rm -Rf magento ; tar xvfz /home/bob/magento-1.3.2.2.tar.gz ;
   chmod -R a+rw magento ; mysqladmin -uroot -proot -f drop magento ;
   mysqladmin -uroot -proot create magento

That’s all for now.

Screen scraping with jQuery

jQuery logoDuring the course of my job I often find myself faced with the task of migrating information from an existing website to our own content management system.  In the past my approach to this task has been to assess the source code of the existing site and see whether it’s feasible to use a combination of curl, regular expressions and string manipulation.  Sometimes this is straightforward but increasingly this method is becoming less and less viable as it’s too intensive.

I’ve been using jQuery a lot recently and it occurred to me that I could use jQuery’s selectors to target the information that I’m interested in a web page, and then using Ajax POST it to my own script that would be ready waiting to then do something useful with the data, e.g. validate it and save it in a database.  For educational purposes I was keen to keep this completely client-side if possible (except for a script to receive the information).  See later on for a server-side solution.

The situation I was up against was a page that had a heap of data in a table (about 90 items), but the table was interspersed with random images to split it up and make it more pleasing to the eye.  Fortunately for me, all of the data that I wanted was neatly wrapped in <div class=”information”></div> tags.  Selecting these div tags with jQuery is really easy by using $(‘div.information’).

My first problem was that in order to use jQuery, the web page you’re looking at has to be using it.  Fortunately there’s a quick bookmarklet called jQuerify that allows you to load jQuery onto any web page.  Once you’ve got that then you can write further bookmarklets of your own to do stuff.

So, my evil evil plan was to combine a jQuery selector, jQuery’s each() construct, and jQuery’s ajax support to post the content of each div to a “scraper” script, like so:

$('div.information').each(function(){
  $.post('http://localhost/scraper.php',{
    data: this.innerHTML
  });
});

I loaded my source page, clicked the jQuerify bookmarklet and then pasted the code above into the Firebug console (what, oh you’ll need that…) and it was flawless … except that the browser security model stepped in and prevented the ajax call because the XHTTPRequest object is not allowed to post information from one domain to another.  I was stuck – I googled around for a while looking for workarounds, and investigated the use of JSONP but the transport method seemed more weighted at retrieving information rather than posting it.

So, I was stuck with a simple question: “How can I get information from one site to another by using the browser?” – the simplest answer to this question is of course to have a form on the source website, that when submitted posts to the target.  Thanks to the power of JavaScript, modifying the DOM of a loaded web page is a doddle.  Therefore it should be simple to create a form on the page after it has loaded (client side, remember), create and populate some form fields with data and then submit the form to my scraper script.

Suddenly my intentions had outgrown a bookmarklet, but I would still need one for jQuerify and one for my “Scraper Utils”.  My new bookmarket simply asked jQuery to load a local JavaScript file in exactly the same was that jQuery was loaded in the first place:

javascript:$.getScript('http://localhost/scraper.js');

Now I had the freedom of writing chunk loads of stuff in my local scraper.js file.

Scraper = {};
Scraper.createForm = function()
{
  var form = document.createElement('form');
  form.setAttribute('method', 'POST');
  form.setAttribute('action', 'http://localhost/scraper.php');
  document.getElementsByTagName('body')[0].appendChild(form);
  return form;
}
 
Scraper.createSubmitButton = function()
{
  var button = document.createElement('input');
  button.setAttribute('type', 'submit');
  return button;
}
 
Scraper.createFormField = function(name)
{
  var field = document.createElement('textarea');
  field.setAttribute('name', name);
  field.setAttribute('rows', 10);
  field.setAttribute('cols', 50);
  return field;
}		
 
var ScraperForm = Scraper.createForm();
$('div.information').each(function(){
  var field = ScraperForm.appendChild(Scraper.createFormField('data[]'));
  field.value = this.innerHTML;
});
// Create a field that we can post with:
ScraperForm.appendChild(Scraper.createSubmitButton());

You can see here that I’ve set up a few functions, createForm(), createFormField(), createSubmitButton() and then at the bottom I wrap them all together with the $(‘div.information’).each(…) construct.  The end result of this is that when I click my bookmarklet that includes the scraper.js script, a form is created at the bottom of the page and a textarea for each div.information is created that holds the innerHTML from that div.

Then, by clicking the Submit button, the browser posts all of that information across to http://localhost/scraper.php where I then collect the information from $_POST['data'] and poke it into a database.

It’s pretty rough and ready but could easily be extended to do other things like allow you to specify the selector and target URL for the post when you click the Scraper bookmarket.

Server Side Solution

On my travels I also came across the “PHP Simple HTML DOM Parser” which claims a similar ability like so:

// Create DOM from URL or file
$html = file_get_html('http://www.google.com/');
 
// Find all images
foreach($html->find('img') as $element)
       echo $element->src . '<br/>';
 
// Find all links
foreach($html->find('a') as $element)
       echo $element->href . '<br/>';

You can get a hold of this from Sourceforge at the PHP Simple HTML DOM Parser website.

Ubuntu and Lightning – not working, application seems buggy?

I’ve switched over to running Ubuntu 8.10 full time at work now.  The only things I miss are TortoiseSVN and the application integration with the desktop (Thunderbird/W32 for example lets you drag attachments onto the desktop).  Oh, and I miss TimeSnapper (classic – free download) too, but will get off my chuff and work out an alternative using Xwd.

Anyway, at work we use the Lightning calendar plugin for Thunderbird, with the Google Calendar provider in order to collaborate on a calendar.  For the most part this works well as when not in the office you can fall back to Google Calendar.

I went down the path of installing Lightning into Thunderbird (download the XPI, browse to it etc…) but after the installation Lightning seemed broken.  The UI was mostly there but it looked buggy and nothing worked.  After hunting around for a reason, I came across this thread that suggested that the problem was that the libstdc++5 package had to be installed.

I was skeptical, but after reading half a dozen “me too” posts where the problem had been fixed I got stuck in.

  1. Uninstall the Lightning plugin from Thunderbird
  2. Open a terminal, and run this command: sudo aptitude install libstdc++5
  3. Reinstall Lightning from the XPI you downloaded

Then things came to life nicely.  I was disappointed that the state of Lightning without libstdc++5 appeared to be a buggy application rather than a specific error.

Goodbye Telecom

Next week I’ll be porting my Telecom phone number to a VoIP provider (2talk) and then getting naked DSL provisioned on my line.  This setup should leave me with full speed broadband, the benefits of a VoIP line (I’ll blog about that later) and all the features that 2talk provide for my voice line for only $85 per month.

Considering my broadband/phone bill currently sits at about $120 per month this will be a nice saving, plus an excuse to play with some technology :)   I’ll be using Snap for the naked DSL, only because Orcon (my current provider) haven’t unbundled in Dunedin yet and it looks like it’ll be quite a while.

If you don’t mind a bit of downtime on your phone line or broadband service, the steps are:

  1. Port your phone number to 2talk.
  2. Get naked DSL provisioned on your home line.
  3. Configure your system so that your phone rings at home by connecting to 2talk’s servers

You’ll need some hardware, and some charges (termination charges with your current broadband provider for example) may apply.

More on fail2ban

A while ago I blogged about a SSH attack – this had been going on unnoticed for some time.  Taking my typical fire-and-forget (gently forced by a busy family life) I simply installed fail2ban and did nothing else.  Finally I was in a position where I had to research fail2ban a little more to figure out how to make it work.

What is fail2ban?

It’s a python script (that runs as a daemon) which monitors log files in your /var/log file.  It monitors them for specific entries, for example “Failed password”, and then updates iptables rules to deny network access for the offending IP for a configured amount of time.

A good example of this is that if you try to ssh into my system three times unsuccessfully, you won’t be able to try again for 10 minutes.  This is sufficient to make automated brute force attacks useless.

Do you need it?

If you have a public-facing server with the ability to log into it (including web applications even) then you need this.  If you’re curious to see if you’ve been targeted for attacks, try running these commands as root on your server:

cat /var/log/auth.log | grep 'Failed password' | grep sshd | awk '{print $1,$2}' | sort | uniq -c
zcat /var/log/auth.log* | grep 'Failed password' | grep sshd | awk '{print $1,$2}' | sort | uniq -c

The first command examines your current auth.log file and the second examines your historical auth.log.[0-9] files. In my recent history (prior to configuring fail2ban properly) I had over 6,000 failed SSH login attempts on a single day just after Christmas.

What next?
The steps are:

bob@server:~$ sudo apt-get install iptables fail2ban
bob@server:~$ sudo /etc/init.d/fail2ban start

Now, you can check to see if it’s working by “pinging” the service:

bob@server:~$ sudo fail2ban-client ping
Server replied: pong

And you can get information on what’s currently been banned by examining the ssh “jail” – the jail is term used to describe the configuration and current black list for access from remote hosts:

bob@server:~$ sudo fail2ban-client status
Status
|- Number of jail:	1
`- Jail list:		ssh
bob@server:~$ sudo fail2ban-client status ssh
Status for the jail: ssh
|- filter
|  |- File list:	/var/log/auth.log
|  |- Currently failed:	0
|  `- Total failed:	52
`- action
   |- Currently banned:	0
   |  `- IP list:
   `- Total banned:	7

To test everything is working, simply try to log into your system incorrectly three times.  When you’ve done this and you look at the results of “fail2ban-client status ssh” you will see your remote IP in the list.  To unblock your IP, simply restart the fail2ban daemon (i.e. sudo /etc/init.d/fail2ban restart)

These pages were very useful when reading about fail2ban:

Random Images

Found these today on facebook, too good not to share….

#1: A Stein-smashingly good time!

#2: I have days like these …

#3: Just too true

#4: Fandom

#5: White-belly-making-cereal

WordPress Themes