background image

Content tagged with: alias

Eric's picture

In this tutorial I'll show how you can setup a server to parse email with a PHP script. This tutorial assumes that your server is configured to receive email (I wrote this using a virtual machine running postfix).

The first thing you'll need to do is configure an alias to direct email to a PHP script (instead of an email box). I added the following entry to the bottom of my /etc/aliases file and then ran the "newaliases" command to refresh my aliases database:

phpscript: "|php -q /usr/local/bin/email.php"

The above entry will pipe email sent to phpscript@MYDOMAIN to the designated PHP script.

And here's the script:

#!/usr/bin/php
<?php

// fetch data from stdin
$data = file_get_contents("php://stdin");

// extract the body
// NOTE: a properly formatted email's first empty line defines the separation between the headers and the message body
list($data, $body) = explode("\n\n", $data, 2);

// explode on new line
$data = explode("\n", $data);

// define a variable map of known headers
$patterns = array(
 
'Return-Path',
 
'X-Original-To',
 
'Delivered-To',
 
'Received',
 
'To',
 
'Message-Id',
 
'Date',
 
'From',
 
'Subject',
);

// define a variable to hold parsed headers
$headers = array();

// loop through data
foreach ($data as $data_line) {

 
// for each line, assume a match does not exist yet
 
$pattern_match_exists = false;

 
// check for lines that start with white space
  // NOTE: if a line starts with a white space, it signifies a continuation of the previous header
 
if ((substr($data_line,0,1)==' ' || substr($data_line,0,1)=="\t") && $last_match) {

   
// append to last header
   
$headers[$last_match][] = $data_line;
    continue;

  }

 
// loop through patterns
 
foreach ($patterns as $key => $pattern) {

   
// create preg regex
   
$preg_pattern = '/^' . $pattern .': (.*)$/';

   
// execute preg
   
preg_match($preg_pattern, $data_line, $matches);

   
// check if preg matches exist
   
if (count($matches)) {

     
$headers[$pattern][] = $matches[1];
     
$pattern_match_exists = true;
     
$last_match = $pattern;

    }

  }

 
// check if a pattern did not match for this line
 
if (!$pattern_match_exists) {
   
$headers['UNMATCHED'][] = $data_line;
  }

}

?>

At this point in the code, the body of the message will be contained in the $body variable and the headers will be in $headers.

Here is an example of the parsed headers (using print_r()):

Array
(
    [UNMATCHED] => Array
        (
            [0] => From root@Eric-Centos.localdomain  Sun Jan 10 21:49:50 2010
        )

    [Return-Path] => Array
        (
            [0] => <root@Eric-Centos.localdomain>
        )

    [X-Original-To] => Array
        (
            [0] => phpscript
        )

    [Delivered-To] => Array
        (
            [0] => phpscript@Eric-Centos.localdomain
        )

    [Received] => Array
        (
            [0] => by Eric-Centos.localdomain (Postfix, from userid 0)
            [1] => id 4D03F30131; Sun, 10 Jan 2010 21:49:50 -0500 (EST)
        )

    [To] => Array
        (
            [0] => phpscript@Eric-Centos.localdomain
        )

    [Subject] => Array
        (
            [0] => This is the subject
        )

    [Message-Id] => Array
        (
            [0] => <20100111024950.4D03F30131@Eric-Centos.localdomain>
        )

    [Date] => Array
        (
            [0] => Sun, 10 Jan 2010 21:49:50 -0500 (EST)
        )

    [From] => Array
        (
            [0] => root@Eric-Centos.localdomain (root)
        )

)

Now, you have all the email headers and message body parsed. You can do whatever your heart desires with the data, like insert it into a database or even create nodes!

Eric's picture

In the future, I will try to elaborate more on how to improve SEO in a Drupal. But for right now, here are some notes on what I have done with this site.

1. Install the Google Analytics module (http://drupal.org/project/google_analytics). You'll need to create a Google account if you have not already done so. This will monitor your visitors and web traffic.

2. Configure your URLs
- Enabe clean URLs. This uses Apache mod_rewrite to create virtual directory structure in your query strings.
- Enable the path module so you can rename URLs to whatever you like. Instead of node/#, you can make them more descriptive.
- Install the Pathauto module (http://drupal.org/project/pathauto). This module can be configured to automatically create a URL path alias based off of taxonomy, node title, menu structure, etc. I find it useful to configure path aliases based off menu structure and node titles. For example, here is a sample menu structure and the follow aliases I would use:

Home
>> My Hobbies
   >> Photography
      >> node/67
      >> node/68

My-Hobbies/Photography/Hiking
My-Hobbies/Photography/Ralphie-the-Cat

3. Install the XML Sitemap module (http://drupal.org/project/xmlsitemap). This module allows you to generate an XML sitemap that can be submitted to search engines automatically. You can see mine here: (http://thedrupalblog.com/sitemap.xml). I set my site to submit the sitemap to each available search engine. I also recommend signing up for the Google Webmaster Tools (http://www.google.com/webmasters/tools). This will allow you to monitor and configure the way Google analyzes your XML sitemap.

4. Install the Global Redirect module (http://drupal.org/project/globalredirect). This module will check to see if a path alias exists are redirect the user as necessary. For instance, if a user went to node/#, this module would redirect them to the more search engine friendly URL alias.

5. Use the taxonomy module. It's a great way to categorize your content. For this site, I use free tagging, so I do not have to maintain a definitive list of terms. It enables me to type in comma separated lists of terms that relate to my content. This also allows your users to click on your taxonomy terms and view contact what has been tagged with the same term.

Additional Notes
- If you are using automatic path aliases via pathauto, be careful when editing your nodes; you're aliases may be updated which can affect your menu structure and page links.
- I think it's important to setup pathauto immediately after installing Drupal. That way, when you're ready to start adding content to your site, you'll already be following a standard naming convention.
- I had some difficulty getting XML Sitemap and Pathauto to work together. At first, not all of my pages and taxonomy terms where showing up in the sitemap. I found a module called Module Weight (http://drupal.org/project/moduleweight) which helped alleviate some of my headache.