Setting up a PHP development environment on your local system

This is an old post I had written on google docs, before VMware Server was free for the general user. Thought could be still more relevant to the general developer for testing in Linux envoirment. VMPlayer I beleive is lighter than VMware Server.

There are 2 stages of setting up the development & testing environment for yourself. The preferred work environment is Linux. If you already on Linux then you can safely skip Stage I, you can directly proceed to Stage II, also if you have Linux already installed, just cross check if you have the LAMP (Linux + Apache + MySQL + PHP) server installed.

Pre Installation Software Download Links
STAGE I: Setting up Ubuntu on VMWare.
  • Download Ubuntu/Xubuntu/Kubuntu Desktop edition ISO, what ever you like... all the same just different filemanagers (Gnome/Xfe/KDE resp.). If you are comfortable with command line, then you can also install the Server edition with no XWindows and only commandline.
  • Download and install VMPlayer for your system.
  • Download qemu for your operating system and extract in any directory you like.
  • Open your commandline to go ahead to bin folder of qemu and type
qemu-img create -f vmdk ubuntu.vmdk 3G
  • Here we are creating a 3GB, virtual hard drive in VMWare format. This file will now act as our hard drive for VMWare to install an OS on it. The name of the file is ubuntu.vmdk, you can name it what ever you wish to.
  • Now the main vmx file (The VMWare configuration file). change the highlighted paths as per your system.

     #!/usr/bin/vmware
    config.version = "8"
    virtualHW.version = "4"
    ide0:0.present = "TRUE"
    ide0:0.filename = "ubuntu.vmdk"
    # The amount of RAM you want to allot to the Operating system. For Desktop use 512 and server just 256.
    memsize = "512"
    MemAllowAutoScaleDown = "FALSE"
    ide1:0.present = "TRUE"

    #ide1:0.fileName = "auto detect"
    #ide1:0.deviceType = "cdrom-raw"

    ide1:0.fileName = "ubuntu-7.04-server-i386.iso"
    ide1:0.deviceType = "cdrom-image"

    ide1:0.autodetect = "TRUE"
    floppy0.present = "FALSE"
    ethernet0.present = "TRUE"
    usb.present = "TRUE"
    sound.present = "TRUE"
    displayName = "Ubuntu LAMP Server"
    guestOS = "ubuntu"
    nvram = "ubuntu-server-three.nvram"
    MemTrimRate = "-1"

    ide0:0.redo = ""
    ethernet0.addressType = "generated"
    uuid.location = "56 4d ce 99 e0 d2 2b bf-73 47 ac 62 65 13 57 86"
    uuid.bios = "56 4d ce 99 e0 d2 2b bf-73 47 ac 62 65 13 57 86"

    tools.syncTime = "TRUE"
    ide1:0.startConnected = "TRUE"

    uuid.action = "create"

    checkpoint.vmState = "ubuntu-lamp-server.vmss"

    isolation.tools.hgfs.disable = "TRUE"
    virtualHW.productCompatibility = "hosted"
    tools.upgrade.policy = "manual"

    tools.remindInstall = "TRUE"

    usb.autoConnect.device0 = ""
  • After setting all the paths correctly, if you have VMPlayer installed, just save the vmx file, (call it ubuntu.vmx) and double click on the file.
  • If all the paths are set correctly, the VMPlayer will boot up the virtual drive and show the ubuntu installation menu. This is easier than windows a million times.
  • At the end of the installation (server or desktop) the process will ask you, if you want to install LAMP server. Select it and let it install the LAMP server.
  • Reboot the system, and you are good to go. You have successfully installed Ubuntu on VMWare.
STAGE II: Check the configuration setup of Apache/MySQL/PHP

Now we'll check if PHP and Apache have been successfully installed on your system. Go the /var/www folder, you should be seeing a apache2-default folder out there. If you see these then apache seems to be installed. Just open the browser and type http://localhost you should be able to see the apache-default folder over there. If you a receive a page not found then apache is not running or installed properly.

If things look good, next stage is to check PHP installation. Create a file named index.php and type the following there.

 <?php phpinfo(); ?> 

Just refresh the localhost (place the file in /var/www) That should give you loads of PHP info on the screen in blue, purple and a huge table. If that happens you are good to go!!! Else something is wrong!!!.

MySQL Check
Open the command line and type

mysql -uroot

If it opens up with a mysql> prompt then it's good else something is wrong. Your mysql password is blank.

If all sounds good, now go to Stage III

STAGE III: Applications to setup.

Download
  • JOOMLA
  • WORDPRESS
Extract them to the /var/www folders and access them using http://localhost/joomla and http://localhost/wordpress respectively if you extracting them to the folders on these names.

How to go about setting them, start the index.php or read the readme file and you should be good to go.

The demise of MyGadgetBuilder

Hello all, my previous blog, MyGadgetBuilder.com has been officially shut down. I'll not be renewing the domain name any more, so some one can go ahead and claim it. It was a nice learning experience. Being a developer with some design skills, putting up the site was a piece of cake, but getting people to post on the site was kind of painful.

People had issues with copyright, data protection and all kinds of issues. Also things/people change as time passes by, everyone gets busy with their own profession and has little time for off-track exploration for their old love (electronics). I don't blame them, priorities do change; but took me some time to realize :) better late than never.

Well all for good. Hence forth MGB will point to this blog (till end of subscription life). Instead visit Instructables a nice site, which already existed on the lines of the world I was planning to build.

Cloud Computing on VMWare or Xen

What is Cloud Computing (CC)?
A lot has already been said and blogged about cloud computing. Just to give you a one liner about it;

It's the availability of system resources to allow scalability of the system as and when required by the application.

As and when required?

Consider an example of a build server running in our VM environment, now this server during it's peak instrumentation of code and all kinds of code analysis requires the power of a quad core machine with 4GB RAM, so when you perform your hardware requirement analysis for this server, you consider all these factors and build a VM fulfilling these requirements.

Over the period of time your system usage reports show a small spike in system resource usage every 3-4 hours (every commit build) and a peak spike for 2-2.5 hours every 24 hours (nightly builds), the remaining time the system is pretty much idle doing nothing, eating up precious RAM allocation and number of CPUs dedicated.

Now if we push this setup in a CC environment, the system will use bare minimum resources when idle and grab all the possible resources from the cluster, during peak cycles. This allows us to leverage the un-utilized system resources, during non-peak cycles.

VMWare has DRS - Distributed Resource Scheduler, which helps the VMs attain similar to cloud status to dynamically allocate the resources.

Now Xen also has something similar, but then is it available for a small start-up (read free)? Yes, Nimbus@UC is something we can look at, will try digging in more into it and see what we can achieve out of it. So let us see how we can leverage the existing hardware to allow maximum virtual resource utilization.

Update: Read this article http://highscalability.com/eucalyptus-build-your-own-private-ec2-cloud

Virtualization: The Practical Implementation - I

I'm helping my friend Nitin at his firm Star4ce Technologies put up a virtualization server as their in-house development environment. To begin with we went ahead and ordered some hardware from newegg.

Here is a brief on the hardware we've picked;
  • Intel Core 2 Quad Q9300 Yorkfield 2.5GHz LGA 775 95W Quad-Core Processor
  • ASUS P5K-E LGA 775 Intel P35 ATX Intel Motherboard
  • (5) Seagate Barracuda 7200.11 ST3500320AS 500GB 7200 RPM SATA 3.0Gb/s Hard Drive
  • (2) CORSAIR 4GB (2 x 2GB) 240-Pin DDR2 800 (PC2 6400) SDRAM Dual Channel
  • LITE-ON 20X DVD±R DVD Burner with LightScribe SATA
  • Antec 850W ATX12V / EPS12V Power Supply
  • Antec P182 Gun Metal Black 0.8mm cold rolled steel ATX Mid Tower Computer Case
  • ZOTAC GeForce 7300GT 256MB 128-bit GDDR2 PCI Express x16
  • Logitech USB + PS/2 Cordless Standard Desktop EX110 Mouse Included
Now the bill fits to around $1,500.00 as of the day this article was written. Hopefully by next week we shall receive the components and we shall go ahead and assemble the system.

A few after thoughts about the hardware choices

Intel Core 2 Quad Q9300
45nm consumes less power compared to predecessors and can be over clocked to 3.2GHz.

ASUS P5K-E
It's not a high end server, but can be used for a low end server. We did see a few other motherboards but opted for this, as they had SLI and were more geared towards being gaming PC's.

Why no Server Motherboards
This is an interesting find we dug into, we decided to hit the server motherboards for the system, as they supported 32GB of RAM quite easily as compared to the 8GB max of the above mentioned. The price was not that different approx $150.00 higher (with dual processors -physical) but it had FB-RAM which took the toll on the price of the system. 4GB on FB-RAM goes to approx $250.00 to $550.00 for 8GB. Yes now that's a whopping high number we are talking about.

So we are back to reality with a lower end system (recall the first article) it was aimed towards start-ups and low budget development teams.

Seagate Barracuda 7200.11 ST3500320AS

32MB cache, and I trust Seagate. 5 pieces: 1- OS and 4- RAID 01

CORSAIR 4GB (2 x 2GB) 240-Pin DDR2 800
Nice & reliable.

ZOTAC GeForce 7300GT 256MB
There was no onboard video, this was the lowest and best

So we wait and watch for the parts to arrive. I'll be posting in intermittently the progress, issues & accomplishments along the way.

UPDATE: The shipping was quick, the parts should be arriving today afternoon.

DATETIME vs TIMESTAMP vs DATE & TIME - II

Ok now the test are run and the results are out, I know we are all excited to know them, and I'm equally eager to print them too!!

The test did a simple select * from the tables.
  public void fetchAll() throws Exception {
String SQL1 = "SELECT * FROM dateandtime";
String SQL2 = "SELECT * FROM datetime";
String SQL3 = "SELECT * FROM timestamps";

long start = 0;
long end = 0;

System.out.println("ONE");
start = new Date().getTime();
selectQuery(SQL1);
end = new Date().getTime();
System.out.println(" SQL 1 - dateandtime " + (end - start));

System.out.println("TWO");
start = new Date().getTime();
selectQuery(SQL2);
end = new Date().getTime();
System.out.println(" SQL 2 - datetime " + (end - start));

System.out.println("THREE");
start = new Date().getTime();
selectQuery(SQL3);
end = new Date().getTime();
System.out.println(" SQL 3 - timestamps " + (end - start));
}
The time to fetch kept on reducing with every subsequent calls.
 SQL 1 - dateandtime 4526 ms
SQL 2 - datetime 2852 ms
SQL 3 - timestamps 3577 ms

SQL 1 - dateandtime 4168 ms
SQL 2 - datetime 2467 ms
SQL 3 - timestamps 3073 ms

SQL 1 - dateandtime 4080 ms
SQL 2 - datetime 2346 ms
SQL 3 - timestamps 3130 ms

SQL 1 - dateandtime 3949 ms
SQL 2 - datetime 2419 ms
SQL 3 - timestamps 3043 ms
So looks like DATETIME wins in fetching speed.

Duplicate file finder

Disk space is cheap, starting from a 1.2GB hard drive from my first computer to a "spare" 500GB external hard drive, cheap data storage has come a long way. I click a lot of photographs ever since I got my first digital camera, and I store a lot of these photos too (locally), now since 2006 I have over 201,608 photos and some videos. My camera photo number counter has reset twice!!

A few days back I decided to hand over my drive to my brother, since he was leaving soon, I dumped all the stuff on my two laptops, and forgot about them. Of late my wife asked me to pick up good nice pics so we can print them. That's when I realized I had a lot of duplicate photos, now the simplest idea was to find the duplicate file names and delete them, but that was not possible, since I had already reset the counter so I technically had 3 files with the same name and atleast 10,000 of them!!

MD5 to the rescue. Since all the photos and movies are binary files, MD5 seemed ideal to me...

MD5 digests have been widely used in the software world to provide some assurance that a transferred file has arrived intact. For example, file servers often provide a pre-computed MD5 checksum for the files, so that a user can compare the checksum of the downloaded file to it.

So I started my eclipse and churned out a program to scan my HDD and compare MD5 keys and find all duplicates.

The method below generates the MD5 checksum for any file

  private static String generateMD5(String path) throws IOException {
MessageDigest digest;
InputStream is = null;
try {
digest = MessageDigest.getInstance("MD5");
is = new FileInputStream(new File(path));
byte[] buffer = new byte[8192];
int read = 0;

while((read = is.read(buffer)) > 0) {
digest.update(buffer, 0, read);
}
byte[] md5sum = digest.digest();
BigInteger bigInt = new BigInteger(1, md5sum);
return bigInt.toString(16);
} catch(NoSuchAlgorithmException e) {
e.printStackTrace();
} catch(FileNotFoundException e) {
e.printStackTrace();
} catch(IOException e) {
e.printStackTrace();
} finally {
is.close();
}
return null;
}
Then go ahead and get a list of all the files on your system

missed the code

And finally run the main method.

public static void main(String[] args) throws Exception {
List filePaths = new ArrayList();
File file = new File("/home/varun/workbench/duplicates.csv");
FileWriter fw = new FileWriter(file);
SortedMap duplicates = new TreeMap();
filePaths = generateFileMap("/mnt/datastorage/photos", filePaths);
for(String path : filePaths) {
String hash = generateMD5(path);
if(duplicates.containsKey(hash)) {
fw.write(path + "," + duplicates.get(hash) + "\n");
} else {
duplicates.put(hash, path);
}
}
}

So go ahead and run this, you can also extend it to generate a list of any kind of duplicate files. Average file size is 1.5 - 2.0 MB

The output file when viewed on Open Office, Google Docs or Excel looks like this

This is how the output looks like this. It shows you which file a duplicate of which other file.

/mnt/datastorage/Photos/2008/Photos/Halloween NYC 2007/DSC01958.JPG /mnt/datastorage/Photos/2008/Photos/SORT ME/DSC01958.JPG
/mnt/datastorage/Photos/2008/Photos/Halloween NYC 2007/DSC01959.JPG /mnt/datastorage/Photos/2008/Photos/SORT ME/DSC01959.JPG
/mnt/datastorage/Photos/2008/Photos/Halloween NYC 2007/DSC01960.JPG /mnt/datastorage/Photos/2008/Photos/SORT ME/DSC01960.JPG


UPDATE: Ran the program with SHA algorithm also and here are the comparison times.
  • Time for MD5 858,953ms (14.31 minutes)
  • Time for SHA 1,191,656ms (19.80 minutes)
Bibliography

DATETIME vs TIMESTAMP vs DATE & TIME

I'm starting off this project and wanted to study some data retrieval optimization values. DATE & TIME are the two most deciding factors for processing the information in my app. The aggregation, classification, sorting & grouping of data is based on DATE & TIME.
  • Daily reports
  • Weekly reports
  • Every day at 00:00 hours.
  • Every year on this date
So there is a huge amount of chronlogical processing. We might require to process the data just date, just time, or both date & time. So was born the question. "What is the most optimum way of storing information DATE & TIME, DATETIME or TIMESTAMP?" The initial study helped me find this.

From the MySQL manual...

Storage Requirements for Date and Time Types

Data Type Storage Required
DATE 3 bytes
TIME 3 bytes
DATETIME 8 bytes
TIMESTAMP 4 bytes
YEAR 1 byte

The storage requirements shown in the table arise from the way that MySQL represents temporal values:

  • DATE: A three-byte integer packed as DD + MM×32 + YYYY×16×32

  • TIME: A three-byte integer packed as DD×24×3600 + HH×3600 + MM×60 + SS

  • DATETIME: Eight bytes:

    • A four-byte integer packed as YYYY×10000 + MM×100 + DD

    • A four-byte integer packed as HH×10000 + MM×100 + SS

  • TIMESTAMP: A four-byte integer representing seconds UTC since the epoch ('1970-01-01 00:00:00' UTC)

  • YEAR: A one-byte integer

So in terms of data storage, DATETIME is 8 bytes, TIMESTAMP 4 bytes, DATE & TIME 6 bytes (3 each). Ideally TIMESTAMP is good enough, if it fits my needs.

8 bytes > 6 bytes > 4 bytes

Memory is getting cheaper by the day, so let's ignore this for the time being, we'll revisit the storage factor a bit later.

Since I have to fetch information and process it, I decided to run some test in MySQL. Below is the schema of the database.
CREATE DATABASE datetest;

USE datetest;

DROP TABLE IF EXISTS dateandtime;

DROP TABLE IF EXISTS datetime;

DROP TABLE IF EXISTS timestamps;

CREATE TABLE dateandtime (
timeonly TIME,
dateonly DATE,
counter INTEGER,
salary DECIMAL(10,2),
PRIMARY KEY (timeonly, dateonly));

CREATE TABLE datetime (
dateandtime DATETIME,
counter INTEGER,
salary DECIMAL(10,2),
PRIMARY KEY (dateandtime));

CREATE TABLE timestamps (
timestamps TIMESTAMP,
counter INTEGER,
salary DECIMAL(10,2),
PRIMARY KEY (timestamps));

I added approximately 100,000,000 records to each table, and then ran further test on it. As of now I'm yet to write the test cases, after I'm done I'll put the files on.

Found another interesting post, you might want to touch base on.
http://www.scribd.com/doc/2565263/The-top-20-design-tips-for-MySQL-Enterprise-data-architects

Bibliography