S3 Amazon Simple Storage Service

Posted on March 31, 2010 at 2:59 am in

So yeah I’ve been looking around for tools I can use for Amazon S3.
I originally used S3fox. It’s a useful browser plugin for Firefox. Works great for small jobs and copy single
files and small directories of files. However if you want something a little more robust the best tool
out there is s3cmd.

I’ve only been using it for a bit but it seems to do everything this programmer could want.
It’s a command line tool for linux. It’s pretty good so far I’ve copied 4900 images with it in a single bound.
It died when I tried copying over 100000 files but yeah that was probably the operating system
wimpering and not s3cmd. I switched to s3cmd put a* s3://bucket/directory/ instead to send
all of the files that start with the letter a first to split things up. That worked fine.

Installation was a breeze on centros linux.
# yum install s3cmd
# s3cmd –configure

Then once that’s all done you can start running these commands.
s3cmd ls
s3cmd put * s3://bucketname/directory/
s3cmd get s3://bucketname/directory/file

I was a little disappointed that the s3cmd –help didn’t list the commands. You have to sort of look around on the website. I’ve listed a few common commands here. There are others listed on the website. I use s3fox for making directories and buckets but here are the commands for s3cmd.

s3cmd mb s3://bucketname
// makes a bucket
s3cmd rb s3://buckettoremove
// removes a bucket

Check out here for more information and documentation..

The man pages for ubuntu actually have better documentation then the original website

Finally I was having issues running s3cmd from php shell_exec on certain boxes/configurations.

Here’s a good thread about how to deal with these issues


ubuntu apache2 and htaccess files.

Posted on March 20, 2010 at 11:10 pm in

If you are just trying to setup an apache2 server straight out of the box and want to modify add htaccess files it’s kind of different then how the apache instructions mention how to do so.

Here’s the thread I found through google on it.

http://ubuntuforums.org/showthread.php?t=47669

Follow orlando_nicks advice (posted 5 years ago). LOL.
The post is pretty old so I’ll cut & paste it here just in case it goes missing someday.

Found the solution!! Apache2 in general, or it might be specifically to Ubuntu, is configured slightly differently than Apache1.x.. at least from what I’ve seen in the default installs in RH, FC, Mandrake, etc (I’m not Linux or Apache expert).

I went to /etc/apache2/sites-available and edited the file default
There you’ll find:


NameVirtualHost *
<VirtualHost *>
ServerAdmin admin@site.com

DocumentRoot /var/www/
<Directory />
Options FollowSymLinks
AllowOverride None
</Directory>
<Directory /var/www/>
Options Indexes FollowSymLinks MultiViews
AllowOverride All
Order allow,deny
allow from all
# This directive allows us to have apache2's default start page
# in /apache2-default/, but still have / go to the right place
# Commented out for Ubuntu
#RedirectMatch ^/$ /apache2-default/
</Directory></code>

oh and yay for the code tag not working.


Ok ubuntu installation again!

Posted on March 19, 2010 at 5:05 pm in

So yeah I’m setting up another slicehost server.

Following these instructions:

https://help.ubuntu.com/community/ImageMagick

http://articles.slicehost.com/2008/4/25/ubuntu-hardy-installing-apache-and-php5

However they don’t work first try. Turns out I didn’t run the following command first: ‘sudo apt-get update’

Once I ran that … everything was AOK. Next I wanted to install convert/imagemagick. I followed these instructions:

https://help.ubuntu.com/community/ImageMagick

Everything AOK so far. This box doesn’t need MySQL however so I haven’t installed it.

Comment on Ok ubuntu installation again!


Useless blog is useless?

Posted on March 11, 2010 at 2:31 pm in

I put this blog up mainly so I can keep track of idea’s and various procedures I deal with in running websites and programming. It’s an online diary that I don’t mind sharing with everyone. Unfortunately since WordPress is so popular I’m constantly getting spam comments. In total I’ve had 213 spam comments (probably more) and 0 actual real comments and each one of those generates an email. So for now I’ve disabled comments on this site.


Commonly used linux commands

Posted on February 22, 2010 at 7:17 pm in

Here’s a list of useful linux commands for anyone running a website on a LAMP box.

df – report file system disk space usage (ex: df -h)
chmod – change permissions to make files/directories read/write/executable (ex: chmod +777 dir)
chown – change ownership of a file (ex: chown file.txt me)
whomai – determine what user you logged in as/running under.
which – determine the location of an executable (ex: which convert)
cd – change your current directory, where tilde ~ is your home directory. (ex: cd ~/)
ls – list files in current directory (ex: ls -la)
cp – copy files (ex: cp * directory)
mv – move/rename files (ex: mv * directory)

Those are some of the shell commands. Here are some other misc commands I commonly use:
crontab – edit my cron jobs (ex: crontab -e)
php – run php on the command line (ex: php -f filename)
sudo – super user do (ex: sudo command)
wget – web get (ex: wget http://www.google.com )
uptime – server uptime
ps – list processes ( ex: ps -aux )

Other commands to retrieve information about hardware:
cat /proc/cpuinfo
cat /proc/meminfo
dmesg

Those are some of the basic commands needed to run a website on a Linux Apache Myql Php box.
I’ll post about dealing with and running Apache and Mysql next time I have to install apache.
:D

Running MySQL and Apache requires a different set of skills. I’ll post about simple MySQL administration/client uses and commands in my next post.


An excellent blog article on how to setup CutyCap

Posted on February 17, 2010 at 5:22 am in

Check this article out if you want to know how to setup CutyCap on Ubuntu Linux,

http://mattaustin.me.uk/2009/01/thummer-website-thumbnail-generator-cutycapt-django/

It’s the best one around and saved me hours if not days of work trying to figure out why certain headers/libraries didn’t exist.


Hadoop: Handling of Binary Data

Posted on June 14, 2009 at 7:42 pm in

Several large companies have recently implemented Hadoop on their cloud based computing platforms such as aws.amazon.com. Now small players with large amounts of data to be processed have access to thousands of computers for data processing purposes.

Binary data processing could include everything from image converting, scaling, watermarks. Converting video to a web ready format such as .flv. Handling/parsing other custom binary formats. Converting wav to mp3 etc.

However currently there is no one easy way to process binary data with Hadoop. In order to read/write binary data with Hadoop you need to implement a custom jar and extend the Hadoop java classes for reading and writing the data.
I’ve implemented a binary version of these classes and have several examples of how to use the classes FileInputFormat, FileOutputFormat RecordReader and RecordWriter.

Click here to view my products using Hadoop with binary data readers.

My laymans explaination of Map/Reduce:

Map/Reduce is a 2 step process.

  1. Performing a map function on a key,value pair
  2. Performing a reduce function on a collection of key,value pairs.

Each piece of data is referenced by a key=>value
Map( key, value ) => Array( (key1, value1), (key2,value2) )
Reduce( ((key1,value1), (key1,value2)) ) => ( key1, value3 )

Typically you would run the Map and Reduce functions in a distributed network on many cpu’s.
So you could have many map functions concurrently running and then once the data has been all mapped it gets reduced. Again the reduce process happens in a distributed network on many cpu’s.

Additional reading:
Get familiar with the map/reduce concept
Read more about Hadoop implementation of map reduce.


MySQL Insert

Posted on June 14, 2009 at 4:15 pm in

Ok, this is a simple post on something that really is annoying about MySQL and Navicat 8.

Basically anytime you execute a single INSERT statement it’s slow as heck on MySQL.
Very very very very slow.
I honestly think there is a bug / very poor implementation issue that should be fixed.

For instance I was inserting about 22,000 rows using an SQL file produced by a tool called Navicat 8.
It took about 5 minutes to do that. That’s really not an acceptable amount of time when you have users like me waiting on that data being inserted.

So in light of this after a bit of googling I found an actual page on the mysql site highlighting
Speed of Insert statements.

Specifically it mentions that you can insert more then 1 row per an insert statement.
OK? So lets try that out:

BOOM 22,000 records inserted in less then one SECOND.

Now that’s what I’m talking about. I’ve taken 22,000 inserts and made them take about the same amount of time as 1 single insert. It’s incredible.

    Here’s an actual example of multiply row inserts per a single insert statement.
LOCK TABLES a WRITE;  // makes the inserts even faster
INSERT INTO table_a VALUES (1,23),(2,34),(4,33); // insert 3 rows into table_a
INSERT INTO table_b VALUES (8,26),(6,29); // insert 2 rows into table_b
...
UNLOCK TABLES;

I’m not sure who to blame more here MySql 5 or Navicat 8.

For MySql 5 there shouldn’t be such a big difference in speed between a single insert and an insert /w many rows at once.

Navicat 8 has a poorly implemented Export Wizard for SQL files. The alernative to Navicat 8 export is mysqldump –user username –password database table which is what most people use when their GUI tool breaks down like this.

Originally I was going to blame PHP for my problems but after writing and rewriting a function to load the SQL files from DISK I found that the speed issues were really MySql’s problem worsened by the SQL queries produced by Navicat 8.

Here’s a simple function I wrote for reading a SQL file from disk and installing it to whatever database you specify in the settings.

	/**
	 * Simple function for installing sql files.
	 * http://dev.mysql.com/doc/refman/5.0/en/insert.html
	 *
	 * **WARNING**
	 * A single insert with many values is about 1000000x faster
	 * then multiple inserts due to way mysql is coded.
	 * Use text search and replace to fix multiply inserts into a single insert.
	 * **WARNING**
	 *
	 * @param unknown_type $filename An SQL file to read in.
	 */
	function fast_install( $filename )
	{
		$mysqli = new mysqli();

		$mysqli->connect( DB_SERVER, DB_USERNAME, DB_PASSWORD, DATABASE );

		$content = file_get_contents( $filename );

		// set a 16 meg limit on the query
		$result = $mysqli->multi_query( "SET GLOBAL max_allowed_packet=16*1024*1024;" ) or die( mysqli_error( $mysqli ) );

		$result = $mysqli->multi_query( $content ) or die( mysqli_error( $mysqli ) );

		$mysqli->close();
	}

The origins of life.

Posted on May 13, 2009 at 4:02 pm in

The origins of life

Found this interesting article about experients performed that synthesize ribonucleotides in a “natural” manner.

According to Sutherland, these laboratory conditions resembled those of the life-originating “warm little pond” hypothesized by Charles Darwin if the pond “evaporated, got heated, and then it rained and the sun shone.”

It’s very interesting and I think it could also apply to the surface of comets or anywhere you would have radiation and these precursor materials.


Setup MySQL 5.1 on Centros 5.3

Posted on April 22, 2009 at 4:05 pm in

Yes I’ve done it. Finally, with much pain and suffering I’ve setup mysql on a Centros 5.3 slicehost.com box. Took me about a 1 1/2 days to set everything up from Apache/MySQL 5.1/PHP as well as a few other pieces of software. The main problem is incompatibility between MySQL 5.1 and PHP.

Currently there are two blog posts out there with 2 different methods of setting up MySQL 5.1 and the rest of the LAMP software.

http://gravityfx.org/2009/04/08/how-to-install-php-529-andmysql-5133-on-centos-53-final/

http://consciencespeaks.blogspot.com/2009/01/setting-up-lamp-server.html

I was able to get the gravityfx.org version by using the remi repositories. The only difference / change I had to make was:


yum enablerepo=remi install mysql-server

and leaving out the mysql since it installed packages/software from the centros base that interfered with remi’s packages.

Anyways either way it’s not the default setting and the php installed was missing some important libraries for tools such as phpMyAdmin.

The main problem for all of the installations is that php is not compatible with the MySQL 5.1 libaries. It works fine with libmysqlclient 15 but not libmysqlclient 16 as is bundled with MySQL 5.1.

I would recommend anyone installing software on linux to just stick with the defaults as they are what works.


Top