coding

Web based chat and shout boxes.

July 14, 2010 - 1:37 pm

Here are some of the solutions I’ve found or have dealt with lately when trying to find the perfect solution to website(s) I’m working on.

Web based chat and shout boxes. - continue reading …

Javascript Libraries and Tools.

May 22, 2010 - 3:07 pm

Here are some javascript libraries and tools that I find useful:

http://jquery.com/

http://www.bitrepository.com/ajax-login-modal-box.html

http://tablesorter.com/docs/

Comment on Javascript Libraries and Tools.

Hadoop: Handling of Binary Data

June 14, 2009 - 7:42 pm

Several large companies have recently implemented Hadoop on their cloud based computing platforms such as aws.amazon.com. Now small players with large amounts of data to be processed have access to thousands of computers for data processing purposes.

Binary data processing could include everything from image converting, scaling, watermarks. Converting video to a web ready format such as .flv. Handling/parsing other custom binary formats. Converting wav to mp3 etc.

However currently there is no one easy way to process binary data with Hadoop. In order to read/write binary data with Hadoop you need to implement a custom jar and extend the Hadoop java classes for reading and writing the data.
I’ve implemented a binary version of these classes and have several examples of how to use the classes FileInputFormat, FileOutputFormat RecordReader and RecordWriter.

Click here to view my products using Hadoop with binary data readers.

My laymans explaination of Map/Reduce:

Map/Reduce is a 2 step process.

  1. Performing a map function on a key,value pair
  2. Performing a reduce function on a collection of key,value pairs.

Each piece of data is referenced by a key=>value
Map( key, value ) => Array( (key1, value1), (key2,value2) )
Reduce( ((key1,value1), (key1,value2)) ) => ( key1, value3 )

Typically you would run the Map and Reduce functions in a distributed network on many cpu’s.
So you could have many map functions concurrently running and then once the data has been all mapped it gets reduced. Again the reduce process happens in a distributed network on many cpu’s.

Additional reading:
Get familiar with the map/reduce concept
Read more about Hadoop implementation of map reduce.

MySQL Insert

June 14, 2009 - 4:15 pm

Ok, this is a simple post on something that really is annoying about MySQL and Navicat 8.

Basically anytime you execute a single INSERT statement it’s slow as heck on MySQL.
Very very very very slow.
I honestly think there is a bug / very poor implementation issue that should be fixed.

For instance I was inserting about 22,000 rows using an SQL file produced by a tool called Navicat 8.
It took about 5 minutes to do that. That’s really not an acceptable amount of time when you have users like me waiting on that data being inserted.

So in light of this after a bit of googling I found an actual page on the mysql site highlighting
Speed of Insert statements.

Specifically it mentions that you can insert more then 1 row per an insert statement.
OK? So lets try that out:

BOOM 22,000 records inserted in less then one SECOND.

Now that’s what I’m talking about. I’ve taken 22,000 inserts and made them take about the same amount of time as 1 single insert. It’s incredible.

    Here’s an actual example of multiply row inserts per a single insert statement.
LOCK TABLES a WRITE;  // makes the inserts even faster
INSERT INTO table_a VALUES (1,23),(2,34),(4,33); // insert 3 rows into table_a
INSERT INTO table_b VALUES (8,26),(6,29); // insert 2 rows into table_b
...
UNLOCK TABLES;

I’m not sure who to blame more here MySql 5 or Navicat 8.

For MySql 5 there shouldn’t be such a big difference in speed between a single insert and an insert /w many rows at once.

Navicat 8 has a poorly implemented Export Wizard for SQL files. The alernative to Navicat 8 export is mysqldump –user username –password database table which is what most people use when their GUI tool breaks down like this.

Originally I was going to blame PHP for my problems but after writing and rewriting a function to load the SQL files from DISK I found that the speed issues were really MySql’s problem worsened by the SQL queries produced by Navicat 8.

Here’s a simple function I wrote for reading a SQL file from disk and installing it to whatever database you specify in the settings.

	/**
	 * Simple function for installing sql files.
	 * http://dev.mysql.com/doc/refman/5.0/en/insert.html
	 *
	 * **WARNING**
	 * A single insert with many values is about 1000000x faster
	 * then multiple inserts due to way mysql is coded.
	 * Use text search and replace to fix multiply inserts into a single insert.
	 * **WARNING**
	 *
	 * @param unknown_type $filename An SQL file to read in.
	 */
	function fast_install( $filename )
	{
		$mysqli = new mysqli();

		$mysqli->connect( DB_SERVER, DB_USERNAME, DB_PASSWORD, DATABASE );

		$content = file_get_contents( $filename );

		// set a 16 meg limit on the query
		$result = $mysqli->multi_query( "SET GLOBAL max_allowed_packet=16*1024*1024;" ) or die( mysqli_error( $mysqli ) );

		$result = $mysqli->multi_query( $content ) or die( mysqli_error( $mysqli ) );

		$mysqli->close();
	}

Setup MySQL 5.1 on Centros 5.3

April 22, 2009 - 4:05 pm

Yes I’ve done it. Finally, with much pain and suffering I’ve setup mysql on a Centros 5.3 slicehost.com box. Took me about a 1 1/2 days to set everything up from Apache/MySQL 5.1/PHP as well as a few other pieces of software. The main problem is incompatibility between MySQL 5.1 and PHP.

Currently there are two blog posts out there with 2 different methods of setting up MySQL 5.1 and the rest of the LAMP software.

http://gravityfx.org/2009/04/08/how-to-install-php-529-andmysql-5133-on-centos-53-final/

http://consciencespeaks.blogspot.com/2009/01/setting-up-lamp-server.html

I was able to get the gravityfx.org version by using the remi repositories. The only difference / change I had to make was:


yum enablerepo=remi install mysql-server

and leaving out the mysql since it installed packages/software from the centros base that interfered with remi’s packages.

Anyways either way it’s not the default setting and the php installed was missing some important libraries for tools such as phpMyAdmin.

The main problem for all of the installations is that php is not compatible with the MySQL 5.1 libaries. It works fine with libmysqlclient 15 but not libmysqlclient 16 as is bundled with MySQL 5.1.

I would recommend anyone installing software on linux to just stick with the defaults as they are what works.

Hadoop: Buggy on Windows.

April 16, 2009 - 10:39 am

So yeah running hadoop on a windows box has a few bugs / issues that will definitely frustrate you as a developer.

If you suddenly start getting strange errors from hadoop and can’t seem to fix them by either restarting hadoop with stop-all.sh/start-all.sh the next best thing is just to clean that sucker up.

How do you clean up the hadoop distributed filesystem?

1. bin/stop-all.sh
2. delete the temporary files/directories c:\tmp
3. bin/hadoop namenode -format
4. bin/start-all.sh
5. bin/hadoop dfs -put local_dir dfs_dir

Of course that means you just formated/removed all of the data you put into the DFS but if all your doing is testing and development it shouldn’t be a big deal to put that data right back.
Here’s an example of an error that’s nearly unfixable and will just pop up and ruin your day for no reason:

09/04/13 14:48:35 INFO dfs.DFSClient: org.apache.hadoop.ipc.RemoteException: jav
a.io.IOException: File /user/Adminestrator/input/better.PNG could only be replicated to 0 nodes, instead of 1

Bummer.

CURL for C++

March 25, 2009 - 7:16 pm

Ok I’ve been having a few issues with setting up CURL with msdev 2008. I may have made some mistakes and basically writing this down here to keep track what I’ve done so far.

1. C/C++ tab / Command Line / Additional Options:
-DCURL_STATICLIB

2. Linker, General, Additional Library Directories: Location of libcurl.lib

3. Linker, Input, Additional Dependencies: libcurl.lib Winmm.lib
Winmm.lib has a timer function necessary one of libcurls functions.

Thats all so far for getting it setup with visual studio 2008.

Top