phpbb MySQL queries; need optimizations

This one really hit me, but none would be much worried about these small things.

explain SELECT ban_ip, ban_userid, ban_email  FROM phpbb_banlist WHERE ban_ip IN ('5bcd7c03', '5bcd7cff', '5bcdffff', '5bffffff') OR ban_userid = -1;
+----+-------------+---------------+------+----------------+------+---------+------+------+-------------+
| id | select_type | table         | type | possible_keys  | key  | key_len | ref  | rows | Extra       |
+----+-------------+---------------+------+----------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | phpbb_banlist | ALL  | ban_ip_user_id | NULL | NULL    | NULL |  257 | Using where | 
+----+-------------+---------------+------+----------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> show status like '%cost%';
+-----------------+-----------+
| Variable_name   | Value     |
+-----------------+-----------+
| Last_query_cost | 54.998609 | 
+-----------------+-----------+
1 row in set (0.02 sec)

What was that, no indexes? well we should take a look, for forum like phpbb having so many users and loyal supporters, it should not be the case. Lets see the structure.

show create table phpbb_banlist;
CREATE TABLE `phpbb_banlist` (
  `ban_id` mediumint(8) unsigned NOT NULL auto_increment,
  `ban_userid` mediumint(8) NOT NULL default '0',
  `ban_ip` varchar(8) NOT NULL default '',
  `ban_email` varchar(255) default NULL,
  PRIMARY KEY  (`ban_id`),
  KEY `ban_ip_user_id` (`ban_ip`,`ban_userid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Continue reading “phpbb MySQL queries; need optimizations”

Tunning Mysql; Rephrase queries Continued

Hey, as of late, I had the opportunity to check the optimization for a heavily loaded phpbb system. It also shows lac of sql optimization in the code.

SELECT DISTINCT t.topic_id, t.topic_title, t.topic_poster, t.topic_first_post_id, t.forum_id, t.topic_replies, u.user_id, u.username, s.post_id, s.post_text
FROM phpbb_posts p, phpbb_topics t, phpbb_users u, phpbb_posts_text s
WHERE p.topic_id = t.topic_id AND t.topic_poster = u.user_id AND t.topic_first_post_id = s.post_id AND t.forum_id != 2 AND u.user_id != -1
ORDER BY p.post_time desc LIMIT 7;

This is one thing that is tough, and cannot be optimized without modifying the core code, and my exposure with the said script (phpbb) is too low that digging into the system will take many hours. I would recommend to design databases with the end user experience of much priority. For such things like `latest posts`, `latest news`, `new users` and other such things, keep a single table with almost the following structure

Continue reading “Tunning Mysql; Rephrase queries Continued”

Tunning Mysql; Rephrase queries

The following is an excerpt from mysql slow log on a server which I was assigned to do MySQL tunning and optimizing. The following log was after the proper indexes were applied. Each query was taking about .9 seconds (query time is 0, because slow-query-time in mysql settings was 1, and that means logs will have an accuracy of seconds to the integer only)

# User@Host: articlec_bioeric[articlec_bioeric] @ www.jijutm.com []
# Query_time: 0  Lock_time: 0  Rows_sent: 1  Rows_examined: 3
select a.*,p.pen_name from ahm_articles a,ahm_penname p where publish<>0 and publish<>2 and publish_date<='2009-01-24' and nick='1073534' or a.id='1073534' and a.author=p.id order by date desc limit 0,30;
# User@Host: articlec_bioeric[articlec_bioeric] @ www.jijutm.com []
# Query_time: 0  Lock_time: 0  Rows_sent: 1  Rows_examined: 3
select a.*,p.pen_name from ahm_articles a,ahm_penname p where publish<>0 and publish<>2 and publish_date<='2009-01-24' and nick='1073534' or a.id='1073534' and a.author=p.id order by date desc limit 0,30;

Continue reading “Tunning Mysql; Rephrase queries”

Consider Query Cache performance when designing tables

I have seen many web applications with poor database design. As of recent to increase performance, every body is utilizing some sort of caching. The primary one being query cache inside database server itself. Most of the developers are not aware of this, and do not make use of this. For example lets consider an article site or a blog application. The final element article or blog will need a view count, and adding a view or hit count to the same table is what 99% are doing. This is a field which should be separated from the semi static content. Keeping the count separate and never using this count table in joins will help cache the query. Also when the table is updated, index updates will be faster.

For showing top ten etc, it would be better to use a trigger to aggregate the top ten into another table. Also for aggregate pages like front portion etc, even better would be to create a single table which will hold information from different parts of the system which will be updated using triggers, and configuration directives in db. Then the page building code will need to just read out the aggregated data and merge it into the template to output the page.

gzip nusoap requests

Recently for a project involving travelport xml api integration, we badly needed the soap requests to be gzipped, since the technical support people suggested. Also we were aware of the benefits of using gzEncoded data when transmitting through the Internet. For the same we checked the wid internet searching with all sort of combinations of gzip nusoap request. Speaking from the inner view, we already had developed a handful of classes to abstract the SubmitXML api provided by Galelio Travelport. And frankly were reluctant to trash all the code already written.
At this point, we swam through the whole code of nusoap.php; should say thanks to easyeclipse, and made some minor tweaks here and there. The following was all that was required, though I doubt if this would be a generic solution, this definitely serves our purpose and the versions we used are 0.7.3/Revision: 1.114.

in class soap_transport_http added property gzipRequests (line 2148)
var $gzipRequests = true; // gzip any requests..

in function buildPayload at the top (line 2799)
if($this->gzipRequests){
$data = gzencode($data);
$this->setHeader("Content-Encoding",'gzip');
}

in function sendRequest (about line 2874), just in case the request is using curl
if($this->gzipRequests){
$data = gzencode($data);
}

Responsible use of Indexes

At several stages, I have been asked to benchmark web applications. I always tend to do the light weight stuff at the start, and when it gets into my mind, that the system which is being tested is working some what like it is expected, I steer my course and go for bigger feats. For instance the RatingHQ, was run with a test of hundred thousand registered users and about the same amount of items in each of the sections before releasing for public beta.

Having said the above, the most important matter is that seldom do the programmer check his / her code more than the unit or functional tests. And code thus created would crumple when subjected to higher loads. Several times I have seen code written which simply works, but will break, and take the server that hosts it too down to it knees.
Continue reading “Responsible use of Indexes”

Reduce Bandwidth usage in PHP

While searching on the net with the title as search terms, I landed on a page by Anthony Williams, which provoked me to probe more into optimizing all our javascripts too. Since our latest and greatest wordpress implementation at keralaonline, was getting 10K hits per day. I thought to try it out.
By combining the ideas of Anthony Williams and jsmin-php, I wrote a piece, and with the help of a RewriteRule, the famous mod_rewrite module of apache, I could optimize my page fetches a lot. It may take quite a few days before I can come out with any sort of bench marks. But still I am sure there is quite a change.
Continue reading “Reduce Bandwidth usage in PHP”

Get Post Image for WordPress

Get Post Image is a plugin for WordPress 2.0 and higher that allows you to retrieve images contained in posts and display them in a custom manner.

Though the plugin helps you do all that what is said, it had a small bug (or was it intentional), will need to ask Andrew. The problem was identified only after we implemented it in the Asianet Portal – Kerala Online. With about 10K hits a day, the whole process was loading our web server which was a Dual Xeon with 4G RAM. I found it really baffling, and went on a wild hunt, and identified a lot of about 15 to 20 internal requests per external request of a page.

The hunt finally came to the place where the problem was on line no 252,
$imagesize = @getimagesize($img_url);,
which was changed to
$imagesize = @getimagesize($img_path);.

And the server started to breathe again.

PHP; Towards a 5.3 release

Straight from the php internals discussion:

Items on the list as the key features of this release

1) namespaces
Here we need to make sure that the current state is now in a coherent state. I think Derick still has some issues with the recent change by Greg, but even his criticism did not sound all to loud. So I think we are in a good state here?

2) late static binding
Etienne had some questions recently, which were met by criticism by Stas. However all others agreed with the change. So I guess we are solid here too?

3) re2c
Rui recently came to the list with notes on the ZE MB feature.

4) windows support
Ever since Edin disappeared it has become clear that we have a bus factor issue with windows support. The windows team is working to rectify this situation. We need to make sure that the infrastructure to deliver windows binaries for PHP 5.3 as well as PECL extension is in place before we can release 5.3.

5) BC issues
Well this is a bit of an “anti-feature” in the sense its not a task anyone is dedicated to. The point is that we need to make sure that we understand any BC issues we currently have, so that we can either correct them or document them.

These are the 5 areas we the RMs would hope that people focus on.

Thats actually 3 focus areas too many for my taste, but goes to show that we might want to release more often (given that the list for 5.4 is already cramming up).

On top of this we also have a few other changes that are of quite some importance, but that to me will not stop a release if they do not make it (for the extensions we feel that they will be available via PECL for those who really need them now in the worst case). But these are big features non the less that could warrant a new minor release on their own alone if it would be for even bigger stuff:

1) intl extension
Last discussion ended without a decision on the class naming. I specifically remember Derick taking issue with intl ext “invading” the date ext namespace. Stas however arguing that the intl ext is supposed to bring some forwards compatibility to PHP 6 and therefore naturally will need to span the namespaces of other extensions, that are planned to be expanded for PHP 6.

2) phar extension
I guess we are pretty solid here?

3) E_DEPRECATED
Here we just need to make sure that we actually mark only the things as deprecated that we actually want to deprecate. This ties in a bit with the BC issues point above.

4) __callStatic

5) Garbage Collection

So is anything missing? Please everybody take time to review the todo list, make sure that all items are on the list, make sure that the information is as up to date as possible. Finally anyone who’s name is on this list (or who will add himself) should get in contact with Johannes and myself within 1 week (thats July 9th) to explain the state of the todo item and when he can finish the item and what the general impact the feature has on the release. Also please bring up any issues, especially for the above 6 points, to the list and try to focus on solving the issues in a timely manner. We RMs will try to moderate as much as possible, but understand that at some point we will have to have to go with one approach (or in the worst case we might have to push a feature to 5.4).

After July 9th, we will then publish a tentative release plan within 3 days afterwards. The tentative schedule will probably try to move us quickly towards a feature freeze together with a first alpha. Depending on our discoveries we will schedule beta and RC releases (obviously subject to continued review).

protoype.js: Deep Category Select; Ajax

I was experimenting with the prototype.js library, thanks to all who have contributed towards this, and the wonderful documentations avaliable as download, as well as online references.

For a multilevel hierarchical selector of category, where the top levels should not be selectable, the existing ui elements were not enough to show off, with out incurring ambiguity. This led me to do the basic tests, and finalized the said widget. It does not support much now, though I may be working on a extension which will support multi-select.

See the Deep Category Select, in action where it is embedded into an example.

Category Selector by jiju-saturn