MySQL Backups and a lot more

A couple of years back, I had asked several people through groups, and posts the possibility of backing up a MySQL database programatically, from client side applications which could connect to remote servers running MySQL. There was a lot of suggestions,

* php script on server to take backup on server
* application code to select the data and do a round trip
* mysqldump from client machine with the server parameters

And finally from all those valuable suggestions from all the helpful people, I derived my own hybrid method which I would summarize in a few words. This is a method, which I have adopted for web applications in php as well as custom single machine applications in Visual Basic. How this can be termed as hybrid can be known by going into the structure, though it seems to be a bit complicated, it would serve the purpose and be worth the headache.

Continue reading “MySQL Backups and a lot more”

Optimizing Indexes

Designing a database as part of a big project, will need a lot of experience. There are several factors which should be considered before finalizing a database schema. Since once the project starts rollout, there is no going back, other than adding a new index, or adding another field, any major change will affect the whole code base.

Indexing is the most important method you should try first for speeding up queries. Other techniques are  also available, but generally the one thing that makes the most difference is the proper use of indexes. On the MySQL mailing list, people often ask for help in making a query run faster. In a surprisingly large number of cases, there are no indexes on the tables in question, and adding indexes often solves the problem immediately. It doesn’t always work like that, because optimization isn’t always simple. Nevertheless, if you don’t use indexes, in many cases you’re just wasting your time trying to improve performance by other means. Use indexing first to get the biggest performance boost and then see what other techniques might be helpful.

One should take absolute care to optimize the fields by keeping in mind, that larger the indexes, slower the operations. The most common example of a mistake is to add a timestamp field to select the data in chronological order. Now if the total rows is just a handful, this wont be a problem atall, but if the total rows is millions, the timestamp field may have a cardinality of say 50% of the total rows, making the index very huge, and operations slow. But if instead of the timestamp, the data is stored in two fields with date in one, and time is another the cardinality of time will be 86400, ie the no of seconds in a day, and the cardinality of date will also be under control, making the operations faster by an appreciable amount. For varchar fields, creating partial indexes instead of full index helps, reduce the size of the index file, and hence give the server breathing space.

Calculations ? Try and use SQL

Hmm.. yes.. I literally meant it.. for several things you can directly use SQL calculations, though the queries and functions shown here are tested only with mysql, it should work with most other sql servers too. The cliche is if you have the sql server on the same server where your webserver and php interpreter is running. In simple terms, if you connect to mysql on www.jijutm.com, sql can be used to a great extend to do calculations.

Some time back I did read on some other site, where two dates where calculated for the current month, using php mktime and date functions, to be passed in an sql query, for a range selection. I am not for an argument, but it seems more efficient to use the date calculations directly in the sql query, than pre calculating this unless otherwise it is needed to be printed in the output. Even then, the calculation can be done by the sql, and the result returned.

SQL calculations along with SQL session variables, will help a programmer achieve a very high optimization level for the scripts, thereby reducing webserver, and interpreter load.

Date calculations can use date_format, date_sub, unix_timestamp and from_unixtime and other related functions. Aggregates has all complimentary functions like sum, abs and others. By using a combination of these, any calculation can be achieved by sending an sql query.

Use variables in SQL

What was that ?

Yes.. I mean use variables in SQL. This can be quite interesting, though there may be a lot of people who might comment, that this kills readability, or does not induce even a ray of benefit, well I am not into an argument. And the method I am testing only with MySQL. In several situations, it has proven to be more faster, and as of recent times, we have started to send the whole of the code as a single operation letting mysql to handle the query splitting.

Suppose we need to insert first into a login table, to generate the user_id, which is auto-increment, and then into a set of tables, normally, we would go around insert into the table, then use mysql_insert_id(), to get the last inserted id, to process the rest of the tables. Now think if we used


insert into [table1] (fieldnames) values (values);
select last_insert_id() into @ii;
insert into [table2] (fkId,fieldnames) values (@ii,values);
insert into [table3] (fkId,fieldnames) values (@ii,values);

There are several other example uses for selecting into a variable, but the limitations are that you can select only a single field and single row.

Any comments :