Tags: excel
A quick look at using dates in Mysql
By admin on Jul 24, 2009 | In Daily Commute - Standard hints/tips | Send feedback »
The manipulation of dates in sql is often involved, and Mysql is no exception. I'm going to go over some of the more common requirements when playing with dates in Mysql, but advise that a lot more detail is available in the manual (link :-
date_time_functions_link )
The two main data types you'll probably deploy in Mysql (although there are others) are DATE and DATETIME. If you need to record time of day as well as the date then you'll probably be using DATETIME. If you then want to select records for a certain day accessing a DATETIME field then the function DATE can be very useful:-
Code:
mysql> select dateposted from entries where date(dateposted) = '2008-04-14'; |
Remember always that when inserting (or selecting from) Mysql DATE fields that the default format is 'YYYY-MM-DD'; when importing from a csv file created by Excel remember to first change the format accordingly on the initial Excel field. Another point (which upsets sql purists) is that if the date value being inserted into a DATETIME field is invalid, often the date will be set to '0000-00-00', rather than the insert being rejected. Here's what the mysql manual used to say on the subject:
The MySQL server only performs basic checking on the validity of a date: days 00-31, months 00-12, years 1000-9999. Any date not within this range will revert to 0000-00-00. Please note that this still allows you to store invalid dates such as 2002-04-31. It allows web applications to store data from a form without further checking. To ensure a date is valid, perform a check in your application.
To show the year part of the date use YEAR(
select curdate() will return the current date in yyyy-mm-dd format whereas
select now() will return the current date in a longer datetime format.
For various possibilities in formatting the date output, you can use DATE_FORMAT, for example:
Code:
mysql> select DATE_FORMAT('2009-10-04 22:23:00', '%W %M %Y'); | |
+------------------------------------------------+ | |
| DATE_FORMAT('2009-10-04 22:23:00', '%W %M %Y') | | |
+------------------------------------------------+ | |
| Sunday October 2009 | | |
+------------------------------------------------+ |
Other options include %m (month numeric), %b (abbreviated month name), %e (day of the month numeric), %D (day of month with suffix 1st, 2nd etc). Here's another sql example of DATE_FORMAT:
Code:
mysql> select DATE_FORMAT('2009-10-04 22:23:00', '%e %b %y'); | |
+------------------------------------------------+ | |
| DATE_FORMAT('2009-10-04 22:23:00', '%e %b %y') | | |
+------------------------------------------------+ | |
| 4 Oct 09 | | |
+------------------------------------------------+ |
The DATE_FORMAT function can be very useful for finding the first day of a month eg:
Code:
SELECT DATE_FORMAT('2004-01-20' ,'%Y-%m-01'); |
Once you have this, you can find out the last day of the month, first day of following month etc by using DATE_ADD and DATE_SUB (see below).
To find out the date 50 days from now, use the ADDDATE function, normally using format ADDDATE( date , Number_of_days ), as follows:
Code:
mysql> select curdate(), adddate(curdate(), 50); | |
+------------+------------------------+ | |
| curdate() | adddate(curdate(), 50) | | |
+------------+------------------------+ | |
| 2009-07-21 | 2009-09-09 | | |
+------------+------------------------+ |
You can use PERIOD_DIFF (p1, p2) to find the number of months between 2 dates p1 and p2, so long as p1 and p2 are in format YYMM or YYYYMM.
Code:
mysql> select period_diff(200907,200901); | |
+----------------------------+ | |
| period_diff(200907,200901) | | |
+----------------------------+ | |
| 6 | | |
+----------------------------+ |
select period_diff(date_format(now(), '%Y%m'), date_format(hiredate, '%Y%m')) as months from emp;
It has to be said however, that the normal method of date arithmetics in Mysql is done with the functions date_sub and date_add.
Code:
mysql> select date_sub('2005-01-01 00:05:00', INTERVAL 1 DAY); | |
+-------------------------------------------------+ | |
| date_sub('2005-01-01 00:05:00', INTERVAL 1 DAY) | | |
+-------------------------------------------------+ | |
| 2004-12-31 00:05:00 | | |
+-------------------------------------------------+ |
Code:
mysql> select DATE_ADD('2010-12-31 23:58:00', INTERVAL 1 DAY); | |
+-------------------------------------------------+ | |
| DATE_ADD('2010-12-31 23:58:00', INTERVAL 1 DAY) | | |
+-------------------------------------------------+ | |
| 2011-01-01 23:58:00 | | |
+-------------------------------------------------+ |
Date Issues importing from an Excel csv file
By admin on Jun 10, 2009 | In Daily Commute - Standard hints/tips | Send feedback »
I wanted to import the following data from a spreadsheet into my database table:
Code:
orderid customerid orderdate city | |
800 15 15/02/09 London | |
801 22 22/02/09 Glasgow | |
802 15 20/03/09 London | |
803 24 21/03/09 Aberdeen | |
804 76 13/04/09 Grimsby |
After saving the spreadsheet as csv (comma-delimited) my first attempt was using the LOAD DATA syntax as follows:
Code:
LOAD DATA INFILE 'C:\\ordertest.csv' INTO TABLE testorders | |
LINES terminated BY '\n'; |
As I’d read that this seldom worked, I was not surprised to find problems in the data inserted (although it did at least make some attempt at populating the table from the file).
So, I decided to use the Import facility in phpMyAdmin, after first selecting the necessary table. A useful option here was to specify the number of records to skip from start of file – this allowed me to skip the header record (field names) that I didn’t want to import. However the default options supplied slipped me up. The “Fields Terminated By” box has a default of “;” when we need “,”. Anyway, after correcting this by entering the comma, I was able to import the records. Only one hitch – the date columns all showed 00/00/00 – this is because MySQL uses a different date fomat to Microsoft Excel.
The solution to this? I went back into the original Excel sheet and changed the format of the date field to yyyy-mm-dd before saving as a csv file (comma delimited) and going through the process again. In the end I was able to import the data as I wanted.
Other workarounds could have included using OpenOffice to create the spreadsheet and csv file, using ODBC queries on Mysql, using Navicat, or using a php program to read and import the data. I hope to look at these options in future articles.
Incidentally, in the past when transferring data from spreadsheets into databases I’ve used the following technique. I use the Excel concatenate function to build up a sql Insert statement from the data on one line, and then copy the function down. For instance, to insert the first two fields only I would do the following in Excel and then copy the sql statements across.
Code:
=CONCATENATE("INSERT INTO testorders (orderid, customerid) VALUES (",A2,",",B2,")") |
This results in the following:
Code:
INSERT INTO testorders (orderid, customerid) VALUES (840,25) |
Remember though to be careful regarding different data types; for example you’ll have to include the apostrophe marks in the concatenation when the field is varchar.
Finding duplicates
By admin on Mar 2, 2009 | In Daily Commute - Standard hints/tips | Send feedback »
One very common requirement when working with data, is finding duplicate entries. The sql language is very powerful in doing this, yet many people prefer to use Excel to find duplicates. My opinion is this is fine for relatively small data-sets (less than 10,000 rows). You can use conditional formatting (for example using =COUNTIF(a:A,a2) > 1) or you can use the Filter/Advanced menu options to filter and copy to another location (you take out the duplicates by checking Unique records only). More modern versions of Excel now have a special command in the Data Menu: Data>Remove duplicates.
In Mysql (and indeed other versions of sql) the most common way of finding duplicate field entries is by using the GROUP BY clause. For example, to bring back job entries from the emp table that have more than 1 employee recorded against that particular job:
Code:
select job, count(*) | |
-> from emp | |
-> group by job | |
-> having count(*) > 1; |
If you wanted to bring back jobs recorded more than 3 times in the table you merely change the having clause:
Code:
select job, count(*) | |
-> from emp | |
-> group by job | |
-> having count(*) > 3; |
Incidentally, if you merely wanted to bring back distinct field(s) in a table, you can use GROUP BY without performing a count.
Now this is all very well, but supposing you have a requirement to bring back all fields in a record where the id field occurs as a duplicate. This can be achieved by using the IN clause in a subquery:
Code:
mysql> select a.* from emp a | |
-> where a.job in | |
-> (select b.job | |
-> from emp b | |
-> group by b.job having count(*) >1); |
All very well so far, but let's introduce a second table. Suppose we want to select records that occur twice within the two tables (and it's possible that the records appear within the same table)?
We can do this by using the UNION operator to bring back all records in both tables (remember to use UNION ALL, otherwise duplicates would be suppressed). We put this in an inline view, and then using GROUP BY to select only duplicates.
select empno, ename, job
from
(select empno, ename, job
from emp
UNION ALL
select empno, ename, job
from emp_temp) as empt
group by empno, ename, job having count(*) > 1;
I'm not sure how you would approach that in Excel! Easy enough in Mysql though!
Have fun!