Category: Daily Commute - Standard hints/tips
Using Mysql to calculate average (mean, mode or median)
By admin on Sep 28, 2010 | In Daily Commute - Standard hints/tips, Showroom - Examples of Mysql usage | Send feedback »
To calculate the mean (or average) we can utilise the AVG function within Mysql.
Code:
select avg(sal) as mean_sal | |
from emp; |
mean_sal
========
2073.9286
Remember though that the function will not work on a NULL field. If you want to include NULL values (eg assume zero value) use COALESCE to default the NULL to zero.
Code:
select avg(coalesce(sal,0)) as mean_sal | |
from emp; |
We can now produce the mean figure per grouped column, for example by department
Code:
select deptno, avg(coalesce(sal,0)) as mean_sal | |
from emp | |
group by deptno; |
deptno mean_sal
====== ========
10 2916.6667
20 2175.0000
30 1568.3333
The mode is the value that occurs most frequently in a given set of data.
If I want to find out from the PRODUCT table, what is the mode of the FULLPRICE field for a certain productgroup (in this case 'CA'), I can use a subquery to find the mode.
The subquery will calculate the various counts for each Fullprice, while the main part of the query will choose the price that occurs the most.
Code:
SELECT FullPrice FROM product | |
WHERE productgroupcode = 'CA' | |
GROUP BY FullPrice | |
HAVING count(*) >= all (SELECT count(*) | |
FROM product | |
WHERE productgroupcode = 'CA' | |
GROUP BY FullPrice) |
FullPrice
=========
14
There is another common requirement in finding an average, which is called the median.
The median is the value of the middle item in a set of ordered elements.
To calculate the median we can use a self join (join the table to itself), and then use the HAVING clause to filter those rows where the number of
times (P.FULLPRICE = Q.FULLPRICE) is greater than or equal to the number of times (Q.FULLPRICE > P.FULLPRICE) :
Code:
SELECT avg(FullPrice) | |
FROM ( select q.FullPrice | |
FROM product q, product p | |
WHERE p.productgroupcode = q.productgroupcode | |
AND q.productgroupcode = 'CA' | |
GROUP BY q.FullPrice | |
HAVING sum(case when q.FullPrice = p.FullPrice then 1 else 0 end) | |
>= abs(sum(sign(q.FullPrice - p.FullPrice))) | |
) t |
avg(FullPrice)
=========
14.0000
We can show the individual workings with this code that moves the SUM calculations into the SELECT list:
Code:
select q.FullPrice, p.FullPrice, | |
sum(case when q.FullPrice = p.FullPrice then 1 else 0 end) as cnt1, | |
abs(sum(sign(q.FullPrice - p.FullPrice))) as cnt2 | |
FROM product q, product p | |
WHERE p.productgroupcode = q.productgroupcode | |
AND q.productgroupcode = 'CA' | |
group by q.FullPrice |
FullPrice --- cnt1 --- cnt2
10 --- 25 --- 130
11 --- 4 --- 38
12 --- 1 --- 16
14 --- 64 --- 56
15 --- 1 --- 2
20 --- 1 --- 4
24 --- 9 --- 24
26 --- 1 --- 12
29 --- 9 --- 48
39 --- 25 --- 120
630 --- 1 --- 30
Finally, here I will list the records from the PRODUCT table so you can see the whole set we worked upon.
Code:
SELECT PRODUCTID, FULLPRICE | |
FROM product | |
WHERE PRODUCTGROUPCODE = 'CA' | |
ORDER BY FULLPRICE |
PRODUCTID --- FULLPRICE
12175 ---- 10
12014 ---- 10
13297 ---- 10
10002 ---- 10
10003 ---- 10
10838 ---- 11
13109 ---- 11
13107 ---- 12
13299 ---- 14
12177 ---- 14
14030 ---- 14
13621 ---- 14
12016 ---- 14
11003 ---- 14
11057 ---- 14
11056 ---- 14
10001 ---- 15
12949 ---- 20
13360 ---- 24
13470 ---- 24
10955 ---- 24
13359 ---- 26
13298 ---- 29
12015 ---- 29
12176 ---- 29
13024 ---- 39
10863 ---- 39
13023 ---- 39
10862 ---- 39
10993 ---- 39
12141 ---- 630
Useful links
By admin on Feb 16, 2010 | In Daily Commute - Standard hints/tips, Formula 1 - More advanced stuff | Send feedback »
Sorry, I've been away for a while. Hope to write a useful post fairly soon. In the meantime, here are a couple of things I've found out today. To look for useful twitter posts regarding Mysql, search on #Mysql (I know, fairly basic, but I'm not yet a Twitter person).
Second I found mysql commands you may not know
It's in Spanish, and if I get some time I'll translate some of the entries. Very interesting!
A quick look at using dates in Mysql
By admin on Jul 24, 2009 | In Daily Commute - Standard hints/tips | Send feedback »
The manipulation of dates in sql is often involved, and Mysql is no exception. I'm going to go over some of the more common requirements when playing with dates in Mysql, but advise that a lot more detail is available in the manual (link :-
date_time_functions_link )
The two main data types you'll probably deploy in Mysql (although there are others) are DATE and DATETIME. If you need to record time of day as well as the date then you'll probably be using DATETIME. If you then want to select records for a certain day accessing a DATETIME field then the function DATE can be very useful:-
Code:
mysql> select dateposted from entries where date(dateposted) = '2008-04-14'; |
Remember always that when inserting (or selecting from) Mysql DATE fields that the default format is 'YYYY-MM-DD'; when importing from a csv file created by Excel remember to first change the format accordingly on the initial Excel field. Another point (which upsets sql purists) is that if the date value being inserted into a DATETIME field is invalid, often the date will be set to '0000-00-00', rather than the insert being rejected. Here's what the mysql manual used to say on the subject:
The MySQL server only performs basic checking on the validity of a date: days 00-31, months 00-12, years 1000-9999. Any date not within this range will revert to 0000-00-00. Please note that this still allows you to store invalid dates such as 2002-04-31. It allows web applications to store data from a form without further checking. To ensure a date is valid, perform a check in your application.
To show the year part of the date use YEAR(
select curdate() will return the current date in yyyy-mm-dd format whereas
select now() will return the current date in a longer datetime format.
For various possibilities in formatting the date output, you can use DATE_FORMAT, for example:
Code:
mysql> select DATE_FORMAT('2009-10-04 22:23:00', '%W %M %Y'); | |
+------------------------------------------------+ | |
| DATE_FORMAT('2009-10-04 22:23:00', '%W %M %Y') | | |
+------------------------------------------------+ | |
| Sunday October 2009 | | |
+------------------------------------------------+ |
Other options include %m (month numeric), %b (abbreviated month name), %e (day of the month numeric), %D (day of month with suffix 1st, 2nd etc). Here's another sql example of DATE_FORMAT:
Code:
mysql> select DATE_FORMAT('2009-10-04 22:23:00', '%e %b %y'); | |
+------------------------------------------------+ | |
| DATE_FORMAT('2009-10-04 22:23:00', '%e %b %y') | | |
+------------------------------------------------+ | |
| 4 Oct 09 | | |
+------------------------------------------------+ |
The DATE_FORMAT function can be very useful for finding the first day of a month eg:
Code:
SELECT DATE_FORMAT('2004-01-20' ,'%Y-%m-01'); |
Once you have this, you can find out the last day of the month, first day of following month etc by using DATE_ADD and DATE_SUB (see below).
To find out the date 50 days from now, use the ADDDATE function, normally using format ADDDATE( date , Number_of_days ), as follows:
Code:
mysql> select curdate(), adddate(curdate(), 50); | |
+------------+------------------------+ | |
| curdate() | adddate(curdate(), 50) | | |
+------------+------------------------+ | |
| 2009-07-21 | 2009-09-09 | | |
+------------+------------------------+ |
You can use PERIOD_DIFF (p1, p2) to find the number of months between 2 dates p1 and p2, so long as p1 and p2 are in format YYMM or YYYYMM.
Code:
mysql> select period_diff(200907,200901); | |
+----------------------------+ | |
| period_diff(200907,200901) | | |
+----------------------------+ | |
| 6 | | |
+----------------------------+ |
select period_diff(date_format(now(), '%Y%m'), date_format(hiredate, '%Y%m')) as months from emp;
It has to be said however, that the normal method of date arithmetics in Mysql is done with the functions date_sub and date_add.
Code:
mysql> select date_sub('2005-01-01 00:05:00', INTERVAL 1 DAY); | |
+-------------------------------------------------+ | |
| date_sub('2005-01-01 00:05:00', INTERVAL 1 DAY) | | |
+-------------------------------------------------+ | |
| 2004-12-31 00:05:00 | | |
+-------------------------------------------------+ |
Code:
mysql> select DATE_ADD('2010-12-31 23:58:00', INTERVAL 1 DAY); | |
+-------------------------------------------------+ | |
| DATE_ADD('2010-12-31 23:58:00', INTERVAL 1 DAY) | | |
+-------------------------------------------------+ | |
| 2011-01-01 23:58:00 | | |
+-------------------------------------------------+ |