Using Mysql to calculate average (mean, mode or median)

To calculate the mean (or average) we can utilise the AVG function within Mysql.

Code:

select avg(sal) as mean_sal
from emp;

mean_sal
========
2073.9286

Remember though that the function will not work on a NULL field. If you want to include NULL values (eg assume zero value) use COALESCE to default the NULL to zero.

Code:

select avg(coalesce(sal,0)) as mean_sal
from emp;

We can now produce the mean figure per grouped column, for example by department

Code:

select deptno, avg(coalesce(sal,0)) as mean_sal
from emp
group by deptno;

deptno mean_sal
====== ========
10 2916.6667
20 2175.0000
30 1568.3333

The mode is the value that occurs most frequently in a given set of data.

If I want to find out from the PRODUCT table, what is the mode of the FULLPRICE field for a certain productgroup (in this case 'CA'), I can use a subquery to find the mode.
The subquery will calculate the various counts for each Fullprice, while the main part of the query will choose the price that occurs the most.

Code:

SELECT FullPrice FROM product
WHERE productgroupcode = 'CA'
GROUP BY FullPrice
HAVING count(*) >= all (SELECT count(*)
                        FROM product
                        WHERE productgroupcode = 'CA'
                        GROUP BY FullPrice)

FullPrice
=========
14

There is another common requirement in finding an average, which is called the median.
The median is the value of the middle item in a set of ordered elements.
To calculate the median we can use a self join (join the table to itself), and then use the HAVING clause to filter those rows where the number of
times (P.FULLPRICE = Q.FULLPRICE) is greater than or equal to the number of times (Q.FULLPRICE > P.FULLPRICE) :

Code:

SELECT avg(FullPrice)
FROM ( select q.FullPrice
      FROM product q, product p
      WHERE p.productgroupcode = q.productgroupcode
      AND q.productgroupcode = 'CA'
      GROUP BY q.FullPrice
      HAVING sum(case when q.FullPrice = p.FullPrice then 1 else 0 end)
                     >= abs(sum(sign(q.FullPrice - p.FullPrice)))
      ) t

avg(FullPrice)
=========
14.0000

We can show the individual workings with this code that moves the SUM calculations into the SELECT list:

Code:

select q.FullPrice, p.FullPrice,
       sum(case when q.FullPrice = p.FullPrice then 1 else 0 end) as cnt1,
       abs(sum(sign(q.FullPrice - p.FullPrice))) as cnt2
      FROM product q, product p
      WHERE p.productgroupcode = q.productgroupcode
      AND q.productgroupcode = 'CA'
group by q.FullPrice

FullPrice --- cnt1 --- cnt2
10 --- 25 --- 130
11 --- 4 --- 38
12 --- 1 --- 16
14 --- 64 --- 56
15 --- 1 --- 2
20 --- 1 --- 4
24 --- 9 --- 24
26 --- 1 --- 12
29 --- 9 --- 48
39 --- 25 --- 120
630 --- 1 --- 30

Finally, here I will list the records from the PRODUCT table so you can see the whole set we worked upon.

Code:

SELECT PRODUCTID, FULLPRICE
FROM product
WHERE PRODUCTGROUPCODE = 'CA'
ORDER BY FULLPRICE

PRODUCTID --- FULLPRICE
12175 ---- 10
12014 ---- 10
13297 ---- 10
10002 ---- 10
10003 ---- 10
10838 ---- 11
13109 ---- 11
13107 ---- 12
13299 ---- 14
12177 ---- 14
14030 ---- 14
13621 ---- 14
12016 ---- 14
11003 ---- 14
11057 ---- 14
11056 ---- 14
10001 ---- 15
12949 ---- 20
13360 ---- 24
13470 ---- 24
10955 ---- 24
13359 ---- 26
13298 ---- 29
12015 ---- 29
12176 ---- 29
13024 ---- 39
10863 ---- 39
13023 ---- 39
10862 ---- 39
10993 ---- 39
12141 ---- 630

Useful links

Sorry, I've been away for a while. Hope to write a useful post fairly soon. In the meantime, here are a couple of things I've found out today. To look for useful twitter posts regarding Mysql, search on #Mysql (I know, fairly basic, but I'm not yet a Twitter person).
Second I found mysql commands you may not know
It's in Spanish, and if I get some time I'll translate some of the entries. Very interesting!

Creating websites with Perl (cgi)

The normal way to produce a website that accesses data from a Mysql database is via the scripting language PHP. I wondered if it were possible to use Perl instead, and found that with the use of a couple of modules, it's not so hard to get started. (Here's a brief introduction to using modules with web programming.
Here's the Perl I wrote (calling the file empnew.cgi). I made sure the program was executable, and placed the file in a directory on my web server (make sure your path points to this directory):


#!/usr/bin/perl
use CGI;
use HTML::Template;
use DBI;
use strict;

my $dsn = 'dbi:mysql:one:localhost:3306:mysql_socket=/opt/lampp/var/mysql/mysqld.sock';
# the database here is called 'one'

my $CGI = CGI->new();
my $user = 'root';
my $pass = 'passwordx';


my $dbh = DBI->connect($dsn, $user, $pass)
or die "Can’t connect to the DB: $DBI::errstr\n";

my $sth = $dbh->prepare("select empno, ename, job, sal from emp order by ename");
$sth->execute;



# Prepare a data structure for HTML::Template
my $rows;
push @{$rows}, $_ while $_ = $sth->fetchrow_hashref();

#
# initiate the template and substitute the values:

my $template = HTML::Template->new(filename => 'empnew.tmpl',
die_on_bad_params => 0, );
# call param to fill in the loop with the loop data by reference.
$template->param(ROWS => $rows);


print $CGI->header();
# print the template
print $template->output();

$dbh->disconnect();

The module CGI deals quickly and efficiently with http headers.

The module HTML::Template that is called, allows you to state the Html layout in a separate file. In our example this calls empnew.tmpl

Code:

<!-- empnew.tmpl -->
<html>
<head>
<title>Employee Listing</title>
</head>
<body>
<h1>Employees</h1>
<table border=1>
            <tr>
              <td bgcolor="#ffff00"><font COLOR="#800500"><b>Emp Number</b></font></td>
              <td bgcolor="#ffff00"><font COLOR="#800500"><b>Emp Name</b></font></td>
              <td bgcolor="#ffff00"><font COLOR="#800500"><b>Job</b></font></td>
              <td bgcolor="#ffff00"><font COLOR="#800500"><b>Salary</b></font></td>
            </tr>
 
<!-- TMPL_LOOP NAME=ROWS -->
   <tr>
      <td><!-- TMPL_VAR NAME=EMPNO --></td>
      <td><!-- TMPL_VAR NAME=ENAME --></td>
      <td><!-- TMPL_VAR NAME=JOB --></td>
      <td><!-- TMPL_VAR NAME=SAL --></td>
   </tr>
<!-- /TMPL_LOOP -->
</table>
</body>
</html>

There are basically 3 new kinds of tags to be aware of when making a .tmpl file for access by HTML::Template. These are

Code:

* VARIABLES defined by <tmp_var>
* LOOPS defined by <tmpl_loop>
* CONDITIONALS you can use <tmpl_if> <tmpl_unless> or <tmpl_else>
 
Variable tags such as <TMPL_VAR NAME=FOO> will be replaced by whatever value is assigned to parameters when calling the HTMP::Template to the parameter FOO

The other tags can be standard html formatting stuff. Here's a good introduction to using the HTML::Template

Finally, in order to access data from the Mysql database we need to use the DBI module. You can see from the Perl script that we connect to the database using parameters given (not a good idea from the security pov to have the password in this script in real life - better to bring in as a parameter from a file that can't be seen by public eyes), prepare the sql script, and then execute. For more information on starting with DBI see this tutorial

Once this is setup, it's fairly simple to change the layout by editing the tmpl file directly without needing to change the program code.

I hope this has been of interest! Good luck with your web programming!