yablog: 2011/08

2011/08/21

postgres tips: enable plpgsql for implementing stored procedure

This entry is just a memo for me 8^)
If we want to implement stored procedure to any database, it is necessary to type below command.

createlang -d dbname plpgsql

2011/08/16

python tips: draw beta distribution with matplotlib

ラベル: machine learning, python

$Beta(\mu | a, b) = \frac{\Gamma(a+b)}{\Gamma(a)\Gamma(b)}\mu^{a-1}(1-\mu)^{b-1}$

import numpy as np
import numpy as np
import pylab as pl
import scipy.special as ss

def beta(a, b, mew):
    e1 = ss.gamma(a + b)
    e2 = ss.gamma(a)
    e3 = ss.gamma(b)
    e4 = mew ** (a - 1)
    e5 = (1 - mew) ** (b - 1)
    return (e1/(e2*e3)) * e4 * e5

def plot_beta(a, b):
    Ly = []
    Lx = []
    mews = np.mgrid[0:1:100j]
    for mew in mews:
        Lx.append(mew)
        Ly.append(beta(a, b, mew))
    pl.plot(Lx, Ly, label="a=%f, b=%f" %(a,b))

def main():
    plot_beta(0.1, 0.1)
    plot_beta(1, 1)
    plot_beta(2, 3)
    plot_beta(8, 4)
    pl.xlim(0.0, 1.0)
    pl.ylim(0.0, 3.0)
    pl.legend()
    pl.show()

if __name__ == "__main__":
    main()

2011/08/13

postgres tips: calculate degree with stored procedure

ラベル: postgres, tips

-- User-Defined-Function for calculate degree
-- ------------------------------------------

-- This function calculates cos from vector(0,1)

-- .. code-block:: sql
--    :linenos:

CREATE OR REPLACE FUNCTION GET_COS
    (v2_x FLOAT, v2_y FLOAT)
RETURNS FLOAT AS
$$
DECLARE
    v1_x FLOAT;
    v1_y FLOAT;
    v1_v2 FLOAT;
    length_v1 FLOAT;
    length_v2 FLOAT;
    cos FLOAT;
BEGIN
    v1_x := 0.;
    v1_y := 1.;
    v1_v2 := (v1_x * v2_x) + (v1_y * v2_y);
    length_v1 := sqrt(power(v1_x,2) + power(v1_y,2));
    length_v2 := sqrt(power(v2_x,2) + power(v2_y,2));
    cos := v1_v2 / (length_v1 * length_v2);
    RETURN cos;
END;
$$ language plpgsql;

-- This function calculates sin from vector(0,1)

-- .. code-block:: sql
--    :linenos:

CREATE OR REPLACE FUNCTION GET_SIN
    (v2_x FLOAT, v2_y FLOAT)
RETURNS FLOAT AS
$$
DECLARE
    v1_x FLOAT;
    v1_y FLOAT;
    det FLOAT;
    length_v1 FLOAT;
    length_v2 FLOAT;
    sin FLOAT;
BEGIN
    v1_x := 0.;
    v1_y := 1.;
    det := (v1_x * v2_y) - (v1_y * v2_x);
    length_v1 := sqrt(power(v1_x,2) + power(v1_y,2));
    length_v2 := sqrt(power(v2_x,2) + power(v2_y,2));
    sin := det / (length_v1 * length_v2);
    RETURN sin;
END;
$$ language plpgsql;

-- This function calculates degree from vector(0,1)

-- .. code-block:: sql
--    :linenos:

CREATE OR REPLACE FUNCTION GET_DEGREE
    (x FLOAT, y FLOAT)
RETURNS FLOAT AS
$$
DECLARE
    sin FLOAT;
    cos FLOAT;
    degree FLOAT;
BEGIN
    sin := GET_SIN(x, y);
    cos := GET_COS(x, y);
    IF sin >= 0 AND cos >= 0 THEN
        degree := degrees(acos(cos));
    ELSEIF sin >= 0 AND cos < 0 THEN
        degree := degrees(acos(cos));
    ELSEIF sin < 0 AND cos < 0 THEN
        degree := 180. - degrees(asin(sin));
    ELSEIF sin < 0 AND cos >= 0 THEN
        degree := 360. + degrees(asin(sin));
    END IF;
    RETURN degree;
END;
$$ language plpgsql;

memo for my low-memory brain: calculate intersection from two lines in 2-D.

ラベル: math

This is my memo for basic geometry-formula.
Now, I want to calculate intersection of two lines(Line1 and Line2).

Line1: passes point (x1=0, y1=9) and (x2=2, y2=5).
Line2: passes point (x3=1, y3=1) and (x4=3, y4=0).

Firstly, we can express these lines as following:
Line1: $(y_1 - y_2)x - (x_1 - x_2)y = (x_2y_1 - x_1y_2)$
Line2: $(y_3 - y_4)x - (x_3 - x_4)y = (x_4y_3 - x_3y_4)$

Secondly, we can get intersection (x,y) by solving below linear algebra:
$A\begin{pmatrix}x \\ y\end{pmatrix} = \begin{pmatrix}l_1 \\ l_2\end{pmatrix}$

$a_{11} = y_1 - y_2$
$a_{12} = y_3 - y_4$
$a_{21} = x_2 - x_1$
$a_{22} = x_4 - x_3$
$l_1 = (x_2y_1 - x_1y_2)$
$l_2 = (x_4y_3 - x_3y_4)$
$A = \begin{pmatrix}a_{11} & a_{12} \\ a_{21} & a_{22}\end{pmatrix}$

Finally, using inverse matrix, we can easily calculate intersection:
$A^{-1} = \frac{1}{|detA|}\begin{pmatrix}a_{22} & -a_{12} \\ -a_{21} & a_{11}\end{pmatrix} = \frac{1}{a_{11}a_{22}-a_{12}a_{21}}\begin{pmatrix}a_{22} & -a_{12} \\ -a_{21} & a_{11}\end{pmatrix}$
$\begin{pmatrix}x \\ y\end{pmatrix} = \begin{pmatrix}l_1 \\ l_2\end{pmatrix}A^{-1}$

sphinx tips: how to easily write a document for several scripts?

ラベル: postgres, sphinx, tips

Today, I explain how to write a sphinx document for several scripts in an easy way. Assume that you are now writing postgres scripts and need to write document for them. Now, you have two postgres scripts named sample{1,2}.sql and these scripts are written like below.

-- Sample1.sql
-- -----------

-- This SQL does something!!
-- A definition of sample_table shown below.

-- =============== =========== ============
-- Column Name     Type        Description
-- =============== =========== ============
-- foo             TEXT        foo?
-- bar             TEXT        bar?
-- baz             FLOAT       baz?
-- =============== =========== ============

-- :NOTE: just return 10 records
-- :TODO: just return 11 records

-- .. code-block:: sql
--    :linenos:

SELECT
    *
FROM
    sample_table
LIMIT 10;

-- Sample2.sql
-- -----------

-- This SQL extract data from columns named foo, bar...

-- :NOTE: just return 10 records
-- :TODO: just return 11 records

-- .. code-block:: sql
--    :linenos:

SELECT
    foo,
    bar
FROM
    sample_table
LIMIT 11;

As you can see above, there are rst formatted comments in sql scripts. If you write this kind of comments in sql scripts, you can easily convert them to html/latex-styled pdf! Below shell scripts can convert above sql scripts into one rst document.

#!/bin/sh
# convert_sql2rst.sh

output_rst="sample.rst"

if [ -e ${output_rst} ]; then
    rm ${output_rst}
fi

echo "=======================" >> ${output_rst}
echo "This is sample chapter!" >> ${output_rst}
echo "=======================" >> ${output_rst}
echo "" >> ${output_rst}

for sql_file in `ls *.sql`
do
    echo "converting ${sql_file} to rst-formatted document..."
    sed ${sql_file} -e 's/^/    /g' | \
    sed -e 's/^    -- //g' >> ${output_rst}
    echo "" >> ${output_rst}
    echo "" >> ${output_rst}
done

Now, put all of above scripts into same sphinx-root directory and edit your index.rst for including ${output_rst}. Next, do this:

yaboo@maniac:~/Projects/SqlProject$ sh convert_sql2rst.sh 
converting sample.sql to rst-formatted document...
converting sample2.sql to rst-formatted document...
yaboo@maniac:~/Projects/SqlProject$ make html
sphinx-build -b html -d _build/doctrees   . _build/html
Making output directory...
Running Sphinx v1.0.7
loading pickled environment... not yet created
building [html]: targets for 2 source files that are out of date
updating environment: 2 added, 0 changed, 0 removed
reading sources... [100%] sample                                                
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] sample                                                 
writing additional files... genindex search
copying static files... done
dumping search index... done
dumping object inventory... done
build succeeded.

Build finished. The HTML pages are in _build/html.

Finally, you can get document like this.

2011/08/11

python tips: draw poisson pdf graph with matplotlib

ラベル: machine learning, python

this is a code for drawing a graph of poisson pdf with matplotlib.

#!/usr/bin/env python                                                           
#coding: utf-8                                                                  
                                                                                
import pylab as pl                                                              
import numpy as np                                                              
                                                                                
def get_factorial(n):                                                           
    if n == 0:                                                                  
        return 1                                                                
    else:                                                                       
        return n * get_factorial(n-1)                                           
                                                                                
def get_probability(p, n, r):                                                   
    mew = float(n) * p                                                          
    denom = (mew ** r) * np.exp(-1 * mew)                                       
    molec = get_factorial(r)                                                    
                                                                                
    return denom/float(molec)                                                   
                                                                                
def plot_poisson_distrib(p, n, lt, lab):                                        
    xl = [nr for nr in range(0, 20, 1)]                                         
    yl = [get_probability(p, n, nr) for nr in range(0, 20, 1)]                  
    pl.plot(xl, yl, lt, label=lab)                                              
                                                                                
def main():                                                                     
    p = 0.02                                                                    
                                                                                
    mew = 1.0                                                                   
    plot_poisson_distrib(p, (mew / p), 'b-', 'P(%f,%f)' %(mew,p))               
                                                                                
    mew = 2.5                                                                   
    plot_poisson_distrib(p, (mew / p), 'r-', 'P(%f,%f)' %(mew,p))               
                                                                                
    mew = 5.0                                                                   
    plot_poisson_distrib(p, (mew / p), 'g-', 'P(%f,%f)' %(mew,p))               
                                                                                
    pl.ylabel('p(x)')                                                           
    pl.xlabel('x')                                                              
    pl.legend()                                                                 
    pl.show()                                                                   
                                                                                
if __name__ == "__main__":                                                      
    main()

and output of above code is:

python tips: draw binary pdf graph with matplotlib

ラベル: machine learning, python

this is a code for drawing a graph of binary probability distribution with matplotlib.

#!/usr/bin/env python                                                           
#coding: utf-8                                                                  
                                                                                
import pylab as pl                                                              
                                                                                
def combination(n, r):                                                          
    if (r == 0) or (n == r):                                                    
        return 1.0                                                              
                                                                                
    denom = reduce(lambda x, y: x*y, [i for i in range(1, n+1)])                
    mol_l = reduce(lambda x, y: x*y, [i for i in range(1, n-r+1)])              
    mol_r = reduce(lambda x, y: x*y, [i for i in range(1, r+1)])                
    return float(denom / (mol_l * mol_r))                                       
                                                                                
def get_probability(p, n, r):                                                   
    q = 1 - p                                                                   
    return combination(n, r) * (p ** r) * (q ** (n-r))                          
                                                                                
def plot_bin_distrib(p, n, lt, lab):                                            
    xl = [nr for nr in range(0, n+1, 1)]                                        
    yl = [get_probability(p, n, nr) for nr in range(0, n+1, 1)]                 
    pl.plot(xl, yl, lt, label=lab)                                              
                                                                                
def main():                                                                     
    p = 1.0/3.0                                                                 
    n = 10                                                                      
    plot_bin_distrib(p, n, 'b-', 'Bin(%s,%f)' %(n,p))                           
    n = 20                                                                      
    plot_bin_distrib(p, n, 'r-', 'Bin(%s,%f)' %(n,p))                           
    n = 30                                                                      
    plot_bin_distrib(p, n, 'g-', 'Bin(%s,%f)' %(n,p))                           
    n = 40                                                                      
    plot_bin_distrib(p, n, 'y-', 'Bin(%s,%f)' %(n,p))                           
                                                                                
    pl.ylabel('p(x)')                                                           
    pl.xlabel('x')                                                              
    pl.legend()                                                                 
    pl.show()                                                                   
                                                                                
if __name__ == "__main__":                                                      
    main()

and here is output from above code.

2011/08/10

python tips: use of chain generator

ラベル: python, tips

The book titled "Expert Python Programming" introduces pipeline of generators.
This concept is very useful for dividing loop codes to small parts.
Below is my example code that prints digit pyramid.

def change_value():
    for step in range(1, 10):
        yield step

def create_box(steps):
    for step in steps:
        yield (' ').join( str(step) for i in range(2 * step - 1) )

def create_pyramid(height):
    boxes = create_box(change_value())
    for step, box in enumerate(boxes):
        if step == height:
            break
        else:
            print ('').join(' ' for i in range((height - step)*2)) + box

Result of above code is ...

>>> create_pyramid(9)
                  1
                2 2 2
              3 3 3 3 3
            4 4 4 4 4 4 4
          5 5 5 5 5 5 5 5 5
        6 6 6 6 6 6 6 6 6 6 6
      7 7 7 7 7 7 7 7 7 7 7 7 7
    8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
  9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9

2011/08/07

postgres tips: stored procedure for calculating distance on the earth

ラベル: postgres, tips

CREATE FUNCTION GET_DISTANCE
    (alat FLOAT, alon FLOAT, lat FLOAT, lon FLOAT)
RETURNS FLOAT AS
$$
DECLARE
    radius_earth FLOAT;
    radian_lat FLOAT;
    radian_lon FLOAT;
    distance_v FLOAT;
    distance_h FLOAT;
    distance FLOAT;
BEGIN
    -- Insert earth radius
    SELECT INTO radius_earth 6378.137;

    -- Calculate difference between lat and alat
    SELECT INTO radian_lat radians(lat - alat);

    -- Calculate difference between lon and alon
    SELECT INTO radian_lon radians(lon - alon);

    -- Calculate vertical distance
    SELECT INTO distance_v (radius_earth * radian_lat);

    -- Calculate horizontal distance
    SELECT INTO distance_h (cos(radians(alat)) * radius_earth * radian_lon);

    -- Calculate distance(km) and convert it to distance(meter)
    SELECT INTO distance sqrt(pow(distance_h,2) + pow(distance_v,2)) * 1000;

    -- Returns distance
    RETURN DISTANCE;
END;
$$ language plpgsql;

> createlang plpgsql mytestdb
> psql -d mytestdb
mytestdb=# CREATE FUNCTION GET_DISTANCE
mytestdb-#     (alat FLOAT, alon FLOAT, lat FLOAT, lon FLOAT)
mytestdb-# RETURNS FLOAT AS
mytestdb-# $$
mytestdb$# DECLARE
mytestdb$#     radius_earth FLOAT;
mytestdb$#     radian_lat FLOAT;
mytestdb$#     radian_lon FLOAT;
mytestdb$#     distance_v FLOAT;
mytestdb$#     distance_h FLOAT;
mytestdb$#     distance FLOAT;
mytestdb$# BEGIN
mytestdb$#     -- Insert earth radius
mytestdb$#     SELECT INTO radius_earth 6378.137;
mytestdb$#
mytestdb$#     -- Calculate difference between lat and alat
mytestdb$#     SELECT INTO radian_lat radians(lat - alat);
mytestdb$#
mytestdb$#     -- Calculate difference between lon and alon
mytestdb$#     SELECT INTO radian_lon radians(lon - alon);
mytestdb$#
mytestdb$#     -- Calculate vertical distance
mytestdb$#     SELECT INTO distance_v (radius_earth * radian_lat);
mytestdb$#
mytestdb$#     -- Calculate horizontal distance
mytestdb$#     SELECT INTO distance_h (cos(radians(alat)) * radius_earth * radian_lon);
mytestdb$#
mytestdb$#     -- Calculate distance(km) and convert it to distance(meter)
mytestdb$#     SELECT INTO distance sqrt(pow(distance_h,2) + pow(distance_v,2)) * 1000;
mytestdb$#
mytestdb$#     -- Returns distance
mytestdb$#     RETURN DISTANCE;
mytestdb$# END;
mytestdb$# $$ language plpgsql;
CREATE FUNCTION
mytestdb=# select GET_DISTANCE(34.701909, 135.4949770, 35.681382, 139.766084);
   get_distance
------------------
 405807.810663345
(1 行)

latex tips: how to set up UTF-8 latex environment in Ubuntu 10.04LTS/amd64

ラベル: latex, sphinx, tips, ubuntu

I show you utf-8 latex installation steps for utf-8 encodings in Ubuntu-10.04 LTS (amd64) 2011/08/07. I am not sure this is the most effective way, but this can work very well on Ubuntu-10.04 LTS ;-) If you through below steps, you can compile utf-8 encoded tex files and also compile sphinx document!! (And..., if you want to convert sphinx document to pdf format and get ExtBabel error when uses "make all-pdf-ja" command, you should also check here)

Step1. install dvipsk-ja (default package is broken at 2011.06.11 JST)

> sudo add-apt-repository ppa:cosmos-door/dvipsk-ja
> sudo aptitude update
> sudo aptitude upgrade
> sudo aptitude install dvipsk-ja

Step2. install packages

> sudo aptitude install texlive texlive-math-extra texlive-latex-extra texlive-latex-extra-doc texlive-fonts-extra texlive-fonts-extra-doc texlive-fonts-recommended texlive-fonts-recommended-doc texlive-formats-extra texlive-latex-recommended texlive-latex-recommended texlive-extra-utils texlive-font-utils texlive-doc-ja ptex-bin jbibtex-bin mendexk okumura-clsfiles latex-cjk-japanese cmap-adobe-japan1 cmap-adobe-japan2 cmap-adobe-cns1 cmap-adobe-gb1 gs-cjk-resource ghostscript xdvik-ja dvi2ps dvi2ps-fontdesc-morisawa5 jmpost latexmk latex-mk pybliographer yatex

Step3. update latex environment

> updmap
> sudo mktexlsr
> sudo updmap-sys
> sudo dpkg-reconfigure ptex-jisfonts
> sudo jisftconfig add

Step4. get iso image of texlive2010 (this includes platex for utf8)

Step5. mount iso image of texlive2010

sudo mount -o loop texlive2010-20100826.iso /mnt3

Step6. install texlive2010

> sudo ./install-tl
...
Enter command: O [options]
Enter command: L [create symlinks in standard directories]
New value for binary directory [/usr/local/bin]: Enter
New value for man directory    [/usr/local/man]: Enter
New value for info directory   [/usr/local/info]: Enter
Enter command: R
Enter command: I
Installing to: /usr/local/texlive/2010
Installing [0001/2133, time/total: ??:??/??:??]: 12many [376k]
Installing [0002/2133, time/total: 00:00/00:00]: 2up [66k]
Installing [0003/2133, time/total: 00:00/00:00]: ANUfinalexam [3k]
Installing [0004/2133, time/total: 00:00/00:00]: AkkTeX [16k]
Installing [0005/2133, time/total: 00:00/00:00]: Asana-Math [433k]
Installing [0006/2133, time/total: 00:00/00:00]: ESIEEcv [137k]
Installing [0007/2133, time/total: 00:01/18:36]: FAQ-en [4943k]
Installing [0008/2133, time/total: 00:01/03:12]: HA-prosper [266k]
Installing [0009/2133, time/total: 00:01/03:03]: IEEEconf [188k]
Installing [0010/2133, time/total: 00:01/02:58]: IEEEtran [1325k]
Installing [0011/2133, time/total: 00:01/02:28]: MemoirChapStyles [669k]
Installing [0012/2133, time/total: 00:02/04:32]: SIstyle [338k]
Installing [0013/2133, time/total: 00:02/04:22]: SIunits [284k]
....
Installing [2129/2133, time/total: 03:33/03:33]: zhmetrics [66k]
Installing [2130/2133, time/total: 03:33/03:33]: zhspacing [165k]
Installing [2131/2133, time/total: 03:33/03:33]: ziffer [3k]
Installing [2132/2133, time/total: 03:33/03:33]: zwgetfdate [242k]
Installing [2133/2133, time/total: 03:33/03:33]: zwpagelayout [520k]
Time used for installing the packages: 03:33
running mktexlsr /usr/local/texlive/2010/texmf-dist /usr/local/texlive/2010/texmf
mktexlsr: Updating /usr/local/texlive/2010/texmf-dist/ls-R...
mktexlsr: Updating /usr/local/texlive/2010/texmf/ls-R...
mktexlsr: Done.
writing fmtutil.cnf data to /usr/local/texlive/2010/texmf-var/web2c/fmtutil.cnf
writing updmap.cfg to /usr/local/texlive/2010/texmf-config/web2c/updmap.cfg
writing language.dat data to /usr/local/texlive/2010/texmf-var/tex/generic/config/language.dat
writing language.def data to /usr/local/texlive/2010/texmf-var/tex/generic/config/language.def
writing language.dat.lua data to /usr/local/texlive/2010/texmf-var/tex/generic/config/language.dat.lua
running mktexlsr /usr/local/texlive/2010/texmf-var
mktexlsr: Updating /usr/local/texlive/2010/texmf-var/ls-R...
mktexlsr: Done.
running updmap-sys...done
re-running mktexlsr /usr/local/texlive/2010/texmf-var
mktexlsr: Updating /usr/local/texlive/2010/texmf-var/ls-R...
mktexlsr: Done.
pre-generating all format files (fmtutil-sys --all), be patient...done
running path adjustment actions
finished with path adjustment actions
running package specific postactions
finished with package specific postactions

 See
   /usr/local/texlive/2010/index.html
 for links to documentation.  The TeX Live web site (http://tug.org/texlive/)
 contains updates and corrections.

 TeX Live is a joint project of the TeX user groups around the world;
 please consider supporting it by joining the group best for you. The
 list of groups is available on the web at http://tug.org/usergroups.html.

 Add /usr/local/texlive/2010/texmf/doc/man to MANPATH, if not dynamically determined.
 Add /usr/local/texlive/2010/texmf/doc/info to INFOPATH.

 Most importantly, add /usr/local/texlive/2010/bin/x86_64-linux
 to your PATH for current and future sessions.

 Welcome to TeX Live!
Logfile: /usr/local/texlive/2010/install-tl.log

Step7. check platex can corporate with utf-8 encoded text (see -kanji=STRING option)

> platex --help
Usage: ptex [option] texfile
     : ptex [option] &format texfile

-fmt=NAME               use NAME instead of program name or %&format.
-halt-on-error          stop processing at the first error
[-no]-file-line-error   disable/enable file:line:error style messages
-ini                    be iniptex.
-interaction=STRING     set interaction mode (STRING=batchmode|nonstopmode|
                          scrollmode|errorstopmode)
-ipc                    send DVI output to a socket as well as the usual
                          output file
-ipc-start              as -ipc, and also start the server at the other end
-jobname=STRING         set the job name to STRING
-kanji=STRING           set Japanese encoding (STRING=euc|jis|sjis|utf8)
-kpathsea-debug=NUMBER  set path searching debugging flags according to
                          the bits of NUMBER
[-no]-mktex=FMT         disable/enable mktexFMT generation (FMT=tex/tfm)
-mltex                  enable MLTeX extensions such as \charsubdef
-output-comment=STRING  use STRING for DVI file comment instead of date
-output-directory=DIR   use DIR as the directory to write files to
[-no]-parse-first-line  disable/enable parsing of the first line of the
                          input file
-progname=STRING        set program (and fmt) name to STRING
-recorder               enable filename recorder
[-no]-shell-escape      disable/enable \write18{SHELL COMMAND}
-src-specials           insert source specials into the DVI file
-src-specials=WHERE     insert source specials in certain places of
                          the DVI file. WHERE is a comma-separated value
                          list: cr display hbox math par parend vbox
-translate-file=TCXNAME use the TCX file TCXNAME
-help                   print this message and exit.
-version                print version information and exit.

Email bug reports to ptex-staff@ml.asciimw.jp.

2011/08/04

hive tips: convert default separator

ラベル: hadoop, tips

hive's default separator is like '^A'.
to convert '^A' to ',', use below sed script.

/opt/hadoop/bin/hadoop dfs -cat /user/test/* | sed 's/[Ctrl-V][Ctrl-A]/,/g' > ${outputfile}

2011/08/03

machine learning: curve-fitting

ラベル: machine learning, python

#coding: utf-8                                                                  
                                                                                
import numpy as np                                                              
from pylab import *                                                             
import sys                                                                      
                                                                                
M = 9                                                                           
def y(x, wlist):                                                                
    ret = wlist[0]                                                              
    for i in range(1, M+1):                                                     
        ret += wlist[i] * (x ** i)                                              
    return ret                                                                  
                                                                                
def estimate(xlist, tlist, lam):                                                
    A = []                                                                      
    for i in range(M+1):                                                        
        for j in range(M+1):                                                    
            temp = (xlist**(i+j)).sum()                                         
            if i == j:                                                          
                temp += lam                                                     
            A.append(temp)                                                      
    A = array(A).reshape(M+1, M+1)                                              
                                                                                
    T = []                                                                      
    for i in range(M+1):                                                        
        T.append(((xlist**i) * tlist).sum())                                    
    T = array(T)                                                                
                                                                                
    wlist = np.linalg.solve(A, T)                                               
    return wlist                                                                
                                                                                
def rms(xlist, tlist, wlist):                                                   
    E_w = 0                                                                     
    N = len(xlist)                                                              
                                                                                
    for n in range(0, N):                                                       
        sum = 0                                                                 
        for j in range(0, M+1):                                                 
            sum += wlist[j] * (xlist[n] ** j)                                   
        E_w += ((sum - tlist[n]) ** 2)/2                                        
                                                                                
    return str(np.sqrt(2 * E_w / N))                                            

def example1():                                                                 
    # number of training data                                                   
    N = 10                                                                      
                                                                                
    # generate training data                                                    
    xlist = np.linspace(0, 1, N) # extract N-points                             
    tlist = np.sin(2 * np.pi * xlist) + np.random.normal(0, 0.2, xlist.size)    
                                                                                
    # estimate parametaer w                                                     
    wlist = estimate(xlist, tlist, np.exp(-18.0))                               
    #print wlist                                                                
    print "E_RMS for training data: %s" %  rms(xlist, tlist, wlist)             
                                                                                
    N = 100                                                                     
    xlist2 = np.linspace(0, 1, N) # extract N-points                            
    tlist2 = np.sin(2 * np.pi * xlist2) + np.random.normal(0, 0.2, xlist2.size) 
    print "E_RMS for test data: %s" % rms(xlist2, tlist2, wlist)                
                                                                                
    # generate original data                                                    
    xs = np.linspace(0, 1, 1000)                                                
                                                                                
    # ideal and model                                                           
    ideal = np.sin(2 * np.pi * xs)                                              
    model = [y(x, wlist) for x in xs]                                           
                                                                                
    # plot training data and original data                                      
    plot(xlist, tlist, 'bo')                                                    
    plot(xlist2, tlist2, 'rd')                                                  
    plot(xs, ideal, 'g-')                                                       
    plot(xs, model, 'r-')                                                       
    xlim(0.0, 1.0)                                                              
    ylim(-1.5, 1.5)                                                             
    show()                                                                      
                                                                                
if __name__ == "__main__":                                                      
    example1()