yablog: numpy

Showing posts with label numpy. Show all posts

2012/07/23

python tips: calculate distance between N points and M points using numpy array

This is memory consuming way of calculate distance between N points and M points. Let l_lat and l_lon are vectors for N points and v_lat and v_lon are for M points. Using numpy array, distance calculation can be written as follows.

#!/usr/bin/env python                                                                                                                                                                                        
#coding:utf-8                                                                                                                                                                                                

import numpy

M = 3
N = 5

# points1                                                                                                                                                                                                    
l_lat = numpy.arange(N)
l_lon = numpy.arange(N)
m_l_lat = numpy.tile(l_lat, (M, 1))
m_l_lon = numpy.tile(l_lon, (M, 1))

# points2                                                                                                                                                                                                    
v_lat = numpy.arange(M)
v_lon = numpy.arange(M)
m_v_lat_t = numpy.tile(v_lat, (N, 1)).T
m_v_lon_t = numpy.tile(v_lon, (N, 1)).T

distance = numpy.sqrt((m_v_lat_t - m_l_lat) ** 2 + (m_v_lon_t - m_l_lat) ** 2)
print distance

Although code generates lat and lon with numpy.arange(), in real case you should fill each vector with appropriate lat and lon values. Below is an output of previous code.

~$ python test_calc_distance.py 
[[ 0.          1.41421356  2.82842712  4.24264069  5.65685425]
 [ 1.41421356  0.          1.41421356  2.82842712  4.24264069]
 [ 2.82842712  1.41421356  0.          1.41421356  2.82842712]]

2012/05/31

statistics tips: my covariance matrix implementation on python

ラベル: math, numpy, python, statistics

#!/usr/bin/env python                                                                                                                
#coding: utf-8                                                                                                                       
                                                                                                                                     
import numpy as np                                                                                                                   
from numpy.random import *                                                                                                           
                                                                                                                                     
def get_covariance_matrix(A):                                                                                                        
    N, M = A.shape                                                                                                                   
    ret = np.reshape(np.zeros(N*M), (N,M))                                                                                           
                                                                                                                                     
    for n in range(0, N):                                                                                                            
        for m in range(0, M):                                                                                                        
            ret[n][m] = calc_covariance(A, n, m)                                                                                     
                                                                                                                                     
    return ret                                                                                                                       
                                                                                                                                     
def calc_covariance(A, n, m):                                                                                                        
    nominator = np.sum((A[n] - np.average(A[n])) * (A[m] - np.average(A[m])))                                                        
    denominator = A.shape[0] - 1                                                                                                     
    return nominator/denominator                                                                                                     
                                                                                                                                     
def test():                                                                                                                          
    D = 4                                                                                                                            
    A = np.reshape([randint(100) for i in range(D**2)], (D, D))                                                                      
                                                                                                                                     
    print("# numpy.cov() output:")                                                                                                   
    print(np.cov(A))                                                                                                                 
                                                                                                                                     
    print("# my covariance matrix:")                                                                                                 
    print(get_covariance_matrix(A))                                                                                                  
                                                                                                                                     
    print("# my precision matrix:")                                                                                                  
    print(get_covariance_matrix(A) ** -1)  
                                                                                                                                     
if __name__ == "__main__":                                                                                                           
    test()

yaboo@overtones:~$ python kyoubunsan.py
# numpy.cov() output:
[[  226.91666667   -26.83333333   -39.33333333   499.75      ]
 [  -26.83333333   328.33333333   629.          -257.83333333]
 [  -39.33333333   629.          1432.66666667  -882.        ]
 [  499.75        -257.83333333  -882.          1976.25      ]]                                                                      
# my covariance matrix:
[[  226.91666667   -26.83333333   -39.33333333   499.75      ]
 [  -26.83333333   328.33333333   629.          -257.83333333]
 [  -39.33333333   629.          1432.66666667  -882.        ]
 [  499.75        -257.83333333  -882.          1976.25      ]]
# my precision matrix:                                                                                                               
[[ 0.0044069  -0.03726708 -0.02542373  0.002001  ]
 [-0.03726708  0.00304569  0.00158983 -0.00387847]
 [-0.02542373  0.00158983  0.000698   -0.00113379]
 [ 0.002001   -0.00387847 -0.00113379  0.00050601]]

2012/01/25

python tips: array computation with numpy II

ラベル: numpy, python, tips

As previous post shows, for-loop calculation in array computations needs more processing time than vectorized calculation. There are some techniques for fasten calculation over arrays. Sample code shown below is one of them.

>>> from numpy import *
>>> 
>>> A = array([[1,2,3], [4,5,6], [7,8,9]])
>>> 
>>> def somefunc(x):
...     x1 = x
...     x2 = x - 3
...     return where(x < 7, x1, x2)
... 
>>> somefunc(A)
array([[1, 2, 3],
       [4, 5, 6],
       [4, 5, 6]])
>>>

python tips: array computation with numpy

ラベル: numpy, python, tips

I compared performance of array computations. Listed below are results and codes of some experiments which introduced in the book named 'Python Scripting for Computational Science (3rd edition)'.

>>> from numpy import *
>>> import time
>>> 
>>> a = linspace(0, 1, 1E+07)
>>> t0 = time.clock()
>>> b = 3*a - 1
>>> 
>>> t1 = time.clock()
>>> 
>>> for i in xrange(a.size):
...     b[i] = 3*a[i] -1
... 
>>> t2 = time.clock()
>>> 
>>> print '3*a-1: %g sec, loop: %g sec' %(t1-t0, t2-t1)
3*a-1: 0.05 sec, loop: 15.46 sec