Dear all,
I am starting with pvFMM and encountering an issue while trying to change the kernel evaluation. I tried to replace the evaluation of rinv = RSQRT_INTRIN
by another intrinsic, provided, rsqrt_approx_intrin
(which should be more accurate, up to my understanding).
I first try to run the program, without any modification (downloaded 5 march 2018), which goes through the example1
test smoothly.
./bin/exemple1 -N 4096
Maximum Absolute Error:2.47002e-05
Maximum Relative Error:4.5345e-09
Then, changing the evaluation of the rinv
intrinsic in the kernel evaluation makes the algorithm blow up (after make clean
everything).
./bin/exemple1 -N 4096
Maximum Absolute Error:3.07983e+09
Maximum Relative Error:565400
and runing the potential kernel only, gives me an error in the precomputation of the boundary conditions:
Cheb_Integ::Failed to converge.[6.93278e-09,-0.975,-0.975,-0.975]
Cheb_Integ::Failed to converge.[6.40075e-10,-0.975,-0.647222,-0.975]
Cheb_Integ::Failed to converge.[2.07315e-09,-0.975,-0.647222,-0.647222]
Cheb_Integ::Failed to converge.[2.08389e-09,-0.975,-0.647222,-0.319444]
...
Cheb_Integ::Failed to converge.[4.40053e-10,-0.975,0.00833333,0.663889]
Cheb_Integ::Failed to converge.[3.17001e-09,-0.975,0.00833333,0.991667]
Cheb_Integ::Failed to converge.[2.80505e-09,-0.975,0.00833333,1.31944]
Cheb_Integ::Failed to converge.[4.60546e-10,-0.975,0.00833333,1.64722]
Cheb_Integ::Failed to converge.[7.76104e-10,-0.975,0.336111,-0.975]
Cheb_Integ::Failed to converge.[8.40628e-11,-0.975,0.336111,-0.647222]
Cheb_Integ::Failed to converge.[1.85214e-09,-0.975,0.336111,-0.319444]
Cheb_Integ::Failed to converge.[4.40053e-10,-0.975,0.336111,0.00833333]
Here is the git diff
of the downloaded code. I checked the kernel and the value only differs from each other by 1e-4 to 1e-6.
Do you have any solution to that issue?
diff --git a/include/kernel.txx b/include/kernel.txx
index 7867086..5abd5df 100755
--- a/include/kernel.txx
+++ b/include/kernel.txx
@@ -1088,7 +1088,7 @@ void laplace_poten_uKernel(Matrix<Real_t>& src_coord, Matrix<Real_t>& src_value,
for(int i=0;i<NWTN_ITER;i++){
nwtn_scal=2*nwtn_scal*nwtn_scal*nwtn_scal;
}
- const Real_t OOFP = 1.0/(4*nwtn_scal*const_pi<Real_t>());
+ const Real_t OOFP = 1.0/(4*const_pi<Real_t>());
size_t src_cnt_=src_coord.Dim(1);
size_t trg_cnt_=trg_coord.Dim(1);
@@ -1110,7 +1110,7 @@ void laplace_poten_uKernel(Matrix<Real_t>& src_coord, Matrix<Real_t>& src_value,
r2=add_intrin(r2,mul_intrin(dy,dy));
r2=add_intrin(r2,mul_intrin(dz,dz));
- Vec_t rinv=RSQRT_INTRIN(r2);
+ Vec_t rinv=rsqrt_approx_intrin(r2);
tv=add_intrin(tv,mul_intrin(rinv,sv));
}
Vec_t oofp=set_intrin<Vec_t,Real_t>(OOFP);
EDIT
When using the potential kernel, by removing the scale invariance boolean and the dbl_layer kernel, i managed to reduce the error:
@@ -1405,8 +1405,8 @@ void laplace_grad(T* r_src, int src_cnt, T* v_src, int dof, T* r_trg, int trg_cn
template<class T> const Kernel<T>& LaplaceKernel<T>::potential(){
- static Kernel<T> potn_ker=BuildKernel<T, laplace_poten<T,1>, laplace_dbl_poten<T,1> >("laplace" , 3, std::pair<int,int>(1,1),
- NULL,NULL,NULL, NULL,NULL,NULL, NULL,NULL, &laplace_vol_poten<T>);
+ static Kernel<T> potn_ker=BuildKernel<T, laplace_poten<T,1> >("laplace" , 3, std::pair<int,int>(1,1),
+ NULL,NULL,NULL, NULL,NULL,NULL, NULL,NULL,NULL,false);
return potn_ker;
}
template<class T> const Kernel<T>& LaplaceKernel<T>::gradient(){
@@ -1418,8 +1418,8 @@ template<class T> const Kernel<T>& LaplaceKernel<T>::gradient(){
template<> inline const Kernel<double>& LaplaceKernel<double>::potential(){
typedef double T;
- static Kernel<T> potn_ker=BuildKernel<T, laplace_poten<T,2>, laplace_dbl_poten<T,2> >("laplace" , 3, std::pair<int,int>(1,1),
- NULL,NULL,NULL, NULL,NULL,NULL, NULL,NULL, &laplace_vol_poten<double>);
+ static Kernel<T> potn_ker=BuildKernel<T, laplace_poten<T,2> >("laplace" , 3, std::pair<int,int>(1,1),
+ NULL,NULL,NULL, NULL,NULL,NULL, NULL,NULL,NULL,false);
return potn_ker;
}
template<> inline const Kernel<double>& LaplaceKernel<double>::gradient(){
gives
Maximum Absolute Error:21.2006
Maximum Relative Error:0.0541196
PS: I am runing on mac OS and compiled with
./configure MPICXX=mpic++ CXX=icpc CC=icc F77=ifort CXXFLAGS="-mavx -g -std=c++11" CFLAGS="-mavx -g" FFLAGS="-mavx -g" --with-openmp-flag="qopenmp" --with-fftw-include="${FFTW_INC}" --with-fftw-lib="-mkl" --with-blas="-mkl" --with-lapack="-mkl" --disable-doxygen-doc --disable-doxygen-dot --disable-doxygen-html
where FFTW_INC
refers to the the include dir of MKL/fftw