Hi,
I compiled OGS with MKL support in order to use PardisoLU from the eigen library.
Unfortunately, the solver always fails with a very unspecific message:
info: ------------------------------------------------------------------
info: *** Eigen solver computation
info: -> scale
info: -> solve with PardisoLU
error: Failed during Eigen linear solver initialization
info: ------------------------------------------------------------------
info: [time] Linear solver took 2.27617 s.
error: Newton: The linear solver failed.
info: [time] Solving process #0 took 2.35626 s in time step #1
error: The nonlinear solver failed in time step #1 at t = 5000 s for process #0.
info: [time] Output of timestep 1 took 0.00990845 s.
error: Time stepper cannot reduce the time step size further. at TimeLoop.cpp, line 667
info: OGS terminated on 2020-04-09 00:06:05+020
error: OGS terminated with error
However, I thought the debug mode might help, but it does not give any further information.
Does anyone have an idea what might be the cause?
For future reference: @joergbuchwald and I solved the issue. The problem occurred because of another MKL Installation from a package manager. One has to make sure to link against the native Intel MKL, i.e. that all paths in CMake are properly set to that version.
Hi,
I am a novice at OGS6. I also met the similar problem to yours, but I am not sure it is the same as yours.
I had not compiled OGS, and I just download the zip from gitlab to run this prj written by myself. I tried to sovle the problem as you said, but it not worked.
Could you tell me more details to deal with this problem? Or other suggestion about my problem.
Thank you
info: ------------------------------------------------------------------
info: [time] Linear solver took 3.51127 s.
info: Convergence criterion, component 0: |dx|=1.5533e-07, |x|=3.5545e+04, |dx|/|x|=4.3699e-12
info: Convergence criterion, component 1: |dx|=2.8061e+08, |x|=1.9345e+24, |dx|/|x|=1.4505e-16
info: Convergence criterion, component 2: |dx|=6.8381e-02, |x|=1.1999e+14, |dx|/|x|=5.6991e-16
info: Convergence criterion, component 3: |dx|=2.5835e-02, |x|=5.8525e+13, |dx|/|x|=4.4143e-16
info: [time] Iteration #50 took 3.94437 s.
info: [time] Solving process #0 took 200.756 s in time step #1
error: The nonlinear solver failed in time step #1 at t = 4.32e+06 s for process #0.
info: [time] Output of timestep 1 took 0.208888 s.
critical: E:/gitlab/builds/XBgsxgtH/0/ogs/ogs/ProcessLib/TimeLoop.cpp:735 ProcessLib::TimeLoop::solveUncoupledEquationSystems()
error: Time stepper cannot reduce the time step size further.
info: OGS terminated on 2021-03-29 22:13:36+0800.
error: OGS terminated with error.
Here is Rui’s prj file. I help him to update it.THM_prj.zip (367.2 KB)[quote=“Rui_Feng, post:4, topic:501, full:true”]
Hi,
I am a novice at OGS6. I also met the similar problem to yours, but I am not sure it is the same as yours.
I had not compiled OGS, and I just download the zip from gitlab to run this prj written by myself. I tried to sovle the problem as you said, but it not worked.
Could you tell me more details to deal with this problem? Or other suggestion about my problem.
Thank you
info: ------------------------------------------------------------------
info: [time] Linear solver took 3.51127 s.
info: Convergence criterion, component 0: |dx|=1.5533e-07, |x|=3.5545e+04, |dx|/|x|=4.3699e-12
info: Convergence criterion, component 1: |dx|=2.8061e+08, |x|=1.9345e+24, |dx|/|x|=1.4505e-16
info: Convergence criterion, component 2: |dx|=6.8381e-02, |x|=1.1999e+14, |dx|/|x|=5.6991e-16
info: Convergence criterion, component 3: |dx|=2.5835e-02, |x|=5.8525e+13, |dx|/|x|=4.4143e-16
info: [time] Iteration #50 took 3.94437 s.
info: [time] Solving process #0 took 200.756 s in time step #1
error: The nonlinear solver failed in time step #1 at t = 4.32e+06 s for process #0.
info: [time] Output of timestep 1 took 0.208888 s.
critical: E:/gitlab/builds/XBgsxgtH/0/ogs/ogs/ProcessLib/TimeLoop.cpp:735 ProcessLib::TimeLoop::solveUncoupledEquationSystems()
error: Time stepper cannot reduce the time step size further.
info: OGS terminated on 2021-03-29 22:13:36+0800.
error: OGS terminated with error.
Hi Rui_Feng,
first of all, your problem is not related to the topic: You are not using the PardisoLU solver and in your case, it is also not the direct solver that failed.
I think there is no real big issue with your problem as the relative error seems to be quite low already. The main problem is with your convergence setting: you are using absolute tolerances with an extremely low pressure (second entry) which cannot be met numerically as you are already at 10^-16 for the relative error. If you using a relative tolerances instead (<reltols> instead of <abstols>) with a threshold e.g. of 10^-10 it should work.
I encountered the same issue on the envinf2 with the latest master. The available solution is not too clear to me what to be set with CMake…
Below is the MKL-related options I set in the CMake. MKL_DIR: /opt/intel/mkl MKL_LIB_CORE: /opt/intel/mkl/lib/intel64/libmkl_core.so MKL_LIB_INTEL: /opt/intel/mkl/lib/intel64/libmkl_intel_lp64.so MKL_LIB_THREAD: /opt/intel/mkl/lib/intel64/libmkl_gnu_thread.so.
For me, it was enough to set the MKL_DIR only, without sourcing mklvars.
If I do source /opt/intel/mkl/bin/mklvars.sh intel64 prior to cmake configure (I think, setting MKL_LIB_CORE, MKL_LIB_INTEL and MKL_LIB_THREAD will likely have the same effect) I get the above-mentioned error. The main difference seems to be that if the paths to the shared libraries are set prior to cmake configure, BLAS uses MKL which seems to cause the error in my case.
I am experiencing the same problem right now on envinf1. the last command from @renchao.lu sadly did not make a difference for me. I also tried to remove the paths for MKL_LIB_CORE, MKL_LIB_INTEL and MKL_LIB_THREAD, but this also didn’t work. Any help would be appreciated.
I think it is about the MKL paths for the BLAS library. So, you could try to delete them and reconfigure.
To be on the safe side: clean-up the build directory first.
@joergbuchwald if you mean the three MKL Variables I listed above, I cleared my build directory and rebuild ogs once with them having their paths and once where I deleted the paths during the configuration. But both times it didn’t work.