Hello everyone,
does anyone have experience using MUMPS (PetSc) with ogs?
The parameter set below does not work, i.e. ogs jumps back to the default solver.
<linear_solver>
<name>general_linear_solver</name>
<petsc>
<parameters>-mat_type aij -pc_type lu -pc_factor_mat_solver_package mumps -ksp_view</parameters>
</petsc>
</linear_solver>
Do you have any suggestions on how to correclty activate/parametrize it?
Thanks!
Best,
Tuanny
PS: Please disregard my last post “MUMPS solver - Parameters”.
Dear all,
the simulation runs with
<linear_solver>
<name>general_linear_solver</name>
<petsc>
<prefix>hc</prefix>
<parameters>-hc_mat_type aij -hc_pc_type lu -hc_pc_factor_mat_solver_type mumps -hc_ksp_view</parameters>
</petsc>
</linear_solver>
This seems to be the simplest parameter combination.
Output example:
KSP Object: (hc_) 1 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: (hc_) 1 MPI processes
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: nd
factor fill ratio given 0., needed 0.
Factored matrix follows:
Mat Object: 1 MPI processes
type: mumps
rows=26874, cols=26874
package used to perform factorization: mumps
total: nonzeros=5792610, allocated nonzeros=5792610
MUMPS run parameters:
SYM (matrix type): 0
PAR (host participation): 1
ICNTL(1) (output for error): 6
ICNTL(2) (output of diagnostic msg): 0
ICNTL(3) (output for global info): 0
ICNTL(4) (level of printing): 0
ICNTL(5) (input mat struct): 0
ICNTL(6) (matrix prescaling): 7
ICNTL(7) (sequential matrix ordering):7
ICNTL(8) (scaling strategy): 77
ICNTL(10) (max num of refinements): 0
ICNTL(11) (error analysis): 0
ICNTL(12) (efficiency control): 1
ICNTL(13) (sequential factorization of the root node): 0
ICNTL(14) (percentage of estimated workspace increase): 20
ICNTL(18) (input mat struct): 0
ICNTL(19) (Schur complement info): 0
ICNTL(20) (RHS sparse pattern): 0
ICNTL(21) (solution struct): 0
ICNTL(22) (in-core/out-of-core facility): 0
ICNTL(23) (max size of memory can be allocated locally):0
ICNTL(24) (detection of null pivot rows): 0
ICNTL(25) (computation of a null space basis): 0
ICNTL(26) (Schur options for RHS or solution): 0
ICNTL(27) (blocking size for multiple RHS): -32
ICNTL(28) (use parallel or sequential ordering): 1
ICNTL(29) (parallel ordering): 0
ICNTL(30) (user-specified set of entries in inv(A)): 0
ICNTL(31) (factors is discarded in the solve phase): 0
ICNTL(33) (compute determinant): 0
ICNTL(35) (activate BLR based factorization): 0
ICNTL(36) (choice of BLR factorization variant): 0
ICNTL(38) (estimated compression rate of LU factors): 333
CNTL(1) (relative pivoting threshold): 0.01
CNTL(2) (stopping criterion of refinement): 1.49012e-08
CNTL(3) (absolute pivoting threshold): 0.
CNTL(4) (value of static pivoting): -1.
CNTL(5) (fixation for null pivots): 0.
CNTL(7) (dropping parameter for BLR): 0.
RINFO(1) (local estimated flops for the elimination after analysis):
[0] 8.52752e+08
RINFO(2) (local estimated flops for the assembly after factorization):
[0] 7.85459e+06
RINFO(3) (local estimated flops for the elimination after factorization):
[0] 8.52752e+08
INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization):
[0] 72
INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization):
[0] 72
INFO(23) (num of pivots eliminated on this processor after factorization):
[0] 26874
RINFOG(1) (global estimated flops for the elimination after analysis): 8.52752e+08
RINFOG(2) (global estimated flops for the assembly after factorization): 7.85459e+06
RINFOG(3) (global estimated flops for the elimination after factorization): 8.52752e+08
(RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0)
INFOG(3) (estimated real workspace for factors on all processors after analysis): 5792610
INFOG(4) (estimated integer workspace for factors on all processors after analysis): 261736
INFOG(5) (estimated maximum front size in the complete tree): 399
INFOG(6) (number of nodes in the complete tree): 1394
INFOG(7) (ordering option effectively use after analysis): 5
INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100
INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 5792610
INFOG(10) (total integer space store the matrix factors after factorization): 261736
INFOG(11) (order of largest frontal matrix after factorization): 399
INFOG(12) (number of off-diagonal pivots): 0
INFOG(13) (number of delayed pivots after factorization): 0
INFOG(14) (number of memory compress after factorization): 0
INFOG(15) (number of steps of iterative refinement after solution): 0
INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 72
INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 72
INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 72
INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 72
INFOG(20) (estimated number of entries in the factors): 5792610
INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 63
INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 63
INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0
INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1
INFOG(25) (after factorization: number of pivots modified by static pivoting): 0
INFOG(28) (after factorization: number of null pivots encountered): 0
INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 5792610
INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 65, 65
INFOG(32) (after analysis: type of analysis done): 1
INFOG(33) (value used for ICNTL(8)): 7
INFOG(34) (exponent of the determinant if determinant is requested): 0
INFOG(35) (after factorization: number of entries taking into account BLR factor compression - sum over all processors): 5792610
INFOG(36) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - value on the most memory consuming processor): 0
INFOG(37) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - sum over all processors): 0
INFOG(38) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - value on the most memory consuming processor): 0
INFOG(39) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - sum over all processors): 0
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: mpiaij
rows=26874, cols=26874
total: nonzeros=1054062, allocated nonzeros=1054062
total number of mallocs used during MatSetValues calls=0
using I-node (on process 0) routines: found 11493 nodes, limit used is 5
Best,
Tuanny
2 Likes
This is great news! Thanks for finding out the solution. Now more interesting solvers can be used.
There is indeed some clarification needed for those prefixed petsc solver settings and which one has priority.
Hi Dima, indeed there are many options! We need to try a bit more and see where/how we can optimize the calculations. Here is a link with additional parameters:
https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MATSOLVERMUMPS.html
Thanks indeed. Now we can use a direct solver in a Petsc environment. This is great.