sandialabs / albany Goto Github PK
View Code? Open in Web Editor NEWSandia National Laboratories' Albany multiphysics code
License: Other
Sandia National Laboratories' Albany multiphysics code
License: Other
Matlab and Objective-C are often misidentified due to their common use of the .m
extension. This is a common issue so I'm creating a meta-issue that we can point all the reports to.
As a developer, I want to incorporate a system to track stories in a backlog
right now the Discretization calls some PUMI functions that behave like MPI_Init
; I should move those out to the main function instead. We may have multiple Discretizations at some point.
There is a lot of code duplication in Albany_SolverFactory.cpp, in particular having to do with creating a linearSolverBuilder for different physics that have their own model evaluators. This should be cleaned up.
Dave and Coleman has done some work for the crystal plasticity model, Chris S. has also worked on YAML.
This is actually a Trilinos issue as the relevant code is in Piro at this point, but its really only used by Albany. There are bits of code that assume adaptivity only makes the mesh bigger, and cause serious failures when the mesh gets smaller.
I'll try to code a patch while I still remember where that code is, this issue can track our progress.
CMake code and a subdirectory still remain from this now-obsolete component
The following problem has been occurring for Schwarz tests in an all-debug build of Trilinos/Albany for at least 1 year, possibly from the beginning of when Schwarz was created. It may be related to Trilinos issue #184: trilinos/Trilinos#184 .
The code in question is this function:
Teuchos::RCP<Thyra::VectorSpaceBase const> LCM::SchwarzMultiscale::get_g_space(int l) const {
assert(0 <= l && l < num_responses_total_);
std::vector<Teuchos::RCP<Thyra::VectorSpaceBase const>>
vs_array;
// create product space for lth response by concatenating lth response
// from all the models.
for (auto m = 0; m < num_models_; ++m) {
vs_array.push_back(
Thyra::createVectorSpace<ST, LO, GO, KokkosNode>(
apps_[m]->getResponse(l)->responseMapT()));
}
return Thyra::productVectorSpace(vs_array);
}
which gives the following when run:
p=0: *** Caught standard std::exception of type 'Teuchos::DanglingReferenceError' :
/home/amota/LCM/trilinos-install-serial-gcc-debug/include/Teuchos_RCPNode.hpp:605:
Throw number = 1
Throw test that evaluated to true: true
Error, an attempt has been made to dereference the underlying object
from a weak smart pointer object where the underling object has already
been deleted since the strong count has already gone to zero.
Context information:
RCP type:
Teuchos::RCP<Thyra::DefaultProductVectorSpace const>
RCP address: 0x306d850
RCPNode type:
Teuchos::RCPNodeTmpl<Thyra::DefaultProductVectorSpace,
Teuchos::DeallocDelete<Thyra::DefaultProductVectorSpace > >
RCPNode address: 0x2abcb20
insertionNumber: 18227
RCP ptr address: 0x2a20e20
Concrete ptr address: 0x2a20e20
NOTE: To debug issues, open a debugger, and set a break point in the function where
the RCPNode object is first created to determine the context where the object first
gets created. Each RCPNode object is given a unique insertionNumber to allow setting
breakpoints in the code. For example, in GDB one can perform:
Open the debugger (GDB) and run the program again to get updated object addresses
Set a breakpoint in the RCPNode insertion routine when the desired RCPNode is first
inserted. In GDB, to break when the RCPNode with insertionNumber==3 is added, do:
(gdb) b 'Teuchos::RCPNodeTracer::addNewRCPNode( [TAB] ' [ENTER]
(gdb) cond 1 insertionNumber==3 [ENTER]
(gdb) run [ENTER]
when run in the debugger as instructed above, the code stops in the return line of the function
Improve usability and the ability to collaborate. Automate the process on a select few clusters. Can be accomplished by adding testing to the clusters.
As both a user and developer, I would like to have the ability to write restart files and then output solver (and state information) on the iteration. This would be helpful to understand the source of numerical instabilities in the iterative process.
albanyLib
uses symbols from albanySTK
(e.g., Albany::STKDiscretization::getOwnedDOF
) which in turn uses symbols from albanyLib
(e.g., createEpetraCommFromTeuchosComm
). There's probably more.
How to best get this straight?
Using the Elasticity2DTriangles inputT.xml input file as an example:
https://github.com/gahansen/Albany/blob/master/examples/LCM/Elasticity2DTriangles/inputT.xml
It appears that the regression comparison for the "Solution Average" response function is not actually compared against when running the problem. i.e. I can change the Test Value on this line:
https://github.com/gahansen/Albany/blob/master/examples/LCM/Elasticity2DTriangles/inputT.xml#L109
and the number of failed comparisons at the end of the Albany run will still be 0.
Is this expected behavior? I think I recall this behaved differently a few years ago.
Cheers,
Brian
I have been making an effort to remove Epetra member variables in PHAL::Workset, as a part of the conversion of ATO to Tpetra/Thyra. This includes in particular responses and their derivatives (workset.g, workset.dgdp, and workset.overlapped_dgdp). I was able to successfully remove the Epetra versions of g and overlapped_dgdp in PHAL::Workset and replace them with Tpetra analogs throughout Albany. I am having issues with workset.dgdp, however. The only place where workset.dgdp appears is:
src/evaluators/PHAL_SeparableScatterScalarResponse_Def.hpp
responses/Albany_FieldManagerScalarResponseFunction.cpp
The switch from workset.dgdp to workset.dgdpT should be trivial. However, when I make the change, the ATO::Constraint2D_adj test fails. @mperego, can you please have a look at this, as the author of distributed responses?
In multiple cases of special use of Albany, users have had to code alternatives to the Main_Solve+Application driver system. A wiki page will document our efforts to establish an organized way to compose different solver architectures from Albany components.
After a clean Albany build, test 28 fails:
[...]
99% tests passed, 1 tests failed out of 119
Total Test time (real) = 2410.31 sec
The following tests FAILED:
28 - MPNIQuad2D_Tpetra (Failed)
Errors while running CTest
In more detail:
UpdateCTestConfiguration from :/home/nschloe/software/albany/build/DartConfiguration.tcl
UpdateCTestConfiguration from :/home/nschloe/software/albany/build/DartConfiguration.tcl
Test project /home/nschloe/software/albany/build
Constructing a list of tests
Done constructing a list of tests
Checking test dependency graph...
Checking test dependency graph end
test 28
Start 28: MPNIQuad2D_Tpetra
28: Test command: /usr/bin/mpiexec "-np" "4" "/home/nschloe/software/albany/build/src/AlbanyT" "inputT.xml"
28: Test timeout computed to be: 9.99988e+06
28: Teuchos::GlobalMPISession::GlobalMPISession(): started processor with name wind and rank 1!
28: Teuchos::GlobalMPISession::GlobalMPISession(): started processor with name wind and rank 0!
28: Teuchos::GlobalMPISession::GlobalMPISession(): started processor with name wind and rank 2!
28: Teuchos::GlobalMPISession::GlobalMPISession(): started processor with name wind and rank 3!
28: TmplSTKMeshStruct:: Creating 2D mesh of size 20x20 elements and scaled to 1x1
28: Heat Problem Num MeshSpecs: 1
28: Heat Problem Num MeshSpecs: 1
28: Heat Problem Num MeshSpecs: 1
28: Heat Problem Num MeshSpecs: 1
28: Field Dimensions: Workset=50, Vertices= 4, Nodes= 4, QuadPts= 4, Dim= 2
28: Field Dimensions: Workset=50, Vertices= 4, Nodes= 4, QuadPts= 4, Dim= 2
28: Field Dimensions: Workset=50, Vertices= 4, Nodes= 4, QuadPts= 4, Dim= 2
28: Field Dimensions: Workset=50, Vertices= 4, Nodes= 4, QuadPts= 4, Dim= 2
28: Field Dimensions: Workset=50, Vertices= 4, Nodes= 4, QuadPts= 4, Dim= 2
28: Field Dimensions: Workset=50, Vertices= 4, Nodes= 4, QuadPts= 4, Dim= 2
28: Field Dimensions: Workset=50, Vertices= 4, Nodes= 4, QuadPts= 4, Dim= 2
28: Field Dimensions: Workset=50, Vertices= 4, Nodes= 4, QuadPts= 4, Dim= 2
28: STKDisc: 100 elements on Proc 0
28: STKDisc: nodeset NodeSet0 has size 6 on Proc 0.
28: STKDisc: nodeset NodeSet1 has size 6 on Proc 0.
28: STKDisc: nodeset NodeSet2 has size 21 on Proc 0.
28: STKDisc: nodeset NodeSet3 has size 0 on Proc 0.
28: STKDisc: nodeset NodeSet99 has size 1 on Proc 0.
28: STKDisc: sideset SideSet0 has size 5 on Proc 0.
28: STKDisc: sideset SideSet1 has size 5 on Proc 0.
28: STKDisc: sideset SideSet2 has size 20 on Proc 0.
28: STKDisc: sideset SideSet3 has size 0 on Proc 0.
28:
28: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
28: Sacado ParameterLibrary has been initialized:
28: Library of all registered parameters:
28: DBC on NS NodeSet0 for DOF T: Supports AD = 1, Supports_Analytic = 0
28: DBC on NS NodeSet1 for DOF T: Supports AD = 1, Supports_Analytic = 0
28: DBC on NS NodeSet2 for DOF T: Supports AD = 1, Supports_Analytic = 0
28: DBC on NS NodeSet3 for DOF T: Supports AD = 1, Supports_Analytic = 0
28: Quadratic Nonlinear Factor: Supports AD = 1, Supports_Analytic = 0
28: Thermal Conductivity KL Random Variable 0: Supports AD = 1, Supports_Analytic = 0
28: Thermal Conductivity KL Random Variable 1: Supports AD = 1, Supports_Analytic = 0
28: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
28:
28: Number of parameter vectors = 1
28: Number of parameters in parameter vector 0 = 2
28: Number of distributed parameters vectors = 0
28: Number of response vectors = 1
28: ************************************************************************
28: -- Status Test Results --
28: **...........OR Combination ->
28: **...........AND Combination ->
28: **...........F-Norm = 4.850e-01 < 1.000e-08
28: (Length-Scaled Two-Norm, Absolute Tolerance)
28: ??...........WRMS-Norm = 1.000e+12 < 1
28: **...........Number of Iterations = 0 < 10
28: ************************************************************************
28:
28: ************************************************************************
28: -- Nonlinear Solver Step 0 --
28: ||F|| = 1.019e+01 step = 0.000e+00 dx = 0.000e+00
28: ************************************************************************
28:
28:
28: ******* WARNING *******
28: MueLu::AmesosSmoother: "Amesos-klu" is not available. Using "Klu" instead
28: multigrid algorithm = sa
28: smoother: type = MueLu symmetric Gauss-Seidel
28: smoother: pre or post = both
28: coarse: type = Amesos-KLU
28: number of equations = 1
28: coarse: max size = 2000 [default]
28: max levels = 10 [default]
28:
28: Clearing old data (if any)
28:
28: ******* WARNING *******
28: Hierarchy::ReplaceCoordinateMap: matrix and coordinates maps are same, skipping...
28: Using default factory (MueLu::AmalgamationFactory) for building 'UnAmalgamationInfo'.
28: Level 0
28: Max coarse size (<= 2000) achieved
28:
28: p=0: *** Caught standard std::exception of type 'MueLu::Exceptions::RuntimeError' :
28:
28: /home/nschloe/software/trilinos/privateTrilinos/packages/muelu/src/Smoothers/MueLu_DirectSolver_def.hpp:146:
28:
28: Throw number = 3
28:
28: Throw test that evaluated to true: true
28:
28: Direct solver for Tpetra was not constructed
28: during request for data " PreSmoother" on level 0 by factory NoFactory
28:
28: p=1: *** Caught standard std::exception of type 'MueLu::Exceptions::RuntimeError' :
28:
28: /home/nschloe/software/trilinos/privateTrilinos/packages/muelu/src/Smoothers/MueLu_DirectSolver_def.hpp:146:
28:
28: Throw number = 3
28:
28: Throw test that evaluated to true: true
28:
28: Direct solver for Tpetra was not constructed
28: during request for data " PreSmoother" on level 0 by factory NoFactory
28:
28: p=2: *** Caught standard std::exception of type 'MueLu::Exceptions::RuntimeError' :
28:
28: /home/nschloe/software/trilinos/privateTrilinos/packages/muelu/src/Smoothers/MueLu_DirectSolver_def.hpp:146:
28:
28: Throw number = 3
28:
28: Throw test that evaluated to true: true
28:
28: Direct solver for Tpetra was not constructed
28: during request for data " PreSmoother" on level 0 by factory NoFactory
28:
28: p=3: *** Caught standard std::exception of type 'MueLu::Exceptions::RuntimeError' :
28:
28: /home/nschloe/software/trilinos/privateTrilinos/packages/muelu/src/Smoothers/MueLu_DirectSolver_def.hpp:146:
28:
28: Throw number = 3
28:
28: Throw test that evaluated to true: true
28:
28: Direct solver for Tpetra was not constructed
28: during request for data " PreSmoother" on level 0 by factory NoFactory
28: =================================================================================================================================================
28:
28: TimeMonitor results over 4 processors
28:
28: Timer Name MinOverProcs MeanOverProcs MaxOverProcs MeanOverCallCounts
28: -------------------------------------------------------------------------------------------------------------------------------------------------
28: > Albany Fill: Jacobian 0.01765 (1) 0.01767 (1) 0.01768 (1) 0.01767 (1)
28: > Albany Fill: Jacobian Export 0.001342 (1) 0.001653 (1) 0.001796 (1) 0.001653 (1)
28: > Albany Fill: Residual 0.007272 (1) 0.007329 (1) 0.00739 (1) 0.007329 (1)
28: Albany: ***Total Time*** 0.2639 (1) 0.264 (1) 0.264 (1) 0.264 (1)
28: Albany: **Total Fill Time** 0.02518 (2) 0.02523 (2) 0.02528 (2) 0.01262 (2)
28: Albany: Setup Time 0.1873 (1) 0.1874 (1) 0.1875 (1) 0.1874 (1)
28: MueLu: Hierarchy: Setup (total) 0.002472 (1) 0.002576 (1) 0.002733 (1) 0.002576 (1)
28: MueLu: Hierarchy: Setup (total, level=0) 0.002424 (1) 0.002528 (1) 0.002683 (1) 0.002528 (1)
28: NOX Total Preconditioner Construction 0.03764 (1) 0.03766 (1) 0.03768 (1) 0.03766 (1)
28: Phalanx: Evaluator 0: Gather Solution<Residual> 9.608e-05 (2) 9.698e-05 (2) 9.894e-05 (2) 4.849e-05 (2)
28: -------------------------------------------------------------------------------------------------------------------------------------------------
28: Phalanx: Evaluator 10: HeatEqResid 0.0006161 (2) 0.0006354 (2) 0.0006518 (2) 0.0003177 (2)
28: Phalanx: Evaluator 11: Gather Solution<Jacobian> 0.000526 (2) 0.0005313 (2) 0.0005381 (2) 0.0002657 (2)
28: Phalanx: Evaluator 12: Scatter<Jacobian> 0.001784 (2) 0.001794 (2) 0.001812 (2) 0.000897 (2)
28: Phalanx: Evaluator 13: Gather Coordinate Vector 5.627e-05 (2) 5.99e-05 (2) 6.509e-05 (2) 2.995e-05 (2)
28: Phalanx: Evaluator 14: MapToPhysicalFrame 0.0003619 (2) 0.0003652 (2) 0.0003719 (2) 0.0001826 (2)
28: Phalanx: Evaluator 15: ComputeBasisFunctions<Jacobian> 0.002013 (2) 0.002025 (2) 0.002034 (2) 0.001012 (2)
28: Phalanx: Evaluator 16: DOFInterpolation Jacobian 0.0007648 (2) 0.0007775 (2) 0.0007868 (2) 0.0003887 (2)
28: Phalanx: Evaluator 17: DOFInterpolation Jacobian 0.0007839 (2) 0.0007962 (2) 0.0008049 (2) 0.0003981 (2)
28: Phalanx: Evaluator 18: DOFGradInterpolation Jacobian 0.001594 (2) 0.001629 (2) 0.001671 (2) 0.0008146 (2)
28: Phalanx: Evaluator 19: Thermal Conductivity 0.0006123 (2) 0.0006338 (2) 0.000653 (2) 0.0003169 (2)
28: -------------------------------------------------------------------------------------------------------------------------------------------------
28: Phalanx: Evaluator 1: Scatter<Residual> 4.792e-05 (2) 4.846e-05 (2) 4.911e-05 (2) 2.423e-05 (2)
28: Phalanx: Evaluator 20: QuadraticSource 0.0004139 (2) 0.0004179 (2) 0.0004249 (2) 0.000209 (2)
28: Phalanx: Evaluator 21: HeatEqResid 0.004164 (2) 0.004187 (2) 0.004231 (2) 0.002094 (2)
28: Phalanx: Evaluator 2: Gather Coordinate Vector 7.2e-05 (2) 7.355e-05 (2) 7.701e-05 (2) 3.678e-05 (2)
28: Phalanx: Evaluator 3: MapToPhysicalFrame 0.0003679 (2) 0.000375 (2) 0.0003872 (2) 0.0001875 (2)
28: Phalanx: Evaluator 44: DBC on NS NodeSet0 for DOF T 3.719e-05 (1) 3.862e-05 (1) 4.101e-05 (1) 3.862e-05 (1)
28: Phalanx: Evaluator 45: DBC on NS NodeSet0 for DOF T 7.796e-05 (1) 8.279e-05 (1) 9.203e-05 (1) 8.279e-05 (1)
28: Phalanx: Evaluator 48: DBC on NS NodeSet1 for DOF T 7.868e-06 (1) 8.225e-06 (1) 8.821e-06 (1) 8.225e-06 (1)
28: Phalanx: Evaluator 49: DBC on NS NodeSet1 for DOF T 5.698e-05 (1) 5.949e-05 (1) 6.7e-05 (1) 5.949e-05 (1)
28: Phalanx: Evaluator 4: ComputeBasisFunctions<Residual> 0.002193 (2) 0.002223 (2) 0.002243 (2) 0.001112 (2)
28: -------------------------------------------------------------------------------------------------------------------------------------------------
28: Phalanx: Evaluator 52: DBC on NS NodeSet2 for DOF T 5.96e-06 (1) 6.258e-06 (1) 6.914e-06 (1) 6.258e-06 (1)
28: Phalanx: Evaluator 53: DBC on NS NodeSet2 for DOF T 5.007e-06 (1) 5.507e-05 (1) 0.0002031 (1) 5.507e-05 (1)
28: Phalanx: Evaluator 56: DBC on NS NodeSet3 for DOF T 5.007e-06 (1) 5.722e-06 (1) 6.914e-06 (1) 5.722e-06 (1)
28: Phalanx: Evaluator 57: DBC on NS NodeSet3 for DOF T 4.053e-06 (1) 5.531e-05 (1) 0.0002081 (1) 5.531e-05 (1)
28: Phalanx: Evaluator 5: DOFInterpolation 0.0001428 (2) 0.0001435 (2) 0.000144 (2) 7.173e-05 (2)
28: Phalanx: Evaluator 60: Dirichlet Aggregator 0 (1) 7.153e-07 (1) 9.537e-07 (1) 7.153e-07 (1)
28: Phalanx: Evaluator 61: Dirichlet Aggregator 0 (1) 2.384e-07 (1) 9.537e-07 (1) 2.384e-07 (1)
28: Phalanx: Evaluator 6: DOFInterpolation 0.0001409 (2) 0.0001413 (2) 0.0001421 (2) 7.066e-05 (2)
28: Phalanx: Evaluator 7: DOFGradInterpolation 0.000294 (2) 0.0002961 (2) 0.0002978 (2) 0.0001481 (2)
28: Phalanx: Evaluator 8: Thermal Conductivity 0.0003879 (2) 0.0003957 (2) 0.0004091 (2) 0.0001979 (2)
28: -------------------------------------------------------------------------------------------------------------------------------------------------
28: Phalanx: Evaluator 9: QuadraticSource 3.505e-05 (2) 3.505e-05 (2) 3.505e-05 (2) 1.752e-05 (2)
28: Thyra::DefaultModelEvaluatorWithSolveFactory<double>::evalModel(...) 0.02554 (2) 0.02559 (2) 0.02563 (2) 0.01279 (2)
28: Thyra::NOXNonlinearSolver::solve 0.0676 (1) 0.06766 (1) 0.0677 (1) 0.06766 (1)
28: =================================================================================================================================================
28: --------------------------------------------------------------------------
28: mpiexec noticed that the job aborted, but has no info as to the process
28: that caused that situation.
28: --------------------------------------------------------------------------
1/1 Test #28: MPNIQuad2D_Tpetra ................***Failed 2.79 sec
0% tests passed, 1 tests failed out of 1
Total Test time (real) = 2.80 sec
The following tests FAILED:
28 - MPNIQuad2D_Tpetra (Failed)
Errors while running CTest
@jewatkins , can you please do this, e.g., for the Ride machine, for both the P100s and K80s?
The external Albany CDash dashboard:
http://my.cdash.org/index.php?project=Albany
should have its Trilinos configuration script updated
to build the new Trilinos package "MiniTensor". I
believe this is as simple as adding a line like
-DTrilinos_ENABLE_MiniTensor:BOOL=ON \
somewhere in the following:
The following FELIX tests failed in last night builds due to the change in default workset size:
FO_GIS_GisUnstructured
FO_GIS_GisCoupledThickness
FO_GIS_GisManifold
FO_GIS_GisRestartUnstructured
FO_GIS_GisAdjointSensitivity
FO_GIS_GisWedgeAdjointSensitivity
FO_GIS_GisAdjointSensitivityBasalFriction
FO_GIS_GisAdjointSensitivityStiffeningBasalFriction
FO_GIS_GisSensSMBwrtBeta
FO_GIS_GisSensSMBwrtBetaRestart
It looks like the errors are all the same:
Test output
Albany_IOSS: Loading STKMesh from Exodus file ../ExoMeshes/gis_unstruct_2d.exo
Using decomposition method 'RIB' on 4 processors.
p=0: *** Caught standard std::exception of type 'Teuchos::Exceptions::InvalidParameterType' :
Error! An attempt was made to access parameter "Workset Size" of type "int"
in the parameter (sub)list "Albany Parameters->Discretization"
using the incorrect type "Albany::AbstractMeshStruct::{unnamed type#1}"!
Throw number = 1
See http://cdash.sandia.gov/CDash-2-3-0/viewTest.php?onlydelta&buildid=47874 (SRN CDash).
I've created a Wiki page here:
https://github.com/gahansen/Albany/wiki/Formatting-Tools
Which describes clang-format
, a tool to automatically format C++ files, and xmllint
, a tool which automatically formats XML files.
Based on discussion in the weekly phone call, possible uses of these tools include:
clang-format
styles that accurately match our own style(s)
src/LCM/.clang-format
clang-format
to also clean up the styling of itxmllint
to clean up its indentationclang-format
to most of the sourcexmllint
to most of the input.xml
filesAs always, this issue is here to gather feedback on this. We may start to iterate on a .clang-format
file in the near future.
@gahansen more or less owns this code and we're fairly sure it no longer builds. Based on discussion this afternoon we will soon remove Hydride.
There is code in Albany to use STK Adapt and Rebalance, and it would simplify Albany if we removed it. If we instead choose to keep it, then we need to bring it up to date and start testing it because it likely does not work like it used to.
This issue is to gather opinions from Albany users, in particular if you object to its removal. Although we don't have a specific date, it is likely that within the next month a decision will be made based on feedback.
Is anyone opposed to turning off Epetra tests in a CUDA build of Albany? Are Epetra tests meant to execute correctly on GPUs, in particular when UVM is on? I am wary of this usage and would be inclined to run only Tpetra tests in a CUDA build of Albany, unless I am missing something.
There is currently a conflict in the application of Dirichlet and Schwarz boundary conditions. If both are defined for a specific degree of freedom, both are applied to it. Thus the last specified BC prevails.
This needs to be changed so that if both Schwarz and Dirichlet BCs are applied to the same DOF, the Schwarz BC is ignored and the Dirichlet BC takes precedence.
This is needed to improve convergence for matrix-free GMRES + Schwarz (= monolithic Schwarz), and will require changes in Piro. I will need some help from Trilinos experts, e.g., @rppawlo , on how to hook this up, as it is nontrivial.
Both commit 9116dba and issue #48 report solutions being affected by workset size. My main concern is that if they are affected by workset size, then there is likely an error being introduced all the time, so even the results in the current "gold" files may be wrong. This is a meta-issue to track progress in finding and fixing such errors. My guess is most of this comes from the "last" workset, which is often a fixed size but whose last few elements are fake to make up the difference. There is probably logic in evaluators assuming the workset is all real cells.
Longer sets of tests allow more rigorous verification of physics.
I was thinking that if someone never uses classic PDE evaluators/problems (like Navier-Stokes, Helmholtz, CahnHill...), it would be neat to pack them in an optional sub-library, something like albanyPDE or albanyClassicPDE or whatever. They would be compiled only upon request from the configuration. I understand that these problems are used for testing, so perhaps this sub-library should be forced to be enabled if examples are enabled.
Not that it would save hours of compilation time, but there are quite a few evaluators for these classes, and when I switch branch or change something in the main library directory, it increases my compilation time by a time>epsilon.
My two cents.
A few of the ATO tests began to fail last night with the change in default workset size due to having been baselined against a gold file generated with workset size = 50 and tight exodiff tolerances. I attempted to fix the tests by setting the workset size = 50 in the input file (adding the line under "Discretization) but it does not appear to work -- when I print workset size in Albany::IOSSSTKMeshStruct, it is > 50. I have verified that commit 2eb2559 (change in workset size default value) is what broke this test. @ibaned , can you please look into why the workset size value specified in the input file does not overwrite the default value in the code for these tests?
This has been broken for awhile.
newer GCC versions (6.2.0 for one) are warning about usage of a deprecated variant of addDependentField. Basically, it seems the right thing to do is to declare your dependent fields with const
data types, so const ScalarT
instead of just ScalarT
. This is a big job to do throughout Albany, but fairly straightforward.
Want to improve the development cycle.
There seems to be one remaining issue in getting ATO ported to Tpetra/Thyra from Epetra/EpetraExt, which affects tests with distributed parameters/responses:
When running these problems with the AlbanyT executable, the following exception is thrown:
p=0: *** Caught standard std::exception of type 'std::logic_error' :
/home/ikalash/LCM/Trilinos/build-debug-no-dtk-ikt/install/include/Thyra_ModelEvaluatorBase_def.hpp:2102:
Throw number = 1
Throw test that evaluated to true: !deriv.isSupportedBy(derivSupport)
Thyra::ModelEvaluatorBase::OutArgs::assert_supports(OUT_ARG_DgDp,j,l):
model = 'Piro::NOXSolver':
Error, The argument DgDp(0,0) = Derivative{derivMultiVec=DerivativeMultiVector{multiVec=Thyra::TpetraMultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >{rangeDim=651,domainDim=1},orientation=DERIV_MV_BY_COL}}
is not supported!
The supported types include DerivativeSupport{DERIV_LINEAR_OP}!
The source of the problem is that, in ATOT::Solver, when
*(ret.responses_outT) = ret.modelT->createOutArgs();
is called in ATOT::Solver::CreateSubSolver, the code does not go into the Albany::ModelEvaluatorT::createOutArgsImpl routine, as it should, so the right supports do not get set. This is different than the behavior with the EpetraExt model evaluator version of the code.
After looking at the Thyra::ModelEvaluator class with @agsalin , it seems the issue is that, for the code to go into the Albany::ModelEvaluatorT::createOutArgsImpl routine when createOutArgs is called, the call to createOutArgs needs to happen in ATOT::Solver::evalModelImpl.
A fix to the issue would be to redesign the ATOT::Solver class so that the code that sets ret.responses_outT and ret.response_inT (lines 1716-1780 in ATOT_Solver.cpp) is called in ATOT::Solver::evalModelImpl rather than ATOT::Solver::CreateSubSolver. I have asked @jrobbin to look into such a redesign.
I will do this, once @lxmota has a chance to clean up the algol CDash tests I created to use the modules on the various LCM machines.
Currently in MechanicsProblem there are several methods to register and request output for fields and state variables. This is not optimal and causes confusion, and makes it nearly impossible to provide a standard interface.
As a developer, I want a class to manage output in order to simplify input syntax, provide more fine grained control of output frequency, and possibly add some in code post-processing capabilities (e.g. computing invariants of the stress tensor, or volume averaging integration point quantities).
At the Albany Users Meeting there was some discussion of moving this repository to a GitHub Organization. The major benefit would be the existence of teams and the related permissions hierarchy. Unfortunately the user @albany exists meaning we cannot create an Organization by that name. Other names considered:
That last one is the idea of moving the Albany repository under the existing trilinos Organization.
According to this page, we would still have decent autonomy within the Albany repository. However, two down sides of being under trilinos are that new members can only be added by trilinos owners and that teams would show up as trilinos/LCM
instead of something like AlbanySandia/LCM
.
If you have an opinion on this, please post it here.
@agsalin @gahansen
@jtostie, in commit 55f111b you left a comment
THE INTREPID REALSPACE TOOLS AND FUNCTION SPACE TOOLS NEED TO BE REMOVED
I was wondering what the reasoning behind this comment is.
The reason I'm bringing it up now is I'm having issues getting the (now Intrepid2) RealSpaceTools to work with const ScalarT
arguments, and I'm wondering if instead of trying to fix them I should be replacing them with something else.
If distributed responses / parameters are meant to work without Epetra in Albany, the ALBANY_EPETRA_EXE ifdef guards around the relevant functions should be removed. I am concerned if this is not done, there may be issues for physics that are being ported to Tpetra and that utilize this capability (e.g., ATO).
Some Albany FELIX Epetra tests with a CUDA KokkosNode (on GPUs) run to completion, then throw a cudaFree(m) error:
terminate called after throwing an instance of 'std::runtime_error'
what(): cudaFree(m) error( cudaErrorCudartUnloading): driver shutting down /home/ikalash/nightlyCDash/build/TrilinosInstall/include/Sacado_DynamicArrayTraits.hpp:172
Traceback functionality not available
[ride11:96460] *** Process received signal ***
[ride11:96460] Signal: Aborted (6)
[ride11:96460] Signal code: (-6)
[ride11:96460] [ 0] [0x100000050478]
[ride11:96460] [ 1] /lib64/libc.so.6(abort+0x280)[0x10000e680d70]
[ride11:96460] [ 2] /home/projects/pwr8-rhel73-lsf/gcc/5.4.0/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x1f4)[0x10000e4eb174]
[ride11:96460] [ 3] /home/projects/pwr8-rhel73-lsf/gcc/5.4.0/lib64/libstdc++.so.6(+0xa7b44)[0x10000e4e7b44]
[ride11:96460] [ 4] /home/projects/pwr8-rhel73-lsf/gcc/5.4.0/lib64/libstdc++.so.6(+0xa61c8)[0x10000e4e61c8]
[ride11:96460] [ 5] /home/projects/pwr8-rhel73-lsf/gcc/5.4.0/lib64/libstdc++.so.6(__gxx_personality_v0+0x2b8)[0x10000e4e7078]
[ride11:96460] [ 6] /home/projects/pwr8-rhel73-lsf/gcc/5.4.0/lib64/libgcc_s.so.1(+0xbc54)[0x10000e87bc54]
[ride11:96460] [ 7] /home/projects/pwr8-rhel73-lsf/gcc/5.4.0/lib64/libgcc_s.so.1(_Unwind_Resume+0x174)[0x10000e87c844]
[ride11:96460] [ 8] /home/ikalash/nightlyCDash/build/AlbBuild/src/Albany(_ZN6Kokkos4Impl25cuda_internal_error_throwE9cudaErrorPKcS3_i+0x5e0)[0x15b9c400]
[ride11:96460] [ 9] /home/ikalash/nightlyCDash/build/AlbBuild/src/Albany(_ZN6Sacado3Fad4DFadIdED1Ev+0x68)[0x126a70a8]
[ride11:96460] [10] /lib64/libc.so.6(+0x435a4)[0x10000e6835a4]
[ride11:96460] [11] /lib64/libc.so.6(exit+0x24)[0x10000e6835f4]
[ride11:96460] [12] /lib64/libc.so.6(+0x24708)[0x10000e664708]
[ride11:96460] [13] /lib64/libc.so.6(__libc_start_main+0xc4)[0x10000e6648f4]
[ride11:96460] *** End of error message ***
terminate called after throwing an instance of 'std::runtime_error'
what(): cudaFree(m) error( cudaErrorCudartUnloading): driver shutting down /home/ikalash/nightlyCDash/build/TrilinosInstall/include/Sacado_DynamicArrayTraits.hpp:172
Traceback functionality not available
(see http://cdash.sandia.gov/CDash-2-3-0/testDetails.php?test=2628151&build=47923 on the Sandia SRN CDash for more details). This seems to be related to UVM not playing nicely with statics, perhaps the fact that cleaning up of statics depends on the link order with UVM.
Catching appropriate flags and passing them to NOX. Known issue with Schwarz right now, material model wants to cut the time step, but Schwarz is not honoring that. (NEED TO VERIFY)
Revisit algorithms employed for adaptive time stepping. May link with Tempus task.
I'm trying to fix undefined references in Albany revealed by adding
IF(UNIX)
set(CMAKE_SHARED_LINKER_FLAGS "-Wl,--no-undefined")
ENDIF()
In QCAD, I find that a mathVector
is used all over the place, but nowhere defined. The code comments say the definition has moved to QCAD_MathVector.hpp
, which doesn't exist.
@jrobbin : I noticed after Dan's change to the default workset size in Albany that a couple of ATO test solutions (2Matl and RegHeaviside_3D) are workset size dependent -- the solutions are a few decimal points off with a workset size != 50 (50 was the old default workset size), causing the tests to fail. We should understand why this is happening to make sure there is no bug underlying these differences. Could you please have a look when you have the chance?
The Modified Schwarz implementation in Albany displays oscillatory and slow convergence compared to a Matlab implementation that shows monotonic and fast convergence. The root cause is not known.
In general, improving default solver parameters.
Specifically, for coupled problems with potentially disparate units, figure out how to scale appropriately.
Currently, the default workset size in Albany is 50, which is quite small. Is anyone opposed to increasing the default value? This is of particular interest for a build with a CUDA KokkosNode - many of the tests time out when run on a GPU b/c the workset size is too small, meaning the GPU does not have enough work to do relative to copies that are done.
If people agree to increase the default workset size, what is a reasonable value to use? Should we have an #ifdef Kokkos_ENABLE_Cuda that sets the default workset size (e.g., workset size = -1, which puts everything in 1 workset) to an even larger value when running on GPUs? One possible issue with this is that a problem may not fit on a GPU with workset size = -1, but I would think this would not happen for the ctest tests, as these are supposed to be small. @jewatkins , any thoughts?
Tempus is a new time integration package in Trilinos.
Hello!
I am compiling Albany at this SHA: da7c212
using the gcc compiler: powerpc64-bgq-linux-g++ (BGQ-V1R2M4-160620) 4.7.2
on a BlueGene/Q system
When compiling, I receive the following three warnings :
(1) J2MiniSolver.hpp:85:8: warning: inline function ‘void LCM::J2MiniKernel<EvalT, Traits>::operator()(int, int) const [with EvalT = PHAL::AlbanyTraits::DistParamDeriv; Traits = PHAL::AlbanyTraits]’ used but never defined [enabled by default]
(2 )ParallelNeohookeanModel.hpp:64:8: warning: inline function ‘void LCM::NeohookeanKernel<EvalT, Traits>::operator()(int, int) const [with EvalT = PHAL::AlbanyTraits::Residual; Traits = PHAL::AlbanyTraits]’ used but never defined [enabled by default]
(3) CrystalPlasticityModel.hpp:81:8: warning: inline function ‘void LCM::CrystalPlasticityKernel<EvalT, Traits>::operator()(int, int) const [with EvalT = PHAL::AlbanyTraits::Jacobian; Traits = PHAL::AlbanyTraits]’ used but never defined [enabled by default]
and then the gcc compiler seg-faults with the error message:
ConstitutiveModelInterface.cpp: In lambda function:
ConstitutiveModelInterface.cpp:12:1: internal compiler error: in get_expr_operands, at tree-ssa-operands.c:1035
It think that this may be a known issue with the gcc 4.7 compiler:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53137
When I comment out code related to these three constitutive models (J2MiniSolver, CrystalPlasticity, and ParallelNeohookean) in the ConstitutiveModelInterface, the Albany build completes successfully.
This is really a machine specific issue, as the Albany source is valid C++11 and is known to compile and run successfully with more recent compilers, however it is very unlikely that the compiler on the system I am using will be updated anytime in the near future.
I see two potential options :
(1) I carry on commenting out models that trigger this compiler bug, or perhaps keep a second branch of Albany in parallel to this one
(2) Optionally enable these constitutive models via CMake
Does anyone have any thoughts on this? Thanks for the input!
There was discussion at this week's call to consider rewriting history and/or separating out the examples/
directory. Here is some data to guide decisions:
12:13:53:westley:~/src/Albany$ du -hs .git
849M .git
12:13:55:westley:~/src/Albany$ du -hs .
1.7G .
12:13:58:westley:~/src/Albany$ du -hs examples/
827M examples/
What this means is that the entire history, under some degree of compression, is 850 MB, and the current version of examples/
without compression is 827 MB. The entire Albany/
directory is 1.7 GB, and subtracting out the .git/
history and examples/
we are left with about 30 MB of "everything else", meaning all C++ code, CMake code, dashboard scripts, etc.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.