I am working on a parallel code. In my main function I have a loop over time, and at the start I need to copy the class by using the assignment operator. But somehow at 4th step, the double free or corruption error occurs on one of the processor, and others fine; and the error on std::set and set::map. Below is the part of the code and main loop.
class Mesh
{
public:
const Mesh &operator=(const Mesh &mesh);
std::set ghostSet;
std::map localIndex;
}
Assignment operator:
const Mesh &operator=(const Mesh &mesh)
{
std::set().swap(ghostSet); ///BUG here
std::map().swap(localIndex); /// BUG sometimes here
for(auto const &it : mesh.localIndex)
localIndex[it.first] = it.second;
for(auto const &it : mesh.ghostSet)
ghostSet.insert(it);
return *this;
}
main function:
int main(int argc, char *argv[])
{
Mesh ms, ms_gh;
/// Some operation to ms;
for(size_t t = 0; t != 10; t++)
{
/// Some operation to ms;
ms_gh = ms;
/// Some operation to ms_gh;
}
}
#0 0x00002aaab2405207 in raise () from /lib64/libc.so.6
#1 0x00002aaab24068f8 in abort () from /lib64/libc.so.6
#2 0x00002aaab2447cc7 in __libc_message () from /lib64/libc.so.6
#3 0x00002aaab2450429 in _int_free () from /lib64/libc.so.6
#4 0x000000000041bfba in __gnu_cxx::new_allocator >::deallocate (this=07fffffff8b50, __p=0x7131c0)
at /usr/include/c++/4.8.2/ext/new_allocator.h:110
#5 0x000000000041835c in std::_Rb_tree, std::ess, std::allocator >::_M_put_node (this=0x7fffffff8b50, __p=0x7131c0)
at /usr/include/c++/4.8.2/bits/stl_tree.h:374
#6 0x000000000041276e in std::_Rb_tree, std::ess, std::allocator >::_M_destroy_node (this=0x7fffffff8b50, __p=0x7131c0)
at /usr/include/c++/4.8.2/bits/stl_tree.h:422
#7 0x000000000040c8ad in std::_Rb_tree, std::ess, std::allocator >::_M_erase (this=0x7fffffff8b50, __x=0x7131c0)
at /usr/include/c++/4.8.2/bits/stl_tree.h:1127
#8 0x000000000040c88a in std::_Rb_tree, std::ess, std::allocator >::_M_erase (this=0x7fffffff8b50, __x=0x72f410)
at /usr/include/c++/4.8.2/bits/stl_tree.h:1125
#9 0x000000000040c88a in std::_Rb_tree, std::ess, std::allocator >::_M_erase (this=0x7fffffff8b50, __x=0x72b760)
at /usr/include/c++/4.8.2/bits/stl_tree.h:1125
#10 0x000000000040c88a in std::_Rb_tree, std::ess, std::allocator >::_M_erase (this=0x7fffffff8b50, __x=0x70fce0)
at /usr/include/c++/4.8.2/bits/stl_tree.h:1125
#11 0x00000000004080c4 in std::_Rb_tree, std::ess, std::allocator >::~_Rb_tree (this=0x7fffffff8b50, __in_chrg=)
at /usr/include/c++/4.8.2/bits/stl_tree.h:671
#12 0x0000000000407bbc in std::set, std::allocator ::~set (this=0x7fffffff8b50,
__in_chrg=) at /usr/include/c++/4.8.2/bits/stl_set.h:90
#13 0x0000000000405003 in Mesh::operator= (this=0x7fffffffa8a0, mesh=...)
at mesh.cpp:73
#14 0x000000000048eb98 in DynamicMesh::reattach_ghost (mpi_comm=1140850688,
ms=..., cn=..., ms_gh=..., gh=..., cn_gh=..., ale=..., t=4)
at dynamicMesh.cpp:273
In this case the traceback #13 corresponds to swap the std::set.
My problem is why this kind of error does not appear at the first time step, and why it does not appear on all processors. Moreover, this bug sometimes occurs in the std::map related lines.
Additionally, on my macOS and Linux laptop, the code can be run successfully; but it does not work on the HPC.
No comments:
Post a Comment