注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

snoopyxdy的博客

https://github.com/DoubleSpout

 
 
 

日志

 
 

Mongodb 副本集的 error RS102 too stale to catch up  

2012-05-17 15:01:30|  分类: node |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |
昨天完成了mongodb副本集的搭建和修复等博客,今天上午往node.js框架rrestjs内完善副本集支持的代码时停掉了其中一个30的节点,中午饱餐一顿回来以后,再启动30的mongodb发现报错了。
Mongodb 副本集的 error RS102 too stale to catch up - snoopyxdy - snoopyxdy的博客
报了一个 error RS102 too stale to catch up 的错误。
错误 replica sets 102 太陈旧跟不上了。从字面上的理解就是此节点由于脱离副本集时间太长,已经跟不上整个集群了,说的通俗点就是一个学生病假请的太多了,跟不上整个班级的教学进度了。那改怎么办呢?老师帮忙开小灶呗!

我们先来看官方的说明吧:
地址:http://www.mongodb.org/display/DOCS/Resyncing+a+Very+Stale+Replica+Set+Member

MongoDB writes operations to an oplog.  For replica sets this data is stored in collection local.oplog.rs.  This is a capped collection and wraps when full "RRD"-style.  Thus, it is important that the oplog collection is large enough to buffer a good amount of writes when some members of a replica set are down.  If too many writes occur, the down nodes, when they resume, cannot catch up.  In that case, a full resync would be required.

In v1.8+, you can run db.printReplicationInfo() to see the status of the oplog on both the current primary and the overly stale member. This should show you their times, and if their logs have an overlapping time range. If the time ranges don't overlap, there is no way for the stale secondary to recover and catch up (except for a full resync).

译文:mongodb会生成一个操作的日志,称为oplog,这个日志在副本集节点中会保存在 local.oplog.rs中。这是一个固定大小的集合,具有RRD风格。(什么是RRD风格呢?全称Round Robin Database环状数据库。)因此你必须将这个日志的oplog集合设置的足够大,当副本集有节点嗝屁了还不至于将之前的操作日志刷掉。如果oplog写入了过多的操作,在那段时间内down掉的节点重新启动是无法同步的,所以必须完全重新同步!(尼玛,坑爹啊这是!!!)
在1.8版本以上,你可以运行 db.printReplicationInfo()  这个命令来看oplog在当前的主节点和重叠的老节点,这可以告诉你他们的时间和是否有重叠的时间区间。当时间区间没有重叠,陈旧的从节点也就无法恢复和跟上了(除非一此完整的同步!还是坑爹!

由于我在30节点down掉之后做了很多压测什么的,所以操作记录早就完全不同步了,下面是官方给出的几个解决方案:

What to do on a RS102 sync error

If one of your members has been offline and is now too far behind to catch up, you will need to resync. There are a number of ways to do this.

  • Perform a full resync. If you stop the failed mongod, delete all data in the dbpath (including subdirectories), and restart it, it will automatically resynchronize itself. Obviously it would be better/safer to back up the data first. If disk space is adequate, simply move it to a backup location on the machine if appropriate. Resyncing may take a long time if the database is huge or the network slow – even idealized one terabyte of data would require three hours to transmit over gigabit ethernet.

or

  • Copy data from another member: You can copy all the data files from another member of the set IF you have a snapshot of that member's data file's. This can be done in a number of ways. The simplest is to stop mongod on the source member, copy all its files, and then restart mongod on both nodes. The Mongo fsync and lock feature is another way to achieve this if you are using EBS or a SAN. On a slow network, snapshotting all the datafiles from another (inactive) member to a gziped tarball is a good solution. Also similar strategies work well when using SANs and services such as Amazon Elastic Block Service snapshots.

or

  • Find a member with older data: Note: this is only possible (and occurs automatically) in v1.8+. If another member of the replica set has a large enough oplog or is far enough behind that the stale member can sync from it, the stale member can bootstrap itself from this member.

如果发生了RS102错误怎么办?
如果你的节点中的一个下线了太长时间,而无法跟上了,你必须重新同步,下面有3个方法去做这个事情:
1、执行全部的重新同步:删除dbpath下的所有文件,包括子文件夹,然后重启它。它会自动重新同步的。很明显这个方式是更安全的,全部重新同步会花费很长时间,1个TB的数据可能要话费3小时以上。
2、从其他节点拷贝数据:这个就不多说了,就是从其他节点重新拷贝一份。
3、从一个成员那里找到相对旧的数据:在1.8版本之后这项工作是自动做了,我这里是2.0版本,所以这个方法等于没有了。

还是老老实实选择第一个吧。
Mongodb 副本集的 error RS102 too stale to catch up - snoopyxdy - snoopyxdy的博客
速度还不错,大约10W多条数据5分钟就同步好了
Mongodb 副本集的 error RS102 too stale to catch up - snoopyxdy - snoopyxdy的博客
 
 




  评论这张
 
阅读(2169)| 评论(1)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2016