stream load写入失败版本失败
willwang704 发布于2021-06 浏览:3618 回复:8
0
收藏

我们团队现在在测试doris,大批量数据写入的时候会出现大量版本错误,导致写入失败,使用stream load方式写入,有可能是我们那块的配置或者使用方式不对,但是我在咱们官网没有找到版本这个概念的解释和最佳实践,所以想问一下,针对这种错误有啥可以调整的内容吗?

应用报错

BE 日志报错:

W0617 16:28:59.049198 12814 task_worker_pool.cpp:702] publish version error, retry. [transaction_id=1087629, error_tablets_size=1]
W0617 16:29:00.049214 12811 engine_publish_version_task.cpp:73] could not find related rowset for tablet 10854 txn id 1166323
W0617 16:29:00.049360 12811 task_worker_pool.cpp:702] publish version error, retry. [transaction_id=1166323, error_tablets_size=1]
W0617 16:29:00.049361 12810 engine_publish_version_task.cpp:73] could not find related rowset for tablet 10854 txn id 1038439
W0617 16:29:00.049432 12814 engine_publish_version_task.cpp:73] could not find related rowset for tablet 10854 txn id 1087629
W0617 16:29:00.049435 12810 task_worker_pool.cpp:702] publish version error, retry. [transaction_id=1038439, error_tablets_size=1]
W0617 16:29:00.049494 12814 task_worker_pool.cpp:702] publish version error, retry. [transaction_id=1087629, error_tablets_size=1]
W0617 16:29:01.049463 12811 task_worker_pool.cpp:715] publish version failed. signature:1166323, error_code=-914
W0617 16:29:01.049563 12810 task_worker_pool.cpp:715] publish version failed. signature:1038439, error_code=-914
W0617 16:29:01.049604 12814 task_worker_pool.cpp:715] publish version failed. signature:1087629, error_code=-914
W0617 16:29:10.394752 12813 engine_publish_version_task.cpp:73] could not find related rowset for tablet 10854 txn id 1087629
收藏
点赞
0
个赞
共8条回复 最后由简简单单XTlife回复于2021-06
#9简简单单XTlife回复于2021-06
#5 willwang704回复
stream load 失败报错就是一楼的截图内容,提示socket 连接失败,但是集群的网络都是稳定有监控的,就怀疑是不是我对doris的那块使用姿势不对
展开

你看下你那边是不是json过长导致的,我这边也出现了这个报错,目前没什么好的解决方案,你要是也是这个原因导致的,找到解决方案告知一声啊。

0
#8简简单单XTlife回复于2021-06
#7 Ling缪回复
或者可以试一下重启fe 或者be 。

我也遇到了找个报错,我这边经过多次测试,是因为json字符串过长导致的,目前也没有找到好的解决方案,只能把jsonArray 的数据 分段插入,但是这么搞会导致http调用的频次过多,偶尔会出现socket 超时的情况

0
#7Ling缪回复于2021-06

或者可以试一下重启fe 或者be 。

0
#6Ling缪回复于2021-06

你导入频次是多少?

0
#5willwang704回复于2021-06
#4 Ling缪回复
你当时stream load 失败的报错信息是?感觉你这个info 日志里面有很多超时的 事务呀
展开

stream load 失败报错就是一楼的截图内容,提示socket 连接失败,但是集群的网络都是稳定有监控的,就怀疑是不是我对doris的那块使用姿势不对

0
#4Ling缪回复于2021-06

你当时stream load 失败的报错信息是?感觉你这个info 日志里面有很多超时的 事务呀

0
#3willwang704回复于2021-06
#2 Ling缪回复
在 be.INFO 日志中找一下  error_code=-914 的上下文,是否存在类似数据版本过多的日志信息。
展开

没有找到数据版本过多的信息,上下文更多是这种的日志

I0617 16:32:33.282379 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment e34122ae0da49a73-2977cec2ad1c2d85
I0617 16:32:33.282385 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment 9c40fbbc7953e10d-40ca6fd2de155aa9
I0617 16:32:33.282392 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment 3449792e0edadcd9-9b9ecde5da127d8c
I0617 16:32:33.282398 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment 47466aceac439bba-9c1f7ee73a703d89
I0617 16:32:33.282416 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment ca4524368366fdae-7ddbfb7373552a86
I0617 16:32:33.282423 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment 4e49bf8ba8eabdeb-cf4fef6aa6966baf
I0617 16:32:33.282433 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment ad44c4acd572cd56-2c13d754e0c9aaa7
I0617 16:32:33.282439 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment e84582737c9d5e4f-4717c4719c439cae
I0617 16:32:33.282445 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment 12492a9dd5bffd63-57a7851a842502bf
W0617 16:32:33.409693 12814 task_worker_pool.cpp:715] publish version failed. signature:1166323, error_code=-914
W0617 16:32:33.409720 12811 task_worker_pool.cpp:715] publish version failed. signature:1038439, error_code=-914
W0617 16:32:33.410053 12810 task_worker_pool.cpp:715] publish version failed. signature:1087629, error_code=-914
I0617 16:32:33.410305 12811 task_worker_pool.cpp:275] finish task success.
I0617 16:32:33.410332 12811 task_worker_pool.cpp:260] remove task info. type=PUBLISH_VERSION, signature=1038439, queue_size=2
I0617 16:32:33.410346 12814 task_worker_pool.cpp:275] finish task success.
I0617 16:32:33.410372 12814 task_worker_pool.cpp:260] remove task info. type=PUBLISH_VERSION, signature=1166323, queue_size=1
I0617 16:32:33.410410 12810 task_worker_pool.cpp:275] finish task success.
I0617 16:32:33.410423 12810 task_worker_pool.cpp:260] remove task info. type=PUBLISH_VERSION, signature=1087629, queue_size=0
I0617 16:32:34.282572 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment 7b4589fa5e831abe-6c338b4573c7e7a2
I0617 16:32:34.282598 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment a7467b2fbce95218-6abae64b2c3bd79a
I0617 16:32:34.282603 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment 2e423c1ba8832b3c-58cb2c8cbd62e88a
I0617 16:32:34.282608 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment 9c4c2c2ba32a8b98-9975b3e281cb1593
I0617 16:32:34.282614 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment 89457be17b4052fc-c1bf49d4ad2e9596
I0617 16:32:34.282619 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment 5d448ac472f91f5a-390cc3f8339a1188
I0617 16:32:34.282624 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment 28452bde1038403a-e65d4b9f86581da1
I0617 16:32:34.282629 12676 fragment_mgr.cpp:580] FragmentMgr cancel worker going to cancel timeout fragment dc403adadb2968a2-6000d19fd94883bc
0
#2Ling缪回复于2021-06

在 be.INFO 日志中找一下  error_code=-914 的上下文,是否存在类似数据版本过多的日志信息。

0
快速回复
TOP
切换版块