Code Monkey home page Code Monkey logo

Comments (3)

wgzhao avatar wgzhao commented on July 23, 2024

这是我本地测试的结果,读取指定目录下 311个文件。为了减少篇幅,相似日志输出做了截断

2024-03-13 13:32:58.801 [        main] INFO  VMInfo               - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2024-03-13 13:32:58.810 [        main] INFO  Engine               -
{
	"setting":{
		"speed":{
			"byte":-1,
			"channel":8
		},
		"errorLimit":{
			"record":0,
			"percentage":0.02
		}
	},
	"content":{
		"reader":{
			"name":"ftpreader",
			"parameter":{
				"column":[
					"*"
				],
				"protocol":"ftp",
				"host":"127.0.0.1",
				"port":"21",
				"username":"wgzhao",
				"password":"*****",
				"skipDelimiter":true,
				"path":"/home/wgzhao/ftptest/ftpreader"
			}
		},
		"writer":{
			"name":"ftpwriter",
			"parameter":{
				"column":[
					"*"
				],
				"protocol":"ftp",
				"host":"127.0.0.1",
				"port":"21",
				"username":"wgzhao",
				"password":"*****",
				"path":"/home/wgzhao/ftptest/ftpwriter",
				"fileName":"101-测试",
				"writeMode":"truncate",
				"compress":"gz",
				"skipDelimiter":true
			}
		}
	}
}

2024-03-13 13:32:58.823 [        main] INFO  JobContainer         - The jobContainer begins to process the job.
2024-03-13 13:32:58.856 [       job-0] WARN  StorageWriterUtil    - The item encoding is empty, uses [UTF-8] as default.
2024-03-13 13:32:58.856 [       job-0] WARN  StorageWriterUtil    - The item delimiter is empty, uses [,] as default.
2024-03-13 13:32:58.874 [       job-0] INFO  JobContainer         - The Reader.Job [ftpreader] perform prepare work .
2024-03-13 13:32:58.916 [       job-0] INFO  FtpReader$Job        - 您即将读取的文件数为: [311]
2024-03-13 13:32:58.916 [       job-0] INFO  JobContainer         - The Writer.Job [ftpwriter] perform prepare work .
2024-03-13 13:32:58.917 [       job-0] INFO  StandardFtpHelperImpl - current working directory:/home/wgzhao/ftptest/ftpwriter
2024-03-13 13:32:58.938 [       job-0] INFO  FtpWriter$Job        - The current writeMode is truncate, begin to cleanup all files with prefix [101-测试] under [/home/wgzhao/ftptest/ftpwriter].
2024-03-13 13:32:58.938 [       job-0] INFO  FtpWriter$Job        - The following file(s) will be deleted: [/home/wgzhao/ftptest/ftpwriter/101-测试__20240313_133157_793_tdwnz8ut.txt, /home/wgzhao/ftptest/ftpwriter/101-测试__20240313_133157_783_4rmxunsq.txt, /home/wgzhao/ftptest/ftpwriter/101-测试__20240313_133157_817_44veuzg4.txt, /home/wgzhao/ftptest/ftpwriter/101-测试__20240313_133157_821_byy3pc11.txt, /home/wgzhao/ftptest/ftpwriter/101-测试__20240313_133157_821_gcfcw8vm.txt, /home/wgzhao/ftptest/ftpwriter/101-测试__20240313_133157_790_3xvm4f94.txt, /home/wgzhao/ftptest/ftpwriter/101-测试__20240313_133157_821_5t4agz8d.txt, /home/wgzhao/ftptest/ftpwriter/101-测试__20240313_133157_817_gu145tv1.txt].
2024-03-13 13:32:58.938 [       job-0] INFO  StandardFtpHelperImpl - current working directory:/home/wgzhao/ftptest/ftpwriter
2024-03-13 13:32:58.940 [       job-0] INFO  JobContainer         - Job set Channel-Number to 8 channel(s).
2024-03-13 13:32:58.956 [       job-0] INFO  JobContainer         - The Reader.Job [ftpreader] is divided into [311] task(s).
2024-03-13 13:32:58.956 [       job-0] INFO  StorageWriterUtil    - Begin to split...
2024-03-13 13:32:58.963 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133258_962_x27b8yt1]
2024-03-13 13:32:58.964 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133258_963_2n52sntt]
2024-03-13 13:32:58.964 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133258_964_e623a1s8]
2024-03-13 13:32:58.964 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133258_964_hgva8zyv]
2024-03-13 13:32:58.964 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133258_964_68uqewte]
2024-03-13 13:32:58.964 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133258_964_xwreawz9]
2024-03-13 13:32:58.965 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133258_964_6f3q3fyb]
2024-03-13 13:32:58.965 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133258_965_g3xqcuz4]
2024-03-13 13:32:58.965 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133258_965_u4ux9ywb]
2024-03-13 13:32:58.965 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133258_965_cdqacs9e]
2024-03-13 13:32:58.965 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133258_965_hb98t8ag]
2024-03-13 13:32:58.965 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133258_965_nshwrf8v]
2024-03-13 13:32:58.966 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133258_966_ct0fr1vx]
......
2024-03-13 13:32:59.000 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_000_ayxeax1h]
2024-03-13 13:32:59.001 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_001_upb2bgwv]
2024-03-13 13:32:59.001 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_001_9v2acmmw]
2024-03-13 13:32:59.001 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_001_bfmfes78]
2024-03-13 13:32:59.001 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_001_efg8bsz3]
2024-03-13 13:32:59.001 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_001_w8yu94xs]
2024-03-13 13:32:59.001 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_001_wtv1v4qp]
2024-03-13 13:32:59.001 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_001_utvqbwwa]
2024-03-13 13:32:59.001 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_001_uen8emwr]
2024-03-13 13:32:59.002 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_001_fexsrc16]
2024-03-13 13:32:59.002 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_002_5rb40g5w]
2024-03-13 13:32:59.002 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_002_9bsxftnm]
2024-03-13 13:32:59.002 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_002_1pqh18dv]
2024-03-13 13:32:59.002 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_002_fpppg6ep]
2024-03-13 13:32:59.002 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_002_n4vx5rq6]
2024-03-13 13:32:59.002 [       job-0] INFO  StorageWriterUtil    - split write file name:[101-测试__20240313_133259_002_huh199rz]
2024-03-13 13:32:59.010 [       job-0] INFO  StorageWriterUtil    - Finished split.
2024-03-13 13:32:59.010 [       job-0] INFO  JobContainer         - The Writer.Job [ftpwriter] is divided into [311] task(s).
2024-03-13 13:32:59.079 [       job-0] INFO  JobContainer         - The Scheduler launches [1] taskGroup(s).
2024-03-13 13:32:59.092 [ taskGroup-0] INFO  TaskGroupContainer   - The taskGroupId=[0] started [8] channels for [311] tasks.
2024-03-13 13:32:59.095 [ taskGroup-0] INFO  Channel              - The Channel set byte_speed_limit to -1, No bps activated.
2024-03-13 13:32:59.095 [ taskGroup-0] INFO  Channel              - The Channel set record_speed_limit to -1, No tps activated.
2024-03-13 13:32:59.116 [writer-0-284] INFO  FtpWriter$Task       - begin do write...
2024-03-13 13:32:59.116 [writer-0-284] INFO  FtpWriter$Task       - write to file : [/home/wgzhao/ftptest/ftpwriter/101-测试__20240313_133259_007_g1c7ysae.txt]
2024-03-13 13:32:59.116 [reader-0-284] INFO  FtpReader$Task       - reading file : [/home/wgzhao/ftptest/ftpreader/000851.SZ-300660.SZ.csv]
2024-03-13 13:32:59.116 [writer-0-284] INFO  StandardFtpHelperImpl - current working directory:/home/wgzhao
2024-03-13 13:32:59.117 [writer-0-284] INFO  StandardFtpHelperImpl - current working directory:/home/wgzhao/ftptest/ftpwriter
2024-03-13 13:32:59.118 [writer-0-279] INFO  FtpWriter$Task       - begin do write...
2024-03-13 13:32:59.118 [writer-0-182] INFO  FtpWriter$Task       - begin do write...
2024-03-13 13:32:59.119 [writer-0-182] INFO  FtpWriter$Task       - write to file : [/home/wgzhao/ftptest/ftpwriter/101-测试__20240313_133258_995_uqn2751q.txt]
2024-03-13 13:32:59.119 [writer-0-279] INFO  FtpWriter$Task       - write to file : [/home/wgzhao/ftptest/ftpwriter/101-测试__20240313_133259_007_25dn2gmv.txt]
2024-03-13 13:32:59.120 [writer-0-182] INFO  StandardFtpHelperImpl - current working directory:/home/wgzhao
.....
2024-03-13 13:33:03.463 [writer-0-205] INFO  FtpWriter$Task       - begin do write...
2024-03-13 13:33:03.463 [reader-0-173] WARN  StorageReaderUtil    - Uses [,] as delimiter by default
2024-03-13 13:33:03.463 [writer-0-205] INFO  FtpWriter$Task       - write to file : [/home/wgzhao/ftptest/ftpwriter/101-测试__20240313_133258_998_f4c0mn89.txt]
2024-03-13 13:33:03.463 [writer-0-205] INFO  StandardFtpHelperImpl - current working directory:/home/wgzhao
2024-03-13 13:33:03.463 [writer-0-205] INFO  StandardFtpHelperImpl - current working directory:/home/wgzhao/ftptest/ftpwriter
2024-03-13 13:33:03.463 [writer-0-171] INFO  FtpWriter$Task       - end do write
2024-03-13 13:33:03.464 [writer-0-173] INFO  FtpWriter$Task       - begin do write...
2024-03-13 13:33:03.464 [writer-0-173] INFO  FtpWriter$Task       - write to file : [/home/wgzhao/ftptest/ftpwriter/101-测试__20240313_133258_994_s8z8tsdr.txt]
2024-03-13 13:33:03.464 [writer-0-173] INFO  StandardFtpHelperImpl - current working directory:/home/wgzhao
2024-03-13 13:33:03.464 [writer-0-173] INFO  StandardFtpHelperImpl - current working directory:/home/wgzhao/ftptest/ftpwriter
2024-03-13 13:33:03.465 [writer-0-173] INFO  FtpWriter$Task       - end do write
2024-03-13 13:33:03.465 [reader-0-205] INFO  FtpReader$Task       - reading file : [/home/wgzhao/ftptest/ftpreader/600575.SH-000677.SZ.csv]
2024-03-13 13:33:03.466 [reader-0-205] WARN  StorageReaderUtil    - Uses [,] as delimiter by default
2024-03-13 13:33:03.466 [writer-0-205] INFO  FtpWriter$Task       - end do write
2024-03-13 13:33:03.467 [ reader-0-42] INFO  FtpReader$Task       - reading file : [/home/wgzhao/ftptest/ftpreader/688018.SH-000632.SZ.csv]
2024-03-13 13:33:03.467 [ reader-0-42] WARN  StorageReaderUtil    - Uses [,] as delimiter by default
2024-03-13 13:33:03.468 [ writer-0-42] INFO  FtpWriter$Task       - end do write
2024-03-13 13:33:03.499 [writer-0-158] INFO  FtpWriter$Task       - end do write
2024-03-13 13:33:05.096 [       job-0] INFO  AbstractScheduler    - The scheduler has completed all tasks.
2024-03-13 13:33:05.096 [       job-0] INFO  JobContainer         - The Writer.Job [ftpwriter] perform post work.
2024-03-13 13:33:05.096 [       job-0] INFO  JobContainer         - The Reader.Job [ftpreader] perform post work.
2024-03-13 13:33:05.103 [       job-0] INFO  StandAloneJobContainerCommunicator - Total 225262 records, 23043447 bytes | Speed 3.66MB/s, 37543 records/s | Error 0 records, 0 bytes |  All Task WaitWriterTime 0.974s |  All Task WaitReaderTime 0.156s | Percentage 100.00%
2024-03-13 13:33:05.103 [       job-0] INFO  JobContainer         -
Job start  at             : 2024-03-13 13:32:58
Job end    at             : 2024-03-13 13:33:05
Job took secs             :                  6s
Average   bps             :            3.66MB/s
Average   rps             :          37543rec/s
Number of rec             :              225262
Failed record             :                   0

from addax.

Bear-big-code avatar Bear-big-code commented on July 23, 2024

基本上文件夹下的文件在1000~2000左右的时候问题不大,但是我现在的情况是,每天都会有8~20万个文件的增量但文件不大,只有几百个字节一个文件,以天作为的文件夹;
咱这个支持分批去同步吗,比如20万个文件,分成200次,每次1000个文件?

from addax.

wgzhao avatar wgzhao commented on July 23, 2024

目前的模式每一个文件对应一个任务,也就是一个线程,如果一次读取上万,乃至上十万的话,等于一次性要开这么多个线程,在我本地模拟读取15万个文件时,直接退出了。如果你的文件命名是有规则的话,你可以在 path 项使用通配符的方式一次制定一批文件,这样应该可以临时解决你的问题,类似如下:

{
          "parameter": {
            "path": "/home/wgzhao/ftptest/ftpreader/100*.csv"
          }
}

from addax.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.