Code Monkey home page Code Monkey logo

Comments (19)

PhanLe1010 avatar PhanLe1010 commented on May 28, 2024 6

It is int64. Probably no danger?

you are right, if we are writing with speed 50k IOPs, it would take 5849424 years to go overflow. None of us can see that day :)))

from longhorn.

PhanLe1010 avatar PhanLe1010 commented on May 28, 2024 4

Update: a tiny small improvement to the revision counter logic

Current implementation

We are converting the int64 to a string then a slice of bytes https://github.com/longhorn/longhorn-engine/blob/be5e072e02cd959ab371a04264957624223b01fb/pkg/replica/revision_counter.go#L45-L46

The strconv.FormatInt function seems to eat noticeable CPU time in profiling

Modification:

Using LittleEndian to encode and decode the int64:

func (r *Replica) writeRevisionCounter(counter int64) error {
	if r.revisionFile == nil {
		return fmt.Errorf("BUG: revision file wasn't initialized")
	}

	copy(revisionCounterBuf, int64ToBytes(counter))
	_, err := r.revisionFile.WriteAt(revisionCounterBuf, 0)
	if err != nil {
		return errors.Wrap(err, "failed to write to revision counter file")
	}
	return nil
}

func int64ToBytes(n int64) []byte {
	bytes := make([]byte, 8)
	binary.LittleEndian.PutUint64(bytes, uint64(n))
	return bytes
}

func bytesToInt64(bytes []byte) int64 {
	return int64(binary.LittleEndian.Uint64(bytes))
}

Result

A small improvement in the write performance maybe 1-2%.
However, the implementation cost is big: we need to migrate and update from old decoding to new decoding.
I think this is not a good idea

from longhorn.

PhanLe1010 avatar PhanLe1010 commented on May 28, 2024 2

This is potentially the 3rd bottleneck:

cc @shuo-wu @ejweber @derekbit

from longhorn.

derekbit avatar derekbit commented on May 28, 2024 2

This is potentially the 3rd bottleneck:

cc @shuo-wu @ejweber @derekbit

We already know the input data []byte needs to be aligned for direct IO.
Can we always allocate an aligned buffer for data in the caller end for preventing extra data copy in the two functions?

from longhorn.

PhanLe1010 avatar PhanLe1010 commented on May 28, 2024 2

We already know the input data []byte needs to be aligned for direct IO.
Can we always allocate an aligned buffer for data in the caller end for preventing extra data copy in the two functions?

Agree with this idea. I will give it a try to see if it can improve the read speed

@derekbit Looks like the performance gets worse. So we will not do this:

IOPS (Read/Write)
        Random:          45,395 / 19,553
    Sequential:          62,778 / 33,106

Bandwidth in KiB/sec (Read/Write)
        Random:        714,421 / 348,190
    Sequential:        976,918 / 353,571
                                        

Latency in ns (Read/Write)
        Random:        821,898 / 455,709
    Sequential:        814,332 / 456,455

from longhorn.

PhanLe1010 avatar PhanLe1010 commented on May 28, 2024 2

Update:

There is potentially another bottleneck in the read flow in the replica. For every read, we have to make a systemcall os.(*File).Stat to get the size of the snaoshot file here. This systemcall is eating noticeable CPU. Do you think we can cache this snapshot size value and don't have to issue a systemcall for every read? @shuo-wu @derekbit @ejweber

image

from longhorn.

PhanLe1010 avatar PhanLe1010 commented on May 28, 2024 1

We already know the input data []byte needs to be aligned for direct IO.
Can we always allocate an aligned buffer for data in the caller end for preventing extra data copy in the two functions?

Agree with this idea. I will give it a try to see if it can improve the read speed

from longhorn.

derekbit avatar derekbit commented on May 28, 2024 1

Side track a little bit. Do you think there is a danger with integer overflow when we keep increasing the counter over here https://github.com/longhorn/longhorn-engine/blob/be5e072e02cd959ab371a04264957624223b01fb/pkg/replica/revision_counter.go#L143 ?

If we keep writing long enough, it might become negative

It is int64. Probably no danger?

from longhorn.

PhanLe1010 avatar PhanLe1010 commented on May 28, 2024 1

@shuo-wu This is the CPU usage of strconv.FormatInt that we were talking about in the sync up. It is small but noticeable

Screenshot from 2024-04-25 12-25-24

from longhorn.

shuo-wu avatar shuo-wu commented on May 28, 2024 1

For every read, we have to make a systemcall os.(*File).Stat to get the size of the snaoshot file here.

Good catch! I am fine with using the cached size. The snapshot immutability should be guaranteed by the ctime/checksum mechanism

from longhorn.

PhanLe1010 avatar PhanLe1010 commented on May 28, 2024 1

Do you think we should backport this to v1.6.x and v1.5.x? @shuo-wu @ejweber @derekbit

from longhorn.

ejweber avatar ejweber commented on May 28, 2024 1

Do you think we should backport this to v1.6.x and v1.5.x? @shuo-wu @ejweber @derekbit

I don't think there is much danger here, especially around compatibility, as the changes are all self-contained within the engine. On the other hand, connection count will increase (and possible CPU utilization as well) as a tradeoff for the performance. Maybe that's surprising in a patch update?

I lean slightly towards backporting, but I'm also fine with the gains being associated with a new minor version of Longhorn. I'll defer to whatever you decide.

from longhorn.

PhanLe1010 avatar PhanLe1010 commented on May 28, 2024

Side track a little bit. Do you think there is a danger with integer overflow when we keep increasing the counter over here https://github.com/longhorn/longhorn-engine/blob/be5e072e02cd959ab371a04264957624223b01fb/pkg/replica/revision_counter.go#L143 ?

If we keep writing long enough, it might become negative

from longhorn.

PhanLe1010 avatar PhanLe1010 commented on May 28, 2024

As dissed with @shuo-wu in the sync up. We rerun some of the Broadcom tests. The setup is the same setup in our initial Broadcom report in SUSE data center

Case 1: In a CPU/RAM limited constraints

  1. Random read IOPs test
    • IOPs: similar
    • CPU usage: similar
  2. Sequential read IOPs test
    • IOPs: similar
    • CPU usage: similar
  3. Random write IOPs test
    • IOPs: increased from 2988 to 3150
    • CPU usage: similar
  4. Sequential write IOPs test
    • IOPs: similar
    • CPU usage: similar

Case 2: In no CPU/RAM constraint and workload are NOT rate-limited

With master-head engine:

IOPS (Read/Write)
        Random:          26,603 / 28,501
    Sequential:          49,915 / 43,092

Bandwidth in KiB/sec (Read/Write)
        Random:      1,587,307 / 505,325
    Sequential:      1,827,467 / 535,291
                                        

Latency in ns (Read/Write)
        Random:        272,445 / 260,084
    Sequential:        273,663 / 257,785

With the PR:

IOPS (Read/Write)
        Random:          26,946 / 33,542
    Sequential:          50,484 / 51,998

Bandwidth in KiB/sec (Read/Write)
        Random:      1,595,310 / 541,211
    Sequential:      1,611,390 / 528,907
                                        

Latency in ns (Read/Write)
        Random:        274,136 / 262,201
    Sequential:        277,235 / 252,450

Note: in this case read performance somehow doesn't increase. I believe it is something weird with tgt on Photon RT OS. Even with tgt+local file backend, it is still give the same read performance as tgt+Longhorn => the bottleneck is at tgt

Conclusion

  • Does the increasing number of connections between engine-replica hurt the resource-constrained env?
    -> Looks like no

from longhorn.

PhanLe1010 avatar PhanLe1010 commented on May 28, 2024

Btw, this article is very helpful in understand a common mistake in go concurrency https://eli.thegreenplace.net/2019/go-internals-capturing-loop-variables-in-closures/

from longhorn.

PhanLe1010 avatar PhanLe1010 commented on May 28, 2024

After testing, the snapshot file caching idea doesn't seem to improve performance in practice even though sounds good in theory. CPU usage is also relatively the same. Let's abandon it. Sorry for the noise.

from longhorn.

longhorn-io-github-bot avatar longhorn-io-github-bot commented on May 28, 2024

Pre Ready-For-Testing Checklist

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at: #8436 (comment)

  • Is there a workaround for the issue? If so, where is it documented?
    The workaround is at:

  • Does the PR include the explanation for the fix or the feature?

  • Does the PR include deployment change (YAML/Chart)? If so, where are the PRs for both YAML file and Chart?
    The PR for the YAML change is at:
    The PR for the chart change is at:

  • Have the backend code been merged (Manager, Engine, Instance Manager, BackupStore etc) (including backport-needed/*)?
    The PR is at: longhorn/longhorn-engine#1085

  • Which areas/issues this PR might have potential impacts on?
    Area Volume Read/Write
    Issues

  • If labeled: require/LEP Has the Longhorn Enhancement Proposal PR submitted?
    The LEP PR is at

  • If labeled: area/ui Has the UI issue filed or ready to be merged (including backport-needed/*)?
    The UI issue/PR is at

  • If labeled: require/doc Has the necessary document PR submitted or merged (including backport-needed/*)?
    The documentation issue/PR is at

  • If labeled: require/automation-e2e Has the end-to-end test plan been merged? Have QAs agreed on the automation test case? If only test case skeleton w/o implementation, have you created an implementation issue (including backport-needed/*)
    The automation skeleton PR is at
    The automation test case PR is at
    The issue of automation test case implementation is at (please create by the template)

  • If labeled: require/automation-engine Has the engine integration test been merged (including backport-needed/*)?
    The engine automation PR is at

  • If labeled: require/manual-test-plan Has the manual test plan been documented?
    The updated manual test plan is at

  • If the fix introduces the code for backward compatibility Has a separate issue been filed with the label release/obsolete-compatibility?
    The compatibility issue is filed at

from longhorn.

PhanLe1010 avatar PhanLe1010 commented on May 28, 2024

Test Plan:

  1. Create a cluster of 3 worker nodes with similar big spec like equinix metal m3.small.x86, ubuntu 22.04, 5.15.0-101-generic, 20Gbps network
  2. deploy Longhorn 1.6.1
  3. Run kbench
  4. Upgrade Longhorn to master-head and upgrade the engine to master-head
  5. Run Kbench again
  6. Verify the you see better IOPs result
  7. See the issue description for example

from longhorn.

roger-ryao avatar roger-ryao commented on May 28, 2024

Verified on master-head 20240508

The test steps
#8436 (comment)

Result Passed

  1. In the AWS EC2 t2.xlarge environment, I did not observe significant differences in IOPS.
    Therefore, I have decided to build up a cluster in a local virtual machine to conduct this test.
    The test result is as follows.

v1.6.1

=========================
FIO Benchmark Summary
For: test_device
CPU Idleness Profiling: disabled
Size: 30G
Quick Mode: disabled
=========================
IOPS (Read/Write)
        Random:           10,329 / 4,684
    Sequential:          17,677 / 10,204

Bandwidth in KiB/sec (Read/Write)
        Random:        413,252 / 141,958
    Sequential:        461,723 / 170,358
                                        

Latency in ns (Read/Write)
        Random:        431,773 / 553,568
    Sequential:        357,682 / 547,949

master-head

=========================
FIO Benchmark Summary
For: test_device
CPU Idleness Profiling: disabled
Size: 30G
Quick Mode: disabled
=========================
IOPS (Read/Write)
        Random:           11,024 / 5,082
    Sequential:          18,149 / 10,672

Bandwidth in KiB/sec (Read/Write)
        Random:        415,402 / 137,600
    Sequential:        445,611 / 167,325
                                        

Latency in ns (Read/Write)
        Random:        409,800 / 525,464
    Sequential:        342,326 / 525,112

Here is the summary table

IOPS (Random Read/Write) IOPS (Sequential Read/Write)
v1.6.1-1st 10,329 / 4,684 17,677 / 10,204
master-head-1st 11,024 / 5,082 18,149 / 10,672
Bandwidth KiB/sec (Random Read/Write) Bandwidth KiB/sec (Sequential Read/Write)
v1.6.1-1st 413,252 / 141,958 461,723 / 170,358
master-head-1st 415,402 / 137,600 445,611 / 167,325
Latency ns (Random Read/Write) Latency ns (Sequential Read/Write)
v1.6.1-1st 431,773 / 553,568 357,682 / 547,949
master-head-1st 409,800 / 525,464 342,326 / 525,112

Hi @PhanLe1010
I think we can close this ticket. If there are any concerns regarding the test results, please let me know.

from longhorn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.