Code Monkey home page Code Monkey logo

Comments (7)

krizhanovsky avatar krizhanovsky commented on May 25, 2024

Hi @egraldlo ,

thanks for you interest to the study.

Firstly, you should have been seen very small conflicts number with overlap argument of tfw_func() equal to 0, right? (Otherwise, if overlap is non-zero, then there is data dependency producing transactional conflicts). A CPU transaction succeeds only iff all the data in L1d cache. Theoretically, during the test benchmark you can see transaction aborts if the data is evicted from the cache by some other task. CPU caches also have associativity and nobody knows how do they calculate their hash functions to store data, so there could be some pathological cases... However, I was also confused by the fact that data overlapping (actually absence of data overlapping) seems doesn't affect transaction aborts. I asked the questions on Intel forum https://software.intel.com/en-us/forums/intel-isa-extensions/topic/488911 (see the 2nd question) and it seems they left unanswered.

from blog.

egraldlo avatar egraldlo commented on May 25, 2024

Thanks @krizhanovsky , Recently I am studying the RTM, and I run your code in Intel(R) Xeon(R) CPU E5-2643 v4 @ 3.4GHZ and GCC-4.8.5, when the overlap is zero, Actually the conflicts number is not small. when sz_size of the transaction is 64, overlap is zero, the conflicts rate is 99%, it's so weird result, and it does not conform to principle of RTM I think.
Apart from this, I have other two questions those I want to have a discussion with you :)

  1. the performance of STM is slower than SPINLOCK way, the results of the codes show that performance of RTM is also not better than SPINLOCK, am I right?
  2. I am so confused about that some aborts happens when single thread runs, the abort number is small, but in my opinion, it must be no aborts in a single thread situation, some interrupt in CPU affects it?
  3. when we should invoke _xabort(status) function in our code, I will show the question in the following comments, thanks very much.

from blog.

egraldlo avatar egraldlo commented on May 25, 2024

Hi @krizhanovsky , this is the question-3 above:
I used _xabort(status) function in my code, in order to release the buffer in cache. I check the transaction whether it's successful, if the transaction success, I use _xend(), otherwise I use _xabort(status). Can I use the _xabort(status) in this situation?

https://gist.github.com/egraldlo/ff456b0fd25d38ac2124a451c82853c1

Can I use the _xabort(status) like this? it's a single thread situation, I check the transaction_func() by myself, and call _xabort(status). I think it's a issue about usage of _xabort(status), I don't know whether it can remain in (status == _XBEGIN_STARTED) area. In my opinion, if the status equals _XBEGIN_STARTED, it will not abort the transaction, so this code is not right?

If transction_func() return false, printf the status in this code, why the status value is 0xff000001?

from blog.

krizhanovsky avatar krizhanovsky commented on May 25, 2024

Hi @egraldlo,

when sz_size of the transaction is 64, overlap is zero, the conflicts rate is 99%

does 64 mean cache lines or bytes? In the study I saw 100% aborts on single thread transaction with 256 cache lines. In my case it meant that roughly only 1/2 of L1d can be used for a transaction. Please check the cache data (and don't forget about associativity!) for your CPU.

the performance of STM is slower than SPINLOCK way, the results of the codes show that performance of RTM is also not better than SPINLOCK, am I right?

I saw that TSX can be significantly faster than a spinlock on short and small transactions. Please check graph for transaction size and transaction time.

I am so confused about that some aborts happens when single thread runs, the abort number is small, but in my opinion, it must be no aborts in a single thread situation, some interrupt in CPU affects it?

Yeah, maybe so. I thought that it could has sense to try TSX in kernel mode with preemption disabled. Or use taskset(1) to bind the test program to a CPU and assign interrupts to other CPUs.

check the transaction whether it's successful, if the transaction success, I use _xend(), otherwise I use _xabort(status). Can I use the _xabort(status) in this situation?

I depends on what transaction_func() is. If it executes some checks and returns false if they fail, then you can abort the transaction. This is just the case for https://github.com/natsys/blog/blob/master/tsx.cc#L208 - we check the condition and abort the transaction if the check fails.

I don't know whether it can remain in (status == _XBEGIN_STARTED) area. In my opinion, if the status equals _XBEGIN_STARTED, it will not abort the transaction, so this code is not right?

When you call _xbegin(), CPU starts a transaction. It can abort any time: just after if statement, inside transaction_func() or just before _xend(). Anyway, if a transaction aborts, CPU jums to just after _xbegin() statement setting different result to status variable. It has sense to abort a transaction only when the transaction is in progress, so your code looks good for me.

I hope these helps.

from blog.

egraldlo avatar egraldlo commented on May 25, 2024

@krizhanovsky, thanks for your so detail answers !

I saw that TSX can be significantly faster than a spinlock on short and small transactions. Please check graph for transaction size and transaction time.

yes, but it's just in the no-data-overlapping situation. only when it's in the case of independent location. TSX can be faster. However, in the data overlapping situation, even though the overlapping area size is too small, TSX is slow.

I depends on what transaction_func() is. If it executes some checks and returns false if they fail, then you can abort the transaction. This is just the case for https://github.com/natsys/blog/blob/master/tsx.cc#L208 - we check the condition and abort the transaction if the check fails.

Actually, when the transaction fail, the _xabort(status) did not be called. I add a printf in front of this line https://github.com/natsys/blog/blob/master/tsx.cc#L208, it didn't print the result. I think the error could be the way I use _xabort(status), maybe we must transfer the right status value. which status value we must use in different situation? the manual says "Abort the current transaction. When no transaction is active this is a no-op. status
must be a 8bit constant, that is included in the status code returned by _xbegin"

from blog.

krizhanovsky avatar krizhanovsky commented on May 25, 2024

Hi @egraldlo,

you're welcome, I hope the answers are useful for you.

yes, but it's just in the no-data-overlapping situation. only when it's in the case of independent location. TSX can be faster. However, in the data overlapping situation, even though the overlapping area size is too small, TSX is slow.

Yes, you're right. The graph is the best result which I could get that time. I'm planning to back to the research again to explore locking in more complex cases, e.g. trees update.

Actually, when the transaction fail, the _xabort(status) did not be called

I meant failed checks in transaction_func(), not the TSX transaction itself.

I add a printf in front of this line

printf() is quite complex function which can involve system call with context switch, so I believe this isn't a best way to debug TSX. Probably Intel PCM tool will do better work for you.

I think the error could be the way I use _xabort(status), maybe we must transfer the right status value. which status value we must use in different situation?

I didn't get. I see only _xabort(0Xff); in the code you provided and it looks fine...

from blog.

egraldlo avatar egraldlo commented on May 25, 2024

hi @krizhanovsky, it's very appreciated that you will use RTM for tree, it will be great, but I am worry about the performance. :(

I meant failed checks in transaction_func(), not the TSX transaction itself.

yes, I check in transaction_func() and return false, But I can't call _xabort(status). I use a counter to trace the _xabort(status), it's zero.

from blog.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.