When the operation order was changed from cw > hc > hw to hw > cw > hc, pe

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I Have a question about triplet-attention HOT 5 CLOSED

landskape-ai commented on July 20, 2024

I Have a question

from triplet-attention.

Comments (5)

digantamisra98 commented on July 20, 2024

Ideally that shouldn't be the case since all the three operations are computed in parallel and independently. Can you provide a reproducible experiment where we can observe that? Further are you sure, the seeds were the same between the two runs and there is no randomness?

from triplet-attention.

byunsunghun commented on July 20, 2024

I'm honored for your response.
The code below is the code I modified.
When tested on existing code and custom data, the modified code showed a slight improvement. I wonder if there is a cause that I don't know about.

The experimental model was the yolov8m model, and the code was combined with bottleneck, and the same combination was used in all experiments.

class ZPool(nn.Module):
    def forward(self, x):
        return torch.cat(
            (torch.max(x, 1)[0].unsqueeze(1), torch.mean(x, 1).unsqueeze(1)), dim=1
        )
class AttentionGate(nn.Module):
    def __init__(self):
        super(AttentionGate, self).__init__()
        kernel_size = 7
        self.compress = ZPool()
        self.conv = Conv(
            2, 1, k=kernel_size, s=1, p=(kernel_size - 1) // 2, act=False
        )
    def forward(self, x):
        x_compress = self.compress(x)
        x_out = self.conv(x_compress)
        scale = torch.sigmoid(x_out)
        return x * scale
class TripletAttention(nn.Module):
    def __init__(self, no_spatial=False):
        super(TripletAttention, self).__init__()
        self.cw = AttentionGate()
        self.hc = AttentionGate()
        self.hw = AttentionGate()
    def forward(self, x):
        x_hw = self.hw(x)  
        x_hc = self.hc(x.permute(0, 3, 2, 1).contiguous()).permute(0, 3, 2, 1).contiguous()  
        x_cw = self.cw(x.permute(0, 2, 1, 3).contiguous()).permute(0, 2, 1, 3).contiguous() 
        x_out = 1/3 * (x_hw + x_hc + x_cw)
        return x_out

from triplet-attention.

digantamisra98 commented on July 20, 2024

@byunsunghun Sorry for my late response, from your snippet I don't see anything obvious that would explain such improvement as you mentioned. However, as I stated if the seeds were not fixed between the two experiments or any other source of randomness can cause variance in performance. It would be best to run multi-seed runs and benchmark the average and variance in the performance of the two settings.

from triplet-attention.

byunsunghun commented on July 20, 2024

Thank you for your reply. The seeds in both experiments are always fixed, and we will separately investigate whether differences in the different randomness source codes cause performance differences.

happy new year!

from triplet-attention.

digantamisra98 commented on July 20, 2024

Keep me posted, happy new year to you too!

from triplet-attention.

I Have a question about triplet-attention HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent