Code Monkey home page Code Monkey logo

Comments (7)

markwragg avatar markwragg commented on August 16, 2024 1

This all sounds great to me. I will switch to Working Set monitoring to workaround the issue. I like the idea of this being automatic when the process names are over a certain length.

I completely agree the problem here is our process names and I have added a backlog item to get them made shorter, but I think it can be a tricky process to rename services in Service Fabric if you need to also maintain state as I think Service Fabric will treat the renamed services as new ones.

I think the workaround and future mitigation is good, feel free to close this issue if you agree there isn't anything more to do here.

from service-fabric-observer.

markwragg avatar markwragg commented on August 16, 2024

We were previously using Fabric Observer 3.1.23 and have just upgraded to 3.1.26. The issue was surfaced to us due to the new default monitor of the service fabric logs directory, but I can see the same problem was occurring in 3.1.23, however it generated smaller log files as it was logged as just a single line (again process name has been changed for privacy):

2022-06-16 00:00:00.3767--WARN--Unable to get private working set for process examp.proc.xxx.value.somename.nameofprocessis.verylongindeed.process with id 22956. Returning 0.

from service-fabric-observer.

GitTorre avatar GitTorre commented on August 16, 2024

Thank you for this report.

I am curious why the process names need to be so long. That is of course your choice (and not really any of my business), but you can see this approach causes issues that are unrelated to FO, besides its logging stack traces (which I can change in the next release to be less verbose, but it is quite useful in debugging, as you can see. The older message is useless if the goal is to understand what went wrong).

For now, I recommend that you disable Private Working Set monitoring by AppObserver. This will force FO to use a different approach, Working Set (Private + Shared), which does not employ a perf counter. It will use a Win32 API call. The value will be larger since it is private memory plus memory used by shared components, but you will be accurately informed when your service processes are using too much memory as their private memory is what would account for dynamic growth, especially when they approach "leaking" or over-use that exceeds your threshold. This could be due to some bug in user code, for example (which is one of the things you are monitoring). You can do this today with a version-less, parameter-only application upgrade (so, no need to redeploy FO):

$appParams = @{ "AppObserverMonitorPrivateWorkingSet" = "false";  }
Start-ServiceFabricApplicationUpgrade -ApplicationName fabric:/FabricObserver -ApplicationParameter $appParams -ApplicationTypeVersion 3.1.26 -UnMonitoredAuto

In the meantime, I will investigate how to work around this problem. Do note that FO can be configured to manage its own log folders so you could decrease the amount of time it waits to clean them up, for example (default is 3 days). See Settings.xml (this requires that you redeploy FO or do a configuration-only upgrade where you will need to bump the versions of both Code and Config):

    <!-- Optional: Maximum Number of days to store Archived observer log files on disk. 
            This setting spans all ObserverLogger instances (ObserverBase.ObserverLogger), including plugins and ObserverManager. 
            The default value is 0. An empty value or 0 means there is no lifetime constraint. They are immortal files. -->
    <Parameter Name="MaxArchivedLogFileLifetimeDays" Value="3" />

But the real issue here is related to process naming convention you have employed. The other stuff are side effects of this decision.

Let me know how turning off Private Working Set monitoring works out for you. I will also look into why there is a restriction of process name length that is less than MAX_PATH characters on Windows. This seems odd to me. I will do some tests and see if there is a way to work around this restriction. For now, you have a mitigation to try.

Thanks again for the detailed report. I'm sorry you are running into issues.

from service-fabric-observer.

GitTorre avatar GitTorre commented on August 16, 2024

There is a 64 max character limit on InstanceName for perf counters in .NET Core 3.1.

From .NET Core source code:

-or- instanceName is longer than 127 characters.

However, in fact, it looks like the max length for InstanceName in .NET Core 3.1 is actually 64. So, the docs and code comments are confusing. Suffice it to say, you must keep your process names under 64 characters to monitor private working set with .NET Core 3.1 PeformanceCounter class.

        // Exceptions:
        //   T:System.InvalidOperationException:
        //     categoryName is an empty string (""). -or- counterName is an empty string ("").
        //     -or- The read/write permission setting requested is invalid for this counter.
        //     -or- The category specified does not exist (if readOnly is true). -or- The category
        //     specified is not a .NET Framework custom category (if readOnly is false). -or-
        //     The category specified is marked as multi-instance and requires the performance
        //     counter to be created with an instance name. -or- instanceName is longer than
        //     127 characters. -or- categoryName and counterName have been localized into different
        //     languages.

https://docs.microsoft.com/en-us/dotnet/api/system.diagnostics.performancecounter.instancename?view=dotnet-plat-ext-6.0

You really should consider decreasing the length of your service process names to satisfy the requirements of the underlying platform. Or, just use Working Set (Private + Shared) if you must keep your naming conventions.

For 3.2.x releases, I have added code to provide Working Set (Private + Shared) when FO encounters process names that exceed the max supported InstanceName size. FO will log that it did this only once (to limit log size) until it restarts. Let me know if you think this is a good idea. I could also truncate the name to 64 characters and then the Private Working Set routine will not fail. However, if there are multiple processes that exceed the max length and are the same up to 64 characters, then this hack may not work (though FO already runs names through a lookup on process id to get the internal name (InstanceName) from the system in the case where you have multiple services processes of the same name, so it may in fact work..). Not sure I will do that, however. It's kind of hacky,

from service-fabric-observer.

GitTorre avatar GitTorre commented on August 16, 2024

The return full working set impl when encountering a process name that exceeds 64 chars is now in place in develop (v3.2.0) and net6 (v3.2.1) branches, both of which are tested (per usual). Handled exception output will no longer include stack. This will decrease the size of logs when they happen over and over again.

from service-fabric-observer.

markwragg avatar markwragg commented on August 16, 2024

FYI I've noticed that despite having MaxArchivedLogFileLifetimeDays set to 3 (which it was already by default) I'm seeing 8 days of logs under the C:\fabric_observer_logs\Utilitie directory (the current log and 7 days of ones with a date stamp).

Is this expected?

from service-fabric-observer.

GitTorre avatar GitTorre commented on August 16, 2024

FYI I've noticed that despite having MaxArchivedLogFileLifetimeDays set to 3 (which it was already by default) I'm seeing 8 days of logs under the C:\fabric_observer_logs\Utilitie directory (the current log and 7 days of ones with a date stamp).

Is this expected?

OK. For this, Utilities types that log are not constructing Logger instances with max archive days specified, so it defaults to 7. For observers, it defaults to 3 as ObserverBase ctor creates the observer loggers using data from Settings.xml (so, 3 days default). This setting only applies to observer loggers, not loggers created in FabricObserver.Extensibility.

The default for any logger instance is actually 7 days if no max archive days are specified in logger instance construction (and that only applies to observers). Sorry about the confusion.

In Logger.cs:

var target = new FileTarget
{
    Name = targetName,
    ConcurrentWrites = true,
    EnableFileDelete = true,
    FileName = file,
    Layout = "${longdate}--${uppercase:${level}}--${message}",
    OpenFileCacheTimeout = 5,
    ArchiveEvery = FileArchivePeriod.Day,
    ArchiveNumbering = ArchiveNumberingMode.DateAndSequence,
    MaxArchiveDays = MaxArchiveFileLifetimeDays <= 0 ? 7 : MaxArchiveFileLifetimeDays,
    AutoFlush = true
};

from service-fabric-observer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.