Comments (7)
This all sounds great to me. I will switch to Working Set monitoring to workaround the issue. I like the idea of this being automatic when the process names are over a certain length.
I completely agree the problem here is our process names and I have added a backlog item to get them made shorter, but I think it can be a tricky process to rename services in Service Fabric if you need to also maintain state as I think Service Fabric will treat the renamed services as new ones.
I think the workaround and future mitigation is good, feel free to close this issue if you agree there isn't anything more to do here.
from service-fabric-observer.
We were previously using Fabric Observer 3.1.23 and have just upgraded to 3.1.26. The issue was surfaced to us due to the new default monitor of the service fabric logs directory, but I can see the same problem was occurring in 3.1.23, however it generated smaller log files as it was logged as just a single line (again process name has been changed for privacy):
2022-06-16 00:00:00.3767--WARN--Unable to get private working set for process examp.proc.xxx.value.somename.nameofprocessis.verylongindeed.process with id 22956. Returning 0.
from service-fabric-observer.
Thank you for this report.
I am curious why the process names need to be so long. That is of course your choice (and not really any of my business), but you can see this approach causes issues that are unrelated to FO, besides its logging stack traces (which I can change in the next release to be less verbose, but it is quite useful in debugging, as you can see. The older message is useless if the goal is to understand what went wrong).
For now, I recommend that you disable Private Working Set monitoring by AppObserver. This will force FO to use a different approach, Working Set (Private + Shared), which does not employ a perf counter. It will use a Win32 API call. The value will be larger since it is private memory plus memory used by shared components, but you will be accurately informed when your service processes are using too much memory as their private memory is what would account for dynamic growth, especially when they approach "leaking" or over-use that exceeds your threshold. This could be due to some bug in user code, for example (which is one of the things you are monitoring). You can do this today with a version-less, parameter-only application upgrade (so, no need to redeploy FO):
$appParams = @{ "AppObserverMonitorPrivateWorkingSet" = "false"; }
Start-ServiceFabricApplicationUpgrade -ApplicationName fabric:/FabricObserver -ApplicationParameter $appParams -ApplicationTypeVersion 3.1.26 -UnMonitoredAuto
In the meantime, I will investigate how to work around this problem. Do note that FO can be configured to manage its own log folders so you could decrease the amount of time it waits to clean them up, for example (default is 3 days). See Settings.xml (this requires that you redeploy FO or do a configuration-only upgrade where you will need to bump the versions of both Code and Config):
<!-- Optional: Maximum Number of days to store Archived observer log files on disk.
This setting spans all ObserverLogger instances (ObserverBase.ObserverLogger), including plugins and ObserverManager.
The default value is 0. An empty value or 0 means there is no lifetime constraint. They are immortal files. -->
<Parameter Name="MaxArchivedLogFileLifetimeDays" Value="3" />
But the real issue here is related to process naming convention you have employed. The other stuff are side effects of this decision.
Let me know how turning off Private Working Set monitoring works out for you. I will also look into why there is a restriction of process name length that is less than MAX_PATH characters on Windows. This seems odd to me. I will do some tests and see if there is a way to work around this restriction. For now, you have a mitigation to try.
Thanks again for the detailed report. I'm sorry you are running into issues.
from service-fabric-observer.
There is a 64 max character limit on InstanceName for perf counters in .NET Core 3.1.
From .NET Core source code:
-or- instanceName is longer than 127 characters.
However, in fact, it looks like the max length for InstanceName in .NET Core 3.1 is actually 64. So, the docs and code comments are confusing. Suffice it to say, you must keep your process names under 64 characters to monitor private working set with .NET Core 3.1 PeformanceCounter class.
// Exceptions:
// T:System.InvalidOperationException:
// categoryName is an empty string (""). -or- counterName is an empty string ("").
// -or- The read/write permission setting requested is invalid for this counter.
// -or- The category specified does not exist (if readOnly is true). -or- The category
// specified is not a .NET Framework custom category (if readOnly is false). -or-
// The category specified is marked as multi-instance and requires the performance
// counter to be created with an instance name. -or- instanceName is longer than
// 127 characters. -or- categoryName and counterName have been localized into different
// languages.
You really should consider decreasing the length of your service process names to satisfy the requirements of the underlying platform. Or, just use Working Set (Private + Shared) if you must keep your naming conventions.
For 3.2.x releases, I have added code to provide Working Set (Private + Shared) when FO encounters process names that exceed the max supported InstanceName size. FO will log that it did this only once (to limit log size) until it restarts. Let me know if you think this is a good idea. I could also truncate the name to 64 characters and then the Private Working Set routine will not fail. However, if there are multiple processes that exceed the max length and are the same up to 64 characters, then this hack may not work (though FO already runs names through a lookup on process id to get the internal name (InstanceName) from the system in the case where you have multiple services processes of the same name, so it may in fact work..). Not sure I will do that, however. It's kind of hacky,
from service-fabric-observer.
The return full working set impl when encountering a process name that exceeds 64 chars is now in place in develop (v3.2.0) and net6 (v3.2.1) branches, both of which are tested (per usual). Handled exception output will no longer include stack. This will decrease the size of logs when they happen over and over again.
from service-fabric-observer.
FYI I've noticed that despite having MaxArchivedLogFileLifetimeDays
set to 3 (which it was already by default) I'm seeing 8 days of logs under the C:\fabric_observer_logs\Utilitie directory (the current log and 7 days of ones with a date stamp).
Is this expected?
from service-fabric-observer.
FYI I've noticed that despite having
MaxArchivedLogFileLifetimeDays
set to 3 (which it was already by default) I'm seeing 8 days of logs under the C:\fabric_observer_logs\Utilitie directory (the current log and 7 days of ones with a date stamp).Is this expected?
OK. For this, Utilities types that log are not constructing Logger instances with max archive days specified, so it defaults to 7. For observers, it defaults to 3 as ObserverBase ctor creates the observer loggers using data from Settings.xml (so, 3 days default). This setting only applies to observer loggers, not loggers created in FabricObserver.Extensibility.
The default for any logger instance is actually 7 days if no max archive days are specified in logger instance construction (and that only applies to observers). Sorry about the confusion.
In Logger.cs:
var target = new FileTarget
{
Name = targetName,
ConcurrentWrites = true,
EnableFileDelete = true,
FileName = file,
Layout = "${longdate}--${uppercase:${level}}--${message}",
OpenFileCacheTimeout = 5,
ArchiveEvery = FileArchivePeriod.Day,
ArchiveNumbering = ArchiveNumberingMode.DateAndSequence,
MaxArchiveDays = MaxArchiveFileLifetimeDays <= 0 ? 7 : MaxArchiveFileLifetimeDays,
AutoFlush = true
};
from service-fabric-observer.
Related Issues (20)
- With support for .Net 3.1 ending tin 6 months, can it be migrated to .Net 6 (LTS) HOT 6
- [WORK ITEM] Docs
- [BUG] net6 branch: fix typo in README.md HOT 1
- [BUG] Deployed fabricobserver failed to start HOT 2
- [BUG] FO can put itself into Warning with an Infinite TTL Health Report.
- [BUG] Linux: FabricSystemObserver does not report results for some services.
- [FEATURE REQUEST] Add RG Monitoring support (Memory). Windows-only.
- [FEATURE REQUEST] Add Private Bytes monitoring to AppObserver. Windows-only.
- [FEATURE REQUEST] Enable Dump on Warning for Windows user service processes.
- [FEATURE REQUEST] Add support to AppObserver for monitoring multiple code packages.
- [BUG] AppObserver does not monitor GuestExecutable services.
- [BUG] ClusterObserver not building HOT 2
- Is per service disk space monitoring possible? HOT 12
- [BUG]: Concurrent monitoring: CPU data deletion mistiming can lead to no results. HOT 2
- [BUG] log directory calculation in FabricObserver.Extensibility/Utilities/DataTableFileLogger.cs, ConfigureLogger is faulty HOT 4
- [BUG] Allocated File Handles metric no longer being logged by AppObserver in FabricObserver 3.2.8 HOT 2
- [BUG] Child process monitoring does not work when parent procs have only one descendant.
- Informing the customer when a plugin is incorrect[BUG]
- [FEATURE REQUEST] Make telemetry parameters configurable in application parameters HOT 15
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from service-fabric-observer.