forcewake / flatfile Goto Github PK

FlatFile is a library to work with flat files

License: MIT License

PowerShell 2.21% C# 96.94% Batchfile 0.20% Gherkin 0.66%

flatfile's Issues

Documentation for multi-record engine

Documentation and examples should be up-to-date, so it will be cool to add some examples for the multi-record file engine #10, #11
@petersondrew could you please help me with this?

Fixed length with datetime decimal

Hi,

I use fixed length for attributes like following:

[FixedLengthFile]
public class MyFile
{
	[FixedLengthField(1, 1)]
	public string LineType { get; set; }

	[FixedLengthField(2, 13, Padding = Padding.Right, PaddingChar = ' ')]
	public string FormatVersion { get; set; }

        // @ToDo: What to write here? .... i miss DecimalPlaces = 2 ...
        [FixedLengthField(15, 15, PaddingChar = '0')]
	public decimal Amount { get; set; }
}

How can I use fixed length with decimal values? - the specification allows 2 decimal places. I need to specify that somewhere...

Fails to parse delimited row if last field is empty

I've got this class:

    [DelimitedFile(Delimiter = "|", Quotes = "\"", HasHeader = true)]
    class SchemaRow
    {
        [DelimitedField(1)] public int Ordinal {get; set;}
        [DelimitedField(2)] public Guid GUID { get; set; }
        [DelimitedField(3)] public string KeyItem { get; set; }
        [DelimitedField(4)] public string SNLxlKeyField {get;set;}
        [DelimitedField(5)] public string ProductCaption { get; set; }
        [DelimitedField(6)]  public string DataType { get; set; }
        [DelimitedField(7, NullValue="|")] public string Magnitude { get; set; }
        [DelimitedField(8, NullValue = "|")] public int? Length { get; set; }
    }

It blows up parsing this row where the last 2 fields happen to be empty. I can't seem to find a value to set on the NullValue attribute parameter to make this work. Rows where the last field is populated but the second-to-last field is empty work fine.

5|40F7CE96-DC46-4EFC-AAC4-C76199C1E769|1403|132092|End of Fiscal Period Date|smalldatetime||

Support multi-record fixed width file engine

I've been playing around with some ideas for a multi-record file engine for fixed width files, something similar to the MultiRecordEngine in FileHelpers.

There are a couple of things to consider (mostly related to reading, haven't thought much about writing yet)

The nice separation between fixed and delimited makes it hard to mix them both in a multi-engine as FileHelpers does. My thought is to initially only support fixed width files as they more commonly require multiple record types.
FileHelpers returns an ArrayList with all the parsed records. putting the burden on the user to loop through the list and determine what to do with each record one by one. That makes it easy on us, but harder on the user. I think we can do better with something like either:
1. Require the user to still supply a single T where T : interface that serves as a simple marker interface. That allows us to return List<T> rather than ArrayList. The user still needs to go through the list item by item (or use a Linq query with Oftype<>), but with the added benefit of more Linq functionality available on the returned list.
2. Return List<object>. Less runtime type safety, but doesn't require the user to create a marker interface.
3. Return a dictionary<Type, List<T>> that contains a List<T> for each record type passed to the multi-record engine. The user can then access results[typeof(SomeRecord)] to get the results for that type.
4. Similar to iii, but abstract the dictionary away and require the user to call GetResults<T>() on the engine after reading is complete. This then returns the appropriate List to the user and feels more polished from their perspective, I feel.
5. Do some dynamic magic and return an expando object containing a public List<T> for each record type, automatically named something like MyRecordList. This has a certain "cool factor" to it, but ultimately I think I favor option iv as it requires less guessing on the user's part.

I think I'm going to take a crack at implementing a fixed-width file engine (and factory) that take a param array of types to parse and implement the GetResults<T> concept. However before I went too far down the rabbit hole I wanted to see what thoughts you had.

Thanks.

Attribute mapping with multiple record types

Is there a way to use attribute mapping with multiple record types?

Sample

[FixedLengthFile]
public abstract class MyDataBase
{
    protected MyDataBase(char lineType)
    {
        if (lineType <= 0) throw new ArgumentOutOfRangeException(nameof(lineType));

        _lineType = lineType;
    }

    private readonly char _lineType;

    [FixedLengthField(1, 1)]
    public char LineType
    {
        get { return _lineType; }
        set
        {
            if (value != _lineType)
                throw new NotSupportedException($"The value must be '{_lineType}'.");
        }
    }
}

[FixedLengthFile]
public sealed class MyDataHeader : MyDataBase
{
    public MyDataHeader()
        : base('H')
    { }

    [FixedLengthField(2, 13, Padding = Padding.Right, PaddingChar = ' ')]
    public string FormatVersion { get; set; }

    [FixedLengthField(15, 22, Padding = Padding.Right, PaddingChar = ' ')]
    public string Filename { get; set; }

    [FixedLengthField(37, 6, PaddingChar = '0')]
    public int JobId { get; set; }

    [FixedLengthField(43, 10, PaddingChar = '0')]
    public int NoOfTransactionLines { get; set; }

    [FixedLengthField(53, 12, PaddingChar = '0')]
    public int SumOfPoints { get; set; }
}

[FixedLengthFile]
public sealed class MyDataTransaction : MyDataBase
{
    public MyDataTransaction()
        : base('T')
    { }

    [FixedLengthField(2, 6, PaddingChar = '0')]
    public int JobId { get; set; }

    [FixedLengthField(8, 10, PaddingChar = '0')]
    public int TransactionEntryNo { get; set; }

    [FixedLengthField(18, 2, PaddingChar = '0')]
    public int TransactionType { get; set; }

    [FixedLengthField(20, 10, PaddingChar = '0')]
    public int Points { get; set; }
}

Now I have no idea how to read this. Could be something like:

var factory = new FixedLengthFileEngineFactory();
using (var stream = new MemoryStream(Encoding.UTF8.GetBytes(FixedFileSample)))
{
    var flatFile = factory.GetEngine<MyDataBase>();

    var records = flatFile.Read<MyDataBase>(stream).ToArray();
}

But I cannot make the type MyDataHeader and MyDataTransaction known to the engine class. The C# Xml serializer supports this with [XmlIncludeAttribute]

Truncate a "Fixed Length" content when it exceeds the Length property value

Hey!
First off, very good library my friend! :)
Well, I have a request/observation and I'm not sure if this is the right place to do it, but here it goes:
A "Fixed Length" field by definition should not violate its length, either you pad it with something or you truncate it in order to achieve the expected length...however the current code is allowing strings with larger values to go through :(

Class: FlatFile.FixedLength.Implementation.FixedLengthLineBuilder
Method: TransformFieldValue
Block:
if (lineValue.Length >= field.Length)
{
return lineValue;
}

Can this be changed? :)

Thanks!

Allow gaps with [FixedLengthField]

I am parsing using IFlatFileMultiEngine and [FixedLengthField] attributes.

There is a number of fields I don't care about. I thought I could just define the fields I need, simply skipping the irrelevant ones, like this:

[FixedLengthField(1, 10)]
  public string ClientId { get; set; }

[FixedLengthField(100, 50)]
  public string Address { get; set; }

I'd expect the 'gap '(data in positions 11..99) to be ignored. However, it looks like the 'index' parameter isn't used in the this case, and Address would start from 11 rather than 100.

Feature Request - Dual Padding

I've run into a processing case where the file is following a non-standard approach of RPad and LPad.
This could be worked around with type converter, but was wondering if there was a built-in way to support this case.

Thanks!

Implement DelimitedFileMultiEngine following pattern of FlatFileMultiEngine

Objective:
Allow Delimited Files with multiple record types to be loaded via FlatFile.

Context:
We were happily using FlatFile to load all of our position dependent file with multiple record types, but then we tried to implement a delimited file with multiple record types and found we couldn't. So we figured we would have a stab at implementing this by following the approach used by the existing.

Disclaimers:
We need this to meet project deadlines.
I have never contributed to a opensource / git hub project before, so while I have attempted to follow convention and figure out what the rules were, please let me know of any oversights / mistakes.

Need ability to parse a single-record string

I can parse a file with lots of records. If I have a string representing just 1 record, there should be a shortcut to parse it. (I could use a memorystream but that's really overkill here)

Roadmap for .Net Core Support

Hi,
I really love this library specifically with the fluent syntax for attribute mapping.

Wanted to know if we have any plans for the .NetCore 3.1 support?

If there is already a plan , I would love to contribute to that.

Fixed Length multiple record issue

For a fixed length file with multiple records, how do you keep track of the order that records appear?

For example, a file with a header record type and a detail record type:

Header
Detail
Detail
Detail
Header
Detail
Header
Detail
Detail

In this case the detail lines belong to the header rows above them.
GetRecords returns all records of a given type, how can we keep track of which detail lines belong to which headers?

FixedLengthLineBuilder does not uses the type converters

Null Field

\Implementation\FixedLengthLineBuilder.cs

protected override string TransformFieldValue(IFixedFieldSettingsContainer field, string lineValue)

This doesn't check for null fields and instead pads the field no matter what.

I ended up needing to add the following before the padding output to the above function:

        if (field.IsNullable && lineValue == field.NullValue)
		        return lineValue;

Ability to ignore fixed width sections

Lets say I have a fixed width document with 7000 columns and I want to skip random sections say 750-950, 1000-4000.... Is there a better way to skip these sections than to create a single string Ignored { get; } property and basically skip these sections via known length offsets?

.net standard support and project visibility

Do you have any plan to move to .net standard? i see an issue (#55 ) with a PR that solves the problem. Do you have any plan to keep maintaining this library?

Feature Request - Type Conversion Errors

Using FixedLengthFileEngineFactory.

When there is a type conversion problem like "is not a valid value for Int32".
It doesn't get handled by "handleEntryReadError:" and instead throws a global exception.


at System.ComponentModel.BaseNumberConverter.ConvertFrom(ITypeDescriptorContext context, CultureInfo culture, Object value)
at System.ComponentModel.TypeConverter.ConvertFromString(String text)
at FlatFile.Core.Extensions.TypeChangeExtensions.Convert(String input, Type targetType) in TypeChangeExtensions.cs:line 13
at FlatFile.Core.Base.LineParserBase`2.GetFieldValueFromString(TFieldSettings fieldSettings, String memberValue) in LineParserBase.cs:line 44
at FlatFile.FixedLength.Implementation.FixedLengthLineParser.ParseLine[TEntity](String line, TEntity entity) in FixedLengthLineParser.cs:line 23
at FlatFile.Core.Base.FlatFileEngine`2.TryParseLine[TEntity](String line, Int32 lineNumber, TEntity& entity) in FlatFileEngine.cs:line 122
at FlatFile.Core.Base.FlatFileEngine`2.d__8`1.MoveNext() in FlatFileEngine.cs:line 93

Fixed Layout Missing Padding

When a property is null the resulting column is never padded. This is true whether or not AllowNull was used with a non null value or not. I would actually expect the output to be padded is both of those scenarios.

Expose Read(StreamReader) in FixedLengthFileMultiEngine

I noticed the existence of a method Read(StreamReader streamReader) in the class FixedLengthFileMultiEngine but it is not in the corresponding interface: FixedLengthFileMultiEngine

EOL specification

HI all
is it possible to define the EOL character(s)?

Thank you

Discuss variable length last field in fixed length record

I need to handle a record that has a variable length last field, while the previous fields in the record are all fixed length.

We can fairly easily handle this if I add an Unbounded or VariableLength boolean property to the FieldSettings and Attributes. Then if that value is set on the field, the fixed length line parser can read to the end of the string (or to the specified field length, whichever comes first).

The other option is to extend the file engine factory to allow passing a custom line parser, or a Func<> to select the line parser on a per-line basis, or we can register a number of line parsers with the engine itself (each tied to a particular layout or Type) and let the FixedLengthLineParserFactory determine the appropriate line parser on a per line basis.

Any thoughts on any of those approaches? Something like the second option is a more complex approach, but also adds a ton of flexibility without needing to constantly add support for edge cases (like this) to the library.

Null Value not Working for Delimited Attribute Map

e.g. File with one row where Age is null
FirstName,LastName,Age,Date of Birth
John,Doh,,09/06/1947

Here's the map:
[DelimitedFile(Delimiter = ",", Quotes = """, HasHeader = true)]
public class PersonRecordWithAttributeMap
{
[DelimitedField(1)]
public string FirstName { get; set; }

    [DelimitedField(2)]
    public string LastName { get; set; }

    [DelimitedField(3,NullValue = "", Name = "Age")]
    public int? Age { get; set; }
}

For the age column NullValue="" produces a parse error. I've also tried NullValue=null. Same behaviour. Only seems to work if the raw file uses a magic string to represent null.

Missing GetEngine<T> for delimited files

It's probably just me being thick, so apologies in advance.
With a fixed-width file, I can use attributes to describe the mappings on the object. I can then use factory.GetEngine, which uses the attributes to understand the layout.

With a delimited file, I can also use attributes - but I cannot do factory.GetEngine (I can see this was recently removed).
Does this mean that with delimited files I cannot use attributes? Do I have to manually create the layout when working with delimited files - as opposed to how it works with fixed width files? Or am I just missing something?

Thanks in advance.

FixedLengthFileMultiEngine - Write Capability

It would be great to expand the FixedLengthFileMultiEngine to include Write capability including master/detail records in a recursive manner.

Does not work with GUID types

Rethinking of the engine code initialization. Refer to #24

This is the continuation of the conversation from the #24 (comment)

// Init multi record file engine
var engine =
    EngineFactory.GetEngine(
        new[]
        {
            typeof (SettlementHeaderRecord),
            typeof (TransactionRecord),
            typeof (TransactionDetailRecord),
            typeof (SettlementTrailerRecord)
        },
        s =>
        {
            if (String.IsNullOrEmpty(s) || s.Length < 1) return null;
            switch (s[0])
            {
                case 'H':
                    return typeof(SettlementHeaderRecord);
                case 'D':
                    return typeof(TransactionRecord);
                case 'P':
                    return typeof(TransactionDetailRecord);
                case 'V':
                    return typeof(SettlementTrailerRecord);
            }
            return null;
        });

It's rather hard to read
It's rather hard to extend in future
When we will have a lot of engines we will have a lot of copy-paste code

FixedLength - Overall Line Length

I might be missing a setting, but is there a way to verify the overall length of the line?

I was setting up some test cases for errors, and one of the ones I tried was making a line way longer than it is supposed to be, but it doesn't seem to hit any catches.

I would expect it at least hitting the truncateIfExceeded = false.

Custom Converter not working with attribute mapping

The converter is null on the field settings at runtime. I'm going to take a look and see if I can fix it.

.Net Core Support

Are there any plans to support .Net Core 1.1/.2.0?

Strong name assemblies

I realized this is cannot be used in projects that require strong name assemblies. Are you accepting pull requests so I can make one?

Feature Request - More detail in ITypeConverter

Great work here!

Using FixedLengthFile/FixedLengthField.

Would be helpful if ITypeConverter instances had access things like:

Destination Object Type (not the property)
Line Number
Column Number

Right now I'm not seeing a way to provide the specific details about the exact thing causing the problem.

Within ITypeConverter I can only see something like:

String to Bool and the value was X

HandleEntryReadError:

This gets me the line text and the intentionally thrown error.

This feature request might just kind of be specific to FixedLengthFile with lots of fields - it sometimes can be a pain to figure out where the problem is when you have hundreds of characters of text smash, let alone provide a detailed/automatic report to a data source provider say via email.

Thanks for the help!

converting datetime with layout

Idid not find any property to instruct how to convert dates in my flat file to a property of time DateTime and when I tried to read the file, an exception was thrown converting 20170712 to type DateTime.

public class BDIHeader
    {
        public int Tipo { get; set; }
        public string NomeArquivo { get; set; }
        public string Origem { get; set; }
        public int Destino { get; set; }
        public DateTime DataGeracao { get; set; }
        public DateTime DataPregao { get; set; }
        public string HoraMinuto { get; set; }
    }

public sealed class BDIHeaderLayout : FixedLayout<BDIHeader>
    {
        public BDIHeaderLayout()
        {
            this.WithMember(x => x.Tipo, c => c.WithLength(2))
                .WithMember(x => x.NomeArquivo, c => c.WithLength(8))
                .WithMember(x => x.Origem, c => c.WithLength(8))
                .WithMember(x => x.Destino, c => c.WithLength(4))
                .WithMember(x => x.DataGeracao, c => c.WithLength(8))
                .WithMember(x => x.DataPregao, c => c.WithLength(8))
                .WithMember(x => x.HoraMinuto, c => c.WithLength(4));
        }
    }

public void Read()
        {
            //
            var factory = new FixedLengthFileEngineFactory();
            using (var stream = new FileInfo(enderecoArquivoBDI).Open(FileMode.Open, FileAccess.Read, FileShare.Read))
            {
                // If using attribute mapping, pass an array of record types
                // rather than layout instances
                var layouts = new ILayoutDescriptor<IFixedFieldSettingsContainer>[]
                {
                    new BDIHeaderLayout(),new BDIIndiceLayout()

                };
                
                var flatFile = factory.GetEngine(layouts,
                    line =>
                    {
                        // For each line, return the proper record type.
                        // The mapping for this line will be loaded based on that type.
                        // In this simple example, the first character determines the
                        // record type.
                        if (String.IsNullOrEmpty(line) || line.Length < 1) return null;
                        switch (line.Substring(0, 2))
                        {
                            case "00":
                                return typeof(BDIHeader);
                            //case "01":
                            //    return typeof(BDIIndice);
                                //case "02":
                                //    return typeof(BDINegociosPapelLayout);
                                //case "99":
                                //    return typeof(BDITrailerLayout);
                        }
                        return null;
                    });

                flatFile.Read(stream);

                var header = flatFile.GetRecords<BDIHeader>().FirstOrDefault();
                //var indices = flatFile.GetRecords<BDIIndice>().ToList();
                //var negocios = flatFile.GetRecords<BDINegociosPapelLayout>();
                //var trailer = flatFile.GetRecords<BDITrailer>().FirstOrDefault();
            }
        }

line being read
00BDIN9999BOVESPA 999920170713201707131807

Be able to stringify the mapping for saving

It would be nice to be able to export and read in a mapping so a user can map files and save that mapping to a database and pull up later. A simple json format might work.

factory.GetEngine<MyRecord>(); for FixedLengthFileEngineFactory does not work!

Hello,

I'm creating a FixedLengthFileEngineFactory as below (I copied the below from issue #30) but I no longer can do this:

var flatFile = factory.GetEngine();

seems like the GetEngine() call has no empty constructor and now wants a layout descriptor as such:

var container = new FieldsContainer();
var descriptor = new LayoutDescriptorBase(container);
var flatFile = factory.GetEngine(descriptor);

doing the above fails to write "MyRecords" file to the stream though and I get exceptions writing to the stream!! I'm using the latest version 0.2.51.0.

Any idea what's missing here and why am I not able to do it like the sample below shows?

Thank you!

-- Sample code below --

[FixedLengthFile]
class MyRecord
{
[FixedLengthField(1, 4)]
public string Prefix { get; set; }
}
var factory = new FixedLengthFileEngineFactory();
var flatFile = factory.GetEngine();
return flatFile.Read(stream);

Question before PR

i need the line number to get passed around in FixedLengthFileMultiEngine.cs in the context of:

/// <summary>
/// The type selector function used to determine the layout for a given line
/// </summary>
readonly Func<string, Type> typeSelectorFunc;

current usage:

var type = typeSelectorFunc(line);

updated usage:

var type = typeSelectorFunc(line, lineNumber);

is that something that you would accept? asking so i know how i need to consume this library going forward. as of now, nuget works but will need to approach differently if this is something you would not want to include.

the use case here is switching record types off of detecting if we're parsing first and last lines in a file.

thanks! great library!

FixedLength with DateTime

I use the Fixed length attributes to define my structure.

I have a given format with yyyyMMdd hh:mm. How can I specify this for a DateTime property? I found no attribute and no sample how to map DateTime types.

Example with multiple fixed record types

You mention attributes in this example but it's not clear how to use it:

// If using attribute mapping, pass an array of record types
// rather than layout instances

Could you clarify specifically using delimited attributes what this might look like?

Fixed Layout field exceeds maximum length

When working with fixedlengthlayout i need a property that indicates if a field must be truncated if its content exceeds the defined maximum length

delimited layout, wrong parsing when multiple fields are empty at the end of the row

if I have multiple empty field at the end of a row I get the separator as value:

Explained better here:
https://stackoverflow.com/questions/52017242/flatfile-library-delimited-layout-wrong-parsing-when-multiple-fields-are-empty

I don't know if I miss some attribute configurations...

I download the code, in debug it seems to be a problem in class DelimitedLineParser, when check:

if (line.Length > linePosition + delimiterSize)

at first glance could be a solution the next line???

if (line.Length >= linePosition + delimiterSize)

Accept Encoding argument for read|write

just accept it as parameter and pass to underlying StreamReader. Or even better - accept TextReader|TextWriter instead of Stream.

Need ability to load file with "complex" Master/Detail structure

I have a file with a structure like this:
HEADER
RECORD TYPE1
RECORD TYPE2
RECORD TYPE3
RECORD TYPE3
RECORD TYPE1
RECORD TYPE2
RECORD TYPE4
RECORD TYPE5
TAIL
....
in which there are many groups that starts with HEADER and ends with TAIL
This is an example
Test.txt

and those are specs: CBI-RND-001 6_05_ENG.pdf

I could use the Read from stream With multiple fixed record types but when I get the list of RECORD TYPEx I don't know in which group the record was readed.
Is it possible at least to have the line number of each records?

Thank you
Regards

Use tags + releases for nuget releases

I show a new nuget release has been created but I can't tell what's changed as there are no tags/release notes and the change log hasn't been updated in ages.

forcewake / flatfile Goto Github PK

flatfile's Issues

Sample

Recommend Projects

Recommend Topics

Recommend Org