Code Monkey home page Code Monkey logo

beancount-import's Introduction

Beancount-import is a tool for semi-automatically importing financial data from external data sources into the Beancount bookkeeping system, as well as merging and reconciling imported transactions with each other and with existing transactions.

License: GPL v2 PyPI Build Coverage Status

Key features

  • Pluggable data source architecture, including existing support for OFX (cash, investment, and retirement accounts), Mint.com, Amazon.com, and Venmo.

  • Supports beancount importers so it's easier to write your own, and existing beancount and fava users can hop right on with no hustle.

  • Robustly associates imported transactions with the source data, to automatically avoid duplicates.

  • Automatically predicts unknown legs of imported transactions based on a learned classifier (currently decision tree-based).

  • Sophisticated transaction matching/merging system that can semi-automatically combine and reconcile both manually entered and imported transactions from independent sources.

  • Easy-to-use, powerful web-based user interface.

Basic operation

From the data source modules, beancount-import obtains a list of pending imported transactions. (Balance and price entries may also be provided.) Depending on the external data source, pending transactions may fully specify all of the Beancount accounts (e.g. an investment transaction from an OFX source where shares of a stock are bought using cash in the same investment account), or may have some postings to unknown accounts, indicated by the special account name Expenses:FIXME. For example, pending transactions obtained from bank account/credit card account data (e.g. using the Mint.com data source) always have exactly two postings, one to the known Beancount account corresponding to the bank account from which the data was obtained, and the other to an unknown account.

For each pending transaction, beancount-import attempts to find matches to both existing transactions and to other pending transactions, and computes a set of candidate merged transactions. For each unknown account posting, Beancount-import predicts the account based on a learned classifier. Through a web interface, the user can view the pending transactions, select the original transaction or one of the merged candidates, and confirm or modify any predicted accounts. The web interface shows the lines in the journal that would be added or removed for each candidate. Once the user accepts a candidate, the candidate is inserted or merged into the Beancount journal, and the user is then presented with the next pending entry.

The imported transactions include metadata fields on the transaction and on the postings that serve several purposes:

  • indicating to the data source module which entries in its external representation have already been imported and should not be imported again;
  • indicating which postings are cleared, meaning they have been confirmed by the authoritative source, which also constrains matching (a cleared posting can only match an uncleared posting);
  • providing necessary information for training the classifier used for predicting unknown accounts;
  • providing information to the user that may be helpful for identifying and understanding the transaction.

Installation

  1. Ensure you have activated a suitable Python 3 virtualenv if desired.

  2. To install the most recent published package from PyPi, simply type:

    pip install beancount-import

    Alternatively, to install from a clone of the repository, type:

    pip install .

    or for development:

    pip install -e .

    The published PyPI package includes pre-built copy of the frontend and no further building is required. When installing from the git repository, the frontend is built automatically by the above installation commands, but Node.js is required. If you don't already have it installed, follow the instructions in the frontend directory to install it.

Demo

To see Beancount-import in action on test data, refer to the instructions in the examples directory.

Data sources

Data sources are defined by implementing the Source interface defined by the beancount_import.source module.

The data sources provide a way to import and reconcile already-downloaded data. To retrieve financial data automatically, you can use the finance_dl package. You can also use any other mechanism, including manually downloading the data from a financial institution's website, provided that it is in the format required by the data source.

The currently supported set of data sources is:

Refer to the individual data source documentation for details on configuration.

Usage

To run Beancount-import, create a Python script that invokes the beancount_import.webserver.main function. Refer to the examples fresh and manually_entered.

Errors

Any errors either from Beancount itself or one of the data sources are shown in the Errors tab. It is usually wise to manually resolve any errors, either using the built-in editor or an external editor, before proceeding, as some errors may result in incorrect behavior. Balance errors, however, are generally safe to ignore.

Viewing candidates

Select the Candidates tab to view the current pending imported entry, along with all proposed matches with existing and other pending transactions. The original unmatched entry is always listed last, and the proposal that includes the most matched postings is listed first. The list with checkboxes at the top indicates which existing or pending transactions are used in each proposed match; the current pending transaction is always listed first. If many incorrect matches were found, you can deselect the checkboxes to filter the matches.

You can select one of the proposed entries by clicking on it, or using the up/down arrow keys. To accept a proposed entry as is, you can press Enter or double click it. This immediately modifies the journal to reflect the change, and also displays the relevant portion of the journal in the Journal tab, so that you may easily make manual edits.

Specifying unknown accounts

If a proposed entry includes unknown accounts, they are highlighted with a distinctive background color and labeled with a group number. The account shown is the one that was automatically predicted, or Expenses:FIXME if automatic prediction was not possible (e.g. because of lack of training data). There are several ways to correct any incorrectly-predicted accounts:

  • To change an individual account, you can Shift+click on it, type in the new account name, and then press Enter. If you press Escape while typing in the account name, the account will be left unchanged. A fuzzy matching algorithm is used for autocompletion: if you type "ex:co", for example, it will match any accounts for which there is a subsequence of 2 components, where the first starts (case-insensitively) with "ex" and the second starts with "co", such as an Expenses:Drinks:Coffee account.
  • To change all accounts within a proposed entry that share the same group number, you can click on one of the accounts without holding shift, or press the digit key corresponding to the group number. Once you type in an account and press Enter, the specified account will be substituted for all postings in the group.
  • To change all accounts within a proposed entry, you can click the Change account button or press the a key. Once you type in an account and press Enter, the specified account will be substituted for all unknown accounts in the current entry.
  • If you wish to postpone specifying the correct account, you can click the Fixme later button or press the f key. This will substitute the original unknown account names for all unknown accounts in the current entry. If you then accept this entry, the transaction including these FIXME accounts will be added to your journal, and the next time you start Beancount-import the transaction will be treated as a pending entry.

Viewing associated source data

Data sources may indicate that additional source data is associated with particular candidate entries, typically based on the metadata fields and/or links that are included in the transaction. For example, the beancount_import.source.amazon data source associates the order invoice HTML page with the transaction, and the beancount_import.source.google_purchases data source associates the purchase details HTML page. Other possible source data types include PDF statements and receipt images.

You can view any associated source data for the currently selected candidate by selecting the Source data tab.

Changing the narration, payee, links or tags

To modify the narration of an entry, you can click on it, click the Narration button, or press the n key. This actually lets you modify the payee, links, and tags as well. If you introduce a syntax error in the first line of transaction, the text box will be highlighted in red and focus will remain until you either correct it or press Escape, which will revert the first line of the transaction back to its previous value.

Checking for uncleared postings

The Uncleared tab displays the list of postings to accounts for which there is an authoritative source and which have not been cleared. Normally, postings are marked as cleared by adding the appropriate source-specific metadata fields that associate it with the external data representation, such as an ofx_fitid field in the case of the OFX source.

This list may be useful for finding discrepancies that need manual correction. Typical causes of uncleared postings include:

  1. The source data for the posting has not yet been downloaded.
  2. The transaction is a duplicate of another transaction already in the journal, and needs to be manually merged/deleted.
  3. The posting is from before the earliest date for which source data was imported, and no earlier data is available. Such postings can be ignored by adding a cleared_before: <date> metadata field to the open directive for the account or one of its ancestor accounts.
  4. The source data is missing or cannot be imported, but the posting was manually verified. Such postings can be ignored by adding a cleared: TRUE metadata field to them.

Skipping and ignoring imported entries

If you are presented with a pending entry that you don't wish to import, you have several options:

  1. You can skip past it by selecting a different transaction in the Pending tab, or can skip to the next pending entry by clicking on the button labeled or pressing the ] key. This skips it in the current session, but it remains as a pending entry and will be included again if you restart beancount-import.

  2. You can click on the button labeled Fixme later or press the f key to reset all unknown accounts, and then accept the candidate. This will add the transaction to your journal, but with the unknown accounts left as Expenses:FIXME. This is useful for transactions for which you don't know how to assign an account, or which you expect to match to another transaction that will be generated from data that hasn't yet been downloaded. Any transactions in the journal with Expenses:FIXME accounts will be included at the end of the list of pending entries the next time you start beancount-import.

  3. You can click on the button labeled Ignore or press the i key to add the selected candidate to the special "ignored" journal file. This is useful for transactions that are erroneous, such as actual duplicates. Entries that are ignored will not be presented again if you restart beancount-import. However, if you manually delete them from the "ignored" journal file, they will return as pending entries.

Usage with a reverse proxy

If you want to run Beancount-import with features like TLS or authentication, then you can run it behind a reverse proxy that provides this functionality. For instance, an NGINX location configuration like the following can route traffic to a local instance of Beancount-import:

location /some/url/prefix/ {
    proxy_pass_header Server;
    proxy_set_header Host $http_host;
    proxy_redirect off;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Scheme $scheme;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "Upgrade";
    proxy_pass http://localhost:8101/;
}

Replace /some/url/prefix/ with your desired URL path (retaining the trailing slash), or even just / to make Beancount-import available at the URL root.

Use with an existing Beancount journal

If you start using Beancount-import with an existing beancount journal containing transactions that are also referenced in the external data supplied to a data sources, the data source will not know to skip those transactions, because they will not have the requisite metadata indicating the association. Therefore, they will all be presented to you as new pending imported transactions.

However, the matching mechanism will very likely have determined the correct match to an existing transaction, which will be presented as the default option. Accepting these matches will simply have the effect of inserting the relevant metadata into your journal so that the transactions are considered "cleared" and won't be imported again next time you run Beancount-import. It should be a relatively quick process to do this even for a large number of transactions.

Development

For development of this package, make sure to install Beancount-import using the pip install -e . command rather than the pip install . command. If you previously ran the pip install command without the -e option, you can simply re-run the pip install -e . command.

Testing

You can run the tests using the pytest command.

Many of the tests are "golden" tests, which work by creating a textual representation of some state and comparing it with the contents of a particular file in the testdata/ directory. If you change one of these tests or add a new one, you can have the tests automatically generate the output by setting the environment variable BEANCOUNT_IMPORT_GENERATE_GOLDEN_TESTDATA=1, e.g.:

BEANCOUNT_IMPORT_GENERATE_GOLDEN_TESTDATA=1 pytest

Make sure to commit to at least stage any changes you've made to the relevant testdata files prior to running the tests with this environment variable set. That way you can manually verify any changes between the existing output and the new output using git diff.

Web frontend

The web frontend source code is in the frontend/ directory. Refer to the README.md file there for how to rebuild and run the frontend after making changes.

Basic workflow

Simple expense transaction from Mint.com data source

Suppose the user has purchased a coffee at Starbucks on 2016-08-09 using a credit card, and has set up Mint.com to retrieve the transaction data for this credit card.

Given the following CSV entry:

"Date","Description","Original Description","Amount","Transaction Type","Category","Account Name","Labels","Notes"
"8/10/2016","Starbucks","STARBUCKS STORE 12345","2.45","debit","Coffee Shops","My Credit Card","",""

and the following open account directive:

1900-01-01 open Liabilities:Credit-Card  USD
  mint_id: "My Credit Card"

the Mint data source will generate the following pending transaction:

2016-08-10 * "STARBUCKS STORE 12345"
  Liabilities:Credit-Card             -2.45 USD
    date: 2016-08-10
    source_desc: "STARBUCKS STORE 12345"
  Expenses:FIXME                       2.45 USD

The user might manually specify that the unknown account is Expenses:Coffee. The web interface will then show the updated changeset:

+2016-08-10 * "STARBUCKS STORE 12345"
+  Liabilities:Credit-Card             -2.45 USD
+    date: 2016-08-10
+    source_desc: "STARBUCKS STORE 12345"
+  Expenses:Coffee                      2.45 USD

If the Expenses:Coffee account does not already exist, Beancount-import will additionally include an open directive in the changeset:

+2016-08-10 * "STARBUCKS STORE 12345"
+  Liabilities:Credit-Card             -2.45 USD
+    date: 2016-08-10
+    source_desc: "STARBUCKS STORE 12345"
+  Expenses:Coffee                      2.45 USD
+ 2016-08-10 open Expenses:Coffee USD

Once the user accepts this change, the changeset is applied to the journal. The presence of the date and source_desc metadata fields indicate to the Mint data source that the Liabilities:Credit-Card posting is cleared. The combination of the words in the source_desc, the source account of Liabilities:Credit-Card, and the target account of Expenses:Coffee serves as a training example for the classifier. A subsequent pending transaction with a source_desc field containing the word STARBUCKS is likely to be automatically classified as Expenses:Coffee. Note that while in this case the narration matches the source_desc field, the narration has no effect on the automatic prediction. The user must not delete or modify these metadata fields, but additional metadata fields may be added.

Mint.com has its own heuristics for computing the Description and Category fields from the Original Description provided by the financial institution. However, these are ignored by the Mint data source as they are not stable (can change if the data is re-downloaded) and not particularly reliable.

Match to a manually entered transaction

Considering the same transaction as shown in the previous example, suppose the user has already manually entered the transaction prior to running the import:

2016-08-09 * "Coffee"
  Liabilities:Credit-Card             -2.45 USD
  Expenses:Coffee

When running Beancount-import, the user will be presented with two candidates:

 2016-08-09 * "Coffee"
   Liabilities:Credit-Card             -2.45 USD
+    date: 2016-08-10
+    source_desc: "STARBUCKS STORE 12345"
   Expenses:Coffee


+2016-08-10 * "STARBUCKS STORE 12345"
+  Liabilities:Credit-Card             -2.45 USD
+    date: 2016-08-10
+    source_desc: "STARBUCKS STORE 12345"
+  Expenses:FIXME                       2.45 USD

The user should select the first one; selecting the second one would yield a duplicate transaction (but which could later be diagnosed as an uncleared transaction). The Expenses:FIXME account in the second candidate would in general actually be some other, possibly incorrect, predicted account, but which is clearly indicated as an prediction that can be changed.

As is typically the case, the date on the manually entered transaction (likely the date on which the transaction actually occurred) is not exactly the same as the date provided by the bank. To handle this discrepancy, Beancount-import allows matches between postings that are up to 5 days apart. The date metadata field allows the posting to be reliably matched to the corresponding entry in the CSV file, even though the overall transaction date differs.

Note that even though this transaction was manually entered, once it is matched with the pending transaction and the source_desc and date metadata fields are added, it functions as a training example exactly the same as in the previous example.

Credit card payment transaction

Suppose the user pays the balance of a credit card using a bank account, and Mint.com is set up to retrieve the transactions from both the bank account and the credit card.

Given the following CSV entries:

"Date","Description","Original Description","Amount","Transaction Type","Category","Account Name","Labels","Notes"
"11/27/2013","Transfer from My Checking","CR CARD PAYMENT ALEXANDRIA VA","66.88","credit","Credit Card Payment","My Credit Card","",""
"12/02/2013","National Federal Des","NATIONAL FEDERAL DES:TRNSFR","66.88","debit","Transfer","My Checking","",""

and the following open account directives:

1900-01-01 open Liabilities:Credit-Card  USD
  mint_id: "My Credit Card"

1900-01-01 open Assets:Checking  USD
  mint_id: "My Checking"

the Mint data source will generate 2 pending transactions, and for the first one will present two candidates:

+2013-11-27 * "CR CARD PAYMENT ALEXANDRIA VA"
+  Liabilities:Credit-Card             66.88 USD
+    date: 2013-11-27
+    source_desc: "CR CARD PAYMENT ALEXANDRIA VA"
+  Assets:Checking                    -66.88 USD
+    date: 2013-12-02
+    source_desc: "NATIONAL FEDERAL DES:TRNSFR"


+2013-11-27 * "CR CARD PAYMENT ALEXANDRIA VA"
+  Liabilities:Credit-Card             66.88 USD
+    date: 2013-11-27
+    source_desc: "CR CARD PAYMENT ALEXANDRIA VA"
+  Expenses:FIXME                     -66.88 USD

Note that the Expenses:FIXME account in the second transaction will actually be whichever account was predicted automatically. If there have been prior similar transactions, it is likely to be correct predicted as Assets:Checking.

The user should accept the first candidate to import both transactions at once. In that case, both postings are considered cleared, and the new transaction will result in two training examples for automatic prediction, corresponding to each of the two combinations of source_desc, source account, and target account.

However, if the user accepts the second candidate (perhaps because the transaction hasn't yet been posted to the checking account and the pending transaction derived from the checking account data is not yet available), and either leaves the account as Expenses:FIXME, manually specifies Assets:Checking, or relies on the automatic prediction to choose Assets:Checking, then when importing the transaction from the checking account, the user will be presented with the following candidates and will have another chance to accept the match:

 2013-11-27 * "CR CARD PAYMENT ALEXANDRIA VA"
   Liabilities:Credit-Card             66.88 USD
     date: 2013-11-27
     source_desc: "CR CARD PAYMENT ALEXANDRIA VA"
   Assets:Checking                    -66.88 USD
+    date: 2013-12-02
+    source_desc: "NATIONAL FEDERAL DES:TRNSFR"


+2013-12-02 * "NATIONAL FEDERAL DES:TRNSFR"
+  Assets:Checking                    -66.88 USD
+    date: 2013-12-02
+    source_desc: "NATIONAL FEDERAL DES:TRNSFR"
+  Expenses:FIXME                      66.88 USD

License

Copyright (C) 2014-2018 Jeremy Maitin-Shepard.

Distributed under the GNU General Public License, Version 2.0 only. See LICENSE file for details.

beancount-import's People

Contributors

ankurdave avatar bayesianmind avatar camhashemi avatar carljm avatar cookie223 avatar delphij avatar dependabot[bot] avatar dumbpy avatar hnws avatar jakobwenzel avatar jasonhuhx avatar jbms avatar jktomer avatar jonboylecoding avatar m-d-brown avatar marklodato avatar mattmattmatt avatar mbafford avatar mikej avatar moritzj29 avatar naskooskov avatar nnooney avatar patbakdev avatar scanta2 avatar tiraelsedai avatar witten avatar xentac avatar zacchiro avatar zburatorul avatar zhou13 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

beancount-import's Issues

Transaction [Cost]

I am trying to create an import file for an account, but I'm having trouble being able to import a cost for a transaction. Not sure whether this is a beancount specific issue or if it's related to beancount-import specifically (which I think it is).

YYYY-MM-DD [txn|Flag] [[Payee] Narration]
   [Flag] Account       Amount [{Cost}] [@ Price]
   [Flag] Account       Amount [{Cost}] [@ Price]

I'm trying to add a Cost to my transactions from this code

transaction = Transaction(
            meta=None,
            date=account_entry.date,
            flag=FLAG_OKAY,
            payee=None,
            narration=account_entry.source_desc,
            tags=EMPTY_SET,
            links=EMPTY_SET,
            postings=[
                Posting(
                    account=account_entry.account,
                    units=account_entry.value,
                    cost=None,
                    price=None,
                    flag=None,
                    meta=collections.OrderedDict(
                        source_desc=account_entry.source_desc,
                        date=account_entry.date,
                    )),
                Posting(
                    account=FIXME_ACCOUNT,
                    units=-account_entry.amount,
                    cost=account_entry.cost,
                    price=account_entry.rate,
                    flag=None,
                    meta=None,
                ),
                Posting(
                    account=FIXME_ACCOUNT,
                    units=-account_entry.fee,
                    cost=None,
                    price=None,
                    flag=None,
                    meta=None,
                ),
            ])

The output from this shows the actual cost properly but this just shows up as {} when checking a transaction.

+2020-01-01 * ""
+  Assets:Account      589.03000 USD
+    date: 2020-01-01
+    source_desc: ""
+  1Expenses:FIXME    -500.00 EUR {} @ 1.17512 USD
+  2Expenses:FIXME    1.47000 USD

This is the output for print(transaction) which shows the actual cost properly

Transaction(meta=None, date=datetime.date(2020, 01, 01), flag='*', payee=None, narration='', t
ags=frozenset(), links=frozenset(), 
postings=[Posting(account='Assets:Account', units=589.03000 USD, cost=None, price=None, flag=None, meta=OrderedDict([('source_desc', ''), ('date', datetime.date(2020, 01, 01))])), Posting(account='Expenses:FIXME', units=-500.00 EUR, cost=0.8509769215058887602968207502 USD, price=1.17512 USD, flag=None, meta=None),
 Posting(account='Expenses:FIXME', units=1.47000 USD, cost=None, price=None, flag=None, meta=None)])

I create the cost similar to rate, amount or value, so I'm not sure why this isn't working

entries.append(  AccountEntry(
                            account=account,
                            date=date,
                            source_desc=source,
                            amount=Amount(number=number, currency=currency),
                            filename=filename,
                            line=line_i + 1,
                            value=Amount(number=number_value, currency=currency_value),
                            fee=None, type_=type_,
                            rate=Amount(number=rate_number, currency=rate_currency),
                            cost=Amount(number=cost_number, currency=cost_currency)))

Previously ignored balance assertions are presented for import again

I provide a minimal example in which upon restart beancount-import presents import candidates that are already in the ignored.journal file. Specifically this seems to happen with balance assertions.

To reproduce:

  1. run python3 run.py
  2. repeatedly hit i to ignore all import candidates.
  3. Ctrl-C to stop server
  4. run.py again and observe same balance assertion candidate.

I have also observed this issue with price assertions. Is this nominal behaviour?
ignore_test.tar.gz

Price Format Support

I was loading paypal data when I noticed price data doesn't take into account another format: dot before decimals (default) VS comma before decimals (drops the comma).
XX.XX = XX.XX
XX,XX = XXXX
example:
loaded data: "price": "55,33\u00a0EUR"
output: 5533 EUR

This happens in /beancount_import/amount_parsing.py
inside def parse_amount(x)
at line 42 number = D(m.group(2))

def parse_amount(x):
    """Parses a number and currency."""
    if not x:
        return None
    sign, amount_str = parse_possible_negative(x)
    m = re.fullmatch(r'([\$€£])?((?:[0-9](?:,?[0-9])*|(?=\.))(?:\.[0-9]+)?)(?:\s+([A-Z]{3}))?', amount_str)
    if m is None:
        raise ValueError('Failed to parse amount from %r' % amount_str)
    if m.group(1):
        currency = {'$': 'USD', '€': 'EUR', '£': 'GBP'}[m.group(1)]
    elif m.group(3):
        currency = m.group(3)
    else:
        raise ValueError('Failed to determine currency from %r' % amount_str)
    number = D(m.group(2))
    return Amount(number * sign, currency)

Is it possible to add a check for replacing , with . Not sure what the best method would be but I'm currently using this

    value = m.group(2)
    if ',' in value:
        value = value.replace(',','.')
    number = D(value)

Cannot run example

Listening at http://127.0.0.1:8101
Traceback (most recent call last):
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/webserver.py", line 489, in _handle_reconciler_loaded
    loaded_reconciler = loaded_future.result()
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/concurrent/futures/_base.py", line 425, in result
    return self.__get_result()
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/thread_helpers.py", line 13, in wrapper
    f.set_result(fn(*args, **kwargs))
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/reconcile.py", line 380, in __init__
    self._load_sources()
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/reconcile.py", line 434, in _load_sources
    for spec in self.reconciler.options['data_sources']
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/reconcile.py", line 434, in <listcomp>
    for spec in self.reconciler.options['data_sources']
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/source/__init__.py", line 318, in load_source
    m = importlib.import_module(source_spec.pop('module'))
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/source/ofx.py", line 434, in <module>
    from beancount.ingest.importers.ofx import parse_ofx_time, find_child
ModuleNotFoundError: No module named 'beancount.ingest.importers.ofx'
> /Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/source/ofx.py(434)<module>()
-> from beancount.ingest.importers.ofx import parse_ofx_time, find_child
(Pdb)

Does not work on Windows with non-Latin letters

Listening at http://127.0.0.1:8101
Traceback (most recent call last):
  File "C:\Python37\lib\site-packages\beancount_import\webserver.py", line 488, in _handle_reconciler_loaded
    loaded_reconciler = loaded_future.result()
  File "C:\Python37\lib\concurrent\futures\_base.py", line 425, in result
    return self.__get_result()
  File "C:\Python37\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception
  File "C:\Python37\lib\site-packages\beancount_import\thread_helpers.py", line 13, in wrapper
    f.set_result(fn(*args, **kwargs))
  File "C:\Python37\lib\site-packages\beancount_import\reconcile.py", line 380, in __init__
    self._load_sources()
  File "C:\Python37\lib\site-packages\beancount_import\reconcile.py", line 434, in _load_sources
    for spec in self.reconciler.options['data_sources']
  File "C:\Python37\lib\site-packages\beancount_import\reconcile.py", line 434, in <listcomp>
    for spec in self.reconciler.options['data_sources']
  File "C:\Python37\lib\site-packages\beancount_import\source\__init__.py", line 319, in load_source
    return m.load(source_spec, log_status=log_status)  # type: ignore
  File "C:\Python37\lib\site-packages\beancount_import\source\ofx.py", line 1373, in load
    return OfxSource(log_status=log_status, **spec)
  File "C:\Python37\lib\site-packages\beancount_import\source\ofx.py", line 1329, in __init__
    ParsedOfxFile(self.source_fitids, filename))
  File "C:\Python37\lib\site-packages\beancount_import\source\ofx.py", line 1127, in __init__
    contents = f.read()
  File "C:\Python37\lib\encodings\cp1251.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x98 in position 5539: character maps to <undefined>
> c:\python37\lib\encodings\cp1251.py(23)decode()
-> return codecs.charmap_decode(input,self.errors,decoding_table)[0]

The journal file is encoded in UTF-8.

The workaround is to use Linux subsystem, which I used to do on my desktop, however now it's not available and my laptop doesn't have Windows 10 Pro so I'd like to fix the real issue.

P.S. Great software by the way, I really enjoyed using it in the last couple months.

Generic CSV importer

Hi. I've recently opened an account in a bank that doesn't support OFX export.

I'd like to write a general CSV importer, but I'm having a hard time even starting. Where do I look? I found venmo.py Is probably closest to just CSV, but it's still a bit of a stretch to cut it back to my needs.

CSV structure looks sufficient to uniquely match transaction and have a description and whatnot:

Account type; Account number; Currency; Date (dd.mm.yyyy); Reference; Description; Credit; Debit

Reference is somewhat unique and looks like this:

CRD_3629XM - operation with card

A102705190005380 - operation with account

HOLD - you guessed it

Obviously as a lazy person I'd like to just have a working solution in some next release, but steering me in the direction on how to write it myself would be nice too.

Tornado uncaught exception when running example

Hi, I'm trying to run the manually_entered example (same thing happens for fresh example too), but can't find a way around this error which is repeatedly printed to the console:

ERROR:tornado.application:Uncaught exception GET /BEANCOUNT_IMPORT_SECRET_KEY_f31e30e228b14bf2e354100c5305bfdab40d608f/websocket (127.0.0.1)
HTTPServerRequest(protocol='http', host='localhost:8101', method='GET', uri='/BEANCOUNT_IMPORT_SECRET_KEY_f31e30e228b14bf2e354100c5305bfdab40d608f/websocket', version='HTTP/1.1', remote_ip='127.0.0.1')
Traceback (most recent call last):
  File "/Users/rbharvey/.virtualenvs/ledger/lib/python3.6/site-packages/tornado/websocket.py", line 952, in _accept_connection
    open_result = handler.open(*handler.open_args, **handler.open_kwargs)
  File "/Users/rbharvey/.virtualenvs/ledger/lib/python3.6/site-packages/beancount_import/webserver.py", line 267, in open
    self.set_nodelay(True)
  File "/Users/rbharvey/.virtualenvs/ledger/lib/python3.6/site-packages/tornado/websocket.py", line 561, in set_nodelay
    assert self.stream is not None
AssertionError

In the browser, it sort of flickers between what appears to be the web UI and the following error message: Server connection closed, waiting to reconnect.

[Paypal] SEND_MONEY_RECEIVED with fundingSource set breaks import

Hi everyone,

I'm facing some issues with SEND_MONEY_RECEIVED PayPal transactions. In my case, all the transactions have a fundingSource entry in the JSON, therefore the import breaks at this line.

assert 'fundingSource' not in data

Since the only function of the line is to make sure NO fundingSource is set, I wonder what the purpose is. Removing the line, my imports work fine and I don't see any obvious problem yet.

For reference, I use transaction data from Paypal DE (#142) so maybe Paypal is doing some things differently in Germany...

Price fetch reads yesterday's price

This is what I got in the prices_output_map file:
2018-10-31 price FAS 56.4500000 USD
And
(pyenv) [xxxx@av beancount]$ bean-price -e USD:yahoo/FAS
I got output of
2018-10-31 price FAS 58.5499999999999971578290569595992565155029296875 USD
According to
image
58.55 is 10-31's close price and 56.45 is 10-30's close.

I spent some time trying to fix this, however I cannot find any obvious place this price is fetched.
Would you please take a look? Or please let me know which code should I look at so I can attempt a fix?

journal_editor breaks when matched transaction has a comment

My journal contains:

2018-03-21 * "things"
  Assets:Cash:Bank1                             2572.64 USD
  ;; Comment
  Expenses:Purchases:Gifts                     -2572.64 USD

My generated entry looks like:

2018-03-21 * ""
  Assets:Cash:Bank1                             2572.64 USD
  Expenses:FIXME                               -2572.64 USD

When I try to import, I get an AssertionError like this:

Traceback (most recent call last):                                                                                                                                                                                           File "/home/me/.local/lib/python3.7/site-packages/beancount_import/webserver.py", line 499, in _handle_reconciler_loaded                                                                                              
    self.get_next_candidates(new_pending=True)                                                                                                                                                                               File "/home/me/.local/lib/python3.7/site-packages/beancount_import/webserver.py", line 508, in get_next_candidates                                                                                                        self.skip_ids)                                                                                                                                                                                                         
  File "/home/me/.local/lib/python3.7/site-packages/beancount_import/reconcile.py", line 873, in get_next_candidates                                                                                                        pending), i, new_skip_ids                                                                                                                                                                                                File "/home/me/.local/lib/python3.7/site-packages/beancount_import/reconcile.py", line 841, in _make_candidates_from_import_result                                                                                        sources=self.sources,                                                                                                                                                                                                  
  File "/home/me/.local/lib/python3.7/site-packages/beancount_import/reconcile.py", line 234, in __init__                                                                                                                   candidate.update_associated_data(self.sources)                                                                                                                                                                           File "/home/me/.local/lib/python3.7/site-packages/beancount_import/reconcile.py", line 202, in update_associated_data                                                                                                 
    diff = self.staged_changes.get_diff()                                                                                                                                                                                  
  File "/home/me/.local/lib/python3.7/site-packages/beancount_import/journal_editor.py", line 904, in get_diff                                                                                                          
    new_posting=new_posting)                                                                                                                                                                                               
  File "/home/me/.local/lib/python3.7/site-packages/beancount_import/journal_editor.py", line 684, in compute_posting_changes                                                                                           
    builder.match_metadata(old_posting.meta)                                                                                                                                                                               
  File "/home/me/.local/lib/python3.7/site-packages/beancount_import/journal_editor.py", line 592, in match_metadata                                                                                                    
    assert meta['lineno'] == self.orig_lineno + 1                                                                                                                                                                          
AssertionError 

When I remove the comment, all is well.

OFX commodity import wants to repeatedly add existing transactions

I suspect the problem is that the importer isn't looking at the right account, but I haven't had enough time to look through the code myself to see.

I have these accounts in my beancount file:

2001-01-01 open Assets:Retirement:Employer
  ofx_org: "Vanguard"
  ofx_broker_id: "vanguard.com"
  account_id: "123456"
  ofx_account_type: "securities_and_cash"
  div_income_account: "Income:Vanguard:Dividends"
  capital_gains_account: "Income:Vanguard:Capital-Gains"
2001-01-01 open Assets:Retirement:Employer:PreTax:VIRSX         VIRSX
2001-01-01 open Assets:Retirement:Employer:PreTax:Cash          USD
2001-01-01 open Assets:Retirement:Employer:Match:VIRSX          VIRSX
2001-01-01 open Assets:Retirement:Employer:Match:Cash           USD

Here's a simple OFX file with a single transaction:

example.zip

If I use the above account definitions with that OFX file, beancount_import wants to add the following transaction:

2019-01-15 * "BUYMF - PRETAX"
  Assets:Retirement:Employer:PreTax:VIRSX  15.12443 VIRSX {22.18 USD}
    date: 2019-01-15
    ofx_fitid: "1141201901150022251001668EEE"
    ofx_type: "BUYMF"
  Assets:Retirement:Employer:PreTax:Cash  -335.46 USD
    ofx_fitid: "1141201901150022251001668EEE"

That's all well and good, but if I add it and then exit and restart the importer, it wants to add the transaction again even though it's already there. (Worse, if the beancount file changes and the importer reloads it, the importer will immediately offer to re-add the same transaction.)

I would expect the importer to see that the OFX transaction already has a matching beancount transaction and not try to add it again. It works properly with my other, cash-only accounts (a mix of bank accounts and credit cards). The only problems are with my two retirement accounts.

Error handling for missing metadata fields

When I'm trying to import a new OFX account, and I'm in the process of setting all of the necessary Beancount-import OFX metadata in my Beancount accounts file, I'll often get errors like these when I don't yet have the metadata quite right:

 File "/usr/lib/python3.7/site-packages/beancount_import/source/ofx.py", line 529, in get_account_by_key
    raise KeyError('%s: must specify %s' % (account.account, key))
KeyError: 'Assets:MyBank: must specify capital_gains_account'
> /usr/lib/python3.7/site-packages/beancount_import/source/ofx.py(529)get_account_by_key()
-> raise KeyError('%s: must specify %s' % (account.account, key))
(Pdb) 

So, getting unceremoniously dumped into a pdb shell because I'm missing a capital_gains_account entry! This is okay (but not ideal) when running Beancount-import locally in an interactive shell. It's super not great when running Beancount-import persistently on a remote web server with nothing interactive!

To be fair, I was warned of this potential issue (#14 (comment)):

I think more generally some other changes may be helpful to make the beancount-import webserver more convenient as a persistent, multi-user service. For example, currently while it will reload the journal automatically there is no way to reload the import data, and failure to parse the import data often leads to terminating the program.

So I'm opening this ticket to start brainstorming solutions. The existing errors tab in the front-end seems like a nice way to present errors to the user. Would it be possible to just take the cases of errors that (currently) result in a pdb shell or exiting the program to instead manifest as recoverable errors in the UI?

Failed assertion for Fidelity accounts

Importing ofx files from my Fidelity 401k account for BUY transactions is problematic.

In ofx.py:1040 there's this assertion:
if raw.trantype in STOCK_BUY_SELL_TYPES:
assert abs(total + fee_total +
(units * unitprice)) < TOLERANCE, abs(
total + fee_total + (units * unitprice))

The unitprice is not "correct" in the ofx file, and tolerance is not met. units and total is correct and they match the statement.

The cost_spec is computed on line 933, using unitprice, before we determine if a fee was charged.

What I think should happen is that total and fee_total should be computed before the cost_spec initialization, and instead of using "number_per", we should use "number_total" with -(total+fee_total) as its value, and ignore potentially problematic unitprice values.

Matching engine failing to join transactions

I am running an import with --fuzzy_match_days 7. For some reason it's not managing to match the top pending transaction with the existing bottom one. This is just an example, there are many similar instances.

2018-02-08 * "Amazon.com" "Order"
  amazon_account: "XXX"
  amazon_order_id: "112-4181632-8076222"
  Expenses:FIXME:A   79.99 USD
    amazon_item_condition: "New"
    amazon_item_description: "Anker PowerCore Speed 20000 PD, 20100mAh Portable Charger & 30W Power Delivery Wall Charger Bundle, Input & Output Type C Power Bank for Nexus 5X 6P, LG G5, iPhone 8 / X and Macbooks"
    amazon_item_quantity: 1
    amazon_seller: "AnkerDirect"
    shipped_date: 2018-02-09
  Expenses:FIXME:A  -10.00 USD
    amazon_invoice_description: "Your Coupon Savings"
  Expenses:FIXME:A    6.21 USD
    amazon_invoice_description: "Sales Tax"
  Expenses:FIXME    -76.20 USD
    amazon_credit_card_description: "MasterCard ending in 7754"
    transaction_date: 2018-02-09
2018-02-13 * "Debit Card Purchase 02/09 0"
  Assets:Bank:Citibank                                  -76.20 USD
    memo: "AMAZON MKTPLACE PMTS   AMZN.COM/BILL WA"
  Expenses:Electronics                                   76.20 USD

Matching runs forever / Webserver unresponsive

Hi,

I'm trying to import a Paypal transaction, but it reliably breaks the Webserver. It is a transaction containing quite a number of items. If I reduce the number of items, the Webserver remains responsive. Increasing the number of items included in the JSON increases the time needed for processing enormously. It seems to scale in a highly non-linear way.

I attached the problematic JSON, personal information was removed: Paypal_test_json.txt

I tried tracking the time consuming method in the code. I found that _get_valid_posting_matches is called over and over again. I printed the length of the matches returned and it fluctuates heavily, increasing up to 6000?! My beancount file has about 200 transactions at the moment...

def _get_valid_posting_matches(

Let's see how long the processing will take, I'm afraid I will have to leave it running overnight...

I would really appreciate some hints on debugging! Unfortunately, I don't understand the code well enough yet.

As a workaround I can probably just remove the items from the JSON by hand?

Can't run examples on PyPI version

Hello, thank you for writing and maintaining this software! Just wanted to note that I was unable to run the examples using the version available on PyPI with this error:

(my-finances) $ ./run.py 
Listening at http://127.0.0.1:8101
Traceback (most recent call last):
  File "/home/vkurup/.pyenv/versions/3.9.0/envs/my-finances/lib/python3.9/site-packages/beancount_import/webserver.py", line 493, in _handle_reconciler_loaded
    loaded_reconciler = loaded_future.result()
  File "/home/vkurup/.pyenv/versions/3.9.0/lib/python3.9/concurrent/futures/_base.py", line 433, in result
    return self.__get_result()
  File "/home/vkurup/.pyenv/versions/3.9.0/lib/python3.9/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/home/vkurup/.pyenv/versions/3.9.0/envs/my-finances/lib/python3.9/site-packages/beancount_import/thread_helpers.py", line 13, in wrapper
    f.set_result(fn(*args, **kwargs))
  File "/home/vkurup/.pyenv/versions/3.9.0/envs/my-finances/lib/python3.9/site-packages/beancount_import/reconcile.py", line 380, in __init__
    self._load_sources()
  File "/home/vkurup/.pyenv/versions/3.9.0/envs/my-finances/lib/python3.9/site-packages/beancount_import/reconcile.py", line 433, in _load_sources
    sources = self.sources = [
  File "/home/vkurup/.pyenv/versions/3.9.0/envs/my-finances/lib/python3.9/site-packages/beancount_import/reconcile.py", line 434, in <listcomp>
    load_source(spec, log_status=self.reconciler.log_status)
  File "/home/vkurup/.pyenv/versions/3.9.0/envs/my-finances/lib/python3.9/site-packages/beancount_import/source/__init__.py", line 318, in load_source
    m = importlib.import_module(source_spec.pop('module'))
  File "/home/vkurup/.pyenv/versions/3.9.0/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'beancount_import.source.generic_importer_source'

I was able to successfully get it to work by downloading the master branch:

pip install git+git://github.com/jbms/beancount-import@master#egg=beancount-import

(in case that helps someone else until the current version is released to PyPI)

Entries with 0 unitprice

A Vanguard OFX I'd like to import has entries where the UNITPRICE is 0 (which seems like a bug on their part):

<REINVEST>
    <INVTRAN>
        <FITID>XXX</FITID>
        <DTTRADE>20190614160000.000[-5:EST]</DTTRADE>
        <DTSETTLE>20190614160000.000[-5:EST]</DTSETTLE>
        <MEMO>DIVIDEND REINVEST</MEMO>
    </INVTRAN>
    <SECID>
        <UNIQUEID>XXX</UNIQUEID>
        <UNIQUEIDTYPE>CUSIP</UNIQUEIDTYPE>
    </SECID>
    <INCOMETYPE>DIV</INCOMETYPE>
    <TOTAL>-543.95</TOTAL>
    <SUBACCTSEC>CASH</SUBACCTSEC>
    <UNITS>7.599</UNITS>
    <UNITPRICE>0.0</UNITPRICE>
</REINVEST>

Which results in errors like this:

Traceback (most recent call last):
  File "miniconda3/lib/python3.6/site-packages/beancount_import/webserver.py", line 493, in _handle_reconciler_loaded
    loaded_reconciler = loaded_future.result()
  File "miniconda3/lib/python3.6/concurrent/futures/_base.py", line 398, in result
    return self.__get_result()
  File "miniconda3/lib/python3.6/concurrent/futures/_base.py", line 357, in __get_result
    raise self._exception
  File "miniconda3/lib/python3.6/site-packages/beancount_import/thread_helpers.py", line 13, in wrapper
    f.set_result(fn(*args, **kwargs))
  File "miniconda3/lib/python3.6/site-packages/beancount_import/reconcile.py", line 396, in __init__
    all_source_results = self._prepare_sources()
  File "miniconda3/lib/python3.6/site-packages/beancount_import/reconcile.py", line 515, in _prepare_sources
    source.prepare(self.editor, source_results)
  File "miniconda3/lib/python3.6/site-packages/beancount_import/source/ofx.py", line 1394, in prepare
    state.get_accounts_and_entries()
  File "miniconda3/lib/python3.6/site-packages/beancount_import/source/ofx.py", line 1249, in get_accounts_and_entries
    statement.get_entries(self)
  File "miniconda3/lib/python3.6/site-packages/beancount_import/source/ofx.py", line 1043, in get_entries
    total + fee_total + (units * unitprice))
AssertionError: 543.950

Should I use ignore_transaction_regexp to skip these? Fix the entries by hand? Or should the importer solve for UNITPRICE when it is zero? In this case, UNITPRICE should be 71.5817871

Paypal issue downloading transactions

Not sure what happened, but downloading paypal transactions stopped working for me. This still worked around 3 months ago.
command I'm using
python -m finance_dl.cli --config-module paypal_finance_dl_config --config paypal

error message, which I assume has to do with the json response

 --connect=http://127.0.0.1:53380 --session-id=be26bf0a500b64bd1208b16fa9765955
2020-09-27 19:09:38,338 paypal.py:136 [INFO] Finding username field
2020-09-27 19:09:38,358 paypal.py:139 [INFO] Entering username
2020-09-27 19:09:38,427 paypal.py:142 [INFO] Finding password field
2020-09-27 19:09:38,969 paypal.py:145 [INFO] Entering password
2020-09-27 19:09:42,974 paypal.py:149 [INFO] Logged in
2020-09-27 19:09:42,975 paypal.py:175 [INFO] Getting transaction list
2020-09-27 19:09:42,976 paypal.py:163 [INFO] Getting CSRF token
[16860:14784:0927/190945.479:ERROR:device_event_log_impl.cc(208)] [19:09:45.479] Bluetooth: bluetooth_adapter_winrt.cc:1074 Getting Default Adapter failed.
Traceback (most recent call last):
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\scrape_lib.py", line 402, in retry
    return func()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\scrape_lib.py", line 422, in fetch
    scraper.run()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 246, in run
    self.save_transactions()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 189, in save_transactions
    transaction_list = self.get_transaction_list()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 184, in get_transaction_list
    j = resp.json()
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\requests\models.py", line 898, in json
    return complexjson.loads(self.text, **kwargs)
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\simplejson\__init__.py", line 525, in loads
    return _default_decoder.decode(s)
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\simplejson\decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\simplejson\decoder.py", line 400, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Traceback (most recent call last):
  File "C:\Users\Dieter\AppData\Local\Programs\Python\Python38-32\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Dieter\AppData\Local\Programs\Python\Python38-32\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\cli.py", line 91, in <module>
    main()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\cli.py", line 87, in main
    module.run(**spec)
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 250, in run
    scrape_lib.run_with_scraper(Scraper, **kwargs)
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\scrape_lib.py", line 424, in run_with_scraper
    retry(fetch)
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\scrape_lib.py", line 402, in retry
    return func()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\scrape_lib.py", line 422, in fetch
    scraper.run()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 246, in run
    self.save_transactions()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 189, in save_transactions
    transaction_list = self.get_transaction_list()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 184, in get_transaction_list
    j = resp.json()
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\requests\models.py", line 898, in json
    return complexjson.loads(self.text, **kwargs)
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\simplejson\__init__.py", line 525, in loads
    return _default_decoder.decode(s)
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\simplejson\decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\simplejson\decoder.py", line 400, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)```

The OFX balance is not reliable if DTEND < DTASOF

The OFX 2.2 specification states that DTASOF is the balance date and BALAMT the balance amount. The DTEND flag is the (exclusive) end of the transactions in the file. So when DTEND is less than DTASOF there is no way we can reliably calculate a balance (at midnight as requested by beancount) since there may be (actual) transactions between DTEND and DTASOF that will (and should) not be part of the OFX file since only transactions between DTSTART and DTEND should be listed.

On the other end, if DTEND is missing or >= DTASOF we can calculate the balance as of (the day part of) DTASOF by deducting all transactions with a date >= (the day part of) DTASOF. A good example is the test_suncorp where in my opinion the balance is wrong since on the balance date there is a transaction in the file and that should be deducted.

In ofx.py, I have added a dictionary key check_balance to the OFX source specification. This parameter, a callable, should evaluate to True or False for an OFX source file. The default is False, the old behavior which adds a balance any time without checking it is correct.

Javascript `TypeError: o.filename is null` when a beancount error has no filename

I have a custom beancount plugin that generates errors where the filename and line number are not specified. (I think they're set to None in Python.) (The error is about a global constraint violation, so it doesn't really make sense to tie it to a particular line or file.)

This kind of error seems to crash the beancount-import UI: when I load http://localhost:8101/#errors+pending the page is blank, and in the FIrefox developers' console I see

TypeError: o.filename is null
    render http://localhost:8101/#errors+pending:103
    Io http://localhost:8101/#errors+pending:40
    Po http://localhost:8101/#errors+pending:40
    ha http://localhost:8101/#errors+pending:40
    pa http://localhost:8101/#errors+pending:40
    Xa http://localhost:8101/#errors+pending:40
    qa http://localhost:8101/#errors+pending:40
    $a http://localhost:8101/#errors+pending:40
    Ua http://localhost:8101/#errors+pending:40
    ya http://localhost:8101/#errors+pending:40
    enqueueSetState http://localhost:8101/#errors+pending:40

Let me know if more detail or a reproducible example would be helpful. I'm being kind of lazy here because I'm guessing the cause of the error will be obvious.

Following `beancount.ingest`/`beangulp` workflow

Hi --

Thanks for the awesome package!

I am looking to leverage beancount-import in the identify -> extract -> archive and generate -> test workflow from beancount.ingest/beangulp. It seems like with the new support of beancount importers, this is far more achievable.

Two specific questions:

  • Are there thoughts for how to best facilitate the workflow between beancount.ingest/beangulp, e.g. something simple like just replacing the extract step with beancount-import or something more dedicated built into beancount-import?
  • Is there a way to follow the same workflow with beancount-import sources? The beancount-import ofx source is the most full-featured I have seen. Seems a bit duplicative to rewrite it all as a beancount importer. Would be great to have a means to identify, generate (that generates a "default" to be tested against, like with beancount ofx importer), and test.

Example fails on Windows because unable to rename file

On Windows 10, using beancount-import installed 12 Dec 2019, the "manually_entered" example fails with the following traceback if the journal is edited and "Save" clicked:

Traceback (most recent call last):
File "C:\...\lib\site-packages\beancount_import\webserver.py", line 380, in on_message_set_file_contents
os.rename(f.name, filename)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\...\examples\manually_entered\transactions.beancount.tmp_hclhyvn' -> 'C:\...\examples\manually_entered\transactions.beancount'
[the ellipses are mine]

The following patch to webserver.py seems to solve the problem:

380c380,381
<                 os.rename(f.name, filename)
---
>                 tmpname = f.name
>             os.replace(tmpname, filename)

Feature: Accept design update PR?

Would you accept a PR that adds a bit of CSS here and there? This is a great tool and I appreciate the keyboard shortcuts. I'd like to give the UI a little lift – would you be interested in that?

cleared_before: <date> does not work for securities_and_cash account

I have tried with both an ofx imported from Vanguard and one from Chase. I have transactions in my ledger that predate the earliest date I can download via ofx. The cleared_before directive works for the Chase checking account (cash_only), but I can't get it to work for the Vanguard account. Any transaction older than "date" is still marked as uncleared.

Parsing/matching error - postings with only currency (no amount)

Problem

Having a posting entry with a currency, but no amount, causes beancount-import to raise an error and drop into the debugger. The file validates fine from beancount's perspective.

Problematic beancount data:

2020-09-12 ! "Chase"
  Assets:Checking                    -100.00 USD
  Liabilities:Credit-Cards:Chase             USD

bean-check and fava accept the file, but beancount-import raises an error:

Fixed data

(works fine with beancount-import and bean-check and fava)

2020-09-12 ! "Chase"
  Assets:Checking                    -100.00 USD
  Liabilities:Credit-Cards:Chase

beancount-import error

  File "beancount-import/beancount_import/webserver.py", line 502, in _handle_reconciler_loaded
    loaded_reconciler = loaded_future.result()
  File "python3.7/concurrent/futures/_base.py", line 428, in result
    return self.__get_result()
  File "python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "beancount-import/beancount_import/thread_helpers.py", line 13, in wrapper
    f.set_result(fn(*args, **kwargs))
  File "beancount-import/beancount_import/reconcile.py", line 397, in __init__
    self._preprocess_entries()
  File "beancount-import/beancount_import/reconcile.py", line 442, in _preprocess_entries
    posting_db.add_transaction(entry)
  File "beancount-import/beancount_import/matching.py", line 258, in add_transaction
    transaction, self.is_cleared):
  File "beancount-import/beancount_import/matching.py", line 779, in get_matchable_postings
    (p for p, _ in weighted_postings), is_cleared):
  File "beancount-import/beancount_import/matching.py", line 724, in get_aggregate_posting_candidates
    posting.units.number > ZERO),
TypeError: '>' not supported between instances of 'type' and 'decimal.Decimal'
> beancount-import/beancount_import/matching.py(724)get_aggregate_posting_candidates()
-> posting.units.number > ZERO),

Comments

In my case this is just a typo when copying an existing record and editing manually, so I have no problem fixing the source data, but I imagine most people would turn to bean-check to validate their journal if they do encounter problems, so beancount-import would benefit from handling this the same way.

Parse 'OTHERVEST' OFX INV401KSOURCE

In my 401(k) OFX statements, I noticed there was a statement inside an INV401KSOURCE that was not correctly handled by beancount-import: "OTHERVEST".

Could this be added to ofx.py? I believe it just needs to be added to inv401k_account_keys, but I'm not certain.

Add payee shortcut to frontend

Just like pressing n allows editing narration, there should be a "press p to edit/add and edit" payee.
From the transaction_line_editor.tsx, I think it should be trivial given you already have payeeStart and payeeEnd indexes tracked in TransactionLineParseResult and there's editNarration method for reference.
That being said, I have sparingly worked on frontends and this would need a lot of effort setting up for me. 😞
Would be great if someone could add it.

Import from bean-extract / beancount.ingest?

So, I´ve spend hours writing a bean-extract compatible parser using beancount.ingest.importer.ImporterProtocol

Can one somehow string those things together?


  • I do NOT mean parsing a written-our .beancount file,
    but "run bean-extract" and use that result as data_source - similar to Fava?

something along the lines of the usual:

# beancount.config
from importers import my_importer
configured_my_importer = my_importer.Importer(
  currency='USD',
)
CONFIG = [
  configured_my_importer,
]
# ....
# ....
# ....
    data_sources = [
        dict(
            module='beancount_import.source.beancount-ingest',
            class=configured_my_importer,
        )
# or
    data_sources = [
        dict(
            module='beancount_import.source.beancount-extract',
            config='path/to/beancount.config', # parse&read CONFIG from there
            dirs='documents/',
        )

or however that´s gonna map out... ;-)

(think that could proxy-close #18 as well...)

FR: Show where an existing transaction is

In the "Candidates" panel, beancount-import helpfully shows me the diff it is going to apply.

Sometimes I want to make additional changes with my text editor. But to do that I need to know which file the transaction is (and ideally which line); I don't have a fast way to do that. It would be nice if beancount-import showed the filename and line numbers whenever it showed a diff.

Using 'e' to accept and edit causes RuntimeError: Journal file modified concurrently

I'm not sure if I'm using this correctly in a headless system via tmux. Here's how I first started beancount-import:

beancount-import --editor vim --journal_input personal.beancount --journal_output mint.beancount --mint_data transactions.csv --account_output '.*' accounts.beancount

I then see the first transaction. I want to change the name/description. So I press 'e' and nothing happens; my window is frozen. (In another session I can see that vim is running in the background.)

I tried creating a wrapper script: tmux-vim to launch vim in a pane instead loaded, but when I do that, since vim is not a child of beancount-import, beancount-import moves on to the next transaction while I change the file in the background with vim. When I save it, I end up receiving a RunTimeError on the next transaction.

Considering improvements to ML example generation

Bottom line up front:
Unknown account prediction started failing hard for me recently. I got it back up with some hacks and want to discuss potential better solutions.

I've been a happy beancount-import user for around a year now. Recently the unknown account prediction accuracy took a nosedive and I've been investigating why. I learned how the training code builds training examples in two ways:

  • Type 1 features are metadata about the unknown account. These are great for sources like the Amazon source. The Amazon source knows price and metadata about the posting, but doesn't know which account to map it to:
  Expenses:FIXME   9.01 USD
    amazon_item_description: "Some item"
    amazon_item_quantity: 1
    amazon_seller: "Amazon.com Services LLC"
    shipped_date: 2021-01-06
  Expenses:FIXME   0.91 USD
    amazon_invoice_description: "Sales Tax"
  Liabilities:CC:Amazon     -9.92 USD
    amazon_credit_card_description: "Visa ending in 1234"
  • Type 2 features are generated when there are exactly 2 non-ignored postings. This is the main type that I actually use, though I could change that.

Example from my plaid importer:

2021-01-07 * "Caffe" "CAFFE LUSSO COFFEE gosq.com"
  Liabilities:CC:CitiCash     -44 USD
    category: "Food and Drink, Restaurants, Coffee Shop"
    date: 2021-01-07
    plaid_transaction_id: "redacted"
    source_desc: "CAFFE LUSSO COFFEE gosq.com"
  Expenses:FIXME   44 USD

I commented out Type 1 feature generation and my accuracy went way up for type 2 predictions. I'm not sure if this is because the type 2 features are getting drowned out by the type 1?

Fundamentally, I think the biggest issue is that the way we're generating examples is potentially quite different from the actual inference task: I have several sources where the import has no prediction whatsoever, but these still get examples generated for them. We're lacking the context about which postings were offered by the source as FIXME and corrected by the user.
This could probably be corrected by adding a new API for sources to optionally call during prepare().
This API would let the source generate examples. Sources already run through the journal, looking for already imported postings and suppressing those results. Instead of suppressing, they could call this new API telling us they would've imported it as FIXME, but it has been set as . That is then a high quality training example.

I may try modifying my plaid source to put the metadata on the FIXME posting. This would make it use Type 1 features, which may work better.

2021-01-07 * "Caffe" "CAFFE LUSSO COFFEE gosq.com"
  Liabilities:CC:CitiCash     -44 USD
    date: 2021-01-07
    plaid_transaction_id: "redacted"
  Expenses:FIXME   44 USD
    category: "Food and Drink, Restaurants, Coffee Shop"
    source_desc: "CAFFE LUSSO COFFEE gosq.com"

Another option I was considering is separating type 1 and type 2 classifiers, but that would be a fairly involved change since it also affects inference.

Amazon invoices have gotten much less useful :(

Amazon invoices have changed recently, and no longer show any summaries on a per-shipment basis. This means pre-/post-tax adjustments and sales tax can no longer be calculated per shipment 😢

The changes were even made retroactively; see e.g. this example of a previously-downloaded invoice vs. the newly-generated invoice for the same order.

old-good-invoice.pdf
new-bad-invoice.pdf

I have complained to Amazon but I doubt it's going to change anything, so I plan to fix up the parser to handle the eviscerated statements this weekend; mainly opening this issue to have something to reference from the pull request.

Some work left to do on the Schwab importer

I wanted to collect a few loose ends here:

  • Add commodity directives for every new commodity that appears in transactions.
  • Handle bonds, and US Treasuries specifically. Bond symbols are CUSIPs, which are not valid Beancount symbols. Quantity * Price != Amount (-Fee).
  • Allow users to specify if commodity transactions should be in subaccount, or stay in top-level account like OFX import does it.
  • Option expiration, bond maturation.
  • Option sold in lots of 100, even though price reported for 1.
  • Add/skip Income account posting for capital gains (no need when selling short, but needed when buying to close).

@carljm
Comments and suggestions are welcome.

`beancount-import` doesn't react to ctrl-C until the next web request

This happens both with master and the version I had installed using pip (probably the latest, 1.3.3).

I press ctrl-C in the terminal once I'm done importing. But the program doesn't quit until I cause some sort of web request to be issued. For example if I go over to the Editor tab and choose a different file, that's enough for it to actually catch it.

Does this happen to anyone else?

falsifian moth beancount $ uname -a
OpenBSD moth.falsifian.org 6.9 GENERIC.MP#490 amd64

ofx importer assumes one org per file

ofxclient has a "combined download" option where it downloads all your accounts into a single OFX file. This is pretty convenient if you have a lot of accounts, compared to one by one downloading them.

But the ofx importer in beancount-import can't handle transactions from multiple orgs in a single file, due to the code here: https://github.com/jbms/beancount-import/blob/master/beancount_import/source/ofx.py#L1133

It just looks for the first ORG element and then assumes that is the correct org for all transactions in the file.

Seems like fixing this might require a more sophisticated parsing of the OFX file format, looking at nesting etc, rather than just "find org" and then "find all transactions."

Feature: Group candidates by payee

I typically have a large number of transactions for any given payee, all of which will be categorized as the same account (e.g. all purchases at the grocery store are classified as Expenses:Groceries).

Currently, all those purchases are spread, so I have to pay extra attention when importing.

If instead I knew I'd get all the transactions from the same store one after the other, I could be a bit more aggressive with my hammering of the Enter key.

beancount_import 1.3.3 doesn't contain `generic_import_source`

Hi! Thanks for this project, it helps a lot.
Here's a thing I noticed - beancount-import 1.3.3 (latest version at time of writing), as published on pypi doesn't seem to contain generic_import_source. I get an error when specifying it as a data source.

See for example:

wget https://files.pythonhosted.org/packages/cb/b3/a4fbc28c957c8ff5ce5686ec71d8df301b4d61a73b839fa6f7b4960b2ff5/beancount-import-1.3.3.zip
unzip beancount-import-1.3.3.zip
ls beancount-import-1.3.3/beancount_import/source                                          1
amazon_invoice.py             __init__.py           stockplanconnect.py
amazon_invoice_sanitize.py    link_based_source.py  stockplanconnect_statement.py
amazon_invoice_test.py        mint.py               ultipro_google.py
amazon.py                     mint_test.py          ultipro_google_statement.py
amazon_test.py                ofx.py                venmo.py
description_based_source.py   ofx_sanitize.py       venmo_sanitize.py
google_purchases.py           ofx_test.py           venmo_test.py
google_purchases_sanitize.py  paypal.py             waveapps.py
google_purchases_test.py      paypal_sanitize.py    waveapps_test.py
healthequity.py               paypal_test.py
healthequity_test.py          source_test.py

If I install the project from the master branch, everything works as expected:

(install from master branch as per README instructions)
ls ~/.local/lib/python3.9/site-packages/beancount_import/source 
amazon_invoice.py                healthequity_test.py  schwab_csv.py
amazon_invoice_sanitize.py       __init__.py           schwab_csv_test.py
amazon_invoice_test.py           link_based_source.py  source_test.py
amazon.py                        mint.py               stockplanconnect.py
amazon_test.py                   mint_test.py          stockplanconnect_statement.py
description_based_source.py      ofx.py                ultipro_google.py
generic_importer_source.py       ofx_sanitize.py       ultipro_google_statement.py
generic_importer_source_test.py  ofx_test.py           venmo.py
google_purchases.py              paypal.py             venmo_sanitize.py
google_purchases_sanitize.py     paypal_sanitize.py    venmo_test.py
google_purchases_test.py         paypal_test.py        waveapps.py
healthequity.py                  __pycache__           waveapps_test.py

I guess the generic importer source will only be available in a future released version of the project, might be worth specifying this in the Readme. Thanks!

beancount_import matching broken for `securities_and_cash` ofx account?

I just tried beancount-import with an investment account for the first time, and I'm having trouble getting it to match any of my existing transactions.

After some fiddling, I llearned that if I take a transaction generated by beancount-import and manually delete the metadata, beancount-import won't realize it's the same transaction, and will try to generate it again.

(Some numbers below are replaced with XXX for privacy.)

As an example, with the below stripped-down beancount file and run_beancount_import.py, beancount-import generates the following new directives:

2020-12-30 * "SELLSTOCK - BANK MONTREAL QUEBEC"
  Assets:Brokerage:BMO         -13 BMO {} @ 76.03 USD
    date: 2020-12-30
    ofx_fitid: "XXX"
    ofx_memo: "BANK MONTREAL QUEBEC"
    ofx_type: "SELLSTOCK"
  Income:Capital-gains:BMO
  Assets:Brokerage:Cash     988.37 USD
    ofx_fitid: "XXX"
  Expenses:Fees               0.02 USD

2020-12-30 open Assets:Brokerage:BMO                            BMO

2020-12-30 open Income:Capital-gains:BMO                        USD

2020-12-30 open Assets:Brokerage:Cash                           USD

If I edit that transaction so that it instead reads:

2020-12-30 * "SELLSTOCK - BANK MONTREAL QUEBEC"
  Assets:Brokerage:BMO         -13 BMO {} @ 76.03 USD
  Income:Capital-gains:BMO
  Assets:Brokerage:Cash     988.37 USD
  Expenses:Fees               0.02 USD

then suddenly beancount-import won't match it any more, and wants to add another copy of the transaction. I was not able to get into a situation where beancount-import is willing to augment a transaction I've already entered by adding in the appropriate metadata.

Here is my main.beancount with account ID censored:

2000-01-01 open Income:Capital-gains
2000-01-01 open Income:Dividends
2000-01-01 open Income:Interest
2000-01-01 open Expenses:Fees

1792-04-02 commodity USD
  cusip: "9999101"

2000-01-01 open Assets:Brokerage
  ofx_org: ""
  ofx_broker_id: "Wells Fargo Advisors"
  ofx_account_type: "securities_and_cash"
  account_id: "XXX"
  capital_gains_account: Income:Capital-gains
  fees_account: Expenses:Fees
  div_income_account: Income:Dividends
  interest_income_account: Income:Interest

run_beancount_import.py contains:

import beancount_import.webserver

def run_reconcile():
    data_sources = [
        {
            "module": "beancount_import.source.ofx",
            "ofx_filenames": ("export.ofx",)
        },
    ]

    beancount_import.webserver.main(
        argv = (),
        journal_input = "main.beancount",
        ignored_journal = "main.beancount",
        default_output = "main.beancount",
        open_account_output = "main.beancount",
        balance_account_output = "main.beancount",
        data_sources = data_sources,
    )

if __name__ == "__main__":
    run_reconcile()

If any details from export.ofx would be useful, let me know.

High-ish idle CPU usage

When I've got Beancount-import running, I notice that it appears to idle at around 3% (on an arbitrary machine) as viewed in top. This is when the front-end is not open in a browser. Just based on a cursory code inspection, my guess is that this may be due to the check for journal modifications every 100 milliseconds in webserver.py:

       self.check_modification_timer = tornado.ioloop.PeriodicCallback(
            self._check_modification, 100)

Is there any reason this check is so frequent? Just to get fast updates to the user?

How does deduplication work?

I have a simple issue. In my transactions.beancount (which is included by by journal), I have this:

2018-07-26 * "Narration"
  Assets:Cash:Bank1                               -1547.95 USD
  Assets:Cash:Bank2                                1547.95 USD

In my importer's output, I have this:

2018-07-26 * "Payee" "Narration"
  txn_id: "ID" 
  Assets:Cash:Bank1                               -1547.95 USD 
    source_desc: "Other bank name" 
  Expenses:FIXME                                   1547.95 USD

However beancount-import's suggestion doesn't come up with the duplicate.

I had a look around the code to see if I could figure out how to fix it myself but I couldn't find where it's implemented. How can I debug this?

FR: automatically remove "!" flags

I often flag transactions or individual postings with "!" if I enter them before my bank tells me about them. It would be nice if beancount-import removed these when I import the corresponding transactions.

I might try implementing this myself.

Adding an OFX converter boosts up the usability of beancount importer

Half a year ago I stumbled upon this utility and I was impressed. I wanted to use it for a lot of European financial institutions where I had accounts. But then it got difficult. The pluggable architecture of beancount-import actually leads to a lock-in. If I write a converter I can only use it with beancount-import. So after some thought I switched to another idea: if I just write an OFX converter I can use the beancount-import OFX converter. And then I found ofxstatement which gave me exactly all I needed. I have created the following ofxstatement plugins:

These (or any other ofxstatement) plugins can be invoked using the convert2ofx routine in ofx.py. There is also an example.

Import of OFX CHECKNUM tag fails for a non-numeric value

When processing my OFX file it fails with this error:

Traceback (most recent call last):
File "C:\Users\gjpau\AppData\Local\Programs\Python\Python38\lib\site-packages\beancount\core\number.py", line 96, in D
return Decimal(_CLEAN_NUMBER_RE.sub('', strord))
decimal.InvalidOperation: [<class 'decimal.ConversionSyntax'>]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
...
File "c:\users\gjpau\documents\github\beancount-import\beancount_import\source\ofx.py", line 1394, in prepare
state.get_accounts_and_entries()
File "c:\users\gjpau\documents\github\beancount-import\beancount_import\source\ofx.py", line 1249, in get_accounts_and_entries
statement.get_entries(self)
File "c:\users\gjpau\documents\github\beancount-import\beancount_import\source\ofx.py", line 843, in get_entries
posting_meta[CHECK_KEY] = D(stripped_checknum)
File "C:\Users\gjpau\AppData\Local\Programs\Python\Python38\lib\site-packages\beancount\core\number.py", line 104, in D
raise ValueError("Impossible to create Decimal instance from {!s}: {}".format(
ValueError: Impossible to create Decimal instance from 8OOMBV: [<class 'decimal.ConversionSyntax'>]

c:\users\gjpau\appdata\local\programs\python\python38\lib\site-packages\beancount\core\number.py(104)D()
-> raise ValueError("Impossible to create Decimal instance from {!s}: {}".format(

When I comment all tags the import succeeds.

The failing tag is: 08OOMBV

I have read the latest OFX specification that says a number of format A-12. However the A just means any UTF-8 character.


Check (or other reference) number, A-12


Character fields are identified with a data type of “A-n”, where n is the maximum number of allowed
Unicode characters.
Note: n refers to the number of characters in the resultant string. Each multi-byte or encoded
character counts as a single character. UTF-8 encodes “high” Latin-1 characters (decimal 128-
255) using two bytes, and double-byte characters using three bytes. In addition, XML encodes
ampersands, less-than symbols, greater-than symbols, and spaces (where required) using multicharacter escape strings (see section 2.3.1.1). Therefore, an element of type A-40 may require
more than 40 bytes in a UTF-8-encoded XML stream.

Tornado - async IO throwing NotImplementedError

Used to work fine until I reinstalled Windows. I used to have 3.7.something Python. It could be some other dependency.

Steps to reproduce:

  1. Fresh windows
  2. Install VS build tools
  3. Get Python 3.8.1
  4. python -m pip install --upgrade pip
  5. pip install beancount-import
  6. Run beancount-import
Traceback (most recent call last):
  File "C:\Users\Tirae\OneDrive\Ledger\run_win.py", line 33, in <module>
    run_reconcile(sys.argv[1:])
  File "C:\Users\Tirae\OneDrive\Ledger\run_win.py", line 19, in run_reconcile
    beancount_import.webserver.main(
  File "C:\Users\Tirae\AppData\Local\Programs\Python\Python38-32\lib\site-packages\beancount_import\webserver.py", line 737, in main
    http_server.add_sockets(sockets)
  File "C:\Users\Tirae\AppData\Local\Programs\Python\Python38-32\lib\site-packages\tornado\tcpserver.py", line 165, in add_sockets
    self._handlers[sock.fileno()] = add_accept_handler(
  File "C:\Users\Tirae\AppData\Local\Programs\Python\Python38-32\lib\site-packages\tornado\netutil.py", line 279, in add_accept_handler
    io_loop.add_handler(sock, accept_handler, IOLoop.READ)
  File "C:\Users\Tirae\AppData\Local\Programs\Python\Python38-32\lib\site-packages\tornado\platform\asyncio.py", line 99, in add_handler
    self.asyncio_loop.add_reader(fd, self._handle_events, fd, IOLoop.READ)
  File "C:\Users\Tirae\AppData\Local\Programs\Python\Python38-32\lib\asyncio\events.py", line 501, in add_reader
    raise NotImplementedError
NotImplementedError

The balance assertions of mint source is too early.

I often get balance failure because beancount evaluates balance assertions at the beginning of the day while the value of account balance from beancount-import seems to be calculated at the end of the day.

HTTPS

This is a feature request to support HTTPS, not in beancount-import directly, but in any reverse proxy that's wrapping it. Let me back up a bit: My use case is that I'd like to securely host beancount-import on an HTTPS web server so that both I and my partner can collaborate on processing our shared pending transactions. I realize that the typical use case is to run beancount-import locally and then connect to localhost via HTTP, but I'd prefer to run it on an actual web server if possible to aide in collaboration.

I'm wrapping beancount-import with nginx to provide the HTTPS termination, and then proxy passing requests on to the local beancount-import web server. The problem appears to be that frontend/server_connection.ts includes ws://, hard-coded:

"ws://" + window.location.host + "/" + secretKey + "/websocket"

This results in a SecurityError: The operation is insecure. console error in Firefox when using an HTTPS URL. This is because Firefox prevents HTTPS web pages from accessing non-secured web sockets. So in this case, what I need is for the front-end to access the web socket via a secured wss:// URL. I imagine one way to support this properly would be to fork on ws:// or wss:// based on the current page's protocol (HTTP or HTTPS, respectively).

Before I try to do this, I wanted to get a read on whether this is something you'd entertain. Is this use case too far outside the bounds of how you see beancount-import being used? Thanks!

generic_importer_source examples fail on relative path

Freshly checked out master. Same issue for the manually_entered examples.
@dumbPy

eugeniu@home:~/beancount-import/examples/fresh$ python3 run.py 
Listening at http://127.0.0.1:8101
../data/importers/creditcard.csv
Traceback (most recent call last):
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/webserver.py", line 502, in _handle_reconciler_loaded
    loaded_reconciler = loaded_future.result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
    raise self._exception
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/thread_helpers.py", line 13, in wrapper
    f.set_result(fn(*args, **kwargs))
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/reconcile.py", line 382, in __init__
    self._load_sources()
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/reconcile.py", line 435, in _load_sources
    sources = self.sources = [
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/reconcile.py", line 436, in <listcomp>
    load_source(spec, log_status=self.reconciler.log_status)
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/source/__init__.py", line 319, in load_source
    return m.load(source_spec, log_status=log_status)  # type: ignore
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/source/generic_importer_source.py", line 160, in load
    return ImporterSource(log_status=log_status, **spec)
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/source/generic_importer_source.py", line 45, in __init__
    files = [get_file(f) for f in
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/source/generic_importer_source.py", line 45, in <listcomp>
    files = [get_file(f) for f in
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount/ingest/cache.py", line 136, in get_file
    raise e
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount/ingest/cache.py", line 132, in get_file
    assert path.isabs(filename), (
AssertionError: Path should be absolute in order to guarantee a single call. ../data/importers/creditcard.csv
> /home/eugeniu/.local/lib/python3.8/site-packages/beancount/ingest/cache.py(132)get_file()
-> assert path.isabs(filename), (
(Pdb) 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.