jbms / beancount-import Goto Github PK

View Code? Open in Web Editor NEW

383.0 383.0 97.0 1.64 MB

Web UI for semi-automatically importing external data into beancount

License: GNU General Public License v2.0

Python 83.49% HTML 0.02% TypeScript 15.26% JavaScript 0.69% CSS 0.54%

beancount

beancount-import's People

Contributors

Stargazers

Watchers

Forkers

javgh zenners zacchiro hoostus wzyboy hnws jktomer marklodato alexander-bauer drummonds gzimmers dproctor vindyf cookie223 char heyeshuang ikwtif chrishas35 senepiaur scanta2 bjcubsfan jcrben sharadmv cdchapman stin7 stonesthatwhisper lukeramsden watate gpaulissen zburatorul dumbpy scauligi jmayclin mbafford benjvi 7mamu4 carljm bayesianmind nyaxt vincentor nickysemenza zhou13 mattmattmatt filosottile daw-swdev micflu73 philip-iv ybnd jmremus rstanleyhum profonseca arition m-d-brown michelgb michaelvll jakobwenzel erpreciso moritzj29 patbakdev chandler150 liurnd camhashemi cxz nnooney akshaysalunke13 msiedlarek c-vigo mikej ankurdave xentac elijahahianyo naskooskov delphij portable-packages jonboylecoding pkgw boogiewookie jasonhuhx concoy andrey-str nickdarnell talek tebirkesueth oraluben amundhov charlieamao kerrickstaley dkess sclu1034 sunyxi bobobo1618 falsifian blueyestar 1997cui

beancount-import's Issues

HTTPS

This is a feature request to support HTTPS, not in beancount-import directly, but in any reverse proxy that's wrapping it. Let me back up a bit: My use case is that I'd like to securely host beancount-import on an HTTPS web server so that both I and my partner can collaborate on processing our shared pending transactions. I realize that the typical use case is to run beancount-import locally and then connect to localhost via HTTP, but I'd prefer to run it on an actual web server if possible to aide in collaboration.

I'm wrapping beancount-import with nginx to provide the HTTPS termination, and then proxy passing requests on to the local beancount-import web server. The problem appears to be that frontend/server_connection.ts includes ws://, hard-coded:

"ws://" + window.location.host + "/" + secretKey + "/websocket"

This results in a SecurityError: The operation is insecure. console error in Firefox when using an HTTPS URL. This is because Firefox prevents HTTPS web pages from accessing non-secured web sockets. So in this case, what I need is for the front-end to access the web socket via a secured wss:// URL. I imagine one way to support this properly would be to fork on ws:// or wss:// based on the current page's protocol (HTTP or HTTPS, respectively).

Before I try to do this, I wanted to get a read on whether this is something you'd entertain. Is this use case too far outside the bounds of how you see beancount-import being used? Thanks!

Generic CSV importer

Hi. I've recently opened an account in a bank that doesn't support OFX export.

I'd like to write a general CSV importer, but I'm having a hard time even starting. Where do I look? I found venmo.py Is probably closest to just CSV, but it's still a bit of a stretch to cut it back to my needs.

CSV structure looks sufficient to uniquely match transaction and have a description and whatnot:

Account type; Account number; Currency; Date (dd.mm.yyyy); Reference; Description; Credit; Debit

Reference is somewhat unique and looks like this:

CRD_3629XM - operation with card

A102705190005380 - operation with account

HOLD - you guessed it

Obviously as a lazy person I'd like to just have a working solution in some next release, but steering me in the direction on how to write it myself would be nice too.

Tornado - async IO throwing NotImplementedError

Used to work fine until I reinstalled Windows. I used to have 3.7.something Python. It could be some other dependency.

Steps to reproduce:

Fresh windows
Install VS build tools
Get Python 3.8.1
python -m pip install --upgrade pip
pip install beancount-import
Run beancount-import

Traceback (most recent call last):
  File "C:\Users\Tirae\OneDrive\Ledger\run_win.py", line 33, in <module>
    run_reconcile(sys.argv[1:])
  File "C:\Users\Tirae\OneDrive\Ledger\run_win.py", line 19, in run_reconcile
    beancount_import.webserver.main(
  File "C:\Users\Tirae\AppData\Local\Programs\Python\Python38-32\lib\site-packages\beancount_import\webserver.py", line 737, in main
    http_server.add_sockets(sockets)
  File "C:\Users\Tirae\AppData\Local\Programs\Python\Python38-32\lib\site-packages\tornado\tcpserver.py", line 165, in add_sockets
    self._handlers[sock.fileno()] = add_accept_handler(
  File "C:\Users\Tirae\AppData\Local\Programs\Python\Python38-32\lib\site-packages\tornado\netutil.py", line 279, in add_accept_handler
    io_loop.add_handler(sock, accept_handler, IOLoop.READ)
  File "C:\Users\Tirae\AppData\Local\Programs\Python\Python38-32\lib\site-packages\tornado\platform\asyncio.py", line 99, in add_handler
    self.asyncio_loop.add_reader(fd, self._handle_events, fd, IOLoop.READ)
  File "C:\Users\Tirae\AppData\Local\Programs\Python\Python38-32\lib\asyncio\events.py", line 501, in add_reader
    raise NotImplementedError
NotImplementedError

How does deduplication work?

I have a simple issue. In my transactions.beancount (which is included by by journal), I have this:

2018-07-26 * "Narration"
  Assets:Cash:Bank1                               -1547.95 USD
  Assets:Cash:Bank2                                1547.95 USD

In my importer's output, I have this:

2018-07-26 * "Payee" "Narration"
  txn_id: "ID" 
  Assets:Cash:Bank1                               -1547.95 USD 
    source_desc: "Other bank name" 
  Expenses:FIXME                                   1547.95 USD

However beancount-import's suggestion doesn't come up with the duplicate.

I had a look around the code to see if I could figure out how to fix it myself but I couldn't find where it's implemented. How can I debug this?

Tornado uncaught exception when running example

Hi, I'm trying to run the manually_entered example (same thing happens for fresh example too), but can't find a way around this error which is repeatedly printed to the console:

ERROR:tornado.application:Uncaught exception GET /BEANCOUNT_IMPORT_SECRET_KEY_f31e30e228b14bf2e354100c5305bfdab40d608f/websocket (127.0.0.1)
HTTPServerRequest(protocol='http', host='localhost:8101', method='GET', uri='/BEANCOUNT_IMPORT_SECRET_KEY_f31e30e228b14bf2e354100c5305bfdab40d608f/websocket', version='HTTP/1.1', remote_ip='127.0.0.1')
Traceback (most recent call last):
  File "/Users/rbharvey/.virtualenvs/ledger/lib/python3.6/site-packages/tornado/websocket.py", line 952, in _accept_connection
    open_result = handler.open(*handler.open_args, **handler.open_kwargs)
  File "/Users/rbharvey/.virtualenvs/ledger/lib/python3.6/site-packages/beancount_import/webserver.py", line 267, in open
    self.set_nodelay(True)
  File "/Users/rbharvey/.virtualenvs/ledger/lib/python3.6/site-packages/tornado/websocket.py", line 561, in set_nodelay
    assert self.stream is not None
AssertionError

In the browser, it sort of flickers between what appears to be the web UI and the following error message: Server connection closed, waiting to reconnect.

High-ish idle CPU usage

When I've got Beancount-import running, I notice that it appears to idle at around 3% (on an arbitrary machine) as viewed in top. This is when the front-end is not open in a browser. Just based on a cursory code inspection, my guess is that this may be due to the check for journal modifications every 100 milliseconds in webserver.py:

       self.check_modification_timer = tornado.ioloop.PeriodicCallback(
            self._check_modification, 100)

Is there any reason this check is so frequent? Just to get fast updates to the user?

The balance assertions of mint source is too early.

I often get balance failure because beancount evaluates balance assertions at the beginning of the day while the value of account balance from beancount-import seems to be calculated at the end of the day.

Add payee shortcut to frontend

Just like pressing n allows editing narration, there should be a "press p to edit/add and edit" payee.
From the transaction_line_editor.tsx, I think it should be trivial given you already have payeeStart and payeeEnd indexes tracked in TransactionLineParseResult and there's editNarration method for reference.
That being said, I have sparingly worked on frontends and this would need a lot of effort setting up for me. 😞
Would be great if someone could add it.

Amazon invoices have gotten much less useful :(

Amazon invoices have changed recently, and no longer show any summaries on a per-shipment basis. This means pre-/post-tax adjustments and sales tax can no longer be calculated per shipment 😢

The changes were even made retroactively; see e.g. this example of a previously-downloaded invoice vs. the newly-generated invoice for the same order.

old-good-invoice.pdf
new-bad-invoice.pdf

I have complained to Amazon but I doubt it's going to change anything, so I plan to fix up the parser to handle the eviscerated statements this weekend; mainly opening this issue to have something to reference from the pull request.

Javascript `TypeError: o.filename is null` when a beancount error has no filename

I have a custom beancount plugin that generates errors where the filename and line number are not specified. (I think they're set to None in Python.) (The error is about a global constraint violation, so it doesn't really make sense to tie it to a particular line or file.)

This kind of error seems to crash the beancount-import UI: when I load http://localhost:8101/#errors+pending the page is blank, and in the FIrefox developers' console I see

TypeError: o.filename is null
    render http://localhost:8101/#errors+pending:103
    Io http://localhost:8101/#errors+pending:40
    Po http://localhost:8101/#errors+pending:40
    ha http://localhost:8101/#errors+pending:40
    pa http://localhost:8101/#errors+pending:40
    Xa http://localhost:8101/#errors+pending:40
    qa http://localhost:8101/#errors+pending:40
    $a http://localhost:8101/#errors+pending:40
    Ua http://localhost:8101/#errors+pending:40
    ya http://localhost:8101/#errors+pending:40
    enqueueSetState http://localhost:8101/#errors+pending:40

Let me know if more detail or a reproducible example would be helpful. I'm being kind of lazy here because I'm guessing the cause of the error will be obvious.

Transaction [Cost]

I am trying to create an import file for an account, but I'm having trouble being able to import a cost for a transaction. Not sure whether this is a beancount specific issue or if it's related to beancount-import specifically (which I think it is).

YYYY-MM-DD [txn|Flag] [[Payee] Narration]
   [Flag] Account       Amount [{Cost}] [@ Price]
   [Flag] Account       Amount [{Cost}] [@ Price]

I'm trying to add a Cost to my transactions from this code

transaction = Transaction(
            meta=None,
            date=account_entry.date,
            flag=FLAG_OKAY,
            payee=None,
            narration=account_entry.source_desc,
            tags=EMPTY_SET,
            links=EMPTY_SET,
            postings=[
                Posting(
                    account=account_entry.account,
                    units=account_entry.value,
                    cost=None,
                    price=None,
                    flag=None,
                    meta=collections.OrderedDict(
                        source_desc=account_entry.source_desc,
                        date=account_entry.date,
                    )),
                Posting(
                    account=FIXME_ACCOUNT,
                    units=-account_entry.amount,
                    cost=account_entry.cost,
                    price=account_entry.rate,
                    flag=None,
                    meta=None,
                ),
                Posting(
                    account=FIXME_ACCOUNT,
                    units=-account_entry.fee,
                    cost=None,
                    price=None,
                    flag=None,
                    meta=None,
                ),
            ])

The output from this shows the actual cost properly but this just shows up as {} when checking a transaction.

+2020-01-01 * ""
+  Assets:Account      589.03000 USD
+    date: 2020-01-01
+    source_desc: ""
+  1Expenses:FIXME    -500.00 EUR {} @ 1.17512 USD
+  2Expenses:FIXME    1.47000 USD

This is the output for print(transaction) which shows the actual cost properly

Transaction(meta=None, date=datetime.date(2020, 01, 01), flag='*', payee=None, narration='', t
ags=frozenset(), links=frozenset(), 
postings=[Posting(account='Assets:Account', units=589.03000 USD, cost=None, price=None, flag=None, meta=OrderedDict([('source_desc', ''), ('date', datetime.date(2020, 01, 01))])), Posting(account='Expenses:FIXME', units=-500.00 EUR, cost=0.8509769215058887602968207502 USD, price=1.17512 USD, flag=None, meta=None),
 Posting(account='Expenses:FIXME', units=1.47000 USD, cost=None, price=None, flag=None, meta=None)])

I create the cost similar to rate, amount or value, so I'm not sure why this isn't working

entries.append(  AccountEntry(
                            account=account,
                            date=date,
                            source_desc=source,
                            amount=Amount(number=number, currency=currency),
                            filename=filename,
                            line=line_i + 1,
                            value=Amount(number=number_value, currency=currency_value),
                            fee=None, type_=type_,
                            rate=Amount(number=rate_number, currency=rate_currency),
                            cost=Amount(number=cost_number, currency=cost_currency)))

Price fetch reads yesterday's price

This is what I got in the prices_output_map file:
2018-10-31 price FAS 56.4500000 USD
And
(pyenv) [xxxx@av beancount]$ bean-price -e USD:yahoo/FAS
I got output of
2018-10-31 price FAS 58.5499999999999971578290569595992565155029296875 USD
According to

58.55 is 10-31's close price and 56.45 is 10-30's close.

I spent some time trying to fix this, however I cannot find any obvious place this price is fetched.
Would you please take a look? Or please let me know which code should I look at so I can attempt a fix?

Failed assertion for Fidelity accounts

Importing ofx files from my Fidelity 401k account for BUY transactions is problematic.

In ofx.py:1040 there's this assertion:
if raw.trantype in STOCK_BUY_SELL_TYPES:
assert abs(total + fee_total +
(units * unitprice)) < TOLERANCE, abs(
total + fee_total + (units * unitprice))

The unitprice is not "correct" in the ofx file, and tolerance is not met. units and total is correct and they match the statement.

The cost_spec is computed on line 933, using unitprice, before we determine if a fee was charged.

What I think should happen is that total and fee_total should be computed before the cost_spec initialization, and instead of using "number_per", we should use "number_total" with -(total+fee_total) as its value, and ignore potentially problematic unitprice values.

generic_importer_source examples fail on relative path

Freshly checked out master. Same issue for the manually_entered examples.
@dumbPy

eugeniu@home:~/beancount-import/examples/fresh$ python3 run.py 
Listening at http://127.0.0.1:8101
../data/importers/creditcard.csv
Traceback (most recent call last):
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/webserver.py", line 502, in _handle_reconciler_loaded
    loaded_reconciler = loaded_future.result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
    raise self._exception
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/thread_helpers.py", line 13, in wrapper
    f.set_result(fn(*args, **kwargs))
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/reconcile.py", line 382, in __init__
    self._load_sources()
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/reconcile.py", line 435, in _load_sources
    sources = self.sources = [
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/reconcile.py", line 436, in <listcomp>
    load_source(spec, log_status=self.reconciler.log_status)
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/source/__init__.py", line 319, in load_source
    return m.load(source_spec, log_status=log_status)  # type: ignore
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/source/generic_importer_source.py", line 160, in load
    return ImporterSource(log_status=log_status, **spec)
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/source/generic_importer_source.py", line 45, in __init__
    files = [get_file(f) for f in
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount_import/source/generic_importer_source.py", line 45, in <listcomp>
    files = [get_file(f) for f in
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount/ingest/cache.py", line 136, in get_file
    raise e
  File "/home/eugeniu/.local/lib/python3.8/site-packages/beancount/ingest/cache.py", line 132, in get_file
    assert path.isabs(filename), (
AssertionError: Path should be absolute in order to guarantee a single call. ../data/importers/creditcard.csv
> /home/eugeniu/.local/lib/python3.8/site-packages/beancount/ingest/cache.py(132)get_file()
-> assert path.isabs(filename), (
(Pdb)

Parsing/matching error - postings with only currency (no amount)

Problem

Having a posting entry with a currency, but no amount, causes beancount-import to raise an error and drop into the debugger. The file validates fine from beancount's perspective.

Problematic beancount data:

2020-09-12 ! "Chase"
  Assets:Checking                    -100.00 USD
  Liabilities:Credit-Cards:Chase             USD

bean-check and fava accept the file, but beancount-import raises an error:

Fixed data

(works fine with beancount-import and bean-check and fava)

2020-09-12 ! "Chase"
  Assets:Checking                    -100.00 USD
  Liabilities:Credit-Cards:Chase

beancount-import error

  File "beancount-import/beancount_import/webserver.py", line 502, in _handle_reconciler_loaded
    loaded_reconciler = loaded_future.result()
  File "python3.7/concurrent/futures/_base.py", line 428, in result
    return self.__get_result()
  File "python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "beancount-import/beancount_import/thread_helpers.py", line 13, in wrapper
    f.set_result(fn(*args, **kwargs))
  File "beancount-import/beancount_import/reconcile.py", line 397, in __init__
    self._preprocess_entries()
  File "beancount-import/beancount_import/reconcile.py", line 442, in _preprocess_entries
    posting_db.add_transaction(entry)
  File "beancount-import/beancount_import/matching.py", line 258, in add_transaction
    transaction, self.is_cleared):
  File "beancount-import/beancount_import/matching.py", line 779, in get_matchable_postings
    (p for p, _ in weighted_postings), is_cleared):
  File "beancount-import/beancount_import/matching.py", line 724, in get_aggregate_posting_candidates
    posting.units.number > ZERO),
TypeError: '>' not supported between instances of 'type' and 'decimal.Decimal'
> beancount-import/beancount_import/matching.py(724)get_aggregate_posting_candidates()
-> posting.units.number > ZERO),

Comments

In my case this is just a typo when copying an existing record and editing manually, so I have no problem fixing the source data, but I imagine most people would turn to bean-check to validate their journal if they do encounter problems, so beancount-import would benefit from handling this the same way.

Paypal issue downloading transactions

Not sure what happened, but downloading paypal transactions stopped working for me. This still worked around 3 months ago.
command I'm using
python -m finance_dl.cli --config-module paypal_finance_dl_config --config paypal

error message, which I assume has to do with the json response

 --connect=http://127.0.0.1:53380 --session-id=be26bf0a500b64bd1208b16fa9765955
2020-09-27 19:09:38,338 paypal.py:136 [INFO] Finding username field
2020-09-27 19:09:38,358 paypal.py:139 [INFO] Entering username
2020-09-27 19:09:38,427 paypal.py:142 [INFO] Finding password field
2020-09-27 19:09:38,969 paypal.py:145 [INFO] Entering password
2020-09-27 19:09:42,974 paypal.py:149 [INFO] Logged in
2020-09-27 19:09:42,975 paypal.py:175 [INFO] Getting transaction list
2020-09-27 19:09:42,976 paypal.py:163 [INFO] Getting CSRF token
[16860:14784:0927/190945.479:ERROR:device_event_log_impl.cc(208)] [19:09:45.479] Bluetooth: bluetooth_adapter_winrt.cc:1074 Getting Default Adapter failed.
Traceback (most recent call last):
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\scrape_lib.py", line 402, in retry
    return func()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\scrape_lib.py", line 422, in fetch
    scraper.run()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 246, in run
    self.save_transactions()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 189, in save_transactions
    transaction_list = self.get_transaction_list()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 184, in get_transaction_list
    j = resp.json()
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\requests\models.py", line 898, in json
    return complexjson.loads(self.text, **kwargs)
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\simplejson\__init__.py", line 525, in loads
    return _default_decoder.decode(s)
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\simplejson\decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\simplejson\decoder.py", line 400, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Traceback (most recent call last):
  File "C:\Users\Dieter\AppData\Local\Programs\Python\Python38-32\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Dieter\AppData\Local\Programs\Python\Python38-32\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\cli.py", line 91, in <module>
    main()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\cli.py", line 87, in main
    module.run(**spec)
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 250, in run
    scrape_lib.run_with_scraper(Scraper, **kwargs)
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\scrape_lib.py", line 424, in run_with_scraper
    retry(fetch)
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\scrape_lib.py", line 402, in retry
    return func()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\scrape_lib.py", line 422, in fetch
    scraper.run()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 246, in run
    self.save_transactions()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 189, in save_transactions
    transaction_list = self.get_transaction_list()
  File "d:\documenten\coding\import beancount\finance-dl\finance_dl\paypal.py", line 184, in get_transaction_list
    j = resp.json()
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\requests\models.py", line 898, in json
    return complexjson.loads(self.text, **kwargs)
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\simplejson\__init__.py", line 525, in loads
    return _default_decoder.decode(s)
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\simplejson\decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "D:\Documenten\coding\Import Beancount\beancount-import\env\lib\site-packages\simplejson\decoder.py", line 400, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)```

Some work left to do on the Schwab importer

I wanted to collect a few loose ends here:

Add commodity directives for every new commodity that appears in transactions.
Handle bonds, and US Treasuries specifically. Bond symbols are CUSIPs, which are not valid Beancount symbols. Quantity * Price != Amount (-Fee).
Allow users to specify if commodity transactions should be in subaccount, or stay in top-level account like OFX import does it.
~~Option expiration~~, bond maturation.
~~Option sold in lots of 100, even though price reported for 1.~~
~~Add/skip Income account posting for capital gains (no need when selling short, but needed when buying to close).~~

@carljm
Comments and suggestions are welcome.

Updating schwab_csv.py with BankingEntryTypes "DEPOSIT"

line 150 of /sources/schwab_csv.py

class BankingEntryType(enum.Enum):
DEPOSIT = "DEPOSIT"
ACH = "ACH"
INTADJUST = "INTADJUST"
TRANSFER = "TRANSFER"
VISA = "VISA"
ATM = "ATM"
CHECK = "CHECK"

How do I configure listening host/port

I'm looking for how to change the host/port that beancount-import listens on.

Cannot run example

Listening at http://127.0.0.1:8101
Traceback (most recent call last):
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/webserver.py", line 489, in _handle_reconciler_loaded
    loaded_reconciler = loaded_future.result()
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/concurrent/futures/_base.py", line 425, in result
    return self.__get_result()
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/thread_helpers.py", line 13, in wrapper
    f.set_result(fn(*args, **kwargs))
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/reconcile.py", line 380, in __init__
    self._load_sources()
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/reconcile.py", line 434, in _load_sources
    for spec in self.reconciler.options['data_sources']
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/reconcile.py", line 434, in <listcomp>
    for spec in self.reconciler.options['data_sources']
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/source/__init__.py", line 318, in load_source
    m = importlib.import_module(source_spec.pop('module'))
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/source/ofx.py", line 434, in <module>
    from beancount.ingest.importers.ofx import parse_ofx_time, find_child
ModuleNotFoundError: No module named 'beancount.ingest.importers.ofx'
> /Users/Wong/.pyenv/versions/3.7.3/lib/python3.7/site-packages/beancount_import/source/ofx.py(434)<module>()
-> from beancount.ingest.importers.ofx import parse_ofx_time, find_child
(Pdb)

Matching engine failing to join transactions

I am running an import with --fuzzy_match_days 7. For some reason it's not managing to match the top pending transaction with the existing bottom one. This is just an example, there are many similar instances.

2018-02-08 * "Amazon.com" "Order"
  amazon_account: "XXX"
  amazon_order_id: "112-4181632-8076222"
  Expenses:FIXME:A   79.99 USD
    amazon_item_condition: "New"
    amazon_item_description: "Anker PowerCore Speed 20000 PD, 20100mAh Portable Charger & 30W Power Delivery Wall Charger Bundle, Input & Output Type C Power Bank for Nexus 5X 6P, LG G5, iPhone 8 / X and Macbooks"
    amazon_item_quantity: 1
    amazon_seller: "AnkerDirect"
    shipped_date: 2018-02-09
  Expenses:FIXME:A  -10.00 USD
    amazon_invoice_description: "Your Coupon Savings"
  Expenses:FIXME:A    6.21 USD
    amazon_invoice_description: "Sales Tax"
  Expenses:FIXME    -76.20 USD
    amazon_credit_card_description: "MasterCard ending in 7754"
    transaction_date: 2018-02-09

2018-02-13 * "Debit Card Purchase 02/09 0"
  Assets:Bank:Citibank                                  -76.20 USD
    memo: "AMAZON MKTPLACE PMTS   AMZN.COM/BILL WA"
  Expenses:Electronics                                   76.20 USD

`beancount-import` doesn't react to ctrl-C until the next web request

This happens both with master and the version I had installed using pip (probably the latest, 1.3.3).

I press ctrl-C in the terminal once I'm done importing. But the program doesn't quit until I cause some sort of web request to be issued. For example if I go over to the Editor tab and choose a different file, that's enough for it to actually catch it.

Does this happen to anyone else?

falsifian moth beancount $ uname -a
OpenBSD moth.falsifian.org 6.9 GENERIC.MP#490 amd64

ofx importer assumes one org per file

ofxclient has a "combined download" option where it downloads all your accounts into a single OFX file. This is pretty convenient if you have a lot of accounts, compared to one by one downloading them.

But the ofx importer in beancount-import can't handle transactions from multiple orgs in a single file, due to the code here: https://github.com/jbms/beancount-import/blob/master/beancount_import/source/ofx.py#L1133

It just looks for the first ORG element and then assumes that is the correct org for all transactions in the file.

Seems like fixing this might require a more sophisticated parsing of the OFX file format, looking at nesting etc, rather than just "find org" and then "find all transactions."

OFX commodity import wants to repeatedly add existing transactions

I suspect the problem is that the importer isn't looking at the right account, but I haven't had enough time to look through the code myself to see.

I have these accounts in my beancount file:

2001-01-01 open Assets:Retirement:Employer
  ofx_org: "Vanguard"
  ofx_broker_id: "vanguard.com"
  account_id: "123456"
  ofx_account_type: "securities_and_cash"
  div_income_account: "Income:Vanguard:Dividends"
  capital_gains_account: "Income:Vanguard:Capital-Gains"
2001-01-01 open Assets:Retirement:Employer:PreTax:VIRSX         VIRSX
2001-01-01 open Assets:Retirement:Employer:PreTax:Cash          USD
2001-01-01 open Assets:Retirement:Employer:Match:VIRSX          VIRSX
2001-01-01 open Assets:Retirement:Employer:Match:Cash           USD

Here's a simple OFX file with a single transaction:

example.zip

If I use the above account definitions with that OFX file, beancount_import wants to add the following transaction:

2019-01-15 * "BUYMF - PRETAX"
  Assets:Retirement:Employer:PreTax:VIRSX  15.12443 VIRSX {22.18 USD}
    date: 2019-01-15
    ofx_fitid: "1141201901150022251001668EEE"
    ofx_type: "BUYMF"
  Assets:Retirement:Employer:PreTax:Cash  -335.46 USD
    ofx_fitid: "1141201901150022251001668EEE"

That's all well and good, but if I add it and then exit and restart the importer, it wants to add the transaction again even though it's already there. (Worse, if the beancount file changes and the importer reloads it, the importer will immediately offer to re-add the same transaction.)

I would expect the importer to see that the OFX transaction already has a matching beancount transaction and not try to add it again. It works properly with my other, cash-only accounts (a mix of bank accounts and credit cards). The only problems are with my two retirement accounts.

Previously ignored balance assertions are presented for import again

I provide a minimal example in which upon restart beancount-import presents import candidates that are already in the ignored.journal file. Specifically this seems to happen with balance assertions.

To reproduce:

run python3 run.py
repeatedly hit i to ignore all import candidates.
Ctrl-C to stop server
run.py again and observe same balance assertion candidate.

I have also observed this issue with price assertions. Is this nominal behaviour?
ignore_test.tar.gz

[Paypal] SEND_MONEY_RECEIVED with fundingSource set breaks import

Hi everyone,

I'm facing some issues with SEND_MONEY_RECEIVED PayPal transactions. In my case, all the transactions have a fundingSource entry in the JSON, therefore the import breaks at this line.

beancount-import/beancount_import/source/paypal.py

Line 547 in dca3406

assert 'fundingSource' not in data

Since the only function of the line is to make sure NO fundingSource is set, I wonder what the purpose is. Removing the line, my imports work fine and I don't see any obvious problem yet.

For reference, I use transaction data from Paypal DE (#142) so maybe Paypal is doing some things differently in Germany...

Using 'e' to accept and edit causes RuntimeError: Journal file modified concurrently

I'm not sure if I'm using this correctly in a headless system via tmux. Here's how I first started beancount-import:

beancount-import --editor vim --journal_input personal.beancount --journal_output mint.beancount --mint_data transactions.csv --account_output '.*' accounts.beancount

I then see the first transaction. I want to change the name/description. So I press 'e' and nothing happens; my window is frozen. (In another session I can see that vim is running in the background.)

I tried creating a wrapper script: tmux-vim to launch vim in a pane instead loaded, but when I do that, since vim is not a child of beancount-import, beancount-import moves on to the next transaction while I change the file in the background with vim. When I save it, I end up receiving a RunTimeError on the next transaction.

Example fails on Windows because unable to rename file

On Windows 10, using beancount-import installed 12 Dec 2019, the "manually_entered" example fails with the following traceback if the journal is edited and "Save" clicked:

Traceback (most recent call last):
File "C:\...\lib\site-packages\beancount_import\webserver.py", line 380, in on_message_set_file_contents
os.rename(f.name, filename)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\...\examples\manually_entered\transactions.beancount.tmp_hclhyvn' -> 'C:\...\examples\manually_entered\transactions.beancount'
[the ellipses are mine]

The following patch to webserver.py seems to solve the problem:

380c380,381
<                 os.rename(f.name, filename)
---
>                 tmpname = f.name
>             os.replace(tmpname, filename)

Import of OFX CHECKNUM tag fails for a non-numeric value

When processing my OFX file it fails with this error:

Traceback (most recent call last):
File "C:\Users\gjpau\AppData\Local\Programs\Python\Python38\lib\site-packages\beancount\core\number.py", line 96, in D
return Decimal(_CLEAN_NUMBER_RE.sub('', strord))
decimal.InvalidOperation: [<class 'decimal.ConversionSyntax'>]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
...
File "c:\users\gjpau\documents\github\beancount-import\beancount_import\source\ofx.py", line 1394, in prepare
state.get_accounts_and_entries()
File "c:\users\gjpau\documents\github\beancount-import\beancount_import\source\ofx.py", line 1249, in get_accounts_and_entries
statement.get_entries(self)
File "c:\users\gjpau\documents\github\beancount-import\beancount_import\source\ofx.py", line 843, in get_entries
posting_meta[CHECK_KEY] = D(stripped_checknum)
File "C:\Users\gjpau\AppData\Local\Programs\Python\Python38\lib\site-packages\beancount\core\number.py", line 104, in D
raise ValueError("Impossible to create Decimal instance from {!s}: {}".format(
ValueError: Impossible to create Decimal instance from 8OOMBV: [<class 'decimal.ConversionSyntax'>]

c:\users\gjpau\appdata\local\programs\python\python38\lib\site-packages\beancount\core\number.py(104)D()
-> raise ValueError("Impossible to create Decimal instance from {!s}: {}".format(

When I comment all tags the import succeeds.

The failing tag is: 08OOMBV

I have read the latest OFX specification that says a number of format A-12. However the A just means any UTF-8 character.

Check (or other reference) number, A-12

Character fields are identified with a data type of “A-n”, where n is the maximum number of allowed
Unicode characters.
Note: n refers to the number of characters in the resultant string. Each multi-byte or encoded
character counts as a single character. UTF-8 encodes “high” Latin-1 characters (decimal 128-
255) using two bytes, and double-byte characters using three bytes. In addition, XML encodes
ampersands, less-than symbols, greater-than symbols, and spaces (where required) using multicharacter escape strings (see section 2.3.1.1). Therefore, an element of type A-40 may require
more than 40 bytes in a UTF-8-encoded XML stream.

Matching runs forever / Webserver unresponsive

Hi,

I'm trying to import a Paypal transaction, but it reliably breaks the Webserver. It is a transaction containing quite a number of items. If I reduce the number of items, the Webserver remains responsive. Increasing the number of items included in the JSON increases the time needed for processing enormously. It seems to scale in a highly non-linear way.

I attached the problematic JSON, personal information was removed: Paypal_test_json.txt

I tried tracking the time consuming method in the code. I found that _get_valid_posting_matches is called over and over again. I printed the length of the matches returned and it fluctuates heavily, increasing up to 6000?! My beancount file has about 200 transactions at the moment...

beancount-import/beancount_import/matching.py

Line 1438 in dca3406

def _get_valid_posting_matches(

Let's see how long the processing will take, I'm afraid I will have to leave it running overnight...

I would really appreciate some hints on debugging! Unfortunately, I don't understand the code well enough yet.

As a workaround I can probably just remove the items from the JSON by hand?

beancount_import matching broken for `securities_and_cash` ofx account?

I just tried beancount-import with an investment account for the first time, and I'm having trouble getting it to match any of my existing transactions.

After some fiddling, I llearned that if I take a transaction generated by beancount-import and manually delete the metadata, beancount-import won't realize it's the same transaction, and will try to generate it again.

(Some numbers below are replaced with XXX for privacy.)

As an example, with the below stripped-down beancount file and run_beancount_import.py, beancount-import generates the following new directives:

2020-12-30 * "SELLSTOCK - BANK MONTREAL QUEBEC"
  Assets:Brokerage:BMO         -13 BMO {} @ 76.03 USD
    date: 2020-12-30
    ofx_fitid: "XXX"
    ofx_memo: "BANK MONTREAL QUEBEC"
    ofx_type: "SELLSTOCK"
  Income:Capital-gains:BMO
  Assets:Brokerage:Cash     988.37 USD
    ofx_fitid: "XXX"
  Expenses:Fees               0.02 USD

2020-12-30 open Assets:Brokerage:BMO                            BMO

2020-12-30 open Income:Capital-gains:BMO                        USD

2020-12-30 open Assets:Brokerage:Cash                           USD

If I edit that transaction so that it instead reads:

2020-12-30 * "SELLSTOCK - BANK MONTREAL QUEBEC"
  Assets:Brokerage:BMO         -13 BMO {} @ 76.03 USD
  Income:Capital-gains:BMO
  Assets:Brokerage:Cash     988.37 USD
  Expenses:Fees               0.02 USD

then suddenly beancount-import won't match it any more, and wants to add another copy of the transaction. I was not able to get into a situation where beancount-import is willing to augment a transaction I've already entered by adding in the appropriate metadata.

Here is my main.beancount with account ID censored:

2000-01-01 open Income:Capital-gains
2000-01-01 open Income:Dividends
2000-01-01 open Income:Interest
2000-01-01 open Expenses:Fees

1792-04-02 commodity USD
  cusip: "9999101"

2000-01-01 open Assets:Brokerage
  ofx_org: ""
  ofx_broker_id: "Wells Fargo Advisors"
  ofx_account_type: "securities_and_cash"
  account_id: "XXX"
  capital_gains_account: Income:Capital-gains
  fees_account: Expenses:Fees
  div_income_account: Income:Dividends
  interest_income_account: Income:Interest

run_beancount_import.py contains:

import beancount_import.webserver

def run_reconcile():
    data_sources = [
        {
            "module": "beancount_import.source.ofx",
            "ofx_filenames": ("export.ofx",)
        },
    ]

    beancount_import.webserver.main(
        argv = (),
        journal_input = "main.beancount",
        ignored_journal = "main.beancount",
        default_output = "main.beancount",
        open_account_output = "main.beancount",
        balance_account_output = "main.beancount",
        data_sources = data_sources,
    )

if __name__ == "__main__":
    run_reconcile()

If any details from export.ofx would be useful, let me know.

Price Format Support

I was loading paypal data when I noticed price data doesn't take into account another format: dot before decimals (default) VS comma before decimals (drops the comma).
XX.XX = XX.XX
XX,XX = XXXX
example:
loaded data: "price": "55,33\u00a0EUR"
output: 5533 EUR

This happens in /beancount_import/amount_parsing.py
inside def parse_amount(x)
at line 42 number = D(m.group(2))

def parse_amount(x):
    """Parses a number and currency."""
    if not x:
        return None
    sign, amount_str = parse_possible_negative(x)
    m = re.fullmatch(r'([\$€£])?((?:[0-9](?:,?[0-9])*|(?=\.))(?:\.[0-9]+)?)(?:\s+([A-Z]{3}))?', amount_str)
    if m is None:
        raise ValueError('Failed to parse amount from %r' % amount_str)
    if m.group(1):
        currency = {'$': 'USD', '€': 'EUR', '£': 'GBP'}[m.group(1)]
    elif m.group(3):
        currency = m.group(3)
    else:
        raise ValueError('Failed to determine currency from %r' % amount_str)
    number = D(m.group(2))
    return Amount(number * sign, currency)

Is it possible to add a check for replacing , with . Not sure what the best method would be but I'm currently using this

    value = m.group(2)
    if ',' in value:
        value = value.replace(',','.')
    number = D(value)

Following `beancount.ingest`/`beangulp` workflow

Hi --

Thanks for the awesome package!

I am looking to leverage beancount-import in the identify -> extract -> archive and generate -> test workflow from beancount.ingest/beangulp. It seems like with the new support of beancount importers, this is far more achievable.

Two specific questions:

Are there thoughts for how to best facilitate the workflow between beancount.ingest/beangulp, e.g. something simple like just replacing the extract step with beancount-import or something more dedicated built into beancount-import?
Is there a way to follow the same workflow with beancount-import sources? The beancount-import ofx source is the most full-featured I have seen. Seems a bit duplicative to rewrite it all as a beancount importer. Would be great to have a means to identify, generate (that generates a "default" to be tested against, like with beancount ofx importer), and test.

cleared_before: <date> does not work for securities_and_cash account

I have tried with both an ofx imported from Vanguard and one from Chase. I have transactions in my ledger that predate the earliest date I can download via ofx. The cleared_before directive works for the Chase checking account (cash_only), but I can't get it to work for the Vanguard account. Any transaction older than "date" is still marked as uncleared.

Handle variations in Schwab account name format in accounts map

See changes removed from #120

journal_editor breaks when matched transaction has a comment

My journal contains:

2018-03-21 * "things"
  Assets:Cash:Bank1                             2572.64 USD
  ;; Comment
  Expenses:Purchases:Gifts                     -2572.64 USD

My generated entry looks like:

2018-03-21 * ""
  Assets:Cash:Bank1                             2572.64 USD
  Expenses:FIXME                               -2572.64 USD

When I try to import, I get an AssertionError like this:

Traceback (most recent call last):                                                                                                                                                                                           File "/home/me/.local/lib/python3.7/site-packages/beancount_import/webserver.py", line 499, in _handle_reconciler_loaded                                                                                              
    self.get_next_candidates(new_pending=True)                                                                                                                                                                               File "/home/me/.local/lib/python3.7/site-packages/beancount_import/webserver.py", line 508, in get_next_candidates                                                                                                        self.skip_ids)                                                                                                                                                                                                         
  File "/home/me/.local/lib/python3.7/site-packages/beancount_import/reconcile.py", line 873, in get_next_candidates                                                                                                        pending), i, new_skip_ids                                                                                                                                                                                                File "/home/me/.local/lib/python3.7/site-packages/beancount_import/reconcile.py", line 841, in _make_candidates_from_import_result                                                                                        sources=self.sources,                                                                                                                                                                                                  
  File "/home/me/.local/lib/python3.7/site-packages/beancount_import/reconcile.py", line 234, in __init__                                                                                                                   candidate.update_associated_data(self.sources)                                                                                                                                                                           File "/home/me/.local/lib/python3.7/site-packages/beancount_import/reconcile.py", line 202, in update_associated_data                                                                                                 
    diff = self.staged_changes.get_diff()                                                                                                                                                                                  
  File "/home/me/.local/lib/python3.7/site-packages/beancount_import/journal_editor.py", line 904, in get_diff                                                                                                          
    new_posting=new_posting)                                                                                                                                                                                               
  File "/home/me/.local/lib/python3.7/site-packages/beancount_import/journal_editor.py", line 684, in compute_posting_changes                                                                                           
    builder.match_metadata(old_posting.meta)                                                                                                                                                                               
  File "/home/me/.local/lib/python3.7/site-packages/beancount_import/journal_editor.py", line 592, in match_metadata                                                                                                    
    assert meta['lineno'] == self.orig_lineno + 1                                                                                                                                                                          
AssertionError

When I remove the comment, all is well.

Feature: Group candidates by payee

I typically have a large number of transactions for any given payee, all of which will be categorized as the same account (e.g. all purchases at the grocery store are classified as Expenses:Groceries).

Currently, all those purchases are spread, so I have to pay extra attention when importing.

If instead I knew I'd get all the transactions from the same store one after the other, I could be a bit more aggressive with my hammering of the Enter key.

Error handling for missing metadata fields

When I'm trying to import a new OFX account, and I'm in the process of setting all of the necessary Beancount-import OFX metadata in my Beancount accounts file, I'll often get errors like these when I don't yet have the metadata quite right:

 File "/usr/lib/python3.7/site-packages/beancount_import/source/ofx.py", line 529, in get_account_by_key
    raise KeyError('%s: must specify %s' % (account.account, key))
KeyError: 'Assets:MyBank: must specify capital_gains_account'
> /usr/lib/python3.7/site-packages/beancount_import/source/ofx.py(529)get_account_by_key()
-> raise KeyError('%s: must specify %s' % (account.account, key))
(Pdb)

So, getting unceremoniously dumped into a pdb shell because I'm missing a capital_gains_account entry! This is okay (but not ideal) when running Beancount-import locally in an interactive shell. It's super not great when running Beancount-import persistently on a remote web server with nothing interactive!

To be fair, I was warned of this potential issue (#14 (comment)):

I think more generally some other changes may be helpful to make the beancount-import webserver more convenient as a persistent, multi-user service. For example, currently while it will reload the journal automatically there is no way to reload the import data, and failure to parse the import data often leads to terminating the program.

So I'm opening this ticket to start brainstorming solutions. The existing errors tab in the front-end seems like a nice way to present errors to the user. Would it be possible to just take the cases of errors that (currently) result in a pdb shell or exiting the program to instead manifest as recoverable errors in the UI?

Considering improvements to ML example generation

Bottom line up front:
Unknown account prediction started failing hard for me recently. I got it back up with some hacks and want to discuss potential better solutions.

I've been a happy beancount-import user for around a year now. Recently the unknown account prediction accuracy took a nosedive and I've been investigating why. I learned how the training code builds training examples in two ways:

Type 1 features are metadata about the unknown account. These are great for sources like the Amazon source. The Amazon source knows price and metadata about the posting, but doesn't know which account to map it to:

  Expenses:FIXME   9.01 USD
    amazon_item_description: "Some item"
    amazon_item_quantity: 1
    amazon_seller: "Amazon.com Services LLC"
    shipped_date: 2021-01-06
  Expenses:FIXME   0.91 USD
    amazon_invoice_description: "Sales Tax"
  Liabilities:CC:Amazon     -9.92 USD
    amazon_credit_card_description: "Visa ending in 1234"

Type 2 features are generated when there are exactly 2 non-ignored postings. This is the main type that I actually use, though I could change that.

Example from my plaid importer:

2021-01-07 * "Caffe" "CAFFE LUSSO COFFEE gosq.com"
  Liabilities:CC:CitiCash     -44 USD
    category: "Food and Drink, Restaurants, Coffee Shop"
    date: 2021-01-07
    plaid_transaction_id: "redacted"
    source_desc: "CAFFE LUSSO COFFEE gosq.com"
  Expenses:FIXME   44 USD

I commented out Type 1 feature generation and my accuracy went way up for type 2 predictions. I'm not sure if this is because the type 2 features are getting drowned out by the type 1?

Fundamentally, I think the biggest issue is that the way we're generating examples is potentially quite different from the actual inference task: I have several sources where the import has no prediction whatsoever, but these still get examples generated for them. We're lacking the context about which postings were offered by the source as FIXME and corrected by the user.
This could probably be corrected by adding a new API for sources to optionally call during prepare().
This API would let the source generate examples. Sources already run through the journal, looking for already imported postings and suppressing those results. Instead of suppressing, they could call this new API telling us they would've imported it as FIXME, but it has been set as . That is then a high quality training example.

I may try modifying my plaid source to put the metadata on the FIXME posting. This would make it use Type 1 features, which may work better.

2021-01-07 * "Caffe" "CAFFE LUSSO COFFEE gosq.com"
  Liabilities:CC:CitiCash     -44 USD
    date: 2021-01-07
    plaid_transaction_id: "redacted"
  Expenses:FIXME   44 USD
    category: "Food and Drink, Restaurants, Coffee Shop"
    source_desc: "CAFFE LUSSO COFFEE gosq.com"

Another option I was considering is separating type 1 and type 2 classifiers, but that would be a fairly involved change since it also affects inference.

FR: Show where an existing transaction is

In the "Candidates" panel, beancount-import helpfully shows me the diff it is going to apply.

Sometimes I want to make additional changes with my text editor. But to do that I need to know which file the transaction is (and ideally which line); I don't have a fast way to do that. It would be nice if beancount-import showed the filename and line numbers whenever it showed a diff.

Parse 'OTHERVEST' OFX INV401KSOURCE

In my 401(k) OFX statements, I noticed there was a statement inside an INV401KSOURCE that was not correctly handled by beancount-import: "OTHERVEST".

Could this be added to ofx.py? I believe it just needs to be added to inv401k_account_keys, but I'm not certain.

beancount_import 1.3.3 doesn't contain `generic_import_source`

Hi! Thanks for this project, it helps a lot.
Here's a thing I noticed - beancount-import 1.3.3 (latest version at time of writing), as published on pypi doesn't seem to contain generic_import_source. I get an error when specifying it as a data source.

See for example:

wget https://files.pythonhosted.org/packages/cb/b3/a4fbc28c957c8ff5ce5686ec71d8df301b4d61a73b839fa6f7b4960b2ff5/beancount-import-1.3.3.zip
unzip beancount-import-1.3.3.zip
ls beancount-import-1.3.3/beancount_import/source                                          1
amazon_invoice.py             __init__.py           stockplanconnect.py
amazon_invoice_sanitize.py    link_based_source.py  stockplanconnect_statement.py
amazon_invoice_test.py        mint.py               ultipro_google.py
amazon.py                     mint_test.py          ultipro_google_statement.py
amazon_test.py                ofx.py                venmo.py
description_based_source.py   ofx_sanitize.py       venmo_sanitize.py
google_purchases.py           ofx_test.py           venmo_test.py
google_purchases_sanitize.py  paypal.py             waveapps.py
google_purchases_test.py      paypal_sanitize.py    waveapps_test.py
healthequity.py               paypal_test.py
healthequity_test.py          source_test.py

If I install the project from the master branch, everything works as expected:

(install from master branch as per README instructions)
ls ~/.local/lib/python3.9/site-packages/beancount_import/source 
amazon_invoice.py                healthequity_test.py  schwab_csv.py
amazon_invoice_sanitize.py       __init__.py           schwab_csv_test.py
amazon_invoice_test.py           link_based_source.py  source_test.py
amazon.py                        mint.py               stockplanconnect.py
amazon_test.py                   mint_test.py          stockplanconnect_statement.py
description_based_source.py      ofx.py                ultipro_google.py
generic_importer_source.py       ofx_sanitize.py       ultipro_google_statement.py
generic_importer_source_test.py  ofx_test.py           venmo.py
google_purchases.py              paypal.py             venmo_sanitize.py
google_purchases_sanitize.py     paypal_sanitize.py    venmo_test.py
google_purchases_test.py         paypal_test.py        waveapps.py
healthequity.py                  __pycache__           waveapps_test.py

I guess the generic importer source will only be available in a future released version of the project, might be worth specifying this in the Readme. Thanks!

Entries with 0 unitprice

A Vanguard OFX I'd like to import has entries where the UNITPRICE is 0 (which seems like a bug on their part):

<REINVEST>
    <INVTRAN>
        <FITID>XXX</FITID>
        <DTTRADE>20190614160000.000[-5:EST]</DTTRADE>
        <DTSETTLE>20190614160000.000[-5:EST]</DTSETTLE>
        <MEMO>DIVIDEND REINVEST</MEMO>
    </INVTRAN>
    <SECID>
        <UNIQUEID>XXX</UNIQUEID>
        <UNIQUEIDTYPE>CUSIP</UNIQUEIDTYPE>
    </SECID>
    <INCOMETYPE>DIV</INCOMETYPE>
    <TOTAL>-543.95</TOTAL>
    <SUBACCTSEC>CASH</SUBACCTSEC>
    <UNITS>7.599</UNITS>
    <UNITPRICE>0.0</UNITPRICE>
</REINVEST>

Which results in errors like this:

Traceback (most recent call last):
  File "miniconda3/lib/python3.6/site-packages/beancount_import/webserver.py", line 493, in _handle_reconciler_loaded
    loaded_reconciler = loaded_future.result()
  File "miniconda3/lib/python3.6/concurrent/futures/_base.py", line 398, in result
    return self.__get_result()
  File "miniconda3/lib/python3.6/concurrent/futures/_base.py", line 357, in __get_result
    raise self._exception
  File "miniconda3/lib/python3.6/site-packages/beancount_import/thread_helpers.py", line 13, in wrapper
    f.set_result(fn(*args, **kwargs))
  File "miniconda3/lib/python3.6/site-packages/beancount_import/reconcile.py", line 396, in __init__
    all_source_results = self._prepare_sources()
  File "miniconda3/lib/python3.6/site-packages/beancount_import/reconcile.py", line 515, in _prepare_sources
    source.prepare(self.editor, source_results)
  File "miniconda3/lib/python3.6/site-packages/beancount_import/source/ofx.py", line 1394, in prepare
    state.get_accounts_and_entries()
  File "miniconda3/lib/python3.6/site-packages/beancount_import/source/ofx.py", line 1249, in get_accounts_and_entries
    statement.get_entries(self)
  File "miniconda3/lib/python3.6/site-packages/beancount_import/source/ofx.py", line 1043, in get_entries
    total + fee_total + (units * unitprice))
AssertionError: 543.950

Should I use ignore_transaction_regexp to skip these? Fix the entries by hand? Or should the importer solve for UNITPRICE when it is zero? In this case, UNITPRICE should be 71.5817871

Adding an OFX converter boosts up the usability of beancount importer

Half a year ago I stumbled upon this utility and I was impressed. I wanted to use it for a lot of European financial institutions where I had accounts. But then it got difficult. The pluggable architecture of beancount-import actually leads to a lock-in. If I write a converter I can only use it with beancount-import. So after some thought I switched to another idea: if I just write an OFX converter I can use the beancount-import OFX converter. And then I found ofxstatement which gave me exactly all I needed. I have created the following ofxstatement plugins:

These (or any other ofxstatement) plugins can be invoked using the convert2ofx routine in ofx.py. There is also an example.

The OFX balance is not reliable if DTEND < DTASOF

The OFX 2.2 specification states that DTASOF is the balance date and BALAMT the balance amount. The DTEND flag is the (exclusive) end of the transactions in the file. So when DTEND is less than DTASOF there is no way we can reliably calculate a balance (at midnight as requested by beancount) since there may be (actual) transactions between DTEND and DTASOF that will (and should) not be part of the OFX file since only transactions between DTSTART and DTEND should be listed.

On the other end, if DTEND is missing or >= DTASOF we can calculate the balance as of (the day part of) DTASOF by deducting all transactions with a date >= (the day part of) DTASOF. A good example is the test_suncorp where in my opinion the balance is wrong since on the balance date there is a transaction in the file and that should be deducted.

In ofx.py, I have added a dictionary key check_balance to the OFX source specification. This parameter, a callable, should evaluate to True or False for an OFX source file. The default is False, the old behavior which adds a balance any time without checking it is correct.

Feature: Accept design update PR?

Would you accept a PR that adds a bit of CSS here and there? This is a great tool and I appreciate the keyboard shortcuts. I'd like to give the UI a little lift – would you be interested in that?

FR: automatically remove "!" flags

I often flag transactions or individual postings with "!" if I enter them before my bank tells me about them. It would be nice if beancount-import removed these when I import the corresponding transactions.

I might try implementing this myself.

Import from bean-extract / beancount.ingest?

So, I´ve spend hours writing a bean-extract compatible parser using beancount.ingest.importer.ImporterProtocol

Can one somehow string those things together?

I do NOT mean parsing a written-our .beancount file,
but "run bean-extract" and use that result as data_source - similar to Fava?

something along the lines of the usual:

# beancount.config
from importers import my_importer
configured_my_importer = my_importer.Importer(
  currency='USD',
)
CONFIG = [
  configured_my_importer,
]
# ....
# ....
# ....
    data_sources = [
        dict(
            module='beancount_import.source.beancount-ingest',
            class=configured_my_importer,
        )
# or
    data_sources = [
        dict(
            module='beancount_import.source.beancount-extract',
            config='path/to/beancount.config', # parse&read CONFIG from there
            dirs='documents/',
        )

or however that´s gonna map out... ;-)

(think that could proxy-close #18 as well...)

Can't run examples on PyPI version

Hello, thank you for writing and maintaining this software! Just wanted to note that I was unable to run the examples using the version available on PyPI with this error:

(my-finances) $ ./run.py 
Listening at http://127.0.0.1:8101
Traceback (most recent call last):
  File "/home/vkurup/.pyenv/versions/3.9.0/envs/my-finances/lib/python3.9/site-packages/beancount_import/webserver.py", line 493, in _handle_reconciler_loaded
    loaded_reconciler = loaded_future.result()
  File "/home/vkurup/.pyenv/versions/3.9.0/lib/python3.9/concurrent/futures/_base.py", line 433, in result
    return self.__get_result()
  File "/home/vkurup/.pyenv/versions/3.9.0/lib/python3.9/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/home/vkurup/.pyenv/versions/3.9.0/envs/my-finances/lib/python3.9/site-packages/beancount_import/thread_helpers.py", line 13, in wrapper
    f.set_result(fn(*args, **kwargs))
  File "/home/vkurup/.pyenv/versions/3.9.0/envs/my-finances/lib/python3.9/site-packages/beancount_import/reconcile.py", line 380, in __init__
    self._load_sources()
  File "/home/vkurup/.pyenv/versions/3.9.0/envs/my-finances/lib/python3.9/site-packages/beancount_import/reconcile.py", line 433, in _load_sources
    sources = self.sources = [
  File "/home/vkurup/.pyenv/versions/3.9.0/envs/my-finances/lib/python3.9/site-packages/beancount_import/reconcile.py", line 434, in <listcomp>
    load_source(spec, log_status=self.reconciler.log_status)
  File "/home/vkurup/.pyenv/versions/3.9.0/envs/my-finances/lib/python3.9/site-packages/beancount_import/source/__init__.py", line 318, in load_source
    m = importlib.import_module(source_spec.pop('module'))
  File "/home/vkurup/.pyenv/versions/3.9.0/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'beancount_import.source.generic_importer_source'

I was able to successfully get it to work by downloading the master branch:

pip install git+git://github.com/jbms/beancount-import@master#egg=beancount-import

(in case that helps someone else until the current version is released to PyPI)

Does not work on Windows with non-Latin letters

Listening at http://127.0.0.1:8101
Traceback (most recent call last):
  File "C:\Python37\lib\site-packages\beancount_import\webserver.py", line 488, in _handle_reconciler_loaded
    loaded_reconciler = loaded_future.result()
  File "C:\Python37\lib\concurrent\futures\_base.py", line 425, in result
    return self.__get_result()
  File "C:\Python37\lib\concurrent\futures\_base.py", line 384, in __get_result
    raise self._exception
  File "C:\Python37\lib\site-packages\beancount_import\thread_helpers.py", line 13, in wrapper
    f.set_result(fn(*args, **kwargs))
  File "C:\Python37\lib\site-packages\beancount_import\reconcile.py", line 380, in __init__
    self._load_sources()
  File "C:\Python37\lib\site-packages\beancount_import\reconcile.py", line 434, in _load_sources
    for spec in self.reconciler.options['data_sources']
  File "C:\Python37\lib\site-packages\beancount_import\reconcile.py", line 434, in <listcomp>
    for spec in self.reconciler.options['data_sources']
  File "C:\Python37\lib\site-packages\beancount_import\source\__init__.py", line 319, in load_source
    return m.load(source_spec, log_status=log_status)  # type: ignore
  File "C:\Python37\lib\site-packages\beancount_import\source\ofx.py", line 1373, in load
    return OfxSource(log_status=log_status, **spec)
  File "C:\Python37\lib\site-packages\beancount_import\source\ofx.py", line 1329, in __init__
    ParsedOfxFile(self.source_fitids, filename))
  File "C:\Python37\lib\site-packages\beancount_import\source\ofx.py", line 1127, in __init__
    contents = f.read()
  File "C:\Python37\lib\encodings\cp1251.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x98 in position 5539: character maps to <undefined>
> c:\python37\lib\encodings\cp1251.py(23)decode()
-> return codecs.charmap_decode(input,self.errors,decoding_table)[0]

The journal file is encoded in UTF-8.

The workaround is to use Linux subsystem, which I used to do on my desktop, however now it's not available and my laptop doesn't have Windows 10 Pro so I'd like to fix the real issue.

P.S. Great software by the way, I really enjoyed using it in the last couple months.