Comments (22)
Hi @lestephane, thanks for detailing your workaround. I've been preoccupied with moving me and my family from Africa to Europe the last few months, but once I'm settled I would really like to get a good solution for this in.
from hledger-flow.
I've tried it a few times today and it seems to work.
from hledger-flow.
I think it should work if you move
--batch-size 200
to just beforeimport
Right you are, my wrapper helper script ordering of those arguments was to blame.
from hledger-flow.
I changed the behaviour intentionally in v0.11.3 to always be consistent about what the base directory is:
#41
My thinking at the time was that it would be less confusing if the base dir were always the same, in the same way that way that git treats the root of any git repo.
You do make a good case though for the need to import a subset of files.
Should we automatically treat a subdir of the hledger-flow base dir as an indication that we should only import that subdir and below? Possibly.
The other option could be to have another command-line flag. I'm leaning towards the first option.
The part where you get an error when specifying the base dir as .
while in a subdir is unexpected, I would have expected it to detect the top-level base dir even in that case.
I'll look at this unexpected error first.
from hledger-flow.
Another reason to provide this partial import mode is that when something goes wrong and I need to run in --verbose
mode, I typically also have to include --sequential
. If hledger-flow import
imports everything everytime, a lot of logging will occur, and unless you know what you're looking for, this will get tedious.
I don't have an elegant solution to how we should activate the partial import (yet). But I'm also tending to the first option, it's the one that causes the less head scratching.
To take a relatable(?) example:
When i run grep -r PATTERN
, the search starts in the working directory and works its way downwards into sub-directories.
When I run grep -r PATTERN .
or grep -r /some/directory
, the search starts in the specified directory and, again, works its way downwards into sub-directories
The directionality expectation is derived from that. From my prior expectation of how unix tools work, I'd expect hledger-flow import
to start importing from the working directory, working its way downwards into sub-directories, and hledger-flow import /some/directory
to start importing from the specified directory, working its way downwards into sub-directories.
If you want to do the full import correctly without needing to think about the correct working directory,
it's possible to setup a bash alias like so
hledger_import() {
hledger-flow import /finance/rootdirectory "$@"
}
alias hlimport="hledger_import"
from hledger-flow.
The workaround I currently use is actually not working either, v0.11.1.2 in sub-account import mode removes all siblings from include files in the parent directory of the directory where I run the import. That's another head scratcher.
$ alias | grep hl
alias hl='${HOME}/.local/bin/hledger-flow-v0.11.1.2'
$ cd ~/Finance/import/personal
$ git diff
(empty)
$ hl import
...
$ git diff
~/Finance/import/personal$ git diff
diff --git a/import/2018-include.journal b/import/2018-include.journal
index 62a832f..3726eb7 100644
--- a/import/2018-include.journal
+++ b/import/2018-include.journal
@@ -1,6 +1,5 @@
### Generated by hledger-flow - DO NOT EDIT ###
!include 2018-opening.journal
-!include business-de/2018-include.journal
!include personal/2018-include.journal
!include 2018-closing.journal
I think it's fair to assume that running an import in a sub-account is unlikely to require a modification of any parent includes, and so all existing parent includes should be left untouched. This keeps the directionality of side-effects pointing downwards, always. If i need to regenerate includes to account for new years, I need to go to the import root to run the import.
from hledger-flow.
Running in a subdirectory isn't something I tried to support until v0.11.3, and then the behaviour I had in mind was to always import everything.
So I can think the behaviour is unpredictable in earlier versions. We'll have to change it and document it so that it is part of the supported feature set.
from hledger-flow.
I've tried v0.11.3 just now, but it also imports everything. So my current workaround for partial import is to use v0.11.1.2 for imports, and revert changes made to includes in parent directories.
from hledger-flow.
Another side-effect of the automatic detection of the import directory (in import everything mode) is that hledger-flow attempts to import some directories that exist outside of the import hierarchy. These are directories that I moved out of the way because they were work-in-progress. I don't expect them to cause an import error when I run an import in a subaccount of the root import
directory.
~/Finance/import/personal$ hledger-flow --version
hledger-flow 0.12.3.0 linux x86_64 ghc 8.6
~/Finance/import/personal$ hledger-flow import --sequential
Collecting input files...
Found 81 input files in 0.795055531s. Proceeding with import...
I couldn't find the right number of directories between "import" and the input file:
/home/lestephane/Vault/Finance/wip/bisq/account-1/1-in/2019/2019.csv
hledger-flow expects to find input files in this structure:
import/owner/bank/account/filestate/year/trxfile
Have a look at the documentation for a detailed explanation:
https://github.com/apauley/hledger-flow#input-files
from hledger-flow.
@lestephane Could you please add --show-options
to the output above (actually for all outputs when reporting something), I'd like to see what hledger-flow
is using as the base dir.
Please do it with the latest 0.12.3.1
release, I've made a change to always use an absolute path for the base dir.
from hledger-flow.
~/Vault/Finance/import/personal$ hledger-flow-v0.12.3.1 import --show-options
RuntimeOptions {baseDir = FilePath "/home/lestephane/Vault/Finance/", hfVersion = "hledger-flow 0.12.3.1 linux x86_64 ghc 8.6", hledgerInfo = HledgerInfo {hlPath = FilePath "/home/lestephane/Vault/Finance/hledger", hlVersion = "hledger 1.14.99"}, sysInfo = SystemInfo {os = "linux", arch = "x86_64", compilerName = "ghc", compilerVersion = Version {versionBranch = [8,6], versionTags = []}}, verbose = False, showOptions = True, sequential = False}
Collecting input files...
Found 139 input files in 0.674321691s. Proceeding with import...
I couldn't find the right number of directories between "import" and the input file:
/home/lestephane/Vault/Finance/wip/paypal/account/1-in/2019/2019.csv
...
from hledger-flow.
(Here is my current workaround for anyone interested)
I use the latest hledger-flow import
when I need a full import, which is taking longer and longer as the number of files grows on my end (20 seconds for 290 files nowadays, will grow worse for sure). Since haskell also uses all available cores, that's 20 seconds where the laptop is not responsive. Can't have that.
So if I'm only working in one account subdirectory, and do not want this delay, I run the newest version of hledger-flow that does not have this import everything everytime bug (v0.11.1.2), using an alias:
function _hlimport() {
PATH="${PATH}:${HOME}/.local/bin" "${HOME}/.local/bin/hledger-flow-v0.11.1.2" import "$@"
git status -s |
awk '$1~/^MM?/ && $2~/^(..\/)+([[:digit:]]{4}-include|all-years)\.journal/{print $2}' |
xargs --verbose --no-run-if-empty --max-lines=1 git checkout --
}
alias hlimport="_hlimport"
Once the alias is in place, an hlimport
invocation imports only the subdirectory I'm in (which is good), but also modifies includes in parent directories (which is bad, see my June 5 comment in this issue).
That's where the git checkout
comes in, to restore those parent includes to their values from the git index.
import/personal/wallet/cash$ hlimport
Collecting input files...
Found 37 input files in 0.023231917s. Proceeding with import...
Imported 37 journals in 1.043626395s
git checkout -- ../../../../all-years.journal
git checkout -- ../../../2017-include.journal
git checkout -- ../../../2018-include.journal
git checkout -- ../../../2019-include.journal
git checkout -- ../../../all-years.journal
git checkout -- ../../2017-include.journal
git checkout -- ../../2018-include.journal
git checkout -- ../../2019-include.journal
This trick is only meant to save time for localized work on one account sub-directory at a time.
After each work session you need to commit work on the account before working on another one.
And when the parent includes do have new modifications that need to be kept (additions mostly), then git add
or git commit
those first. If ever in doubt whether the include files are all correct, just rerun the entire hledger-flow import
using the latest release.
from hledger-flow.
@lestephane You can check out v0.13 for now:
https://github.com/apauley/hledger-flow/releases/tag/v0.13.0.0
It should solve one of your problems, the need to use v0.11.1.2
I'm still looking at the issue where include files in parent directories are regenerated with just a subset of journals.
Example use: hledger-flow --show-options import --experimental-rundir ./import/gawie/bogart/cheque
I removed the bug
label, because the earlier error (Unable to find an hledger-flow import directory at './'
) was fixed, and I think the behaviour in 0.12.x is correct, even though it prevented you from getting fast feedback. The processing of a subset of files on the other hand is currently producing unexpected results (the include files). So it is faster but not 100% correct.
I think fast feedback is an important use case, I hope to release some more updates to address this.
from hledger-flow.
@lestephane I haven't released anything yet, but there is something that mostly works in the branch rundir-improvements
(#78).
You can compile that branch and test it a bit if you'd like.
There have been a lot of annoying corner cases that I fixed as I found them, so please let me know if you find anything else unexpected.
Usage:
hledger-flow import --enable-future-rundir ./import/gawie/bogart
A known issue in that branch:
if you're doing a full import (using the top-level base dir) with --enable-future-rundir
it generates unnecessary yearly include files in the base dir.
hledger-flow import --help
Usage: hledger-flow import [DIR] [--enable-future-rundir]
Uses hledger with your own rules and/or scripts to convert electronic statements into categorised journal files
Available options:
DIR The directory to import. Use the base directory for a
full import or a sub-directory for a partial import.
Defaults to the current directory. This behaviour is
changing: see --enable-future-rundir
--enable-future-rundir Enable the future (0.14.x) default behaviour now:
start importing only from the directory that was
given as an argument, or the currect directory.
Previously a full import was always done. This switch
will be removed in 0.14.x
-h,--help Show this help text
from hledger-flow.
Can you confirm that the branch is rundir
and not rundir-improvements
?
$ git fetch --all
Fetching origin
remote: Enumerating objects: 125, done.
remote: Counting objects: 100% (125/125), done.
remote: Compressing objects: 100% (61/61), done.
remote: Total 125 (delta 60), reused 96 (delta 40), pack-reused 0
Receiving objects: 100% (125/125), 45.55 KiB | 1.17 MiB/s, done.
Resolving deltas: 100% (60/60), completed with 3 local objects.
From https://github.com/apauley/hledger-flow
ced0b70..e8e508b master -> origin/master
* [new branch] rundir -> origin/rundir
* [new tag] v0.13.2.0 -> v0.13.2.0
* [new tag] v0.13.1.0 -> v0.13.1.0
from hledger-flow.
@lestephane The branch is gone, I merged and released it a few hours ago:
https://github.com/apauley/hledger-flow/releases/tag/v0.13.2.0
The known issue I mentioned is also fixed.
Does the new behaviour match what you would expect?
from hledger-flow.
compiling, should I be worried about this warning?
~/GitRepos/hledger-flow$ stack install
Stack has not been tested with GHC versions above 8.6, and using 8.8.3, this may fail <<<
Preparing to install GHC to an isolated location.
This will not interfere with any system-level installation.
ghc-8.8.3: 50.33 MiB / 187.19 MiB ( 26.89%) downloaded...^C
~/GitRepos/hledger-flow$ stack upgrade
Current Stack version: 2.1.3, available download version: 2.1.3
Skipping binary upgrade, you are already running the most recent version
from hledger-flow.
Looking good so far, and I didn't notice any unexpectedly modified include files.
from hledger-flow.
Looking good so far, and I didn't notice any unexpectedly modified include files.
Great, let's close this issue, I think the main issue is solved. If it isn't we can re-open.
But for issues that are possibly just related, not exactly the same, I'll prefer a new issue to be opened. We can link to this one though if that happens.
Future releases will still remove the flag and make this behaviour the default.
compiling, should I be worried about this warning?
~/GitRepos/hledger-flow$ stack install Stack has not been tested with GHC versions above 8.6, and using 8.8.3, this may fail <<<
No need to worry about this, it compiles successfully despite the warning.
I expect that newer versions of stack will stop complaining.
from hledger-flow.
--enable-future-rundir
is now the default behaviour, as of release 0.14.1. The option has been deprecated (will be removed in a future release).
Specifying the option in the latest release doesn't do anything, other than print a message to the console.
from hledger-flow.
@apauley the ability to specify an arbitrary directory has disappeared. Was that intentional?
$ hlimport import/personal/wallet/cash/
using hledger flow executable: hledger-flow-async-batches-793f882bb22ac7b89a98077ee95b3464bbc5c0e0...
/home/lestephane/.local/bin/hledger-flow-async-batches-793f882bb22ac7b89a98077ee95b3464bbc5c0e0 +RTS -N10 -RTS --show-options import --batch-size 200 import/personal/wallet/cash/
Invalid argument `import/personal/wallet/cash/'
from hledger-flow.
@apauley the ability to specify an arbitrary directory has disappeared. Was that intentional?
batch size is an option on the main hledger-flow command, and if you put it after the import
subcommand it is interpreted as an option on import
.
I think it should work if you move --batch-size 200
to just before import
from hledger-flow.
Related Issues (20)
- File-specific rules HOT 5
- hledger-flow does not 'see' _manual_ year subdirectory if there is no corresponding 1-in subdirectory HOT 1
- QUESTION: how to break up a transaction/payment? HOT 12
- If I delete a file in a `1-in` directory, re-running `hledger-flow import` does not remove the corresponding files in the `2-preprocessed` and `3-journal` directories HOT 4
- Missing version bound on turtle breaks build HOT 3
- Have a way to use `--cost` option for income-expense reports HOT 1
- Documentation on workflow HOT 8
- `hledger-flow` reports empty for user sub-accounts (due to missing `directives.journal` at lower levels) HOT 3
- (docs) unclear what to do if starting balance is not 0 HOT 6
- Where to put account declarations and prices?
- Support for Apple Silicon (aarch64-darwin) HOT 3
- hackage doesn't have the 0.15 release
- Windows: the preprocess and construct scripts are not executed HOT 1
- QUESTION: tags, reports, multiple contributors, virtual accounts, how to do it simply? HOT 2
- when preprocess is called with a $1 that has a .timeclock extension, $2 has a .csv extension HOT 3
- 3-journal/ files not ending in ".journal" extension are added to yearly include files HOT 6
- Make it possible to configure the number of cores being used (the default is to use all cores, which slows down the machine) HOT 18
- hledger-flow does not 'see' hledger despite it being present in the PATH as a symlinked executable HOT 3
- cabal install error: Not in scope: type constructor 'Rel' HOT 2
- Question: where to include "meta" statements (`account...`, `commodity format` & `alias`) & prices? HOT 13
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hledger-flow.