Code Monkey home page Code Monkey logo

mirror's Introduction

Build Status Docker Repository on Quay

Primary Workflow

Mirror is built to support a two-machine (e.g. desktop+laptop) development workflow where you want to run a command line compile/build process on a powerful/dedicated desktop, but still edit files remotely on a laptop.

This is fairly common (see "Comparison to Existing Options" section below), but what makes Mirror unique is that it is two-way: it simultaneously syncs both laptop-to-desktop as well as desktop-to-laptop, in real time.

For my personal use case, this is to facilitate using an IDE on the laptop (e.g. for code completion, navigation, etc.), and IDEs often need local access to the binary artifacts (or build time-generated source code) from the desktop-hosted build process, e.g.:

  • On your laptop, save projectA/foo.java
    • Mirror sends foo.java to the desktop
  • On your desktop, the build system picks up the projectA/foo.java change and creates projectA-snapshot.jar
    • Mirror sends projectA-snapshot.jar back to the laptop
  • On your laptop, the IDE can now use projectA-snapshot.jar for code completion/etc. when editing projectB/bar.java

Granted, the IDE will also do local compilation of projectA/foo.java, so ideally the IDE could use in-workspace references, where the IDE-compiled projectA/foo.java is already/immediately on the IDE classpath for projectB. If you can setup your projects this way (e.g. with m2e or IvyDE), that is generally preferable.

However, for larger/more complex projects, e.g. those with various pre-/post-compilation code generation steps (that are often only performed in the CLI build), the IDE just can't reproduce the build process closely enough to fully compile projectA on it's own, and so using CLI-provided projectX-snapshot.jar artifacts is the only way to do local cross-project imports.

This scenario (local edits + remotely-built cross-project artifacts) is what Mirror addresses.

Goals

  • Real-time, two-way sync between a desktop development machine, and an editor-/IDE-only laptop
  • Native file system events (e.g. inotify) fired when files are changed (otherwise IDEs require explicit refreshes/polling)

Non-Goals

  • Unison-style/long-duration disconnected support
    • mirror will automatically re-connect (e.g. if you close your laptop and then go home) and restart syncing when it detects the server is available again (inspired by mosh), but if files have changed on both sides while disconnected, then the last write wins
    • This hueristic is generally fine, it just means mirror is not meant for a use case of "make new changes on the desktop for a few days, make new changes on the laptop for a few days, and then run mirror once per week to intelligently merge your work". Use git or unison for that; mirror is for real-time syncing.
  • Maintain Unix permissions/owner/group
    • Whatever Unix user runs the mirror commands will be the owner/group/etc. of the files
  • Support for huge files
    • The assumption is that most files are source code, and occassional binary artifacts that are generally in the below-100mb range
  • Super-efficient diff/transmission logic like rsync
    • Instead we assume a generally fast network connection (as in "faster than a modem", i.e. mirror works fine over a VPN)
    • Basically, if a file changes, mirror retransmits the whole file instead of trying to diff only what changed

Comparison to Existing Options

I looked at several sync options before starting mirror, but didn't find anything that quite fit:

  • X-Forwarding (to just run everything, IDE included, on the desktop) has noticeable lag, even on a LAN
  • rsync is not two-way, nor real-time
  • unison is two-way, but not real-time
  • lsyncd is real-time, but does not officially support two-way (see issue 303)
  • sshfs is too slow and doesn't support inotify
  • NFS and other network file systems don't support inotify
  • SyncThing is a dropbox alternative if you're looking for a backup solution
  • doppleganger (an internal tool) is real-time, but not two-way
    • It also generally assumes you can run the build tool (e.g. gradle) on both laptop and desktop, and for my setup I want to only run it on the desktop
    • Similarly, doppleganger does let you have a git working copy on both laptop/desktop, but so far mirror generally assumes you have a git copy on only one of the desktop or laptop (your choice) and then the other side is just a dumb copy. Which means git commands, like git log, tig, etc., will only work on whatever machine you have your git check on.

Install

To use the latest release/pre-built jars:

  • Install Java 8
    • You can try running java -version to see if Java is already installed
    • If not, a you'll need a platform-specific installation step, like sudo apt install openjdk-8-jre on Ubuntu
  • Install watchman
  • Download the latest mirror and mirror-all.jar to your home directory (or some other directory on your path, e.g. ~/bin)
    • wget https://github.com/stephenh/mirror/releases/latest/download/mirror-all.jar ~/
    • wget https://github.com/stephenh/mirror/releases/latest/download/mirror ~/
  • Make the mirror file executable
    • chmod u+x mirror
  • Copy both to your remote home directory (or some other directory on your path, e.g. ~/bin)
    • scp mirror-all.jar your-desktop.com:~/
    • scp mirror your-desktop.com:~/
  • Start the server-side from the desktop's home directory
    • ./mirror server
  • Start the client-side from the laptop's home directory
    • ./mirror client -h your-desktop.com -l ./code/ -r ./code/
    • Note: Be careful using the tilda (e.g. ~/code), as your shell will resolve that, e.g. to /Users/you/code, and that resolved path on the client might be not valid on the server

This will sync the $HOME/code directory on your two machines.

(Note: for Arch Linux users, AUR has the mirror-sync and mirror-sync-git packages.)

Config

By default, mirror will not sync any files in your .gitignore files.

However, you can also configure mirror with extra includes or excludes in addition to the .gitignore, e.g. if for some reason you want to not sync files that are not ignored, or sync files that are actually ignored (e.g. certain build artifacts).

Extra includes and excludes patterns can be passed when starting the client, and follow the .gitignore format, e.g.:

./mirror client --include '*-SNAPSHOT.jar` --include '.classpath' --include '.project' --exclude `build/`

If you'd like mirror to completely ignore your .gitignore files, i.e. to sync everything, you can use --include '*'.

(There is also a --use-internal-patterns that has useful defaults if you work at the same place I do.)

Help

Options for both the client and server commands are available by running:

`./mirror help client`

Or:

`./mirror help server`

For example, the client supports --debug-all and --debug-prefixes command line parameters to output debug info for all or subsets of the paths (setting these parameters on the client session will pass them to the client's server session, so that the server will use those options for its own session with that specific client).

Git Usage Note

(Update February 2020: Lately I have been ignoring this advice and passing -i .git to have both sides sync their .git metadata so that staging/index/etc. status is all shared, and it's been working out really well so far.)

In general, mirror will work best if you have just one machine (either the desktop or laptop) have the git (or svn/other SCM) working copy (or working copies if you're syncing a directory like ~/code with multiple repositories), and have the non-git machine just get all of it's files via mirror from git-using machine, and not by also having it's own git working copy.

The reason is that if you had git working copies on both the laptop and desktop, you could run into:

  1. Desktop: git checkout master
  2. Laptop: git checkout master
  3. Run mirror, everything syncs fine, because they're on the same branch
  4. On the laptop, run git checkout feature_a
  5. mirror copies all the files changes on the laptop (basically all feature_a's changes) to the server
  6. Now the server git directory thinks it's still on master, but we've written the feature_a files to it

Basically, mirror makes no attempts to keep your two git working copies on the same branch, and instead just naively copies files back/forth.

So, instead, it works better if you only ever run git commands on the desktop, and so any branch changes happen there, and then the laptop just follows/copies where ever the server is at.

The downside of this approach is that you can't use git log, git blame, etc. on the laptop. But the upshot is that's pretty simple, and makes it harder/less likely to accidentally nuke code by either knowingly or unknowingly getting the two machines on different branches, and then having one overwrite the other.

System Watch Limits

Note that if you have a lot of directories, you might have to increase the native file system limits, e.g.

  • For Linux see this readme
    • echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p
    • echo fs.inotify.max_queued_events=50000 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p
  • For Mac, see the watchman docs

Watchman Config

If you have an extremely large number of files that you don't want to sync, .gitignore may not be enough, because mirror and the underlying Watchman tool will still initially load those files into memory, even to decide "oh right, don't sync them".

To have Mirror and watchman fundamentally ignore things, you can create a .watchmanconfig file with the ignore_dirs property set.

See the watchman config.

Syncing More than Two Machines

Although not an initial design goal, due to it's approach, mirror also supports a hub and spoke model of syncing more than two machines.

E.g. you could have:

  1. Desktop runs the mirror server process (you don't need to start multiple mirror server processes)
  2. Laptop 1 connects to the desktop and syncs it's ~/code to the desktop's ~/code
  3. Laptop 2 also connects to the desktop and syncs it's ~/code to the desktop's ~/code

Now all three machines will be kept in sync.

Syncing Jar Caches

If you're running most build commands on your desktop, but the IDE on your laptop, your .classpath/etc. files will likely have references to downloaded jars, e.g. in the Maven cache or Gradle cache directories.

Since these files are cached and so don't change, and aren't created very often (only when dependencies are updated), I currently sync these jar caches as needed with a shell script:

rsync -azP [email protected]:.ivy2/sbt/ ~/.ivy2/sbt/
rsync -azP [email protected]:.gradle/caches/ ~/.gradle/caches/

In theory mirror could keep these in sync as well, either by just running ~2-3 more invocations of the mirror client, but it would also be nice to pass in multiple sync directories in a single mirror command invocation, e.g.:

mirror \
  --sync ./.ivy2/sbt:./.ivy2/sbt \
  --sync ./.gradle/caches:./.gradle/caches \
  --sync ./code:./code

But currently mirror only supports a single remote/local sync at a time.

Secure Communication

mirror currently uses plain text for its communication protocol, as the primary use case is syncing a desktop/laptop that are on an assumed-secure internal network or VPN connection.

The underlying RPC framework, GRPC, supports TLS communication, but mirror currently does not leverage that.

If you need secure communication, e.g. are syncing across the open Internet with sensitive data, you can use SSH tunneling, e.g.:

  • On your client, run ssh -L 49172:localhost:49172 your-remote-host
  • On your remote host, start the server as usual, ./mirror server
  • On your client, run mirror with ./mirror client -h localhost ...

This will have the mirror client send traffic to the localhost:49172 port, which SSH will securely tunnel to your remote host.

Running with Docker

A docker image is available at quay.io/stephenh/mirror.

Since the container will have its own filesystem separate from the host's filesystem, usually you'll want to mount some directory into the container to make it available for synchronisation. To mount the current working directory as /data into the container, pass -v $(pwd):/data to docker.

By default docker runs processes as root, which results in all written files owned by root on the host system. To run the process as your current user, pass -u $(id -u):$(id -g) to docker.

To start a mirror server available on port 49172, with the current working directory mounted at /data run:

docker run --rm --init -it -u $(id -u):$(id -g) -v $(pwd):/data -p 49172:49172 \
  quay.io/stephenh/mirror server

To start a mirror client with the current working directory mounted at /data (and syncing the local /data with the remote /data) run:

docker run --rm --init -it -u $(id -u):$(id -g) -v $(pwd):/data \
  quay.io/stephenh/mirror client \
  --local-root /data \
  --remote-root /data \
  --host <SERVER-HOST>

Compiling/Contributing

If you want to hack on mirror locally, you should be able to:

  • Clone this repository
  • Run ./gradlew shadowJar
    • This will download mirror's dependencies and produce an all-in-one jar in build/lib/mirror-all.jar
  • Run ./mirror ... (e.g. either mirror client or mirror server
    • The mirror script in the base directory should pick up your locally-built build/lib/mirror-all.jar

If you want to use your locally-built jar on your remote host, you'll need to scp the new mirror-all.jar to your remote host (in the same directory as the mirror script, which will then use your new mirror-all.jar instead of the previously-downloaded version.)

Todo

  • Configuration via a .mirrorrc file (if necessary)
  • Really easy setup/install process, e.g.:
    • Currently you have to run mirror ... and mirror ... on both sides
    • Ideally you could just run mirror on the client, and it would self-start the desktop-side process by SSH'ing the jar over
    • Maybe the server-side should be an always-running daemon? Same thing with the client-side?
  • Really easy upgrade process
    • Ping repo.joist.ws/mirror-version.txt, and if a new version, exit with a code that tells bootscript script to redownload the jar
  • Include file hashes to avoid sending files with different modification times
    • See Digest for an experiment; collecting digests does slow down the initial scan, but not egregiously
    • Currently needless syncs don't seem that frequent/expensive (text files are small), so this is low priority
  • Use file hashes to detect renames
    • Right now renaming ~/code/project-a to ~/code/project-b will treat every file as a delete+create
    • So far seems like a limited use case, and also is still handled pretty quickly, so this is also low priority
  • Support svn:ignore
    • Would be easiest by converting it in-memory to .gitignore-format via something like this
    • Or else just suggest users use git-svn for all svn repos
  • Support .git/info/exclude?
    • Not sure how often this is used

mirror's People

Contributors

ajdavis avatar barrychapman avatar ccat3z avatar ediphy-azorab avatar ianvkoeppe avatar njam avatar reasonableperson avatar stephenh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mirror's Issues

How to keep the file permissions?

Hi @stephenh,

I have source dir with multiple files and having diff permissions. How to keep the same file permissions on both sites as the mirror service getting change the file permission with the user running the mirror service.

Please help me in fixing this issue.

Thanks and Regards,
Ramesh AR

Syncing .git folder

Hey @stephenh, your project seems very promising to use it with my development docker image to sync files between local and remote environment.

I've read Git Usage Note. But I think this isn't entirely true. AFAIK git doesn't have repo state outside of .git folder. It seems to work great on small test git repo. Everything is in-sync: files, stash, .git/config, current branch etc. But on a little bit larger repository it get out of sync. I'm not familiar with inotify and mirror, but I think there is some bug, if multiple files are exchanged in the way how git do it. I don't think this is git-only issue. Maybe some events are missing? What do you think about it?

read the host name from ~/.ssh/config

I have ssh configured w/ short names / alias to most of my hosts. It would be awesome if mirror read this config and used these short names to resolve the full name of the host. On the plus side, my ssh config seems to work fine otherwise.

UNAVAILABLE: Keepalive failed. The connection is likely gone

Hi,

I faced this problem. Server and client are running but I get this error. The port 49172 is opened, I can telnet into it. The mirroring was working yesterday.

2019-08-04 15:02:53 INFO Stopping session
2019-08-04 15:02:53 INFO Connected, starting session, version unspecified
2019-08-04 15:04:58 ERROR Error returned from runOneLoop
java.lang.RuntimeException: java.util.concurrent.ExecutionException: io.grpc.StatusRuntimeException: UNAVAILABLE: Keepalive failed. The connection is likely gone
at mirror.MirrorClient.doTimeCheck(MirrorClient.java:249)
at mirror.MirrorClient.logErrorIfTimeOutOfSync(MirrorClient.java:209)
at mirror.MirrorClient.startSession(MirrorClient.java:71)
at mirror.MirrorClient.access$300(MirrorClient.java:27)
at mirror.MirrorClient$SessionStarter.runOneLoop(MirrorClient.java:198)
at mirror.tasks.ThreadBasedTask.run(ThreadBasedTask.java:62)
at mirror.tasks.ThreadBasedTask.lambda$new$0(ThreadBasedTask.java:39)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: io.grpc.StatusRuntimeException: UNAVAILABLE: Keepalive failed. The connection is likely gone
at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:90)
at mirror.MirrorClient.doTimeCheck(MirrorClient.java:240)
... 7 common frames omitted
Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: Keepalive failed. The connection is likely gone
at io.grpc.Status.asRuntimeException(Status.java:526)
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:434)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:678)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:397)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:459)
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:546)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:467)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:584)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

Files are repeatedly transferred

Every time a "mirror client" command is run, it seems that a large number - presumably all files "visible" to mirror (see #22) are transferred one way or another from server to client, updating the ctime on the files.

for example, on the server:

root@www-1:/var/www/html/edge/i/flags# ls -al Finland.png 
-rw-r--r-- 1 root root 3036 Oct 27 21:38 Finland.png
root@www-1:/var/www/html/edge/i/flags# stat Finland.png 
  File: Finland.png
  Size: 3036      	Blocks: 8          IO Block: 4096   regular file
Device: 801h/2049d	Inode: 308285      Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2018-10-27 22:44:48.945791967 +0100
Modify: 2018-10-27 21:38:38.372526215 +0100
Change: 2018-10-27 21:38:38.372526215 +0100
 Birth: -

whereas on the client:

me@devBox:/var/www/html/edge/i/flags$ ls -al Finland.png 
-rw-r--r-- 1 ferenc ferenc 3036 Oct 27 21:38 Finland.png
me@devBox:/var/www/html/edge/i/flags$ stat Finland.png 
  File: Finland.png
  Size: 3036      	Blocks: 8          IO Block: 4096   regular file
Device: 805h/2053d	Inode: 1835299     Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/  ferenc)   Gid: ( 1000/  ferenc)
Access: 2018-10-27 21:38:38.372000000 +0100
Modify: 2018-10-27 21:38:38.372000000 +0100
Change: 2018-10-28 16:48:06.759671332 +0000
 Birth: -

I don't know what the basis is for your file comparison, but surely this can be avoided by using mtime as a point of comparison and writing a synchronised mtime at the same time as writing the file, or even just updating the mtime on the file after the write?

I rely on the mtime to see the latest files I am working on, and don't really want the mtime interfered with by any external process.

I am running linux mint 19(~ubuntu 18 bionic) without watchman installed (see #20)

java.lang.RuntimeException: java.util.concurrent.ExecutionException: com.facebook.watchman.WatchmanException: unknown command watch-project

Hi @stephenh,

I am getting the below error on the client-side.
**
2020-04-25 16:51:53 INFO Connected, starting session, version 1.3.6-3-gb47dd9f-dirty
2020-04-25 16:51:53 ERROR Exception starting the client
java.lang.RuntimeException: java.util.concurrent.ExecutionException: com.facebook.watchman.WatchmanException: unknown command watch-project
at mirror.watchman.WatchmanImpl.run(WatchmanImpl.java:67)
at mirror.watchman.WatchmanFileWatcher.startWatchAndInitialFind(WatchmanFileWatcher.java:179)
at mirror.watchman.WatchmanFileWatcher.performInitialScan(WatchmanFileWatcher.java:132)
at mirror.MirrorSession.calcInitialState(MirrorSession.java:77)
at mirror.MirrorClient.startSession(MirrorClient.java:88)
at mirror.MirrorClient.access$300(MirrorClient.java:27)
at mirror.MirrorClient$SessionStarter.runOneLoop(MirrorClient.java:198)
at mirror.tasks.ThreadBasedTask.run(ThreadBasedTask.java:62)
at mirror.tasks.ThreadBasedTask.lambda$new$0(ThreadBasedTask.java:39)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: com.facebook.watchman.WatchmanException: unknown command watch-project
at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:531)
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:512)
at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:83)
at mirror.watchman.WatchmanImpl.run(WatchmanImpl.java:65)
... 9 common frames omitted
Caused by: com.facebook.watchman.WatchmanException: unknown command watch-project
at com.facebook.watchman.WatchmanConnection$IncomingMessageThread.run(WatchmanConnection.java:257)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 common frames omitted
2020-04-25 16:51:53 INFO Stopping session
**

I have downloaded the watchman from the https://download.copr.fedorainfracloud.org/results/codeblock/watchman/epel-6-x86_64/watchman-2.9.5-1.fc20/watchman-2.9.5-1.el6.x86_64.rpm" and installed it on both servers.

Please help me in fixing the issue and let me know if I missed anything.

Thanks and Regards,
Ramesh AR

Deadline exceeded after 3 minutes

Hi @stephenh,

I am getting the below error on the client-side.
2020-04-28 19:40:40 INFO Connection status: Status{code=DEADLINE_EXCEEDED, description=deadline exceeded after 179997852647ns, cause=null}
2020-04-28 19:40:40 INFO Stopping session
2020-04-28 19:40:40 ERROR Exception starting the client
java.util.concurrent.ExecutionException: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 179997852647ns
at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:531)
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:512)
at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:83)
at mirror.MirrorClient.startSession(MirrorClient.java:124)
at mirror.MirrorClient.access$300(MirrorClient.java:27)
at mirror.MirrorClient$SessionStarter.runOneLoop(MirrorClient.java:198)
at mirror.tasks.ThreadBasedTask.run(ThreadBasedTask.java:62)
at mirror.tasks.ThreadBasedTask.lambda$new$0(ThreadBasedTask.java:39)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 179997852647ns
at io.grpc.Status.asRuntimeException(Status.java:533)
at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:442)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:700)
at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:399)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:507)
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:66)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:627)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:515)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:686)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:675)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
... 1 common frames omitted
2020-04-28 19:40:40 INFO Stopping session
2020-04-28 19:40:40 INFO Connected, starting session, version 1.3.6-3-gb47dd9f-dirty

Please help me in fixing this issue.

Thanks and Regards,
Ramesh AR

Do not respect .gitignore content

Hi,

Thanks for the project, it works like a charm over large code repository (~3G).

I wonder how can I disable all ",gitignore" files (main one and in subfolders) and sync all files?

Thanks

Reverting to older files or state is troublesome

I have found 3 use cases that seem to highlight this issue:

  1. I accidentally moved a file from my dev folder instead of copying it. The remote file was deleted on the server - so far this is correct. Then when I tried to move the file back to the original folder, I guess mirror's logic is "that delete operation on the remote outdates the mtime of the old file so I'll delete it on the client".
2018-10-31 15:22:45 INFO  dev/adminBugs.php isLocalNewer
2018-10-31 15:22:45 INFO    l: modTime: 1540672411588 delete: true local: true
2018-10-31 15:22:45 INFO    r: modTime: 1540672410588 data: "initialSyncMarker" local: true
2018-10-31 15:22:45 INFO  Sending (delete) dev/adminBugs.php
---
2018-10-31 15:23:05 INFO  Queueing: path: "dev/adminBugs.php" modTime: 1540672410588 local: true
2018-10-31 15:23:05 INFO  Queueing: path: "dev" modTime: 1540999385424 local: true directory: true executable: true
2018-10-31 15:23:05 INFO  dev/adminBugs.php isRemoteNewer
2018-10-31 15:23:05 INFO    l: modTime: 1540672410588 local: true
2018-10-31 15:23:05 INFO    r: modTime: 1540672411588 delete: true local: true
2018-10-31 15:23:05 INFO  Remote delete dev/adminBugs.php
2018-10-31 15:23:05 INFO  Queueing: path: "dev/adminBugs.php" delete: true local: true

The workaround is to touch the file before copying it back.

  1. I want to revert to an older version of a file so I overwrite it with a backup, but the file change is not propagated to the server:
2018-10-31 15:08:29 INFO  Queueing: path: "edge/adminBugs.php" modTime: 1540672410588 local: true
2018-10-31 15:08:29 INFO  Queueing: path: "edge" modTime: 1540998509051 local: true directory: true executable: true
2018-10-31 15:08:29 INFO  Queueing: path: "edge/.goutputstream-V8SQRZ" delete: true local: true

Again, touching the file cures this but I would prefer to maintain mtime on my files wherever possible.

  1. While dropping in a large directory to my dev folder on the client, I ran out of space on the remote and mirror wrote a bunch of 0-length files. Deleting the 0-length files on the remote then deleted the counterparts on the client. Clearing out enough space on the server and then re-copying that directory on the client resulted in the files being deleted from the client because the delete operation is newer than the mtime.
    The workaround was to stop mirror both ends, copy the files manually both sides and restart mirror both sides.

I would guess there's probably some information in the ext3/ext4 journal that can be relied upon to be sure which is the latest operation, because it's not necessarily the latest mtime that we want to preserve.

Docker image build on Quay stopped working after 2018-11-18

The credentials on Quay to clone the git repo don't work anymore.

Could not clone git repository: Error cloning git repository (exit status 128) Cloning into '/tmp/build_pack268724775'... Warning: Permanently added 'github.com,192.30.253.113' (RSA) to the list of known hosts. Permission denied (publickey). fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists.
https://quay.io/repository/stephenh/mirror/build/74e2adbd-7765-47dd-8b61-9c78602ca1ed

image

@stephenh could you try to delete and re-add the build trigger? I guess it should solve the problem.
-> https://quay.io/repository/stephenh/mirror?tab=builds
image

  • Trigger: "Trigger for all branches and tags (default)"
  • Dockerfile: "/Dockerfile"
  • Context: "/"

grpc DEADLINE_EXCEDDED error when syncing

Hi,

I followed the guideline in the doc this morning to set up mirror, but seems that the client just wait for several minutes before quitting with the following error:

$ ./mirror client -h <my server address> -l ./sync_workspace/ -r ./sync_workspace/
2017-05-01 11:02:44 INFO   Increasing file limit to 9223372036854775807
2017-05-01 11:02:45 INFO   Connected, starting session, version unspecified
2017-05-01 11:02:45 INFO   Client has 1 paths
2017-05-01 11:05:45 SEVERE Error from incoming server stream
io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED
	at io.grpc.Status.asRuntimeException(Status.java:540)
	at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:392)
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:426)
	at io.grpc.internal.ClientCallImpl.access$100(ClientCallImpl.java:76)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:512)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:429)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:544)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:117)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

2017-05-01 11:05:45 INFO   Stopping session
2017-05-01 11:05:45 SEVERE Error returned from runOneLoop
java.lang.RuntimeException: java.io.IOException: Bad file descriptor
	at mirror.watchman.WatchmanFileWatcher.runOneLoop(WatchmanFileWatcher.java:102)
	at mirror.tasks.ThreadBasedTask.run(ThreadBasedTask.java:59)
	at mirror.tasks.ThreadBasedTask.lambda$new$0(ThreadBasedTask.java:37)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Bad file descriptor
	at jnr.enxio.channels.NativeSocketChannel.read(NativeSocketChannel.java:80)
	at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:59)
	at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
	at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
	at com.google.common.io.ByteStreams.read(ByteStreams.java:822)
	at com.facebook.buck.bser.BserDeserializer.readBserBuffer(BserDeserializer.java:123)
	at com.facebook.buck.bser.BserDeserializer.deserializeBserValue(BserDeserializer.java:113)
	at mirror.watchman.WatchmanChannelImpl.read(WatchmanChannelImpl.java:92)
	at mirror.watchman.WatchmanFileWatcher.runOneLoop(WatchmanFileWatcher.java:98)
	... 3 more

and on my remote dev machine, it had the following log before quitting:

$ ./mirror server
2017-05-01 11:02:26 INFO   Listening on 49172, version 1.0.8
2017-05-01 11:02:46 INFO   Starting new session 1 for + /home/<myusername>/./sync_workspace
^C2017-05-01 11:07:56 INFO   Watchman not found, using WatchService instead
2017-05-01 11:07:56 INFO     Note that WatchService is buggy on Linux, and uses polling on Mac.

Any idea on what might be wrong? I think it might be related to watchman, but I have the latest version of watchman (v4.7.0) installed both on my laptop and remote dev machine.

P.S. I masked my server info and username, but we are from the same company. Really like this tool, very helpful!

./gradlew shadowJar produces wrong jar name

It's possibly an artifact of my environment, but the output of the shadowJar task is called mirror.jar on my system, not mirror-all.jar.

It's simple enough to fix up the build scripts, but I don't want to do it if it's only my box it's happening on.

Optimize large (~500K to 1M+) number of paths

Hi there! I just stumbled upon this project today and it seems like exactly what I need to replace my very slow SSHFS setup, so thank you very much for all your work!

I've been playing around with it and settings things up in my environment, where (similarly to your examples) I have a folder containing various code projects, and each of these code projects contains quite a few files. After seeing things work in each project folder individually, my plan was to effectively 'mount' all of them at once and let mirror handle the syncing. I started off with an initial rsync which pulled in most of the data.

What I'm finding is that for a folder with all my projects (mirror reports that the server has 219717 paths), syncing appears to only work one way: my client can make a change and have it reflected on the server, but not the other way round. If I restart the client or server then things do get back in sync during the initial sync that occurs.

So I'm wondering if this is related to the inotify limits that you mention in the readme. Unfortunately I'm in an environment on the server where I can't change those limits. Interestingly though, watchman itself seems to detect the changes that mirror isn't responding to: I set up a trivial trigger to echo files that are changed, and I see them in the watchman log. I'm unsure if there's a way to access more verbose logs from mirror, so at this point I'm at a bit of a dead end. I took a look at some of the source code but couldn't work out where to start without access to a debugger, and my experience debugging java code is a little lacking :(

My workaround for now will likely be to spawn individual clients for each of my project folders as required, as that seems to avoid this problem. But if there is a way to have the single code/ folder picked up from one client, it would make managing those processes a little easier for sure.

logs are not rolling.

Hi @stephenh,

Initially, I have tested the bi-direction sync with the fewer files and I was able to view the logs are rolling.
later, I have tested the bi-direction sync with the more files and I am not able to view the rolling logs.

Please help me in fixing the issue.

Thanks and Regards,
Ramesh AR

mirror doesn't work two-way in all cases

Hi! This project is awesome, exactly what I need.

I'm on Linux, have watchman installed, and I'm trying to sync a directory with just one file currently. Initially, the file exists in desktop and the directory is empty on laptop. mirror is in PATH in both machines.

On desktop, I cd to the directory I want to sync, and run this command: mirror server. On laptop, I also cd to the directory, and run this command: mirror client -h desktop -l . -r .. Initially, the file is synced correctly to laptop.

After starting mirror on both machines, if I edit the file on laptop first and then on desktop, everything works as expected afterwards, and the file syncs in both ways.

But if I make the first edit on desktop, then changes made on on laptop don't sync back to desktop. Furthermore, when I Ctrl-C the mirror running on laptop, the file is emptied. Not removed, but emptied.

Option to sync .git directory

I tried adding "--include .git" to the "mirror client" command line but it still seems to ignore the .git directory. I understand your reasons for ignoring it by default but I'd like to sync it. Is there a way to override the default exclude rule for .git?

NullPointerException if "mirror client" executed without remote path argument

I just installed mirror on my macOS 10.13.6 laptop, and "mirror client" fails at startup:

> mirror client
2019-01-19 09:35:09 INFO  Increasing file limit to 9223372036854775807
Exception in thread "main" java.lang.NullPointerException
	at sun.nio.fs.UnixPath.normalizeAndCheck(UnixPath.java:77)
	at sun.nio.fs.UnixPath.<init>(UnixPath.java:71)
	at sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:281)
	at java.nio.file.Paths.get(Paths.java:84)
	at mirror.Mirror$MirrorClientCommand.runIfChecksOkay(Mirror.java:191)
	at mirror.Mirror$BaseCommand.run(Mirror.java:100)
	at mirror.Mirror.main(Mirror.java:55)

This seems to happen no matter what options I pass. "mirror server" succeeds:

> mirror server
2019-01-19 09:36:09 INFO  Increasing file limit to 9223372036854775807
2019-01-19 09:36:10 INFO  Listening on 49172, version unspecified

I installed mirror like:

wget http://repo.joist.ws/mirror-all.jar ~/
wget http://repo.joist.ws/mirror ~/

Don't drop non-utf8 file paths

I can get mirror working fine as a client without watchman, but with watchman installed I get this:
I am running linux mint 19 (~ubuntu 18 bionic).
Same result whether running as user or root
Same result with whichever version of openjdk-8/9/10/11-jre
A quick google and it appears to be related to character encodings. It may help to mention that I am in the UK and most of my system defaults to UTF-8, but it may be related to some form of internationalisation.
With reference to your notes about WatchService, I notice that JDK-8145981 is now fixed - is WatchService still considered buggy in the latest release and is watchman still recommended/required for stability?

$mirror client -h localhost -l /var/www/html -r /var/www/html
2018-10-28 16:15:39 INFO Connected, starting session, version unspecified
2018-10-28 16:15:41 INFO Watchman root is /var/www/html
2018-10-28 16:15:41 ERROR Exception starting the client
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:816)
at com.facebook.buck.bser.BserDeserializer.deserializeString(BserDeserializer.java:236)
at com.facebook.buck.bser.BserDeserializer.deserializeRecursiveWithType(BserDeserializer.java:332)
at com.facebook.buck.bser.BserDeserializer.deserializeTemplate(BserDeserializer.java:302)
at com.facebook.buck.bser.BserDeserializer.deserializeRecursiveWithType(BserDeserializer.java:338)
at com.facebook.buck.bser.BserDeserializer.deserializeRecursive(BserDeserializer.java:313)
at com.facebook.buck.bser.BserDeserializer.deserializeObject(BserDeserializer.java:276)
at com.facebook.buck.bser.BserDeserializer.deserializeRecursiveWithType(BserDeserializer.java:336)
at com.facebook.buck.bser.BserDeserializer.deserializeRecursive(BserDeserializer.java:313)
at com.facebook.buck.bser.BserDeserializer.deserializeBserValue(BserDeserializer.java:113)
at mirror.watchman.WatchmanChannelImpl.read(WatchmanChannelImpl.java:93)
at mirror.watchman.WatchmanChannelImpl.query(WatchmanChannelImpl.java:87)
at mirror.watchman.WatchmanFileWatcher.startWatchAndInitialFind(WatchmanFileWatcher.java:197)
at mirror.watchman.WatchmanFileWatcher.performInitialScan(WatchmanFileWatcher.java:140)
at mirror.MirrorSession.calcInitialState(MirrorSession.java:78)
at mirror.MirrorClient.startSession(MirrorClient.java:88)
at mirror.MirrorClient.access$300(MirrorClient.java:27)
at mirror.MirrorClient$SessionStarter.runOneLoop(MirrorClient.java:198)
at mirror.tasks.ThreadBasedTask.run(ThreadBasedTask.java:62)
at mirror.tasks.ThreadBasedTask.lambda$new$0(ThreadBasedTask.java:39)
at java.lang.Thread.run(Thread.java:748)
2018-10-28 16:15:41 INFO Stopping session

File and Folder Deletes are not syncing

I'm attempting to use your mirroring tool, and ran into an issue where I'm seeing that file and folder deletes are almost never propagated to the other clients or server (occasionally the delete does propagate but it's so rare I don't know what triggers it). With The --debug-all flag set I can see the delete is always queued but rarely if ever do I see it propagated out to the other clients and/or server.

When I downloaded the source code so I can step through the code to see where the issue is originating from I found I can't compile because 16 of the class sources are not checked into source control (like Update.java and InitialSync*.java).

I would have liked to help you pin point where the problem is but without the complete source I can't find where the issue is.

Dropped flows

We are currently seeing 6,300,000 Dropped flows, this only appears to happen when some housekeeping is running on the database - I have seen it when it deleting from flowsv4

We are running 3.6.181112 - Community Edition on CentOS Linux release 7.5.1804 (Core)
free -h reports

          total        used        free      shared  buff/cache   available

Mem: 3.7G 759M 169M 3.4M 2.8G 2.6G

Database Server version: 5.5.60-MariaDB

In the ntopng gui the number of hosts are 5726 and flows 13282

My ntopng.conf -x and -X are -x 9000 -X 20000 - when at the default level we saw a high number of dropped packets, now is at 888,333 Pkts [ 0.08 % ]

am new to ntopng - so it very possible I have missed something

Local folder doesn't appear to sync anymore

I had a folder which synced fine, but something seems to have happened which causes mirror to send no updates (other than during the initial scan).

I verified that watchman is listening to the changes via https://facebook.github.io/watchman/docs/watchman-replicate-subscription.html (from https://github.com/facebook/watchman/blob/master/python/bin/watchman-replicate-subscription), but something must be lost along the way inside mirror which causes it to ignore these changes after the subscription returns info.

It only seems to happen for one folder at the moment (which I unfortunately don't want to share entirely because it has sensitive work data in it), but I can't see anything special about it. And if my memory serves correctly, it did use to sync as of a couple of days ago.

I opened #8 since I got stuck trying to add print statements etc. which I would do to try and track this down further. From browsing the github source, I also lost track of where the watchman subscription is handled in the code to work out why it might be ignored.

I tried using the --debug flag, but it didn't print any DEBUG statements :(

"OutOfMemoryError: Java heap space" error while running mirror

Hi,

My server and client have such an error while syncing.
server:
_ src free -h
total used free shared buff/cache available
Mem: 62Gi 3.7Gi 36Gi 1.0Mi 22Gi 58Gi
Swap: 2.0Gi 0B 2.0Gi

java.lang.OutOfMemoryError: Java heap space Dumping heap to java_pid21.hprof ... Unable to create java_pid21.hprof: Permission denied Exception in thread "4-SaveToRemote-0" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3520) at com.google.protobuf.ByteString$ArraysByteArrayCopier.copyFrom(ByteString.java:117) at com.google.protobuf.ByteString.copyFrom(ByteString.java:353) at com.google.protobuf.ByteString.readChunk(ByteString.java:543) at com.google.protobuf.ByteString.readFrom(ByteString.java:508) at com.google.protobuf.ByteString.readFrom(ByteString.java:476) at mirror.NativeFileAccess.read(NativeFileAccess.java:61) at mirror.SaveToRemote.sendToRemote(SaveToRemote.java:54) at mirror.SaveToRemote.runOneLoop(SaveToRemote.java:36) at mirror.tasks.ThreadBasedTask.run(ThreadBasedTask.java:62) at mirror.tasks.ThreadBasedTask.lambda$new$0(ThreadBasedTask.java:39) at mirror.tasks.ThreadBasedTask$$Lambda$17/93472147.run(Unknown Source) at java.lang.Thread.run(Thread.java:748)

client:
➜ src free -h
total used free shared buff/cache available
Mem: 15Gi 5.1Gi 3.1Gi 1.0Gi 7.4Gi 9.1Gi
Swap: 7.8Gi 540Mi 7.3Gi

java.lang.OutOfMemoryError: Java heap space Dumping heap to java_pid21.hprof ... Unable to create java_pid21.hprof: Permission denied Exception in thread "5-SaveToRemote-0" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3520) at com.google.protobuf.ByteString$ArraysByteArrayCopier.copyFrom(ByteString.java:117) at com.google.protobuf.ByteString.copyFrom(ByteString.java:353) at com.google.protobuf.ByteString.readChunk(ByteString.java:543) at com.google.protobuf.ByteString.readFrom(ByteString.java:508) at com.google.protobuf.ByteString.readFrom(ByteString.java:476) at mirror.NativeFileAccess.read(NativeFileAccess.java:61) at mirror.SaveToRemote.sendToRemote(SaveToRemote.java:54) at mirror.SaveToRemote.runOneLoop(SaveToRemote.java:36) at mirror.tasks.ThreadBasedTask.run(ThreadBasedTask.java:62) at mirror.tasks.ThreadBasedTask.lambda$new$0(ThreadBasedTask.java:39) at mirror.tasks.ThreadBasedTask$$Lambda$13/1990160809.run(Unknown Source) at java.lang.Thread.run(Thread.java:748)
According to http://javaeesupportpatterns.blogspot.com/2011/08/gc-overhead-limit-exceeded-problem-and.html, there is a leek in application.

About package name

This tool is really awesome. I want to submit mirror on AUR (Arch User Repository: a community-driven repository for Arch users), but the name (aur/mirror) is already occupied. Is it ok to use your GitHub ID as a prefix to identify this package (stephenh-mirror)? Can you provide a more suitable and unique name?

Instructions for compiling / contributing?

I've been trying to debug an issue with mirror, and am having trouble working out what's going on.

I tried cloning the repo and building a local copy of the project so as to insert some print statements etc. but wasn't able to get it working. I tried:

$ gradle shadowJar

> Task :generateProto UP-TO-DATE
Using TaskInputs.file() with something that doesn't resolve to a File object has been deprecated and is scheduled to be removed in Gradle 5.0. Use TaskInputs.files() instead.


FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':compileJava'.
> invalid flag: --release

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 0s
4 actionable tasks: 1 executed, 3 up-to-date
$ gradle -version                                                                                                              

------------------------------------------------------------
Gradle 4.4.1
------------------------------------------------------------

Build time:   2017-12-20 15:45:23 UTC
Revision:     10ed9dc355dc39f6307cc98fbd8cea314bdd381c

Groovy:       2.4.12
Ant:          Apache Ant(TM) version 1.9.9 compiled on February 2 2017
JVM:          1.8.0_25 (Oracle Corporation 25.25-b02)
OS:           Mac OS X 10.12.6 x86_64

I tried removing/changing the line in build.gradle referencing --release, but it then produced binaries with zsh: exec format error: ./build/libs/mirror-all.jar.

If you have any guidance on how best to build a local copy for debugging purposes, I'd be happy to help contribute and try to fix bugs where I find them. Having trouble since I'm a little rusty with java build systems D:

Support syncing file permissions and ownership

Hello,

Thank you for this great software. I just installed and run it easily.

The bidirectional synchronization works great, except it can not sync file permission and ownership. Am I missed something or it is not supported?

Warms Regards

OSX delay before ready

Thanks for the great tool.

One thing I have noticed wenn syncing between a Debian machine (server) and OSX (client) is that while Debian will react right away to file changes after start, OSX needs 1-2 minutes before it reacts. Files changed in that time are not synced unless touched again. Is there something I can do to fix this or figure out what is causing it? Or if there was a way to detect it and notify via output wenn OSX is ready?

I am running the docker version

Question: What causes lag between watchman and mirror sending a file?

I've noticed a delay of approximately 50ms between the time when watchman pickups a file change, and mirror starts sending the corresponding file.

For my tests I am running a build system on both the client and server to see which build completes first. The server is remote and has approx 300% the CPU power of my client and I have a 100Mbit 10baseT connection. Ping time is an average of ~20ms.

Timing as follows:
File is saved
+15ms Local build receives filechange notification from watchman and starts building.
+50ms after local build starts, mirror client logs INFO Sending ....filename (75ms after file is save)
+4ms mirror server logs Remote update ....filename

I am curious to know why such a delay between my local build system noticing the file change, but mirror takes longer? I was expecting mirror to start sending the file at the same time my local build system picks it up because they are both using watchman.

I tried with SSH tunnel in hopes it keeps the connection open to save connection negotition time, but still takes ~4ms between client sending and server logging remote update.

Issues downloading JARs

Cannot resolve repo.joist.ws.

kevin@example:~/dev/mirror$ wget http://repo.joist.ws/mirror-all.jar
--2019-09-04 15:18:06-- http://repo.joist.ws/mirror-all.jar
Resolving repo.joist.ws (repo.joist.ws)... failed: Name or service not known.
wget: unable to resolve host address ‘repo.joist.ws’

kevin@example:~/dev/mirror$ dig @127.0.0.1 repo.joist.ws

; <<>> DiG 9.11.5-P4-5.1+b1-Debian <<>> @127.0.0.1 repo.joist.ws
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 38034
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
; COOKIE: 5547341fd70235dc (echoed)
;; QUESTION SECTION:
;repo.joist.ws.			IN	A

;; AUTHORITY SECTION:
joist.ws.		716	IN	SOA	ns1.dynadot.com. hostmaster.joist.ws. 1567634111 16384 2048 1048576 2560

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Sep 04 15:18:25 PDT 2019
;; MSG SIZE  rcvd: 132

Support running on Windows

This tool looks like something that could be useful to me. I'm mostly developing on my Mac, however the build runs on Windows (VM). I'd like to mirror two folders, as the virtual network share just is too damn slow.

I can't get it to startup, and I'm unsure where to take it from here. watchman.exe is in the same folder as where I'm running this command from.

> java -version
java version "1.8.0_231"
> java -cp mirror-all.jar mirror.Mirror server
Exception in thread "main" java.lang.UnsatisfiedLinkError: The operation completed successfully.

        at jnr.ffi.provider.jffi.AsmRuntime.newUnsatisifiedLinkError(AsmRuntime.java:40)
        at jnr.posix.WindowsLibC$jnr$ffi$0.getrlimit(Unknown Source)
        at jnr.posix.BaseNativePOSIX.getrlimit(BaseNativePOSIX.java:244)
        at jnr.posix.BaseNativePOSIX.getrlimit(BaseNativePOSIX.java:254)
        at mirror.SystemChecks.checkFileDescriptorLimit(SystemChecks.java:51)
        at mirror.SystemChecks.checkLimits(SystemChecks.java:39)
        at mirror.Mirror$BaseCommand.run(Mirror.java:97)
        at mirror.Mirror.main(Mirror.java:55)

not all files are synched

I have 4 subdirectories from my nominated root containing different branches of essentially the same code, each holding apx 307MiB. I call them dev, edge, offline and stable. The largest and most unchanging folder, "scripts", containing thousands of directories (thanks, composer!), has been excluded from the sync (apx 210MiB) using --exclude to keep the load down, and there is a 76MiB .git folder, leaving 21MB in 147 paths(according to the log) to sync.

On running the "mirror client" command, edge and offline will synchronise and immediately exchange files, whereas dev and stable are not touched. Similarly, updating files within edge and offline will trigger uploads, whereas dev and stable don't seem to be monitored.

In troubleshooting, I thought maybe dev and stable are reserved words, so I stopped the mirror client, renamed dev to dv on the server and client, and restarted the mirror client. No change. Then I renamed dv to dev on the client, and that folder was synched to the server, but not all the files. Stop and start the client again, and no synchronisation occurs either as an initial sync or when the filesystem is updated.

I am running linux mint 19(~ubuntu 18 bionic) without watchman installed (see #20).

This is the command line I am running:
mirror client --debug-all --exclude dev/ui --exclude edge/ui --exclude stable/ui/ --exclude offline/ui --exclude dev/log/ --exclude edge/log/ --exclude stable/log/ --exclude offline/log/ --exclude dev/scripts/ --exclude edge/scripts/ --exclude stable/scripts/ --exclude offline/scripts/ --exclude dev/bin --exclude edge/bin --exclude offline/bin --exclude stable/bin/ -h localhost -l /var/www/html -r /var/www/html &

Incidentally, /stable/ui/ is a large directory I don't need to sync, and /dev/ui etc is a symlink to that directory. Similarly with */bin .

I have checked file and directory ownership and permissions, which all seem uniform throughout the nominated tree. All files in the tree (with the exception of a few that need to be owned by www-data) belong to the user running the mirror client executable, with permission level of 644.

Similarly, on the server mirror is executed by root and the files are owned by root with 644.

I have run the client with --debug-all and there is no mention of "dev" or "stable" in the log, so it seems as though mirror is not even considering these directories.

I have plenty of RAM on the client (8GB) but very little on the server (<1GB RAM + <1GB swap). Synchronising without these exceptions resulted in java memory errors on the server, but limiting the directories as mentioned has eradicated these memory errors.

I have tried moving the large trees "scripts" and ".git" away, stopped and started the client but still no sync of dev and stable, so the two large directories seem to be a red herring.

Support very large files

Files over ~500mb / ~1gb give mirror trouble b/c the RPC framework (grpc-java) we use for "really snappy bi-directional streaming" only supports in-memory / on-Java-heap byte[]s for its RPC messages, and doesn't support passing around zero-copy files / memory regions like netty's FileRegion.

So, for now, mirror just can't really send those sort of files unless/maybe you give it a huge heap.

We should probably detect the current size of heap, guess that we can do files ~half that size, and just ignore any file larger than that, which is basically #37.

Allow clients to specify one-directional sync

I have a slightly unusual usecase where I want to sync from my development box to a cloud-side server, and then from that server to a number of docker containers cloud-side. The hub-and-spoke pattern will work fine for me, but in this particular case, if I've deleted a file on my laptop while I was disconnected, I don't want the server to send it back to me if it (likewise if any of the containers happen to have it).

If I could specify sync directions then I could set laptop -> server -> other clients and be sure I never get unexpected files appearing on my box after I've deleted them.

I suppose the main sticking point here might be that during the sync, if the laptop doesn't have the file the server should delete it, which is different/more complex logic than the current 'last write wins' strategy.

Support running as a daemon / service

Hi, first of all this project looks awesome. But I don't know if it's fitting for my use case, I couldn't figure it out all myself. So here goes my question

I have a directory that needed to be synced on 2 servers. I was considering to use lsyncd , but readme page says this project is better at doing master-master / two way sync.

I couldn't figure out how to run this software as background daemon or at startup. Actually couldn't even run it successfully.

Currently I've setup unison software. It's simple yet awesome. But downside is it's not realtime or automated. I made crontab entry for every minute to run the script to manually sync it.

If you don't mind could you please guide me through the process?

ignoring directories with many files

Hey @stephenh, thx again for all your work on this tool!

I have been running into an issue when I try to sync a directory with lots of files on the server. For example, if I have a data directory with a bunch of files, then starting the client, hangs for a while and I eventually get a DEADLINE_EXCEEDED error. This directory is in the .gitignore file, but even so, it seems that watchman or something on the server is hanging trying to list all the files in this directory. Would it be possible to ignore this directory wherever it's hanging?

I'd be happy to try to fix this, if you could point me in the right direction?

Thanks!

CentOS docker image

Hi @stephenh,
I am seeing the current image base os is Debian, Can we use CentOs instead of debain?

Thanks and Regards,
Ramesh AR

BlockingStreamObserver should honor cancelled state

https://github.com/stephenh/mirror/blob/master/src/main/java/mirror/BlockingStreamObserver.java

I don't know this project and too little about the context in which the above class is used.
Nevertheless, if used at the server, you might want to register the state-change hook not only as onReadyHandler, but also as onCancelHandler.

Furthermore, the while loop in onNext should probably be prepared for the connection being canceled as well.
I use something like this:

       synchronized (lock) {
            while (!delegate.isReady()) {
                try {
                    while (!delegate.isCancelled() && !delegate.isReady()) {
                        log.debug("slow client, waiting");
                        lock.wait(30000);
                    }

                    if (delegate.isCancelled()) {
                        return;
                    }

                } catch (InterruptedException meh) {
                    // ignore
                }
            }
        }

Note that 'isCancelled' and 'setOnCancelHandler' are defined on ServerCallStreamObserver

i did not dare to fork/change your code, but feel free to incorporate the suggestions or just delete this issue.

./mirror server gives error ./mirror: line 9: java: command not found

I am attempting to set up ./mirror, and have gotten to the step of running

./mirror server

That gives the error

./mirror: line 9: java: command not found

I have replicated this issue on two boxes. If I am making a mistake, please let me know. If this is an issue, please fix and let me know what I can do to help. Thank you!

Unclear

You say start from client/server home directory? What directory is this? You also state you can use /bin

It is a bit confusing as to where to place these files and where to run them from

Issues with international characters

Just fired up the docker image and I'm getting this error over and over. I removed those files and things seem to be working fine now

mirror_1  | java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: <clipped for privacy>/Fran?ais/timeQplus Guide de D?marrage Rapide (2016_09_12 16_41_37 UTC).pdf
mirror_1  |     at sun.nio.fs.UnixPath.encode(UnixPath.java:147)
mirror_1  |     at sun.nio.fs.UnixPath.<init>(UnixPath.java:71)
mirror_1  |     at sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:281)
mirror_1  |     at java.nio.file.Paths.get(Paths.java:84)
mirror_1  |     at mirror.UpdateTree.find(UpdateTree.java:167)
mirror_1  |     at mirror.UpdateTree.addUpdate(UpdateTree.java:108)
mirror_1  |     at mirror.UpdateTree.addLocal(UpdateTree.java:97)
mirror_1  |     at mirror.MirrorSession.lambda$calcInitialState$1(MirrorSession.java:84)
mirror_1  |     at java.util.ArrayList.forEach(ArrayList.java:1257)
mirror_1  |     at mirror.MirrorSession.calcInitialState(MirrorSession.java:84)
mirror_1  |     at mirror.MirrorClient.startSession(MirrorClient.java:88)
mirror_1  |     at mirror.MirrorClient.access$300(MirrorClient.java:27)
mirror_1  |     at mirror.MirrorClient$SessionStarter.runOneLoop(MirrorClient.java:198)
mirror_1  |     at mirror.tasks.ThreadBasedTask.run(ThreadBasedTask.java:62)
mirror_1  |     at mirror.tasks.ThreadBasedTask.lambda$new$0(ThreadBasedTask.java:39)
mirror_1  |     at java.lang.Thread.run(Thread.java:748)
mirror_1  | 2019-09-05 03:35:21 INFO  Stopping session
mirror_1  | 2019-09-05 03:35:21 INFO  Connected, starting session, version unspecified
mirror_1  | 2019-09-05 03:35:21 INFO  Watchman root is /data/

Provide built-in TLS/SSH security (instead of just SSH tunneling)

Hi there. First of all: thanks for the work you put into this. I have been looking for viable solutions to the same problems you face and the performance of mirror is pretty great. Also for remote connections.

Nevertheless, have you tried xpra instead of x-forwarding? I could imagine that for a local setup this works rather well. On remote it can also be quite laggy. And for the unison not being real time part: It now ships with unison-fswatch and has the options repeat=watch. However, it does take longer to pick up the changes that mirror (at least in my simple test).

My actual question: Is there any means of authenticating a client / securing the server? I did not see any option for that, so I can only think of VPN or ssh tunneling for now. This of course does involve some extra setup and it would be nice to have an out-of-the-box solution for mirror. What are your thoughts about that?

exclude file pattern issue

Hi @stephenh,

I have given the exclude file pattern as "--exclude='Upload/*'". Here the service is excluding all the files along with the "Upload" dir.
I want to exclude the file/suddirs of 'Upload' dir and it should sync the 'Upload' dir.

Please help me on fixing the issue.

Thanks and Regards,
Ramesh AR

Deleting a directory hierarchy leaves empty directory

If you rm -fr foo/ where foo/ is a tree of foo/bar/zaz.txt, the remote side's deleting of zaz.txt will tick it's timestamp of foo/bar and so make it think that it's foo/bar are newer than the remote's foo/, and it will re-send mkdir foo/bar back to the remote side.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.