Code Monkey home page Code Monkey logo

fsqio's Introduction

Foursquare Fsq.io

All of Foursquare's open source code in a single repo.

Build Status

All Foursquare code lives in a single repository, an architecture generally called a monorepo. Fsq.io is a subset of that internal monorepo. Fsq.io holds many of Foursquare's open source projects that had previously lived in their own separate Github repos. Foursquare contributes to a build tool specifically designed to work with monorepos called Pants. The entire Fsq.io repo is is built and tested by Pants.

Deploying directly from our monorepo has some nice advantages, for consumers of our open source projects as well as Foursquare itself. The entire repo is built daily by our CIs and internal contributions are open sourced automatically without the overhead of publishing. This repo will always contain the latest code that we use internally, all of the tools can be built just as we use them, directly from HEAD.

Projects include:

  • Fhttp: A request building interface similar to scalaj-http for Finagle http clients.
  • Rogue: A Scala DSL for MongoDB
  • Spindle: A Scala code generator for Thrift
  • Twofishes: A coarse, splitting geocoder and reverse geocoder in Scala
  • and others.

Requirements

  • JDK 1.8 (1.8.0_40 preferred)
  • python2.6+ (2.7 preferred)
  • postgresql
  • monogdb server (required to pass some tests)
  • An increased number of file descriptors (we use 32768)

Internally we use OSX Yosemite or later. Other OS may work but are officially unsupported. (Unofficially, if building on Linux you should install python-dev, build-essential, and libpq-dev in addition to the above).

Pants build system

Pants is a build system that works particularly well for a source code workspace containing many distinct but interdependent pieces.

Pants is similar to make, maven, ant, gradle, sbt, etc. but pants pursues different design goals. Pants optimizes for:

  • building multiple, dependent things from source
  • building code in a variety of languages
  • speed of build execution

Pants is a true community project, with major contributions from Twitter, Foursquare, Square, among other companies and independent contributors. It has friendly documentation here, in this README we will just touch on how to compile and test the code.

Compiling and Testing

First Run

A good first run is to compile the repo and run every test.

 ./pants compile test src:: test::

Targets and BUILD files

Targets are adressable project or dependency level modules that can be built by Pants. BUILD files are configuration files that define targets that can be built by Pants. Each target has a name and can be built by running a Pants task against the target's name and location.

For example, Fsq.io's JVM projects live under src/jvm here. You can compile Rogue by running:

 ./pants compile src/jvm/io/fsq/rogue:rogue

Build and Test every project

Adding a :: to a path will glob every target under that location. So to compile every target in Fsq.io:

 ./pants compile src::

Similarly, to run all the tests, (after starting the mongodb server):

./pants test test::

Projects aspirationally have READMEs at the project root.

Acknowledgements

  • Thanks to the great community supporting Pants.
  • Fsq.io is split from commits to our internal monorepo by Sapling, a git porcelain tool by @jsirois.

Discussion

Please open an issue if you have any questions or concerns.

License

Apache License, Version 2.0 (Apache-2.0)

fsqio's People

Contributors

abe-winter-4s avatar ankita4sq avatar asherf avatar eric-arellano avatar fsqio avatar iantabolt avatar jglesner avatar jimdickinson avatar jonshea avatar jvandew avatar mateor avatar mrowl avatar nitay avatar omerzach avatar omrihq avatar onioni avatar rahulpratapm avatar slackhappy avatar toddgardner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fsqio's Issues

Twofishes should understand 3-letter ISO codes for countries

Do a Twofishes lookup on "Melbourne, Australia"
Result: correctly resolves

Do a Twofishes lookup on "Melbourne, AUS"
Result: resolves to "Austin, TX, United States"
Expect: should resolve the same as "Melbourne, Australia"

Twofishes expects 2-letter ISO codes so "Melbourne, AU" works perfectly. But 3-letter codes do not work at all.

Add Outcome monad to fsqio?

Hey there,

It'd be awesome to have Foursquare's Outcome class available for use in this repository. If I recall correctly it had few internal dependencies, so I imagine it'd be pretty straightforward to open source.

Thanks in advance!

Add HFileV1 reader/writer code to fsqio

Hello my fine friends!

I'm interested in using quiver at Slack, and I've hacked together the HFileV1 HFileOutputFormat with direct key value pairs on top of hbase 0.94, but as you can imagine, relying on that is a bit unpopular.

I remember that some of this code was reimplemented outside of hbase to avoid the legacy dep. Could we include it here? There is a real dearth of MR-writable file formats out there with an associated static server.

This is a whiskey-bounty eligible request.

<3
johng

hadoop-lzo jar resolution

Hello,

I'm running into an issue using the latest fsqio master where ivy is consistently unable to resolve hadoop-lzo and fails to run serve.py.

[NOT FOUND ] com.hadoop.gplcompression#hadoop-lzo;0.4.19!hadoop-lzo.jar (0ms)
==== maven.twttr.com-maven: tried
https://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/hadoop-lzo-0.4.19.jar

I'm not all too familiar with the dependency management aspects of twofishes. I'm running this from an ubuntu 14.04 docker container, and everything has worked historically ... until now. Any recommendations on how to figure out what's going on / solve for this? I would greatly appreciate it.

Strangely, when I git clone the fsqio repo, and run serve.py from the command line on an Amazon Linux 1 EC2 instance it works perfectly. No dependency issues. I need to run it from a docker container, though preferably.

pants build fails: No module named contrib.node.register

I am trying to run the following command line:

time ./src/jvm/io/fsq/twofishes/scripts/parse.py -w --output_prefix_index -r -s -i --yes-i-am-sure -- --create_unmatched_features true

It fails with this message:

Exception message: Failed to load the pants.contrib.node.register backend: No module named contrib.node.register

Attempting to fix this, I installed pantsbuild.pants.contrib.node with this command:

sudo pip install pantsbuild.pants.contrib.node

I have confirmed that the above succeeded. If I run Python and execute import pants.contrib.node.register the import succeeds. It didn't make any difference. I think the needed modules must be installed into the Python environment in my $HOME/.cache directory.

Can you please either modify the fsqio project so that it works, or else tell me what steps I need to take to configure my system such that it will work?

'[Errno 8] nodename nor servname provided, or not known' during pom-resolve on OSX

So - some users are failing to resolve the jar dependencies when running Fsq.io's custom jar resolver, pom-resolve on OSX. The problem is the interface between the python socket library and OSX DNS resolution/caching.

I have been able to confirm some very frustrating circumstances around this bug - and I don't understand all the details. But it can be fixed by setting the /etc/hosts file.

  $ hostname
  <some-name>

Back up your existing /etc/hosts file in case something goes wrong! Then edit your original to read:

  127.0.0.1   <some-name> localhost
  255.255.255.255   broadcasthost
  ::1 localhost
  fe80::1%lo0   localhost

Please replace <some-name> with the valuew you got from running hostname - without brackets. Reboot and try to run pom-resolve, hopefully with success!

Twofishes reliance on twofishes.net

Hi!

I've been trying to get twofishes working with docker but your image pulls pre-compiled data from twofishes.net which seems to be down.

Will twofishes be put back up? If not, would it be possible to get a link to the latest pre-compiled data? I am having trouble compiling it myself and I have resorted to using a pre-compiled version that I found in the DSTK EC2 AMI.

Thanks

Twofishes indexer does not handle MongoDB query failures properly.

During importing the data into the DB with "./src/jvm/io/fsq/twofishes/scripts/parse.py -w pwd/data/" the indexer crashed with the following error message. After the crash the indexer was stuck doing nothing, and the no more records were processed. Processed data was downloaded with "./src/jvm/io/fsq/twofishes/scripts/download-world.sh"

8246718 INFO  i.f.t.indexer.output.PrefixIndexer - done with 114000 of 2994876 prefixes
8247980 INFO  i.f.t.indexer.output.PrefixIndexer - done with 115000 of 2994876 prefixes
Exception in thread "main" io.fsq.rogue.RogueException: Mongo query on geocoder [db.name_index.find({ "name" : { "$regex" : "^?" , "$options" : ""} , "excludeFromPrefixIndex" : { "$ne" : true}}).sort({ "pop" : -1}).limit(1000)] failed after 2 ms
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.runCommand(BlockingMongoClientAdapter.scala:118)
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.runCommand(BlockingMongoClientAdapter.scala:52)
        at io.fsq.rogue.adapter.MongoClientAdapter.queryRunner(MongoClientAdapter.scala:471)
        at io.fsq.rogue.adapter.MongoClientAdapter.query(MongoClientAdapter.scala:496)
        at io.fsq.rogue.query.QueryExecutor.fetch(QueryExecutor.scala:144)
        at io.fsq.twofishes.indexer.output.PrefixIndexer.getRecordsByPrefix(PrefixIndexer.scala:76)
        at io.fsq.twofishes.indexer.output.PrefixIndexer$$anonfun$writeIndexImpl$2.apply(PrefixIndexer.scala:115)
        at io.fsq.twofishes.indexer.output.PrefixIndexer$$anonfun$writeIndexImpl$2.apply(PrefixIndexer.scala:110)
        at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
        at scala.collection.immutable.List.foreach(List.scala:381)
        at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
        at io.fsq.twofishes.indexer.output.PrefixIndexer.writeIndexImpl(PrefixIndexer.scala:110)
        at io.fsq.twofishes.indexer.output.Indexer$$anonfun$writeIndex$1.apply$mcV$sp(Indexer.scala:54)
        at io.fsq.twofishes.indexer.output.Indexer$$anonfun$writeIndex$1.apply(Indexer.scala:54)
        at io.fsq.twofishes.indexer.output.Indexer$$anonfun$writeIndex$1.apply(Indexer.scala:54)
        at io.fsq.twofishes.util.DurationUtils$.inNanoseconds(DurationUtils.scala:16)
        at io.fsq.twofishes.util.DurationUtils$class.logDuration(DurationUtils.scala:23)
        at io.fsq.twofishes.indexer.output.Indexer.logDuration(Indexer.scala:37)
        at io.fsq.twofishes.indexer.output.Indexer.writeIndex(Indexer.scala:54)
        at io.fsq.twofishes.indexer.output.NameIndexer.writeIndexImpl(NameIndexer.scala:84)
        at io.fsq.twofishes.indexer.output.Indexer$$anonfun$writeIndex$1.apply$mcV$sp(Indexer.scala:54)
        at io.fsq.twofishes.indexer.output.Indexer$$anonfun$writeIndex$1.apply(Indexer.scala:54)
        at io.fsq.twofishes.indexer.output.Indexer$$anonfun$writeIndex$1.apply(Indexer.scala:54)
        at io.fsq.twofishes.util.DurationUtils$.inNanoseconds(DurationUtils.scala:16)
        at io.fsq.twofishes.util.DurationUtils$class.logDuration(DurationUtils.scala:23)
        at io.fsq.twofishes.indexer.output.Indexer.logDuration(Indexer.scala:37)
        at io.fsq.twofishes.indexer.output.Indexer.writeIndex(Indexer.scala:54)
        at io.fsq.twofishes.indexer.output.OutputIndexes.buildIndexes(OutputHFile.scala:28)
        at io.fsq.twofishes.indexer.importers.geonames.GeonamesParser$.writeIndexes(GeonamesParser.scala:144)
        at io.fsq.twofishes.indexer.importers.geonames.GeonamesParser$.main(GeonamesParser.scala:106)
        at io.fsq.twofishes.indexer.importers.geonames.GeonamesParser.main(GeonamesParser.scala)
Caused by: com.mongodb.MongoQueryException: Query failed with error code 2 and error message 'Regular expression is invalid: invalid UTF-8 string' on server 127.0.0.1:27017
        at com.mongodb.operation.FindOperation$1.call(FindOperation.java:521)
        at com.mongodb.operation.FindOperation$1.call(FindOperation.java:510)
        at com.mongodb.operation.OperationHelper.withConnectionSource(OperationHelper.java:431)
        at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:404)
        at com.mongodb.operation.FindOperation.execute(FindOperation.java:510)
        at com.mongodb.operation.FindOperation.execute(FindOperation.java:81)
        at com.mongodb.Mongo.execute(Mongo.java:836)
        at com.mongodb.Mongo$2.execute(Mongo.java:823)
        at com.mongodb.OperationIterable.iterator(OperationIterable.java:47)
        at com.mongodb.OperationIterable.forEach(OperationIterable.java:70)
        at com.mongodb.FindIterableImpl.forEach(FindIterableImpl.java:166)
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.forEachProcessor(BlockingMongoClientAdapter.scala:162)
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.forEachProcessor(BlockingMongoClientAdapter.scala:52)
        at io.fsq.rogue.adapter.MongoClientAdapter$$anonfun$query$1.apply(MongoClientAdapter.scala:496)
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.findImpl(BlockingMongoClientAdapter.scala:272)
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.findImpl(BlockingMongoClientAdapter.scala:52)
        at io.fsq.rogue.adapter.MongoClientAdapter$$anonfun$queryRunner$1.apply(MongoClientAdapter.scala:472)
        at io.fsq.rogue.util.DefaultQueryLogger.onExecuteQuery(QueryLogger.scala:52)
        at io.fsq.rogue.adapter.BlockingMongoClientAdapter.runCommand(BlockingMongoClientAdapter.scala:113)
        ... 30 more

Mapquest API no longer supported

I built twofishes with the docker container--everything went swimmingly. ; ) However, and this is a nitpick, the mapquest API in the demo pages is no longer supported by mapquest and the tiles are just "error tiles" now. All their new stuff is here but I'm not sure what to update to exactly. I don't need this feature personally--just wanted to bring it to your attention if you're interested in maintaining your nice demo app!

Twofishes: ConnectionError building index

Freshly checked out tags/fsqio-2017-02-16-1638, I consistently get this error when building the index:

% ./src/jvm/io/fsq/twofishes/scripts/parse.py -w /opt/twofishes-output
outputting index to /opt/twofishes-output
Are you suuuuuure you want to drop your mongo data? Type "yes" to continue: yes
./pants run src/jvm/io/fsq/twofishes/indexer/importers/geonames:geonames-parser --jvm-run-jvm-options=-Dlogback.configurationFile=src/jvm/io/fsq/twofishes/indexer/data/logback.xml --jvm-run-jvm-program-args=--parse_world --jvm-run-jvm-program-args=true --jvm-run-jvm-program-args=--output_revgeo_index --jvm-run-jvm-program-args=false --jvm-run-jvm-program-args=--output_s2_covering_index --jvm-run-jvm-program-args=false --jvm-run-jvm-program-args=--output_s2_interior_index --jvm-run-jvm-program-args=false --jvm-run-jvm-program-args=--output_prefix_index --jvm-run-jvm-program-args=true --jvm-run-jvm-program-args=--reload_data --jvm-run-jvm-program-args=true --jvm-run-jvm-program-args=--hfile_basepath --jvm-run-jvm-program-args=/opt/twofishes-output

11:28:32 00:00 [main]
               (To run a reporting server: ./pants server)
11:28:32 00:00   [setup]
11:28:32 00:00     [parse]
               Executing tasks in goals: tag -> bootstrap -> imports -> unpack-jars -> validate -> build-spindle -> jvm-platform-validate -> deferred-sources -> gen -> webpack -> pom-resolve -> resources -> compile -> run
11:28:32 00:00   [tag]
11:28:32 00:00     [tag]
11:28:32 00:00   [bootstrap]
11:28:32 00:00     [substitute-aliased-targets]
11:28:32 00:00     [jar-dependency-management]
11:28:32 00:00     [bootstrap-jvm-tools]
11:28:32 00:00     [provide-tools-jar]
11:28:32 00:00   [imports]
11:28:32 00:00     [ivy-imports]
11:28:32 00:00   [unpack-jars]
11:28:32 00:00     [unpack-jars]
11:28:32 00:00   [validate]
11:28:32 00:00     [validate]
11:28:32 00:00   [build-spindle]
11:28:32 00:00     [build-spindle]
11:28:32 00:00       [cache] 
                   No cached artifacts for 1 target.
                   Invalidated 1 target.
11:28:32 00:00       [spindle-build]

11:28:33 00:00 [main]
               (To run a reporting server: ./pants server)
11:28:33 00:00   [setup]
11:28:33 00:00     [parse]
               Executing tasks in goals: tag -> bootstrap -> imports -> unpack-jars -> validate -> build-spindle -> deferred-sources -> jvm-platform-validate -> gen -> webpack -> pom-resolve -> resources -> compile -> bundle
11:28:33 00:00   [tag]
11:28:33 00:00     [tag]
11:28:33 00:00   [bootstrap]
11:28:33 00:00     [substitute-aliased-targets]
11:28:33 00:00     [jar-dependency-management]
11:28:33 00:00     [bootstrap-jvm-tools]
11:28:33 00:00     [provide-tools-jar]
11:28:33 00:00   [imports]
11:28:33 00:00     [ivy-imports]
11:28:33 00:00   [unpack-jars]
11:28:33 00:00     [unpack-jars]
11:28:33 00:00   [validate]
11:28:33 00:00     [validate]
11:28:33 00:00   [build-spindle]
11:28:33 00:00     [build-spindle]
11:28:33 00:00   [deferred-sources]
11:28:33 00:00     [deferred-sources]
11:28:33 00:00   [jvm-platform-validate]
11:28:33 00:00     [jvm-platform-validate]
11:28:33 00:00   [gen]
11:28:33 00:00     [thrift]
11:28:33 00:00     [protoc]
11:28:33 00:00     [antlr]
11:28:33 00:00     [ragel]
11:28:33 00:00     [jaxb]
11:28:33 00:00     [wire]
11:28:33 00:00     [validate-graph]
11:28:33 00:00     [spindle]
11:28:33 00:00   [webpack]
11:28:33 00:00     [webpack-resolve]
11:28:33 00:00     [webpack-gen]
11:28:33 00:00   [pom-resolve]
11:28:33 00:00     [pom-resolve]
                   Invalidated 86 targets.
11:28:33 00:00       [traverse-pom-graph]Exception caught: (<class 'requests.exceptions.ConnectionError'>)
  File "/home/ubuntu/.cache/fsqio/setup/bootstrap/1.2.1rc0/bin/pants", line 11, in <module>
    sys.exit(main())
  File "/home/ubuntu/.cache/fsqio/setup/bootstrap/pants.nkHwgM/install/local/lib/python2.7/site-packages/pants/bin/pants_exe.py", line 44, in main
    PantsRunner(exiter).run()
  File "/home/ubuntu/.cache/fsqio/setup/bootstrap/pants.nkHwgM/install/local/lib/python2.7/site-packages/pants/bin/pants_runner.py", line 57, in run
    options_bootstrapper=options_bootstrapper)
  File "/home/ubuntu/.cache/fsqio/setup/bootstrap/pants.nkHwgM/install/local/lib/python2.7/site-packages/pants/bin/pants_runner.py", line 46, in _run
    return LocalPantsRunner(exiter, args, env, options_bootstrapper=options_bootstrapper).run()
  File "/home/ubuntu/.cache/fsqio/setup/bootstrap/pants.nkHwgM/install/local/lib/python2.7/site-packages/pants/bin/local_pants_runner.py", line 53, in run
    self._maybe_profiled(self._run)
  File "/home/ubuntu/.cache/fsqio/setup/bootstrap/pants.nkHwgM/install/local/lib/python2.7/site-packages/pants/bin/local_pants_runner.py", line 50, in _maybe_profiled
    runner()
  File "/home/ubuntu/.cache/fsqio/setup/bootstrap/pants.nkHwgM/install/local/lib/python2.7/site-packages/pants/bin/local_pants_runner.py", line 95, in _run
    goal_runner_result = goal_runner.run()
  File "/home/ubuntu/.cache/fsqio/setup/bootstrap/pants.nkHwgM/install/local/lib/python2.7/site-packages/pants/bin/goal_runner.py", line 268, in run
    result = self._execute_engine()
  File "/home/ubuntu/.cache/fsqio/setup/bootstrap/pants.nkHwgM/install/local/lib/python2.7/site-packages/pants/bin/goal_runner.py", line 257, in _execute_engine
    result = engine.execute(self._context, self._goals)
  File "/home/ubuntu/.cache/fsqio/setup/bootstrap/pants.nkHwgM/install/local/lib/python2.7/site-packages/pants/engine/legacy_engine.py", line 26, in execute
    self.attempt(context, goals)
  File "/home/ubuntu/.cache/fsqio/setup/bootstrap/pants.nkHwgM/install/local/lib/python2.7/site-packages/pants/engine/round_engine.py", line 224, in attempt
    goal_executor.attempt(explain)
  File "/home/ubuntu/.cache/fsqio/setup/bootstrap/pants.nkHwgM/install/local/lib/python2.7/site-packages/pants/engine/round_engine.py", line 47, in attempt
    task.execute()
  File "/opt/fsqio/src/python/fsqio/pants/pom/pom_resolve.py", line 566, in execute
    global_pinned_versions,
  File "/opt/fsqio/src/python/fsqio/pants/pom/pom_resolve.py", line 415, in resolve_dependency_graphs
    for jar_lib, target_dep_graph in izip(all_jar_libs, dep_graph_iterator):
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 668, in next
    raise value

Exception message: None: Max retries exceeded with url: /webdav/geotools/com/cybozu/labs/langdetect/1.1-20120112/langdetect-1.1-20120112.jar (Caused by redirect)


11:29:13 00:40   [complete]
               FAILURE

FAILURE


11:29:14 00:42   [complete]
               FAILURE

Sometimes the problematic url changes; I have also seen so far:

/webdav/geotools/xml-apis/xml-apis/1.3.04/xml-apis-1.3.04.pom
/webdav/geotools/joda-time/joda-time/2.9.7/joda-time-2.9.7.jar
/webdav/geotools/aopalliance/aopalliance/1.0/aopalliance-1.0.pom
/webdav/geotools/org/apache/hadoop/hadoop-mapreduce-client-core/2.6.0/hadoop-mapreduce-client-core-2.6.0.jar
/maven2/com/github/salat/salat-util_2.11/1.10.0/salat-util_2.11-1.10.0.pom

[error] invalid source release: 1.8

running "./pants pom-resolve" from a clean checkout

               [1/8] Compiling 8 zinc sources in 1 target (src/jvm/io/fsq/spindle/common/thrift/base:base).INFO] killing nailgun server pid=12192

08:22:56 00:01 [zinc]
==== stderr ====

                   ==== stdout ====
                   [info] Compiling 8 Java sources to /home/blackmad/Code/fsqio/.pants.d/compile/zinc/252d64521cf9/src.jvm.io.fsq.spindle.common.thrift.base.base/current/classes...
                   [error] invalid source release: 1.8

08:22:56 00:01 [compile]
compile(src/jvm/io/fsq/spindle/common/thrift/base:base) failed: Zinc compile failed.
FAILURE: Compilation failure: Failed jobs: compile(src/jvm/io/fsq/spindle/common/thrift/base:base)

08:22:57 00:02 [complete]
FAILURE

FAILURE

08:22:57 00:02 [complete]

Error installing buildgen plugin in new repository.

Adding the buildgen from pypi to the plugins list causes some build failures (full stack trace included below).

I have a reproducer here: https://github.com/toddgardner/fsqio-plugin-test

(I think this is due to unpinning the setuptools version in the pants script, from the PRs around: pantsbuild/setup#14, specifically pantsbuild/setup@4eccc24) Not sure what the fix would be here.

(this corresponds to 1 and 3 from #27)

$ ./pants compile
New python executable in /Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/pants.xyWThV/install/bin/python
Installing setuptools, pip, wheel...done.
Requirement already satisfied (use --upgrade to upgrade): setuptools<31.0,>=5.4.1 in /Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/pants.xyWThV/install/lib/python2.7/site-packages
Collecting pantsbuild.pants==1.2.0
  Downloading pantsbuild.pants-1.2.0.tar.gz (892kB)
    100% |████████████████████████████████| 901kB 845kB/s 
Collecting twitter.common.collections<0.4,>=0.3.1 (from pantsbuild.pants==1.2.0)
Collecting ansicolors==1.0.2 (from pantsbuild.pants==1.2.0)
Collecting setproctitle==1.1.10 (from pantsbuild.pants==1.2.0)
Collecting six<2,>=1.9.0 (from pantsbuild.pants==1.2.0)
  Using cached six-1.10.0-py2.py3-none-any.whl
Collecting packaging==16.7 (from pantsbuild.pants==1.2.0)
  Using cached packaging-16.7-py2.py3-none-any.whl
Collecting pathspec==0.3.4 (from pantsbuild.pants==1.2.0)
Collecting scandir==1.2 (from pantsbuild.pants==1.2.0)
Collecting twitter.common.dirutil<0.4,>=0.3.1 (from pantsbuild.pants==1.2.0)
Collecting requests<2.6,>=2.5.0 (from pantsbuild.pants==1.2.0)
  Using cached requests-2.5.3-py2.py3-none-any.whl
Collecting pystache==0.5.3 (from pantsbuild.pants==1.2.0)
Collecting psutil==4.3.0 (from pantsbuild.pants==1.2.0)
Collecting pex==1.1.13 (from pantsbuild.pants==1.2.0)
  Downloading pex-1.1.13-py2.py3-none-any.whl (105kB)
    100% |████████████████████████████████| 112kB 2.0MB/s 
Collecting docutils<0.13,>=0.12 (from pantsbuild.pants==1.2.0)
Collecting Markdown==2.1.1 (from pantsbuild.pants==1.2.0)
Collecting Pygments==1.4 (from pantsbuild.pants==1.2.0)
Collecting twitter.common.confluence<0.4,>=0.3.1 (from pantsbuild.pants==1.2.0)
Collecting fasteners==0.14.1 (from pantsbuild.pants==1.2.0)
  Using cached fasteners-0.14.1-py2.py3-none-any.whl
Collecting coverage<3.8,>=3.7 (from pantsbuild.pants==1.2.0)
Collecting pytest<2.7,>=2.6 (from pantsbuild.pants==1.2.0)
  Downloading pytest-2.6.4.tar.gz (512kB)
    100% |████████████████████████████████| 522kB 1.2MB/s 
Collecting pytest-cov<1.9,>=1.8 (from pantsbuild.pants==1.2.0)
  Downloading pytest-cov-1.8.1.tar.gz
Collecting futures==3.0.5 (from pantsbuild.pants==1.2.0)
  Using cached futures-3.0.5-py2-none-any.whl
Collecting cffi==1.7.0 (from pantsbuild.pants==1.2.0)
  Using cached cffi-1.7.0-cp27-cp27m-macosx_10_10_intel.whl
Collecting lmdb==0.89 (from pantsbuild.pants==1.2.0)
Collecting pywatchman==1.3.0 (from pantsbuild.pants==1.2.0)
Collecting twitter.common.lang==0.3.9 (from twitter.common.collections<0.4,>=0.3.1->pantsbuild.pants==1.2.0)
Collecting pyparsing (from packaging==16.7->pantsbuild.pants==1.2.0)
  Using cached pyparsing-2.1.10-py2.py3-none-any.whl
Collecting twitter.common.log==0.3.9 (from twitter.common.confluence<0.4,>=0.3.1->pantsbuild.pants==1.2.0)
Collecting monotonic>=0.1 (from fasteners==0.14.1->pantsbuild.pants==1.2.0)
  Using cached monotonic-1.2-py2.py3-none-any.whl
Collecting py>=1.4.25 (from pytest<2.7,>=2.6->pantsbuild.pants==1.2.0)
  Downloading py-1.4.32-py2.py3-none-any.whl (82kB)
    100% |████████████████████████████████| 92kB 2.0MB/s 
Collecting cov-core>=1.14.0 (from pytest-cov<1.9,>=1.8->pantsbuild.pants==1.2.0)
  Downloading cov-core-1.15.0.tar.gz
Collecting pycparser (from cffi==1.7.0->pantsbuild.pants==1.2.0)
Collecting twitter.common.options==0.3.9 (from twitter.common.log==0.3.9->twitter.common.confluence<0.4,>=0.3.1->pantsbuild.pants==1.2.0)
Building wheels for collected packages: pantsbuild.pants, pytest, pytest-cov, cov-core
  Running setup.py bdist_wheel for pantsbuild.pants ... done
  Stored in directory: /Users/todd/Library/Caches/pip/wheels/ce/ff/3a/2d8598b56bdfb9468ea67ee963a03c3b35b22dd597084a7cb1
  Running setup.py bdist_wheel for pytest ... done
  Stored in directory: /Users/todd/Library/Caches/pip/wheels/ca/1c/fe/8b76e537572f91c810910e822cccb178ba3156e432e644ac89
  Running setup.py bdist_wheel for pytest-cov ... done
  Stored in directory: /Users/todd/Library/Caches/pip/wheels/e5/94/08/eab43cda4e17e6ae729ed2cd832d7ca4e3ddca7fa6886ec2b8
  Running setup.py bdist_wheel for cov-core ... done
  Stored in directory: /Users/todd/Library/Caches/pip/wheels/86/e1/c2/9ff8cfe9773ce07003f2c2be096e169af4614c2f634671d49b
Successfully built pantsbuild.pants pytest pytest-cov cov-core
Installing collected packages: twitter.common.lang, twitter.common.collections, ansicolors, setproctitle, six, pyparsing, packaging, pathspec, scandir, twitter.common.dirutil, requests, pystache, psutil, pex, docutils, Markdown, Pygments, twitter.common.options, twitter.common.log, twitter.common.confluence, monotonic, fasteners, coverage, py, pytest, cov-core, pytest-cov, futures, pycparser, cffi, lmdb, pywatchman, pantsbuild.pants
Successfully installed Markdown-2.1.1 Pygments-1.4 ansicolors-1.0.2 cffi-1.7.0 cov-core-1.15.0 coverage-3.7.1 docutils-0.12 fasteners-0.14.1 futures-3.0.5 lmdb-0.89 monotonic-1.2 packaging-16.7 pantsbuild.pants-1.2.0 pathspec-0.3.4 pex-1.1.13 psutil-4.3.0 py-1.4.32 pycparser-2.17 pyparsing-2.1.10 pystache-0.5.3 pytest-2.6.4 pytest-cov-1.8.1 pywatchman-1.3.0 requests-2.5.3 scandir-1.2 setproctitle-1.1.10 six-1.10.0 twitter.common.collections-0.3.9 twitter.common.confluence-0.3.9 twitter.common.dirutil-0.3.9 twitter.common.lang-0.3.9 twitter.common.log-0.3.9 twitter.common.options-0.3.9
You are using pip version 8.1.2, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Exception caught: (<class 'pkg_resources.ContextualVersionConflict'>)
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/bin/pants", line 11, in <module>
    sys.exit(main())
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/lib/python2.7/site-packages/pants/bin/pants_exe.py", line 44, in main
    PantsRunner(exiter).run()
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/lib/python2.7/site-packages/pants/bin/pants_runner.py", line 57, in run
    options_bootstrapper=options_bootstrapper)
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/lib/python2.7/site-packages/pants/bin/pants_runner.py", line 46, in _run
    return LocalPantsRunner(exiter, args, env, options_bootstrapper=options_bootstrapper).run()
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/lib/python2.7/site-packages/pants/bin/local_pants_runner.py", line 53, in run
    self._maybe_profiled(self._run)
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/lib/python2.7/site-packages/pants/bin/local_pants_runner.py", line 50, in _maybe_profiled
    runner()
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/lib/python2.7/site-packages/pants/bin/local_pants_runner.py", line 59, in _run
    options, build_config = OptionsInitializer(options_bootstrapper, exiter=self._exiter).setup()
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/lib/python2.7/site-packages/pants/bin/options_initializer.py", line 174, in setup
    backends)
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/lib/python2.7/site-packages/pants/bin/options_initializer.py", line 83, in _load_plugins
    return load_backends_and_plugins(plugins, working_set, backend_packages)
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/lib/python2.7/site-packages/pants/bin/extension_loader.py", line 37, in load_backends_and_plugins
    load_plugins(build_configuration, plugins or [], working_set)
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/lib/python2.7/site-packages/pants/bin/extension_loader.py", line 83, in load_plugins
    aliases = entries['build_file_aliases'].load()()
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2228, in load
    self.require(*args, **kwargs)
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/lib/python2.7/site-packages/pkg_resources/__init__.py", line 2245, in require
    items = working_set.resolve(reqs, env, installer)
  File "/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/1.2.0/lib/python2.7/site-packages/pkg_resources/__init__.py", line 834, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)

Exception message: (setuptools 21.2.1 (/Users/todd/.cache/pants/setup/bootstrap-Darwin-x86_64/pants.xyWThV/install/lib/python2.7/site-packages), Requirement.parse('setuptools==5.4.1'), set(['pantsbuild.pants']))

pants build fails: Exception in thread "main" java.lang.NoSuchFieldError: NONE

Here's the last part of the log:

INFO: Admin HTTP interface started on port 7655.
read 0 slugs
512   INFO  io.fsq.twofishes.util.Helpers$ - readSlugs took 0 seconds
1738  INFO  org.mongodb.driver.cluster - Cluster created with settings {hosts=[127.0.0.1:27017], mode=SINGLE, requiredClusterType=UNKNOWN, serverSelectionTimeout='30000 ms', maxWaitQueueSize=500}
Exception in thread "main" java.lang.NoSuchFieldError: NONE
        at com.mongodb.casbah.WriteConcern$.<init>(WriteConcern.scala:40)
        at com.mongodb.casbah.WriteConcern$.<clinit>(WriteConcern.scala)
        at com.mongodb.casbah.BaseImports$class.$init$(Implicits.scala:162)
        at com.mongodb.casbah.Imports$.<init>(Implicits.scala:142)
        at com.mongodb.casbah.Imports$.<clinit>(Implicits.scala)
        at com.mongodb.casbah.MongoClient.apply(MongoClient.scala:217)
        at io.fsq.twofishes.indexer.mongo.MongoGeocodeDAO$.<init>(MongoGeocodeDAO.scala:10)
        at io.fsq.twofishes.indexer.mongo.MongoGeocodeDAO$.<clinit>(MongoGeocodeDAO.scala)
        at io.fsq.twofishes.indexer.importers.geonames.GeonamesParser$.main(GeonamesParser.scala:96)
        at io.fsq.twofishes.indexer.importers.geonames.GeonamesParser.main(GeonamesParser.scala)
1848  INFO  org.mongodb.driver.connection - Opened connection [connectionId{localValue:1, serverValue:1}] to 127.0.0.1:27017
1849  INFO  org.mongodb.driver.cluster - Monitor thread successfully connected to server with description ServerDescription{address=127.0.0.1:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[2, 0, 4]}, minWireVersion=0, maxWireVersion=0, maxDocumentSize=16777216, roundTripTimeNanos=388042}

Once the build reaches this point, it makes no further progress. I let it run overnight and it didn't continue.

This happens with an absolutely clean start: clone the repo into an empty directory, make a new empty directory and set $HOME to the empty directory.

After cloning I run:

./src/jvm/io/fsq/twofishes/scripts/download-world.sh

I'm using this command line to do the indexes build:

./src/jvm/io/fsq/twofishes/scripts/parse.py -w --output_prefix_index -r -s -i --yes-i-am-sure -- --create_unmatched_features true

I had no trouble building two weeks ago, so I tried doing a git checkout to an earlier version of the repo, and still hit this error. I'm not sure how that's possible; I hope it's some mistake I have made.

Request for more docs on Twofishes autocomplete

I have been trying to use the Twofishes autocomplete feature. I am building JSON requests and trying to specify everything I can to tell Twofishes how to respond. In particular, I am providing latitude/longitude and specifying autocompleteBias of LOCAL.

Here's a sample JSON query (pretty-printed for readability):

{
  "autocomplete": true,
  "autocompleteBias": 2,
  "cc": "US",
  "ll": {
    "lat": 34.01945,
    "lng": -118.49119
  },
  "maxInterpretations": 25,
  "query": "Ca",
  "responseIncludes": [7],
  "woeHint": [7],
  "woeRestrict": [22,7,10,9,8,11,12,13]
}

The above specifies the lat/lng of Santa Monica, and is attempting to autocomplete the city of Calabasas (less than 20 miles away straight-line distance). The results do not include Calabasas but do include results from distant countries.

Even worse, if you send JSON just like the above but with the "query" set to "Sa", Santa Monica isn't in the results, even though the distance from the lat/lng is 0.

Am I doing anything wrong in my query? Is there a better way to do the query? Can you share any documentation on how to best run autocomplete queries?

Instructions for setting up pom-resolve on a new repo?

Hi @mateor,

Following up from slack, I was wondering if you could give any guidance on setting up pom-resolve and validation stuff on a new repo; I don't see it on packages on pypi for anything but buildgen. It looks like neat stuff! Solves a lot of the problems I was having with pants deps.

(I was having problems getting buildgen to work, but I think I've root caused it to changes in the setup script from pantsbuild/setup#15 which causes version conflicts on setuptools by what the script installs and what pants requires when fsqio requires it; I'm not clear what the right fix is)

OSX failures with psycopg or `ld: library not found for -lssl`

There is a OSX dependency on clang and gcc through the postgresql requirement. I have seen machines that haven't updated Homebrew or XCode regularly hit the referenced errors when bootstrapping Pants.

This is easy to hit when you upgrade to OSX Sierra, as well.

If you encounter bootstrap failures with psycopg, clang, xcode, postgresql or ssl, try:

  • Upgrade XCode
  • run:
brew unlink python postgresql
brew uninstall python postgresql
brew install python postgresql
  • reboot
  • run pants again

This is due to upstream Apple and Homebrew dithering, not much we can do about it from here.

Twofishes does not seem to recognize streets nor housenumbers at all

Hi,
I installed twofishes with a wold-download following the install instructions.
The geocoder is running and resolves villages, cities, named places around the world, but does not recognize any streetnames, nor housenumbers.

What could have gone wrong during installation? How can I find out? And how can I correct it?

Thanks for any help!

Instructions for installing pom resolve in a new repo

After the last issue I took a stab at it, so far it's working but haven't tried putting in the complicated deps.

Copy these files from the fsqio repo:

src/python/fsqio/__init__.py
src/python/fsqio/pants/__init__.py
src/python/fsqio/pants/pom/*

You can delete src/python/fsqio/pants/pom/BUILD unless you're planning on writing tests for or publishing the pom-resolve; this means you don't need to put the deps in your 3rdparty file.

Add this to your pants.ini:

# Note this is effectively requirements.txt for the pants environment, so we need the deps of
# our build dependancies, see: https://github.com/pantsbuild/pants/issues/4001
[GLOBAL]
plugins: +[
    "requests==2.5.3",
    "requests-futures==0.9.4",
   ]
backend_packages: +[
    "fsqio.pants.pom",
  ]
[pom-resolve]
# Fill in these values as appropriate; there's good defaults in fsqio's pants.ini
maven_repos: [
    ...
  ]
global_exclusions = [
    ...
  ]
global_pinned_versions = [
    ...
  ]
local_override_versions = [
    ...
  ]

[cache.pom-resolve]
write_to: ...
read_from: ...

Installing tag validation on a new repo

Took a stab at this also, seems straight forward.

Copy these files from the foursquare/fsqio repo to your repo:

src/python/fsqio/__init__.py
src/python/fsqio/pants/__init__.py
src/python/fsqio/pants/register.py
src/python/fsqio/pants/validate.py

Add a:

 [GLOBAL]
pythonpath: +[
    "%(buildroot)s/src/python",
  ]
backend_packages: +[
    "pants.contrib.node",
    "pants.contrib.python.checks",
    "fsqio.pants",
  ]

[tag]
by_prefix:  {
    "3rdparty": ["exempt"],
    "tests": ["tests"],
    "src": ["dependencies_cannot_have:tests"]
  }

`./pants` does not work out of the box - no `requests-sessions`

Looks like so on a fresh clone:

jsirois@gill ~/dev/3rdparty/fsqio (master) $ git log -1
commit 837a4965444334a849f539ed20f8b6a7f2d46af9
Author: John Gallagher <[email protected]>
Date:   Fri Jan 8 14:41:08 2016 -0500

    Update fhttp readme for fsq.io

    (sapling split of 15944090750f351ba99b4bee763c84019a33affc)
jsirois@gill ~/dev/3rdparty/fsqio (master) $ git clean -fdx && ./pants pom-resolve
++++ which python2.7
+++ PYTHON=/home/jsirois/.pyenv/shims/python2.7
+++ PANTS_HOME=/home/jsirois/.cache/pants/setup
+++ PANTS_BOOTSTRAP=/home/jsirois/.cache/pants/setup/bootstrap
+++ VENV_VERSION=13.1.2
+++ VENV_PACKAGE=virtualenv-13.1.2
+++ VENV_TARBALL=virtualenv-13.1.2.tar.gz
+++ FOURSQUARE_REQUIREMENTS=3rdparty/python/requirements.txt
++++ bootstrap_pants
++++ pants_requirement=pantsbuild.pants
+++++ grep -E '^[[:space:]]*pants_version' pants.ini
+++++ sed -E 's|^[[:space:]]*pants_version[:=][[:space:]]*([^[:space:]]+)|\1|'
++++ pants_version=0.0.65
++++ [[ -n 0.0.65 ]]
++++ pants_requirement=pantsbuild.pants==0.0.65
++++ [[ ! -d /home/jsirois/.cache/pants/setup/bootstrap/0.0.65 ]]
++++ echo /home/jsirois/.cache/pants/setup/bootstrap/0.0.65
+++ pants_dir=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65
+++ export PANTSBINARY=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants
+++ PANTSBINARY=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants
+ '[' '' == --help ']'
+ '[' '' == force ']'
+ '[' '' '!=' '' ']'
+ exec ./scripts/upkeep/check.sh
++++ PYTHON=/home/jsirois/.pyenv/shims/python2.7
++++ PANTS_HOME=/home/jsirois/.cache/pants/setup
++++ PANTS_BOOTSTRAP=/home/jsirois/.cache/pants/setup/bootstrap
++++ VENV_VERSION=13.1.2
++++ VENV_PACKAGE=virtualenv-13.1.2
++++ VENV_TARBALL=virtualenv-13.1.2.tar.gz
++++ FOURSQUARE_REQUIREMENTS=3rdparty/python/requirements.txt
+++++ bootstrap_pants
+++++ pants_requirement=pantsbuild.pants
++++++ grep -E '^[[:space:]]*pants_version' pants.ini
++++++ sed -E 's|^[[:space:]]*pants_version[:=][[:space:]]*([^[:space:]]+)|\1|'
+++++ pants_version=0.0.65
+++++ [[ -n 0.0.65 ]]
+++++ pants_requirement=pantsbuild.pants==0.0.65
+++++ [[ ! -d /home/jsirois/.cache/pants/setup/bootstrap/0.0.65 ]]
+++++ echo /home/jsirois/.cache/pants/setup/bootstrap/0.0.65
++++ pants_dir=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65
++++ export PANTSBINARY=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants
++++ PANTSBINARY=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants
+ set -e
+ shopt -s nullglob
+ ran=
+ '[' '' '!=' '' ']'
+ '[' '' '!=' '' ']'
++++ which python2.7
+++ PYTHON=/home/jsirois/.pyenv/shims/python2.7
+++ PANTS_HOME=/home/jsirois/.cache/pants/setup
+++ PANTS_BOOTSTRAP=/home/jsirois/.cache/pants/setup/bootstrap
+++ VENV_VERSION=13.1.2
+++ VENV_PACKAGE=virtualenv-13.1.2
+++ VENV_TARBALL=virtualenv-13.1.2.tar.gz
+++ FOURSQUARE_REQUIREMENTS=3rdparty/python/requirements.txt
++++ bootstrap_pants
++++ pants_requirement=pantsbuild.pants
+++++ grep -E '^[[:space:]]*pants_version' pants.ini
+++++ sed -E 's|^[[:space:]]*pants_version[:=][[:space:]]*([^[:space:]]+)|\1|'
++++ pants_version=0.0.65
++++ [[ -n 0.0.65 ]]
++++ pants_requirement=pantsbuild.pants==0.0.65
++++ [[ ! -d /home/jsirois/.cache/pants/setup/bootstrap/0.0.65 ]]
++++ echo /home/jsirois/.cache/pants/setup/bootstrap/0.0.65
+++ pants_dir=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65
+++ export PANTSBINARY=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants
+++ PANTSBINARY=/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants
+ '[' -n '' ']'
+ '[' -z /home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants ']'
+ export PYTHONPATH=src/python
+ PYTHONPATH=src/python
+ exec /home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants --print-exception-stacktrace pom-resolve
Traceback (most recent call last):
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/extension_loader.py", line 140, in load_backend
    module = importlib.import_module(backend_module)
  File "/usr/lib64/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
  File "/home/jsirois/dev/3rdparty/fsqio/src/python/fsqio/pants/register.py", line 15, in <module>
    from fsqio.pants.pom.pom_resolve import PomResolve
  File "/home/jsirois/dev/3rdparty/fsqio/src/python/fsqio/pants/pom/pom_resolve.py", line 39, in <module>
    from requests_futures.sessions import FuturesSession
ImportError: No module named requests_futures.sessions

Exception caught: (<class 'pants.base.exceptions.BackendConfigurationError'>)
  File "/home/jsirois/.cache/pants/setup/bootstrap/0.0.65/bin/pants", line 11, in <module>
    sys.exit(main())
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/pants_exe.py", line 26, in main
    LocalPantsRunner(exiter).run()
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/pants_runner.py", line 60, in run
    self._maybe_profiled(self._run)
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/pants_runner.py", line 57, in _maybe_profiled
    runner()
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/pants_runner.py", line 65, in _run
    options, build_config = OptionsInitializer(options_bootstrapper, exiter=self.exiter).setup()
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/goal_runner.py", line 128, in setup
    return self._setup_options(self._options_bootstrapper, self._working_set)
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/goal_runner.py", line 102, in _setup_options
    build_configuration = load_plugins_and_backends(plugins, working_set, backend_packages)
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/extension_loader.py", line 36, in load_plugins_and_backends
    load_build_configuration_from_source(build_configuration, additional_backends=backends or [])
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/extension_loader.py", line 126, in load_build_configuration_from_source
    load_backend(build_configuration, backend_package)
  File "/home/jsirois/.cache/pants/setup/bootstrap/pants.4sOId7/install/lib/python2.7/site-packages/pants/bin/extension_loader.py", line 144, in load_backend
    .format(backend=backend_module, error=e))

Exception message: Failed to load the fsqio.pants.register backend: No module named requests_futures.sessions

So the build script either needs to bootstrap a venv with 3rdparty/python/requirements.txt or else instructions need to be added to the README to do some out of band manual setup.

twofishes.net web site needs updates

The twofishes.net web site still links to the now-obsolete "twofishes" repo on GitHub. It should be edited to point to the new "fsqio" repo.

Also, the prebuilt Twofishes .jar file is very old, and should be updated with a recent build. There is a bug in autocomplete that was fixed in Twofishes 0.90.3 that is still present in the prebuilt Twofishes .jar file, and a new build would include the fix.

In order to consume published jars, you need to add too many resolvers

Currently we fetch 3rdparty dependencies from multiple resolvers. Consuming published jars (say from sbt) requires that the sbt project be aware of all of those resolvers. A couple of these resolvers are only needed for single jars, this issue is tracking the need for a better solution.

If you are trying to consume io.fsq artifacts in an sbt project you will need to point it at all the resolvers we use. That list is in the pants.ini file under [pom-resolve] maven_repos. As of the time of this issue, adding these resolvers to an sbt project looks like:

  resolvers += "Sonatype OSS Releases" at "https://oss.sonatype.org/content/repositories/releases"

  resolvers += "geotools" at "http://download.osgeo.org/webdav/geotools"

  resolvers += "maven" at "https://repo.maven.apache.org/maven2"

  resolvers += "twitter" at "https://maven.twttr.com"

  resolvers += "jnegre" at "https://bintray.com/artifact/download/jnegre/maven"

  resolvers += "con-jars" at "http://conjars.org/repos"

  resolvers += "clojars" at "http://clojars.org/repo"

We are considering how to make this easier in the future but for now you will need to be able to fetch jars from all of the above repos.

Twofishes does not handle misspelled city names at all

Do a Twofishes lookup on: "Baton Rouge, LA"
Result: correctly resolves

Do a Twofishes lookup on: "Baton Roug, LA"
Result: resolves to "Los Angeles, CA, United States"

Any misspelling in the city name causes a failure. We first saw this when someone transposed the 'u' and the 'g' to get "Baton Rogue, LA" which won't even trip a spell checker since "Rogue" is a word.

Similarly, "Terra Haute, IN" resolves to the country of India. "Terre Haute, IN" correctly resolves.

The basic problem is that any misspelling in the city name causes the city name to be completely ignored. If you are lucky it will at least get the state right (example: "Minneapolid, MN" resolves to Minnesota, United States). But Los Angeles is over 1800 miles away from Louisiana so it's a very poor result.

I speculate that a misspelled city name is ignored because Twofishes is designed to ignore extra stuff as in "Empire State Building, 350 5th Ave, New York, NY" which resolves to New York, NY, United States.

Build failure

Build of freshly cloned fsqio crashes with the following messages:

$ ./pants compile test src:: test::
Traceback (most recent call last):
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/init/extension_loader.py", line 123, in load_backend
    module = importlib.import_module(backend_module)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
    __import__(name)
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/contrib/go/register.py", line 15, in <module> 
    from pants.contrib.go.tasks.go_binary_create import GoBinaryCreate
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/contrib/go/tasks/go_binary_create.py", line 15, in <module> 
    from pants.contrib.go.tasks.go_task import GoTask
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/contrib/go/tasks/go_task.py", line 15, in <module> 
    from pants.util.process_handler import subprocess
ImportError: cannot import name subprocess
Exception caught: (<class 'pants.base.exceptions.BackendConfigurationError'>)
  File "/Users/user/.cache/fsqio/setup/bootstrap/1.3.1rc1/bin/pants", line 11, in <module> 
    sys.exit(main())
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/bin/pants_exe.py", line 44, in main 
    PantsRunner(exiter).run()
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/bin/pants_runner.py", line 57, in run
    options_bootstrapper=options_bootstrapper)
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/bin/pants_runner.py", line 46, in _run 
    return LocalPantsRunner(exiter, args, env, options_bootstrapper=options_bootstrapper).run()
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/bin/local_pants_runner.py", line 37, in run
    self._run()
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/bin/local_pants_runner.py", line 43, in _run 
    options, build_config = OptionsInitializer(options_bootstrapper, exiter=self._exiter).setup()
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/init/options_initializer.py", line 155, in setup
    global_bootstrap_options.backend_packages)
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/init/options_initializer.py", line 82, in _load_plugins
    return load_backends_and_plugins(plugins, working_set, backend_packages)
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/init/extension_loader.py", line 36, in load_backends_and_plugins
    load_build_configuration_from_source(build_configuration, backends)
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/init/extension_loader.py", line 109, in load_build_configuration_from_source
    load_backend(build_configuration, backend_package)
  File "/Users/user/.cache/fsqio/setup/bootstrap/pants.nhTMrG/install/lib/python2.7/site-packages/pants/init/extension_loader.py", line 127, in load_backend
    .format(backend=backend_module, error=e))

Exception message: Failed to load the pants.contrib.go.register backend: cannot import name subprocess

OSX and Fedora 27 are affected by the issue, I don't have access to other systems to test.

Docker image update?

In reference to #17, we discovered that the docker image is probably out-of-date and fresh builds don't include changes that allow the map tiles to render properly. This ticket is to track updating the docker image.

Unable to build indexes

I have had no trouble building indexes from the old twofishes repository. However, I have been completely unable to build indexes from the fsqio repo, despite multiple attempts.

  • parse.py attempts to check MongoDB version using the path: dependencies/mongodb/bin/mongo
    However, I can't find any documentation that shows how to set up the dependencies subdirectory, or any script that sets it up. I solved this problem by simply modifying parse.py, changing the filename to mongo and letting parse.py find it through the PATH environment variable.
  • Index building failed with a JVM garbage collection error. I attempted to solve this problem by modifying parse.py to add these JVM flags:
jvm_args.append("-Xms8G")
jvm_args.append("-Xmx8G")
jvm_args.append("-XX:+UseConcMarkSweepGC")
  • Index building failed with MongoDB reporting a duplicate key error.
6611072 INFO  i.f.t.i.i.geonames.GeonamesParser - imported 10940000 features so far
6615337 INFO  i.f.t.i.i.geonames.GeonamesParser - imported 10950000 features so far
6617326 INFO  i.f.t.i.i.geonames.GeonamesParser$ - finished: parse global features in 6591 secs / 109 mins
6617326 INFO  i.f.t.i.i.geonames.GeonamesParser$ - starting: parse global postal codes
6617329 INFO  i.f.t.i.i.geonames.GeonamesParser - imported 0 postal codes so far
EException in thread "main" com.mongodb.MongoException$DuplicateKey: { "serverUsed" : "127.0.0.1:27017" , "err" : "E11000 duplicate key error index: geocoder.features.$_id_  dup key: { : 144396663052787542 }" , "code" : 11000 , "n" : 0 , "connectionId" : 2 , "ok" : 1.0}

I tried building with different options, I tried re-cloning fsqio and I made sure that the MongoDB data directory was completely empty before restarting MongoDB and the build. I still had the duplicate key error.

  • I have been trying to build on an Amazon EC2 instance, and therefore under Linux. I am right now trying to get a Mac so I can set up my build environment to be exactly like what Foursquare uses. (I do think it would be desirable to have Linux as a supported build platform.)
  • I would very much appreciate any additional documentation you could give about index building and how to set it up. I don't know what purpose the S2 indexes serve, I don't know for certain how to build in stages... For example: if I build only the reverse geo indexes, and then I want to build the autocomplete indexes... can I just run the index build again with --noreload and the result will be indexes with both reverse geo and autocomplete?
  • I understand that Foursquare uses a Hadoop cluster to build indexes. I would very much appreciate a README file describing how to do that. I haven't used a Hadoop cluster and I don't even know where to start. Do I need to learn Java, Scala, and the whole Hadoop stack just to run a Hadoop build?
  • I would appreciate the dependencies for building to be better documented. For example, is Oracle JDK the only supported JDK? (In the twofishes repo, OpenJDK 7 works just fine. In the fsqio repo, I tried OpenJDK 8 and had mysterious problems so I installed Oracle JDK.) Also, is a specific version of Scala required, or just "anything 2.10 or newer"? Postgres is listed as a requirement.... is that required for index building or just for the mongo2postgis.py script?

I apologize if this is too much stuff for one issue in the issue tracker. If you would prefer I ask these questions in another venue please let me know where; and if you would prefer I split this up into multiple issue tickets, just let me know. Thank you.

Pants buildgen plugin out of date

Hello! I was trying to use the buildgen plugin and it looks like pants development has left it in the dust. e.g. within the plugin source there's a from pants.build_graph.source_mapper import LazySourceMapper, which no longer exists within pants (since 2018 or so?). Is there an updated version of buildgen which hasn't been upstreamed yet or has the project been abandoned?

Start populating the github wiki

We have some internal documentation that I can migrate to the wiki - and we have had multiple consumers write up their experiences consuming plugins.

I think my favorite answer is to just add .md files in the source. I know it is a little non-standard but the contributor list is so small that I don't have much faith in being able to keep a separate wiki from drifting.

The short term answer is to take the text of issues #27,#28, #29, #30 and somehow surfacing them on the Fsq.io wiki tab.

twofishes: export polygons

Hi, I've followed this: https://github.com/foursquare/fsqio/blob/master/src/jvm/io/fsq/twofishes/docs/twofishes_inputs.md#polygons and put my shape files into data/private/polygons/

When I run the build with ./src/jvm/io/fsq/twofishes/scripts/parse.py --world --output_s2_covering_index --output_s2_interior_index --output_prefix_index --output_revgeo_index --yes-i-am-sure it uses significantly more disk space than without the shapefiles in place, however the resulting indexes are no bigger.

Is there some other step I need to run to export the polygons as well?

Spindle doesn't support i64 consts?

I have

const i64 MaxTotalAttachmentBytes = 10000000000
const i64 MaxCoverPhotoBytes = 50000000 // 50 MB

in my Thrift and get the following when I compile:

Omers-MacBook-Pro:sigma-monorepo-2point0 omer$ ./pants compile src/thrift/::

18:11:15 00:00 [main]
               (To run a reporting server: ./pants server)
18:11:15 00:00   [setup]
18:11:15 00:00     [parse]
               Executing tasks in goals: bootstrap -> imports -> unpack-jars -> deferred-sources -> gen -> jvm-platform-validate -> resolve -> compile
18:11:15 00:00   [bootstrap]
18:11:15 00:00     [jar-dependency-management]
18:11:15 00:00     [bootstrap-jvm-tools]
18:11:15 00:00   [imports]
18:11:15 00:00     [ivy-imports]
18:11:15 00:00   [unpack-jars]
18:11:15 00:00     [unpack-jars]
18:11:15 00:00   [deferred-sources]
18:11:15 00:00     [deferred-sources]
18:11:15 00:00   [gen]
18:11:15 00:00     [spindle]
18:11:16 00:01     [thrift]
18:11:16 00:01     [protoc]
18:11:16 00:01     [antlr]
18:11:16 00:01     [ragel]
18:11:16 00:01     [jaxb]
18:11:16 00:01     [wire]
18:11:16 00:01   [jvm-platform-validate]
18:11:16 00:01     [jvm-platform-validate]WARN] No default jvm platform is defined.

18:11:16 00:01       [cache]
                   No cached artifacts for 55 targets.
                   Invalidated 55 targets.
18:11:16 00:01   [resolve]
18:11:16 00:01     [ivy]
18:11:16 00:01   [compile]
18:11:16 00:01     [compile-jvm-prep-command]
18:11:16 00:01       [jvm_prep_command]
18:11:16 00:01     [compile-prep-command]
18:11:16 00:01     [compile]
18:11:16 00:01     [zinc]
18:11:16 00:01       [cache]
                   No cached artifacts for 1 target.
                   Invalidated 1 target.
18:11:16 00:01       [isolation-zinc-pool-bootstrap]
                   [1/1] Compiling 2 zinc sources in 1 target (.pants.d/gen/spindle/src/jvm:src.thrift.com.thesigma.merit.merit-scala).
18:11:16 00:01       [compile]

18:11:16 00:01         [zinc]
                       [info] Compiling 1 Scala source and 1 Java source to /Users/omer/code/sigma-monorepo-2point0/.pants.d/compile/zinc/252d64521cf9/.pants.d.gen.spindle.src.jvm.src.thrift.com.thesigma.merit.merit-scala/current/classes...
                       [error] /Users/omer/code/sigma-monorepo-2point0/.pants.d/gen/spindle/src/jvm/com/thesigma/merit/gen/merit.scala:24: integer number too large
                       [error]       val MaxTotalAttachmentBytes: Long = 10000000000
                       [error]                                           ^
                       [error] /Users/omer/code/sigma-monorepo-2point0/.pants.d/gen/spindle/src/jvm/com/thesigma/merit/gen/merit.scala:25: ';' expected but 'val' found.
                       [error]       val MaxCoverPhotoBytes: Long = 50000000
                       [error]       ^
                       [error] two errors found
                       [error] Compile failed at Jul 27, 2017 6:11:16 PM [0.427s]

                   compile(.pants.d/gen/spindle/src/jvm:src.thrift.com.thesigma.merit.merit-scala) failed: Zinc compile failed.
FAILURE: Compilation failure: Failed jobs: compile(.pants.d/gen/spindle/src/jvm:src.thrift.com.thesigma.merit.merit-scala)


18:11:16 00:01   [complete]
               FAILURE

It seems like just having Spindle output 10000000000L instead of 10000000000 would fix the problem.

Scala 2.12 branch

Twitter consumes some of the twofishes libraries, and are interested in helping to get a 2.12 branch started. This is a notice of our intent to do so!

Twofishes: Is it possible to list all cities in a country?

Hi,

If I do a query for New York City the returned data contains refrences to NYC's parents. That data allows me to traverse up and see that NYC lies in the state New York, and that the state New York lies in the US.

Now, I would like to do the opposite. That is, list all the cities in the US.

It doesn't look like Twofishes can do this, but I thought it was worth asking the question.

Thanks!

twofishes website is offline

Hi,

I wanted to check some documentation in the twofishes website and I realized is offline. Did you changed the domain name or is for maintenance?

thrift map parser incorrectly parsing values with string literal `:`

The thrift value below does not compile. Link to discussion in #code

const map<string, list<string>> NamesToUrls = {
  "myName": [
    "https://helloworld.com",
    "https://foobar.com"
  ]
}

Error

==== stderr ====
                       SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
                       SLF4J: Defaulting to no-operation (NOP) logger implementation
                       SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
                       java.lang.Exception: unable to parse map value '{
  							"myName": [
    							"https://helloworld.com",
    							"https://foobar.com"
 							]
						}'
                       	at io.fsq.spindle.codegen.runtime.MapRenderType.renderValue(RenderType.scala:275)
                       	at io.fsq.spindle.codegen.runtime.ScalaConst.valueOption(ScalaConst.scala:28)
                       	at src.resources.io.fsq.ssp.codegen.scala.$_scalate_$constants_ssp$$anonfun$$_scalate_$render$1.apply(constants.ssp.scala:32)
                       	at src.resources.io.fsq.ssp.codegen.scala.$_scalate_$constants_ssp$$anonfun$$_scalate_$render$1.apply(constants.ssp.scala:29)
                       	at scala.collection.immutable.List.foreach(List.scala:381)
                       	at src.resources.io.fsq.ssp.codegen.scala.$_scalate_$constants_ssp$.$_scalate_$render(constants.ssp.scala:29)
                       	at src.resources.io.fsq.ssp.codegen.scala.$_scalate_$constants_ssp.render(constants.ssp.scala:48)
                       	at org.fusesource.scalate.RenderContext$$anonfun$render$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(RenderContext.scala:391)
                       	at org.fusesource.scalate.RenderContext$$anonfun$render$1$$anonfun$apply$mcV$sp$1.apply(RenderContext.scala:391)
                       	at org.fusesource.scalate.RenderContext$$anonfun$render$1$$anonfun$apply$mcV$sp$1.apply(RenderContext.scala:391)
                       	at org.fusesource.scalate.RenderContext$class.withUri(RenderContext.scala:447)
                       	at org.fusesource.scalate.DefaultRenderContext.withUri(DefaultRenderContext.scala:30)
                       	at org.fusesource.scalate.RenderContext$$anonfun$render$1.apply$mcV$sp(RenderContext.scala:390)
                       	at org.fusesource.scalate.RenderContext$$anonfun$render$1.apply(RenderContext.scala:390)
                       	at org.fusesource.scalate.RenderContext$$anonfun$render$1.apply(RenderContext.scala:390)
                       	at org.fusesource.scalate.RenderContext$class.withAttributes(RenderContext.scala:421)
                       	at org.fusesource.scalate.DefaultRenderContext.withAttributes(DefaultRenderContext.scala:30)
                       	at org.fusesource.scalate.RenderContext$class.render(RenderContext.scala:389)
                       	at org.fusesource.scalate.DefaultRenderContext.render(DefaultRenderContext.scala:30)
                       	at src.resources.io.fsq.ssp.codegen.scala.$_scalate_$record_ssp$.$_scalate_$render(record.ssp.scala:53)
                       	at src.resources.io.fsq.ssp.codegen.scala.$_scalate_$record_ssp.render(record.ssp.scala:103)
                       	at org.fusesource.scalate.layout.NullLayoutStrategy$.layout(LayoutStrategy.scala:43)
                       	at org.fusesource.scalate.TemplateEngine$$anonfun$layout$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(TemplateEngine.scala:559)
                       	at org.fusesource.scalate.TemplateEngine$$anonfun$layout$1$$anonfun$apply$mcV$sp$1.apply(TemplateEngine.scala:559)
                       	at org.fusesource.scalate.TemplateEngine$$anonfun$layout$1$$anonfun$apply$mcV$sp$1.apply(TemplateEngine.scala:559)
                       	at org.fusesource.scalate.RenderContext$class.withUri(RenderContext.scala:447)
                       	at org.fusesource.scalate.DefaultRenderContext.withUri(DefaultRenderContext.scala:30)
                       	at org.fusesource.scalate.TemplateEngine$$anonfun$layout$1.apply$mcV$sp(TemplateEngine.scala:558)
                       	at org.fusesource.scalate.TemplateEngine$$anonfun$layout$1.apply(TemplateEngine.scala:555)
                       	at org.fusesource.scalate.TemplateEngine$$anonfun$layout$1.apply(TemplateEngine.scala:555)
                       	at org.fusesource.scalate.RenderContext$.using(RenderContext.scala:47)
                       	at org.fusesource.scalate.TemplateEngine.layout(TemplateEngine.scala:555)
                       	at org.fusesource.scalate.TemplateEngine.layout(TemplateEngine.scala:587)
                       	at org.fusesource.scalate.TemplateEngine.layout(TemplateEngine.scala:579)
                       	at io.fsq.spindle.codegen.binary.ThriftCodegen$$anonfun$compile$2.apply(ThriftCodegen.scala:233)
                       	at io.fsq.spindle.codegen.binary.ThriftCodegen$$anonfun$compile$2.apply(ThriftCodegen.scala:187)
                       	at scala.collection.immutable.List.foreach(List.scala:381)
                       	at io.fsq.spindle.codegen.binary.ThriftCodegen$.compile(ThriftCodegen.scala:187)
                       	at io.fsq.spindle.codegen.binary.ThriftCodegen$.main(ThriftCodegen.scala:142)
                       	at io.fsq.spindle.codegen.binary.ThriftCodegen.main(ThriftCodegen.scala)
                       	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                       	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
                       	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                       	at java.lang.reflect.Method.invoke(Method.java:498)
                       	at com.martiansoftware.nailgun.NGSession.run(NGSession.java:280)
                       Caused by: scala.MatchError: [Ljava.lang.String;@68cb85dc (of class [Ljava.lang.String;)
                       	at io.fsq.spindle.codegen.runtime.MapRenderType$$anonfun$2.apply(RenderType.scala:263)
                       	at io.fsq.spindle.codegen.runtime.MapRenderType$$anonfun$2.apply(RenderType.scala:263)
                       	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
                       	at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
                       	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
                       	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
                       	at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
                       	at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:186)
                       	at io.fsq.spindle.codegen.runtime.MapRenderType.renderValue(RenderType.scala:263)
                       	... 44 more
                       
                       ==== stdout ====

pants build fails: No module named contrib.node.register

Hi. When running:
./src/jvm/io/fsq/twofishes/scripts/parse.py -w /output/dir
I get error Exception message:
Failed to load the pants.contrib.node.register backend: No module named contrib.node.register

Same as issue #23 however when I run
rm -rf ~/.cache/fsqio
from fsqio directory and try again with above code, I get the same exception message

Need documentation on how to deploy Twofishes

With the original "twofishes" repo, there were a couple of sentences in the documentation which told how to deploy the Twofishes server: A better option is to run "./sbt server/assembly" and then use the resulting server/target/server-assembly-VERSION.jar. Serve that with java -jar JARFILE --hfile_basepath /directory

I can't find any documentation at all in the "fsqio" repo. I'm pretty sure that it will be some sort of pants build command such as "pants bundle" but I haven't figured it out yet. Please add some documentation on how to build a standalone .jar file for Twofishes.

Content Issues

First thanks to the Foursquare team for opening such a great resource. I've uncovered content issues, realizing some are TwoFishes-relevant while others Geonames-relevant

Countries

  1. East Germany should not be a country. Even shows on Foursquare.com... strange
  2. Democratic Republic of Congo and Republic of Congo are being treated as the same country in foursquare.com and latest index build. Strangely, the TwoFishes demo recognizes them as two different countries as they should be
  3. Cities in American Samoa follow wrong nomenclature. For example Pago Pago is labeled as Pago Pago, Eastern District instead of Pago Pago, American Samoa. (eastern section is a division within American Samoa)
  4. To be consistent all or no territories should be included. Currently only some territories are included. Those missing include American Samoa (AS), Antarctica (AQ), Bouvet Island (BV), French Southern Territories (TF), Isle of Man (IM), Tokelau (TK). I omit Western Sahara from this list because it is disputed however the same can be said of Antarctica

Rankings

  1. Madrid, Colombia ranks above Madrid, Spain (even with local bias this seems incorrect)

Consistent Treatment of Woetype

  1. It seems impossible to truly isolate towns/cities because while most towns/cities are in woetype 7, some are also in woe type 10 (for example Westport, MA). But if we open applications to woetype=7,10. Then we run into a lot of duplication issues such as Johannesburg below

Same city, multiple ID’s (these are just a few examples. Sometimes within same woetype. Other times span multiple woetypes )

  1. Two instances of Johannesburg, South Africa (Johannesburg and City of Johannesburg). Nothing online indicates there is a parental administrative area over Johannesburg with the same name)
  2. Westport Township vs Town of Westport vs Westport, SD
  3. Town of Howell, NJ vs Howell, NJ

Bounding Boxes

  1. Bounding box for US and Russia seem to be entire globe on twofishes demo and latest index. Though foursquare site seems to correct these two bounding boxes

Buildgen doesn't work on Spindle?

It looks like buildgen in fsqio doesn't seem to be updating dependencies in src/thrift:

Omers-MacBook-Pro:fsqio omer$ git status
On branch master
Your branch is up-to-date with 'origin/master'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   src/thrift/io/fsq/twofishes/BUILD

no changes added to commit (use "git add" and/or "git commit -a")
Omers-MacBook-Pro:fsqio omer$ git diff src/thrift/io/fsq/twofishes/BUILD
diff --git a/src/thrift/io/fsq/twofishes/BUILD b/src/thrift/io/fsq/twofishes/BUILD
index 2846792..dbcaa37 100644
--- a/src/thrift/io/fsq/twofishes/BUILD
+++ b/src/thrift/io/fsq/twofishes/BUILD
@@ -3,9 +3,6 @@
 scala_record_library(
   name = 'twofishes',
   dependencies = [
-    '3rdparty:bson',
-    '3rdparty:twitter-util',
-    'src/jvm/io/fsq/spindle/runtime',
   ],
   sources = globs('*.thrift'),
   provides=scala_artifact(
Omers-MacBook-Pro:fsqio omer$ ./pants buildgen

22:31:09 00:00 [main]
               (To run a reporting server: ./pants server)
22:31:09 00:00   [setup]
22:31:09 00:00     [parse]
               Executing tasks in goals: tag -> bootstrap -> imports -> unpack-jars -> validate -> deferred-sources -> build-spindle -> jvm-platform-validate -> gen -> webpack -> map-java-exported-symbols -> map-scala-exported-symbols -> map-jvm-symbol-to-source-tree -> resolve -> map-third-party-jar-symbols -> map-scala-used-symbols -> map-sources-to-addresses-mapper -> map-scala-library-used-addresses -> map-python-exported-symbols -> map-derived-targets -> buildgen
22:31:10 00:01   [tag]
22:31:10 00:01     [tag]
22:31:10 00:01   [bootstrap]
22:31:10 00:01     [substitute-aliased-targets]
22:31:10 00:01     [bootstrap-jvm-tools]
22:31:10 00:01     [provide-tools-jar]
22:31:10 00:01     [global-jar-dependency-management]
22:31:10 00:01   [imports]
22:31:10 00:01     [ivy-imports]
22:31:10 00:01   [unpack-jars]
22:31:11 00:02     [unpack-jars]
22:31:11 00:02   [validate]
22:31:11 00:02     [validate]
22:31:11 00:02   [deferred-sources]
22:31:11 00:02     [deferred-sources]
22:31:11 00:02   [build-spindle]
22:31:11 00:02     [build-spindle]
22:31:11 00:02   [jvm-platform-validate]
22:31:11 00:02     [jvm-platform-validate]
22:31:11 00:02   [gen]
22:31:11 00:02     [antlr-java]
22:31:11 00:02     [antlr-py]
22:31:11 00:02     [jaxb]
22:31:11 00:02     [protoc]
22:31:11 00:02     [ragel]
22:31:11 00:02     [thrift-java]
22:31:11 00:02     [thrift-py]
22:31:11 00:02     [wire]
22:31:11 00:02     [go-thrift]
22:31:11 00:02     [spindle]
22:31:11 00:02     [validate-graph]
22:31:11 00:02   [webpack]
22:31:11 00:02     [webpack-resolve]
22:31:11 00:02     [webpack-gen]
22:31:12 00:03   [map-java-exported-symbols]
22:31:12 00:03     [map-java-exported-symbols]
22:31:12 00:03   [map-scala-exported-symbols]
22:31:12 00:03     [map-scala-exported-symbols]
22:31:12 00:03   [map-jvm-symbol-to-source-tree]
22:31:12 00:03     [map-jvm-symbol-to-source-tree]
22:31:12 00:03   [resolve]
22:31:12 00:03     [ivy]
22:31:13 00:04     [go]
22:31:13 00:04   [map-third-party-jar-symbols]
22:31:13 00:04     [map-third-party-jar-symbols]
22:31:13 00:04   [map-scala-used-symbols]
22:31:13 00:04     [map-scala-used-symbols]
22:31:13 00:04   [map-sources-to-addresses-mapper]
22:31:13 00:04     [map-sources-to-addresses-mapper]
22:31:13 00:04   [map-scala-library-used-addresses]
22:31:13 00:04     [map-scala-library-used-addresses]
22:31:13 00:04   [map-python-exported-symbols]
22:31:13 00:04     [map-python-exported-symbols]
22:31:13 00:04   [map-derived-targets]
22:31:13 00:04     [map-derived-targets]
22:31:13 00:04   [buildgen]
22:31:13 00:04     [go]
22:31:13 00:04     [buildgen]
22:31:13 00:04     [aggregate-targets]
22:31:13 00:04     [scala]WARN] BuildFileManipulator would have added test/thrift/io/fsq/spindle/codegen/parser/test:test as a dependency of test/jvm/io/fsq/spindle/codegen/binary/test:test, but that dependency was already forced with a comment.

22:31:13 00:04     [python]
22:31:14 00:05   [complete]
               SUCCESS

I recall it working there and see there's some code in buildgen referring to Spindle.

"Cambridge, Worcester County, MA" has woeType of 7

I have found a neighborhood marked with a woeType of 7 (TOWN) and it is causing a quirk in the displayType returned for an actual town.

Here is a Twofishes query for Cambridge with location hint set to the center of Massachusetts, requesting 20 interpretations:

http://demo.twofishes.net/static/geocoder.html?query=Cambridge&ll=42.36565,-71.10832&maxInterpretations=20

Interpretation 1 is the famous Cambridge, MA. As it should, it has woeType set to 7 (TOWN). However, it is shown with a displayName of Cambridge, Middlesex County, MA.

Interpretation 14 is Cambridge, Worcester County, MA. This also has woeType set to 7 (TOWN) which I believe is incorrect. The source is qs_neighborhoods.shp and I believe the woeType should be set to 22 (SUBURB).

Wikipedia shows a neighborhood of Worcester, MA called "Cambridge Street":

https://en.wikipedia.org/wiki/Neighborhoods_of_Worcester,_Massachusetts

Perhaps it would be best if the name of the neighborhood was changed to "Cambridge Street" rather than just "Cambridge"? But if I am not mistaken that would be an issue to file on Quattroshapes and not on Twofishes.

If there is only one interpretation, the displayName is Cambridge, MA as expected:

http://demo.twofishes.net/static/geocoder.html?query=Cambridge&ll=42.36565,-71.10832

Therefore it seems plausible that Twofishes is giving the unusual displayName of Cambridge, Middlesex County, MA only when returning two distinct cities with the same name in the same state, and it's adding county to disambiguate. I am hoping that if the woeType of the "Cambridge Street" neighborhood is properly set to 22 (SUBURB) that my users will consistently get Cambridge, MA as the displayName no matter how many interpretations are requested for a city called "Cambridge".

Are all the places from qs_neighborhoods.shp being loaded with woeType of 7 (TOWN)? If so, this could cause multiple related quirks similar to this one.

Cannot build when Postgres 10+ is present

The build process fails on a newer Ubuntu server image that has Postgres 10.

The specific problem is that the Python code uses psycopg2 and has the version required to be exactly 2.5.3, which fails with Postgres 10. The message is: Error: could not determine PostgreSQL version from '10.0'

I suggest changing the 3rdparty/python/requirements.txt file, and changing the psycopg2 line to specify >= instead of ==. Version 2.6 and newer of psycopg2 work with Postgres 10.

Making twofishes run on https at 8443 port

I am able to run twofishes on my localhost at 8081 successfully. I am trying to run it on https on 8443. I have seen the jetty documentation at link which gives an option for using the java command as follows:
java -jar $JETTY_HOME/start.jar --add-to-startd=https
However when I try to run the same with the server binary like this
java -jar server-assembly-0.84.9.jar --add-to-startd=https --hfile_basepath 2015-03-05-20-05-30.753698/,
I get the following:

10:02:23.864 [main] INFO  c.f.twofishes.GeocodeFinagleServer$ - starting version 0.84.9
Error: Unknown option --add-to-startd=https
Usage: twofishes [options]

  --host <value>
        bind to specified host (default 0.0.0.0)
  -p <value> | --port <value>
        port to run thrift server on
  -h <value> | --run_http_server <value>
        whether or not to run http/json server on port+1
  --hfile_basepath <value>
        directory containing output hfile for serving
  --preload <value>
        scan the hfiles at startup to prevent a cold start, turn off when testing
  --warmup <value>
        warmup the server at startup to prevent a cold start, turn off when testing
  --max_tokens <value>
        maximum number of tokens to allow geocoding
  --hotfix_basepath <value>
        directory containing hot fix files
  --enable_private_endpoints <value>
        enable private endpoints on server

I know this is configured using serve.py but is there any way to make it run on https without making code modifications.
Kindly let me know how I can make it run on https.

Return order of reverse geocode requests

When I request this lat/lng with any radius up to 24 meters, there are 6 results and the first result is Dumbo, Brooklyn: http://demo.twofishes.net/?ll=40.7026,-73.993&radius=24

If I make the request with a radius of 25 or greater, there are 7 results (which I expect) but the first result is Brooklyn Heights, Brooklyn: http://demo.twofishes.net/?ll=40.7026,-73.993&radius=25.

Intuitively it seems like Dumbo, Brooklyn should always be the first result for that lat/lng regardless of radius. Is there something in the data itself that is influencing the order here?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.