Code Monkey home page Code Monkey logo

Comments (24)

cabol avatar cabol commented on June 29, 2024

@nsweeting can you copy some code, for example ResourceURL.get_admin_url, I need to check out that code, thanks!

BTW, are you running the distributed cache on one node or multiple nodes? If you are running this on multiple nodes, are you sure the cache is started in all nodes? just to check, but again, as much code as you can share would be the best

from nebulex.

nsweeting avatar nsweeting commented on June 29, 2024

@cabol

We have a resource url protocol:

defprotocol App.ResourceURL do
  def get_admin_url(resource)
end

With an implementation done for Product structs.

defimpl App.ResourceURL, for: App.Product do
  def get_admin_url(product) do
    case App.ResourceURLCache.get_or_set(:admin_url, product.account_id) do
      {:ok, base_url} -> "#{base_url}#{product.admin_url_path}"
      _ -> ""
    end
  end
end

Which calls our cache containing the errored get/2 function.

  def get(type, cache_key) do
    case DistCache.get({__MODULE__, type, cache_key}) do
      nil -> {:error, :not_found}
      url -> {:ok, url}
    end
  end

  def get_or_set(type, cache_key) do
    with {:error, :not_found} <- get(type, cache_key) do
      set(type, cache_key)
    end
  end

And finally, the nebulex cache module:

defmodule App.DistCache do
  use Nebulex.Cache, otp_app: :apple: 
end

from nebulex.

cabol avatar cabol commented on June 29, 2024

@nsweeting about the second question:

BTW, are you running the distributed cache on one node or multiple nodes? If you are running this on multiple nodes, are you sure the cache is started in all nodes? just to check

from nebulex.

nsweeting avatar nsweeting commented on June 29, 2024

During the test runs, it was running on a single node.

from nebulex.

cabol avatar cabol commented on June 29, 2024

Ok, but yes, that error looks very weird, let me try to reproduce it, meanwhile, can you try but with the master branch? There have been several improvements, and perhaps we can have better info about the error. Pls let me know your results, stay tuned!

from nebulex.

cabol avatar cabol commented on June 29, 2024

@nsweeting any news? On other hand, help me with this, the Nebulex cache App.DistCache you copied, is it using Nebulex.Adapters.Dist? If so, you need also to define another cache with a local backend, where is that Nebulex cache using Nebulex.Adapters.Local? Or is App.ResourceURLCache your local Nebulex cache? If so, I have some comments about it, but let's confirm that first. BTW, may you copy also the Nebulex config section in your config file ?

from nebulex.

nsweeting avatar nsweeting commented on June 29, 2024

@cabol - here is the config:

config :app, App.LocalCache,
  adapter: Nebulex.Adapters.Local,
  gc_interval: 86_400

config :app, App.DistCache,
  adapter: Nebulex.Adapters.Dist,
  local: App.LocalCache,
  node_picker: Nebulex.Adapters.Dist

No news right now - ALTHOUGH - it did happen once again randomly. I meant to copy the error output, but forgot to :( It was a different error though - related to :badarg or something. Same spot in nebulex though.

So far - this has ONLY happened in tests though. I have yet to see it in our staging or production servers.

I'm hoping it would crop up again, but I have yet to find a way to reproduce it...

from nebulex.

nsweeting avatar nsweeting commented on June 29, 2024

Got hit by the error I described above running the tests just a moment ago.

** (UndefinedFunctionError) function :badarg.exception/1 is undefined (module :badarg is not available)
     code: assert "http://foo.category/baz" = ResourceURL.get_url(category)
     stacktrace:
       :badarg.exception([])
       (nebulex) lib/nebulex/adapters/dist.ex:265: Nebulex.Adapters.Dist.rpc_call/4
       (app) lib/app/support/caches/dist_cache.ex:2: App.DistCache.execute/2
       (app) lib/app/ecommerce/resource_urls/resource_url_cache.ex:17: App.ResourceURLCache.get/2
       (app) lib/app/ecommerce/resource_urls/resource_url_cache.ex:24: App.ResourceURLCache.get_or_set/2
       (app) lib/app/ecommerce/resource_urls/implements/category.ex:3: App.ResourceURL.App.Category.get_url/1
       test/ecommerce/resource_urls/resource_url_test.exs:60: (test)

Different - but the same.

from nebulex.

cabol avatar cabol commented on June 29, 2024

@nsweeting ok, I have some suspicious, one of the possible reason this might be happening is that maybe the local cache for the distributed one is being stopped before the function is completed, maybe a race condition in the test, but I don't know how you are starting and stopping the cache(s) in your tests. But I think we can test it out easily, in your test case, add a try rescue wrap in App.ResourceURL.App.Category.get_url/1 call, and within the rescue section, call for instance App.LocalCache.get("something"), let's check is the function crashes with the same error. Also, check if there is something in your tests, maybe another test case that is stopping, or crashing the app or something similar, fi the tests run asynchronously, that could affect other test cases.

On other hand, change nebulex dep in your mix.exs file and point to master branch or the latest commit; it would be good to see what happens with latests version.

from nebulex.

cabol avatar cabol commented on June 29, 2024

@nsweeting any news?

from nebulex.

cabol avatar cabol commented on June 29, 2024

Heads up! version 1.0.0 was released already, you should try it out!

from nebulex.

nsweeting avatar nsweeting commented on June 29, 2024

Hey @cabol - thanks for the heads up. Will try. I've had it happen randomly once since the last comment. Still have yet to find reproducible steps.

I dont think its the caches being started or not - I have both caches underneath a "Cache Supervisor" that I use to start some other non-nebulex caches. The cache supervisor is started on application start. Nothing in my code stops them at any point either.

In the meantime, feel free to close this if wanted...

from nebulex.

nsweeting avatar nsweeting commented on June 29, 2024

@cabol I upgraded to the latest 1.0.0 release. Had a random test failure. Slightly different than the others though.

     ** (ArgumentError) argument error
     code: assert "http://foo.checkout/baz" = ResourceURL.get_url(checkout)
     stacktrace:
       (nebulex) lib/nebulex/adapters/dist.ex:325: Nebulex.Adapters.Dist.rpc_call/6
       (nebulex) lib/nebulex/cache/object.ex:12: Nebulex.Cache.Object.get/3
       (app) lib/app/support/caches/dist_cache.ex:2: App.DistCache.with_hooks/3
       (app) lib/app/ecommerce/resource_urls/resource_url_cache.ex:17: App.ResourceURLCache.get/2
       (app) lib/app/ecommerce/resource_urls/resource_url_cache.ex:24: App.ResourceURLCache.get_or_set/2
       (app) lib/app/ecommerce/resource_urls/implements/checkout.ex:3: App.ResourceURL.App.Checkout.get_url/1
       test/ecommerce/resource_urls/resource_url_test.exs:12: (test)

from nebulex.

cabol avatar cabol commented on June 29, 2024

@nsweeting I think I have an idea what might be happening. Perhaps is related to the generations manager, maybe at the same time a new generation is pushed a new command is being processed, when a new generation is created, there is a moment (very small) where the ETS tables are removed, hence, if for some reason there is a command being processed, it will fail with ArgumentError. It is very weird and unusual scenario to happen, that's why it's happening randomically for you, after some time. This is just a suspicion, still very weird since in your config you have the gc_interval set to 86_400 , and I suppose your tests finish before that, isn't? BTW, the error happens when you run the unit tests right (if so, this is not the case)? or is it a different kind of test you are running?

from nebulex.

nsweeting avatar nsweeting commented on June 29, 2024

Sorry for the delay. Yes - its configured with the gc_interval: 86_400, and the tests do finish before then. And yeah - it seems to only occur while running tests. I have yet to see the error pop up in staging or production.

from nebulex.

cabol avatar cabol commented on June 29, 2024

@nsweeting any news? in the meantime, I'll close this issue, but feel free to re-open it again!

from nebulex.

nsweeting avatar nsweeting commented on June 29, 2024

No news. Thanks for checking in. I dont recall it happening in a while - but yeah - I'll just reopen if I run across it and have time to dive in a bit. 👍

from nebulex.

nsweeting avatar nsweeting commented on June 29, 2024

@cabol - So I just had it happen on another test run. I think I might have tracked down what the issue is. It has changed a little bit - most likely from recent PR's. So, at the moment, it has transformed into a general argument error:

     ** (ArgumentError) argument error
       ...
       (nebulex) lib/nebulex/adapters/dist.ex:325: Nebulex.Adapters.Dist.rpc_call/6
       (nebulex) lib/nebulex/cache/object.ex:12: Nebulex.Cache.Object.get/3
       ...

Which I believe is as a result of:

def call(node, mod, fun, args, task_sup, timeout) do
{task_sup, node}
|> Task.Supervisor.async(
__MODULE__,
:call,
[node, mod, fun, args, task_sup, timeout]
)
|> Task.await(timeout)
end

I'm assuming Task.Supervisor is raising as a result of the supervisor hitting the max children. Which, by default, would be the number of schedulers. Could easily happen - especially in a concurrent test scenario. Unbound child creation is probably not ideal - but I'm wondering if a better error might be in order?

from nebulex.

cabol avatar cabol commented on June 29, 2024

Ok, I'll take a look at it and try to figure out what's going on. Thanks for the info :) !!

from nebulex.

cabol avatar cabol commented on June 29, 2024

I was checking out the Task.Supervisor options, and apparently the :max_children is :infinity by default – https://hexdocs.pm/elixir/DynamicSupervisor.html#init/1-options

Anyways, I will fix the adapter at least to allow to pass the supervisor available options.

from nebulex.

nsweeting avatar nsweeting commented on June 29, 2024

Sorry - you are correct! Not quite sure where I got that assumption from...

from nebulex.

cabol avatar cabol commented on June 29, 2024

NP, actually it did make sense. Do you think is possible to give me more details about the tests? I'm not sure if maybe there might be a race condition or something, in which the cache is stopped while other test is still running, I don't know. So if you can give me more details or even paste some code related to the tests you are running, that would be very helpful. Thanks!

from nebulex.

cabol avatar cabol commented on June 29, 2024

@nsweeting any news? Did you check the previous suggestion ^^^? Perhaps we should close this issue in the meantime, while we have more info and a clear way to reproduce the error.

from nebulex.

nsweeting avatar nsweeting commented on June 29, 2024

Sorry - no news. The application in question has stabilized, so less day-to-day work is currently occurring. Have not noticed the error in a while though. Will close for now. Sorry - this issue snowballed into a giant thread!

from nebulex.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.