All Topics

#2079 Actors: shared resource pool with max limit on resources

JonasL Thu 10 Jan 2013

I'm trying to use a pool of database connections with a max limit of open connections. Incoming web-threads access the pool until its saturated.

This is fine with an actor managing the pool handing out new connections if none is available - until I hit the max limit. Then I want the web-request thread to sleep on some lock until a connection is returned to the pool.

I cant get my head around how to do this in a clean way with actors or the other concurrent methods in fantom.

With java I could wait on the object and the returning thread could notify other threads waiting (but then not using actors at all, just simple synchronized access and wait on object if none is available).

I guess I could loop in the calling thread - return some NotAvailable value from the actor - and randomly sleep in the calling thread and sending a new message to the actor again requesting the resource. But it just seems wrong, essentially polling an actor.

Any ideas?

brian Thu 10 Jan 2013

I may not be understanding you quite right, but I don't think you need to do anything. Say the ActorPool is assigned a thread pool of 100. You just keep spinning up actors with that pool, then if you have more than 100 actors running, the 101st one will just automatically hibernate until it get scheduled onto one of the threads.

That is essentially how WispService works when accepting HTTP connections.

JonasL Thu 10 Jan 2013

I'll try again:

It's not how to spin up more actors (on limited threads), its a question on many threads accessing fewer resources - somewhere some threads either wait somehow until one of the scarce resource becomes available again.

So in the web-scenario.. the webserver spawns one thread per request. The request needs a db-connection - which is pooled. Say that max 10 pooled db-connections - when the 11:th web-server thread is spawned there is no available db-connection since all 10 are in use.

The question was on how to somehow suspend the 11:th web-server thread until one connection is returned to the pool.

In java i'd do something like (pseudo-code:ish):

List<Connection> list;

synchronized(list) {
  if(list.isEmpty) {
    list.wait();
  }
  list.get(0);
}

and then the some other thread returning a connection:

synchronized(list) {
  list[0] = returnedConnection();
  list.notify();

}

Waking up the some thread waiting for the pooled resource.

brian Thu 10 Jan 2013

To do a proper suspend like I think you are talking about requires true continuations which Java doesn't support. If you want to block in the middle a body of code like you wrote, you have to block the entire thread and can't release that thread back to the pool. The same would be true in Java. You have to use a heavy weight thread to do that.

How you block in Fantom would be to block on an actor result using Future.get. So if a method made a blocking call out to a connection pool, then the thread would automatically block until the proper actor was scheduled onto a thread pool and returned the result.

But with the actor model what you really want to do is break things into messages that complete. That way you can release the thread back to the pool if don't have a message to process. The major downside to that is that you can't write nice sequential logic such as continuations would allow.

JonasL Fri 11 Jan 2013

I think we are talking past each other - I always try to give context to a question, but I think it made it more confusing this time.

This is exactly what I'm after - but done with actors (if possible): http://en.wikipedia.org/wiki/Guarded_suspension

This question touches on the same thing, but no real answer is given on what to do if the resources are limited in supply. http://stackoverflow.com/questions/1645341/proper-way-to-access-shared-resource-in-scala-actors

This does solve it (I think) but it feels like shoehorning the solution into actors.. each resource (consumer in the example) needs to be an actor too and then you pass all of your work into the scheduling actor which queues it onto the first vacant consumer-actor: http://stackoverflow.com/questions/1007010/can-scala-actors-process-multiple-messages-simultaneously/1008532#1008532 - I guess its here true continuations you mention would help.

Actors seems the only way to share state in fantom, so I was curious on how to solve this using actors and fantom.

Thanks for taking the time to try and understand my confusing questions :)

brian Fri 11 Jan 2013

Well the continuations just lets you pause something and use the OS thread for something else. In Java and Fantom anytime you block, you still are using a real thread. That is sort of orthogonal to this discussion though I think.

If you try and model stuff like Java where want to block a thread until something happens, then the only similar construct is Future.get which blocks until an actor has processed a result.

Although I think it is better to think in terms of function units for the actor model. So if you want at most 10 database connections running, then I would design that ActorPool with thread pool size of 10. And then spin off an actor to process each unit of work that uses a connection. In that case you would probably pass a function/method to the actor which in turn would callback your method with the mutable db connection. Something like:

static Void work1(Conn db) { }
static Void work2(Conn db) { }

DbActor(pool).send(#work1)

yliu Sat 12 Jan 2013

If I'm reading correctly you're saying that you want something like this:

Future receive(){

  if(condition){
    process()
    return something
  }

  sendLater(null, 1ms)
}

but you're saying that doing this feels wrong?

I think this should be fine way to poll the actor because the sendLater is essentially a sleep/wait so it takes 0% cpu.

JonasL Tue 15 Jan 2013

So, long discussion.. I've talked to more people about this and googled like crazy. But it all boils down to I'm looking for blocking code and there is no real way to block with actors (and fan). More users than there are resources means someone has to wait or block.

I'll hold the requirement that the dbpool should have a max limit. Then a single actor coordinating the resources works fine. Was mostly curious if it could be done.

@brian Haven't used continuations myself, so I might be wrong on this, but I meant continuation as a way to register a function with the full stacktrace and closure of its senders context. That way I could make the sender return nothing and when a resource was available - the continuation could be called finishing the rest of the work.

@yliu I'm not following your code fully, but yes I mean polling the actor holding the shared resources until a resource was returned. The problem I have with this is if I set a to small window, the actor being polled will be swamped with messages. And if I set it too long, a resource might be returned the next instruction after the wait, causing unnecessary delays.

Manual timing just bugs me, callbacks or notifications relieves me of having to anticipate how long to back off.

KevinKelley Tue 15 Jan 2013

I really think you're trying to do the wrong thing here. But it's not clear what you're trying to do, so hard to say... It sounds like you're trying to implement Java's threading model in terms of Actors, and that just seems so wrong.

But to correct one point:

no real way to block with actors (and fan)

future := actor.send(msg) returns a future; and you can certainly block on its completion with future.get. So if you really do want to block, you can. The whole mind-set thing with the Actor model though, is that you probably shouldn't be blocking: each component of your system will do some work: taking some data and generating some more, then passing it off to another worker. When you need to act on some other worker's output, you can pass a callback, or map out a data-flow and messaging protocol such that it's all well-defined: each actor only understands a certain set of messages, and it won't be activated until the message (containing its input data) is ready; and it needn't block because it can just do its job and then pass off its result to the next stage.

Java model leads to deep stacks of blocked call-frames in various threads, waiting on each other, having to be manually synchronized, all chewing up processor resources and control-stack. Actors should lead to shallow call-stacks, separation of concerns, complete ownership of internal data (no shared mutable state), and much less chance to mess it up.

JonasL Fri 18 Jan 2013

But to correct one point: no real way to block with actors (and

Sorry, that came out wrong.. I meant as in the actor itself cannot (or should not at least) block waiting for some other resource to become available.

Here's the code snippet that got me thinking about this question (its by go4 I belive who posts here sometimes):

https://bitbucket.org/chunquedong/slan/src/05b1ae9e77ca/slanOrm/fan/dataSource/ConnectionPool.fan?at=default

I was just wondering how you would add a max limit on the allowed created connections - with actors.

brian Sat 19 Jan 2013

I think you do it either with capping the actor pool threads or you have a master actor to manages allocation and de-allocation of the actors/connections. The later is probably simplest way to do it.

yliu Sat 2 Feb 2013

Sorry for the bump and late reply, haven't had a chance to look at this board for a bit.

I'm sorry for my earlier code, I didn't fully understand what you were having an issue with, but after reviewing that snippet I see now that you're trying to use 1 actor to distribute the work. So here's an idea I came up with,

Picture this, you have N# of AtomicBool's in an Array and N# of Actor's in an Array where N corresponds to Max number of connections that you want to have. Each AtomicBool is assigned to 1 of the N number of Actor's in the Array. The Actor's receive function corresponds to the process you want to carry out.

A Master Actor is exposed to this list of AtomicBool's, and is the Actor that receives the messages from the network. When the Master Actor receives a message to distribute, he compareAndSet's the AtomicBool of the Connection Actor he is going to send the message to, and sends the message to that corresponding Actor. After the Actor is finished processing he getAndSet's the AtomicBool back to signal he is ready to go again.

Here is where it gets tricky/interesting. We can distribute these messages in two ways:

A) We can randomly choose an index of the AtomicBool and check if the selected connection is busy, if so we sendLater with an appropriate BackOff (The All-Cure I've learned is using an Exponential Backoff Algorithm) to handle congestion.

B) We linearly iterate through the List of AtomicBool's, and cycle back to the front, in this case if we run into a busy connection we again sendLater with BackOff to handle congestion.

Alternatively, Instead of AtomicBool's we can have AtomicInt's and check if the Int is under some specified message limit, (This means our Connection Actors can have a queue of awaiting messages). In this case we still choose the above message distribution schemes and BackOff when we find a Connection Actor is busy.

Things to consider.

We can set the Actors to say 100 and limit the ActorPool to 10 threads, this would mean we process any 10 concurrently. This all depends on the type of traffic you expect.
This only works if each Actor's operation is completely independent, meaning Actor's do not depend on any other's Actor's Future to complete its operation, if they do then that's a whole other problem of consensus and serialization/linearizability, and means you need concurrent data-structures in place that promise atomicity.
Using AtomicInt instead of AtomicBool would balance out the message queues, otherwise the Master Actor's might explode while experiencing heavy traffic. This decision all really depends, again, on how much traffic you're going to experience.

In conclusion, concurrency and multi-core computing is hard. It's either going to be really complex ugly code that uses unwanted polling and make you start using timing that gets you better performance, or really simple and beautiful code that uses locks/semaphores (All Hale Dijkstra), which has issues on its own.

Also, if I've gone on a tangent, and this was really a question on whether you should poll or use observer pattern, then that question depends on how much traffic you're getting. But for something like network applications, I feel like polling makes more sense because with observer pattern you'll start accumulating copious amounts of event notifications on ALL observers.

Sorry, if that was too long of a post, I spent a bit contemplating this problem.

JonasL Tue 5 Feb 2013

No worries, thanks a lot for taking the time to answer.

I've never really used actors before so I wanted to check (this being the first case of using them) if I'd missed something basic - or if this wasn't a good fit for actors.

In conclusion, concurrency and multi-core computing is hard. It's either going to be really complex ugly code that uses unwanted polling and make you start using timing that gets you better performance, or really simple and beautiful code that uses locks/semaphores (All Hale Dijkstra), which has issues on its own.

This looks like the conclusion I came up with too - somewhere there needs to be polling or timing.

katox Fri 8 Feb 2013

For a pragmatic solution see 2097.