[Date Prev][Date Next][Date Index]

Re: BESSRC experience with Spec/Epics




re...,
Guy Jennings wrote:
> 
> On 11/19/02 4:53 PM, 'Tim Mooney' <mooney@aps.anl.gov> wrote:
> 
> >
> > I haven't heard from users directly reporting long CA timeouts with spec,
> > though I have heard of problems thought to be CA related--I think BESSRC
> > may have seen some of this.
> 
> Yes we have, so much so that we have largely given up using epics with spec.

Can you tell me how to make the problems happen on purpose, so I can try
to
collect some data on them?

> > I agree that applications should not be expected to fix problems with network
> > hardware, although they should make as much use as they can of error returns
> > and connection-management messages from CA.
> 
> I'm reasonably convinced that our problems are not caused by network
> hardware.

Ok.  Do you have enough information about other beamlines to look for
systematic differences that might give some clues?

> > My understanding is that it's possible for CA to simply not send some messages
> > if it's 'send' buffer runs out of space and new messages continue to be added.
> 
> I didn't know this.  Is this on the client?   It would be consistent with
> what we see.  I get the impression that the errors are more likely when spec
> has a large number of epics motors defined and specifically during the burst
> of activity that occurs when leaving spec's 'config' screen.

I've seen this only on the server side, but I don't see how it could not
apply
to the client side as well.  I don't know what limits the buffer size on
the
client side.

> > It's also possible for CA to get insufficient CPU time to handle all the
> > messages it's intended to handle.  This could mean that a request doesn't
> > get sent, that a sent request doesn't get received, that an acknowledge
> > doesn't
> > get sent, or that a sent acknowledge doesn't get received.
> 
> How much CPU time does it want ?  I've seen the problem on a dual 1.5GHz
> Athlon system, which should be adequate for talking to a 25MHz 68040 !

I don't know, but if the average CPU load gets up around the 90% area,
some
low-priority client is probably going to get it in the neck.  On the
server
side, CA is pretty low priority, and it's not hard to arrange for higher
priority tasks to starve it.

Tim