From Igor.Soloviev@cern.ch Fri Apr 8 23:20:16 2005 Date: Fri, 8 Apr 2005 11:29:53 +0200 From: Igor Soloviev To: Serguei Kolos , Sarah Wheeler Cc: Doris Burckhart-Chromek , Bryan.Caron@cern.ch Subject: Re: IPC timeout problem Hi Sarah, the best parameters found for RDB server during 2004 scalability tests are: ORBthreadPerConnectionPolicy=0 ORBmaxServerThreadPoolSize=2 # for dual-processors computer! ORBscanGranularity=0 Sergei also recommends to set: ORBthreadPoolWatchConnection=0 Do not set ORBmaxServerThreadPoolSize to a big value. It will only slow down the server. If you want to achieve best performance, you need to run RDB server on a powerful dedicated computer. Ideally no cpu consuming processes should be executed on it, while TDAQ is configured. On later stage you can reuse it for something else, e.g. for monitoring, etc. To estimate time to get configuration on all your clients you can have a look to the results of the 2004 scalability tests: http://atlas-onlsw.web.cern.ch/Atlas-onlsw/components/configdb/tests-03-2004/03-2004-all.xls http://atlas-onlsw.web.cern.ch/Atlas-onlsw/components/configdb/tests-03-2004/03-2004-all.htm As example, 200 clients (e.g. run controllers) read partition tree composed of ~2000 objects in 56 seconds. You should not expect better results, if you do not have much more powerful hardware. However if you have much worse results, one needs to check configuration of the RDB server. Using RDB server better results can only be achieved by using multiple RDB servers inside single partition or executing DAL algorithms on RDB server, to those current TDAQ release is not ready. Please let me know, how many RDB clients do you have (plan in future) in your configuration and what data they read from configuration database? What applications except online infrastructure and Run Control reads the database? Cheers, Igor ----- Original Message ----- From: "Serguei Kolos" To: "Sarah Wheeler" Cc: "Doris Burckhart-Chromek" ; ; "Igor Soloviev" Sent: Friday, April 08, 2005 9:54 AM Subject: Re: IPC timeout problem > Hi Sarah > > You are right, the default IPC timeout is set to 30 seconds. You can > change it via TDAQ_IPC_TIMEOUT > env. var., but don't forget that the value must be specified in > milliseconds, i.e. if you want to set it to 1 minute > you should use > setenv TDAQ_IPC_TIMEOUT 60000 > I have also a question to you - do you know which request has timed > out, the one which is sent > to the database server? or some other ? If it is request to the RDB, you > can try also to play > with the number of threads in the thread pool, i.e. with the second > parameter from the ones, > which I have advised you to use yesterday. Try to increase a number of > threads for example to 100 > > -ORBmaxServerThreadPoolSize 100 > > I'll also forward this replay to Igor, he may give you better advice for the > configuration of the RDB server. By the way do you have an idea how much data > each PT reads from the database? > > Cheers, > Sergei > > > Sarah Wheeler wrote: > > >Hello again, My test partitions worked (following the advice you gave > >below) but moving up to larger configurations I'm now getting IPC timeouts > >from the PTs on Configure. It looks as though there is a timeout set to 30 > >secondssomewhere. How do I increase this please? In the past I remember > >there was TDAQ_IPC_TIMEOUT. But I guess this no longer exists? I tried > >changing ActionTimeouts in the DB but this made no difference. > > > >Thanks alot, > >Sarah > > > >On Thu, 7 Apr 2005, Serguei Kolos wrote: > > > > > > > >>Hi Sarah > >> > >>Please try to add the following options to all the RDB servers in your > >>configuration (you should do this > >>in the online infrastructure segment of the database): > >> > >>-ORBthreadPerConnectionPolicy 0 -ORBmaxServerThreadPoolSize 10 > >> > >>I believe this should help. Please let me know the results. > >> > >>Cheers, > >>Sergei > >> > >>Sarah Wheeler wrote: > >> > >> > >> > >>>Hi, Output is attached. > >>>cheers, > >>>Sarah > >>>On Thu, 7 Apr 2005, Serguei Kolos wrote: > >>> > >>> > >>> > >>> > >>>>Hi Sarah > >>>> > >>>>Thank you for the information. Could you please run the next command and > >>>>send > >>>>me it's output > >>>>(when you have PTs dying of course): > >>>> > >>>> > >>>> > >>>> > >>>>>/usr/sbin/lsof | grep rdb_ser > >>>>> > >>>>> > >>>>> > >>>>Cheers, > >>>>Sergei > >>>> > >>>> > >>>> > >>>> > >>>>>ice17_13|gensetupScripts|31> /usr/sbin/lsof | grep $USER | grep > >>>>>rdb_server > >>>>> > >>>>>rdb_serve 20872 swheeler txt REG 239,150 233229 5311858 > >>>>>/global/home/caronb/atlas/tdaq-release/tdaq-01-01-00/installed/i686-rh73-g > >>>>>cc32 > >>>>>-opt/bin/rdb_server > >>>>>rdb_serve 20881 swheeler txt REG 239,150 233229 5311858 > >>>>>/global/home/caronb/atlas/tdaq-release/tdaq-01-01-00/installed/i686-rh73-g > >>>>>cc32 > >>>>>-opt/bin/rdb_server > >>>>>rdb_serve 20889 swheeler txt REG 239,150 233229 5311858 > >>>>>/global/home/caronb/atlas/tdaq-release/tdaq-01-01-00/installed/i686-rh73-g > >>>>>cc32 > >>>>>-opt/bin/rdb_server > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>Cheers, > >>>>>>Sergei > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> > >>>> > >>>> > >>>> > >>>-- > >>> > >>> > >>> > >> > >> > > > > > > > >