2006 DAQ/HLT Software Large and Medium Scale Tests


Performed in November/December 2006  -    50 -950 nodes, farm size increasing in steps
Draft Test Report   is available on the TWIKI pages, where it is being updated with the results of further analysis. A  snapshot is available as PDF document. The document number in the CERN document server  EDMS  is ATL-D-TR-0004
.

Schedule:
Phase 21: 8. Nov. 400 nodes    244 nodes were available  
Phase 2: 17. Nov. 800 nodes    397 nodes available since 14. Nov, ; 21st of Nov. 537 marked as up, 859 assigned
Phase 3  26. November to 6. December 1200 nodes - 10. December  24th Dec. 950 marked as up, 1250 assigned

The tests are terminated

Communication Organisation The Test Tools System info

Log Book

Aims and Scope

Test Plan Testing Tools  

LXSHARE farm status

atlas-large-scale-tests

Test organisation Preparation Work list Shifter Tools doc Network:  ppt   pdf
TWIKI page Schedule   Settings System info
LST meeting agendas Slides & Doc     Monitoring of the nodes  with lemon
mailing list archive How to participate  

Bug reporting

Testbed  reservation 

Note: Documentation on the usage of databases in LST06 is on the Twiki pages, see
 Databases in LST06DBProxy , DBStressor

Note that this node reservation is preliminary;  please contact per email to the list of people mentioned below if you like to run
and use the nodes

Schedule
    special assignments can normally be done depending on requests and needs

during shared mode:
base schedule
lxb5435 Sarah - lxb5436 Joerg  - lxb5437 Lourenco -  lxb5438 - Hans
Joerg lxb0226,  lxb0227, lxb0229, lxb0231 , lxb0233, lxb0235, lxb0237, lxb0238
Sarah lxb0242  lxb0243  lxb0244  lxb0251
Serge: lxb6511-lxb6520
Doris lxb5323
Haimo: all available nodes between lxb5431-lxb5434 and lxb6509, lcb6510
the database servers lxmrrm3001-3004 are reserved
reserved for the setup_segment: lxb5306-lxb5317
  
  the setup segments to be included are, depending on the release:
   
/afs/cern.ch/user/a/atlonl/public/setup/setup-LST-1.6.1.data.xml
    /afs/cern.ch/user/a/atlonl/public/setup/setup-LST-1.6.2.data.xml
the remaining nodes can be used in shared mode
Specific schedule:


The status of the nodes can be viewed with the lemon monitoring Web page, see pointer in above
specific commands can be used from  lxplus (not from lxbxxxx):

 -  To monitor the nodes, you can use:    ~tkleinw/public/monitoring.pl:
   * monitoring.pl will give you _all_ machines in the atlas_tdaq
cluster
   * monitoring.pl --alive will give you all machines that are alive
   * monitoring.pl --disk will give you all machines with their disk
sizes
   * monitoring.pl --ram will give you all machines with their memory
sizes

Uunder *lxplus* you can use the command: wassh -c atlas_tdaq <command>
command can be for example 'hostname' or 'uptime' or to "kill -9 <my_haning_process>"

 - email is sent to 'it-support-atlastdaq2006@cern.ch' when a machine has a problem. This is a
mailing list which has been set up for this purpose only. If you want to get these
emails, please subscribe!
Log Manger: view logs via http://lxmrra3802.cern.ch/ls/lsmanager.php

Status

Phase 1: see LXSHARE farm status

Aims and Scope

The Large Scale Tests 2006 will take place in the 4th quater of this year, starting at the end of October. The exact date will be decided later and depends on the detailed availability of the test farm. A period of 4 weeks with a cluster increasing in steps from 400 to 800 and finally 1200 nodes has been requested.

The tests will be held at CERN at the LXBATCH facility. The CERN IT support team will be available for the operating system management support of the farm. The Atlas DAQ/HLT infrastructure software, the Level2 and EF algorithms and the Online Databases will be under test.

How to participate

In order to participate in the tests, the functionality of the participating software must have been verified on smaller scale testbeds. Testing tools and farm management and monitoring tools must be available. The same testware, scripts and tools which are envisaged to be run on large scale should have been run already on a smaller scale. Medium scale test may need to be necessary prior to the large scale tests.

For the participating parts of the tests, the following questions should be answered:

Provide a description of the tests you want to perform and answer the
following questions:

  1. What is the purpose of the tests, which aspect of the system is
    investigated, what are the objectives (description of the test)?
  2. Why can these tests only be done on a medium/large/very large
    scale?
  3. Which are the operations that are expected to be critical to scale? Are these operations critical for functionality or performance on a large scale?
  4. At which phase of the data taking activity do they take place (list the run control transition or state)
  5. What are the challenges and/or expected results?
  6. In order to run the tests, testers should be identified from each of the participating systems. Who will prepare the tests, who will participate in the test preparation (be the contact person, participate in meetings), who will run the tests and make the results available as part of the test report?
  7. On which test facility and to which sclale will the tests have been tested and run on a medium scale on one of the available farms before?
  8. Can the tests be automated? Can they run autonomously, i.e. over
    night or during parts of the week-end? Is the necessary testware being developed?
  9. Which release do you envisage to use?
  10. Are there any specific requirements on system parameters or any
    constraints?
  11. How many nodes do the tests require?
  12. What is your estimate for testing time and do you need to run several iterations?

Test Organization and Approach

 The test approach will be similar to the one in the June2006 Large Scale Tests

Schedule

LST06:
    pre-testing and installation on a small number of nodes: End October 2006
    main testing time: November/December
    then: Analysis and Test Report

MST06: 10 days in September 2006

Short term schedule: 

Slides & Documentation

Start of the activities during the May 2006 TDAQ week at Cern:

    Parallel session for LST06 with individual contributions
    Summary in the Plenary

LST06 meetings: 

    LST meetings agendas including minutes

Medium Scale Tests


last edited: 27/04/07   doris                                                                 
back to Atlas TDAQ