Basic usage

The GWDataFind service allows users to query for the location of Gravitational-Wave data files containing data associated with gravitational-wave detectors.

This package provides a number of functions to make requests to a GWDataFind server with authorization credential handling.

Top-level API

The gwdatafind top-level module provides a number of convenience functions to perform single queries without a persistent connection:

ping

Ping the GWDataFind host to test for life.

find_observatories

Query a GWDataFind host for observatories with available data.

find_types

Query a GWDataFind host for dataset types.

find_times

Query a GWDataFind host for times in which data are available.

find_url

Query a GWDataFind host for the URL of a single filename.

find_urls

Query a GWDataFind host for all URLs for a dataset in an interval.

find_latest

Query a GWDataFind host for the latest file in a given dataset.

For example:

A simple query for URLs associated with Open Data
>>> from gwdatafind import find_urls
>>> urls = find_urls(
...     "L",
...     "L1_GWOSC_O2_4KHZ_R1",
...     1187008880,
...     1187008884,
...     host="datafind.gw-openscience.org",
... )
>>> print(urls)
['file://localhost/cvmfs/gwosc.osgstorage.org/gwdata/O2/strain.4k/frame.v1/L1/1186988032/L-L1_GWOSC_O2_4KHZ_R1-1187008512-4096.gwf']

The above examples work well for ‘one-shot’ queries where only a single result is required. For repeated queries, or similar, that require multiple connections to the same GWDataFind host, see Session usage.

Command-line interface

GWDataFind also provides a command-line interface accessible via module execution (python -m gwdatafind) or the gw_data_find entry point script:

usage: gw_data_find [-h] [-V] [[-p | -w | -y | -a | -f FILE | -T]] [-o OBS]
                    [-t TYPE] [-s GPS] [-e GPS] -r HOST:PORT [-P]
                    [[-l | -W | -n]] [-m MATCH] [-u URL_TYPE] [-g] [-O PATH]

Optional arguments

-V, --version

show version number and exit

Query types

Select one of the following, if none are selected a query for frame URLS will be performed

-p, --ping

ping the DataFind server

Default: False

-w, --show-observatories

list available observatories

Default: False

-y, --show-types

list available file types

Default: False

-a, --show-times

list available segments

Default: False

-f, --filename

resolve URL(s) for a particular file name

-T, --latest

resolve URL(s) for the most recent file of the specified type

Default: False

Data options

Parameters for your query. Which options are required depends on the query type

-o, --observatory

observatory(ies) that generated frame file; use –show-observatories to see what is available.

-t, --type

type of frame file, use –show-types to see what is available.

-s, --gps-start-time

start of GPS time search

-e, --gps-end-time

end of GPS time search

Connection options

Authentication and connection options.

-r, --server

hostname and optional port of server to query (default: None)

-P, --no-proxy

attempt to authenticate without a grid proxy (default: False)

Default: False

Output options

Parameters for parsing and writing output.

-l, --lal-cache

format output for use as a LAL cache file

-W, --frame-cache

format output for use as a frame cache file

-n, --names-only

display only the basename of each file

-m, --match

return only results that match a regular expression

-u, --url-type

return only URLs with a particular scheme or head such as ‘file’ or ‘gsiftp’

Default: “file”

-g, --gaps

check the returned list of URLs or paths to see if the files cover the requested interval; a return value of zero (0) indicates the interval is covered, a value of one (1) indicates at least one gap exists and the interval is not , covered and a value of (2) indicates that the entire interval is not covered; missing gaps are printed to stderr (default: False)

Default: False

-O, --output-file

path to output file, defaults to stdout