-
Notifications
You must be signed in to change notification settings - Fork 0
help
Autoplot is an interactive browser for data on the web. In the same way that you browse content stored in HTML, JPEG, and PNG files on the web, Autoplot allows you to interactively browse data stored in CDF, netCDF, ASCII, and many more data file formats.
Autoplot tries to create a sensible plot given a URL or a name of a data file on your computer so you can quickly get your data from one or more data files to a presentation- or publication-ready graphic within minutes. Plots are interactive and annotations are easily added.
Feel free to ask questions on the discussion group autoplot@groups.google.com if you need assistance, or browse the archive there at https://groups.google.com/forum/#!forum/autoplot.
This document is intended to introduce scientists to the software and go over its features. Autoplot should be easy to use for new scientists, but it is quite capable and suitable for those who wish to use it for most of their science products. This document tries to be comprehensive, but one doesn't need to understand later sections to use the software productively.
Autoplot can be found at http://autoplot.org/latest/. There links are provided for Windows and Mac installations, or a single jar file (Java Archive) can be used with Java directly. Development releases with new features and bug fixes are made about once a week. Though each release is tested using a suite of automatic tests, these development releases may contain new bugs which could interfere use, and should not be used in production environments. Production-quality releases are more thoroughly tested and are made monthly.
WebStart was once the easy method for running Java apps, and while a link for WebStart is provided, this is no longer recommended.
On Macs, a .dmg file that contains the Autoplot release and a bundled version of the Java Runtime Environment is used. This 90Mb download is signed and should run on Macs.
For Windows a .exe file is provided which will install Autoplot.
On Linux machines, the single-jar release can be used as well. We've prepended a bash launch script at its beginning so that if the downloaded jar's executable bit is set, it can be launched like any other executable file.
An easy way to try Autoplot out for the first time is to use a CDF file to look at some data. (CDF files are used often in Heliospheric Physics, and contain named data arrays and connections between them.) If you have a CDF file handy, try typing in its name, or try this one on the Autoplot website: http://autoplot.org/data/autoplot.cdf. This string of characters, "http://autoplot.org/data/autoplot.cdf", is entered in the address bar at the top, just as you would a web browser to open a new web page. This is an "Autoplot URI" and is used to identify data. Autoplot will download the file and then provides a GUI to pick a parameter from the file. Note that once the parameter is picked, the URI now contains the name of the file and the name of the parameter--one concise string of characters represents data that can be loaded.
This video shows how the address bar is used: URI Address Bar
These data URIs can be kept as bookmarks, under the bookmarks menu. Further, bookmarks are stored in an XML file, and this XML file can be put on a web site and loaded as a set of remote bookmarks. Using this, a workgroup can easily share commonly used data sets.
Autoplot comes with a set of bookmarked URIs under the "Demos" folder. These show
some of the different types of data that can be loaded. For example, "Demo 1"
shows how DST can be loaded from a server at NASA/Goddard. This URI is not a
filename but instead are controls for the "CDAWeb Data Source":
vap+cdaweb:ds=OMNI2_H0_MRG1HR&id=DST1800&timerange=Oct+2016.
It indicates the data source, vap+cdaweb, and then the dataset within that
database, OMNI2_H0_MRG1HR, and then an identifier for which parameter to plot,
DST1800. You can try out these different demo URIs to see some of the sources
Autoplot can use for loading data.
Once the data is loaded, the plot is "live," where you can interactively adjust axes using the mouse. When you right-click (or cmd-click on a single-button mouse), you see a list of actions which respond to mouse actions. For example, "Box Zoom" will zoom in on the box dragged, and "Display Data" will show the data within a box.
Drag a box around a feature to zoom in on the feature. To zoom in just X or just Y, drag out a range on either axis, and long, narrow boxes will be handled as a zoom in just one direction. Note the "Drag Renderer" feedback changes from a box to a line for long, narrow boxes.
The mouse wheel, if you have one, will also zoom in and out. Also the middle mouse button when pressed will pan the focus. The interface is similar to that of Google Maps.
Control-Z (or Command-Z on a Mac) can be used to undo an operation. A list of states are kept, and you can undo multiple times. Control-Y will redo an operation mistakenly undone.
This video shows navigation: Basic Interation with Autoplot
A CDF URI is a web address (URL) of a CDF file, which can be a local file (file:/tmp/autoplot.cdf), question mark, and the name of a parameter to load. For example, enter the URI into Autoplot:
http://autoplot.org/data/autoplot.cdf?BGSM
Entering this URI will open up the file using the CDF data source, reading the parameter identified by "BGSM." This data is loaded into Autoplot's internal data representation model, QDataSet, and then displayed.
Pressing the inspect button (hourglass/folder) will enter a GUI, called a Data Source Editor, where the parameter is selected. One can also make other selections, such as ignoring CDF ISTP metadata, keyword constraints to help identify parameters, and subsetting the data using another parameter.
Note the URI given above is more formally stated as:
vap+cdf:http://autoplot.org/data/autoplot.cdf?BGSM
and when the vap+cdf prefix (scheme) is missing, the file extension is used to infer the data source to use.
The third-party libraries used to read most data sources, like CDF, need to have their files local for reading. When a remote URL is used, Autoplot downloads the file to a local cache of files. A progress bar is shown as the file is initially downloaded, and once it's downloaded, Autoplot will use it as long as the remote file is unchanged.
One of the more powerful features of Autoplot is that files can be referred to by web addresses. This means that a URI that works on your desktop will also work on your colleague's desktop. Often an analysis starts with downloading the data and modifying codes for your local storage, but using remote website URIs avoids this and lets Autoplot handle this for you.
This local cache of files is located in your home directory, under "autoplot_data/fscache". The menubar can be used to manage this cache, using [menubar]→Tools→Cache→"Manage Cached Files". Note there is no mechanism to automatically remove files from the cache, so you might periodically remove unused files if it is using too much space, or simply [menubar]→Tools→Cache→"Clear Cache" to remove all downloaded files. (Note this will be done sometime!)
Note also that the cache will be used when Autoplot is off-line. This allows use of Autoplot when the Internet is not available or is slow.
Often in our field you will come across web sites containing directories of hundreds of files, systematically named so that the file corresponding to any time interval is easily found. Consider
https://www.ngdc.noaa.gov/stp/space-weather/satellite-data/satellite-systems/lanl_geo/data/LANL-97A/mpa/20140101_LANL-97A_MPA_MOMENTS_1Cycle_json_v1.1.0.txt?column=SC_POTEN
where 31 files covering the month of January 2014 are found.
https://www.ngdc.noaa.gov/stp/space-weather/satellite-data/satellite-systems/lanl_geo/data/LANL-97A/mpa/$Y$m$d_LANL-97A_MPA_MOMENTS_1Cycle_json_v$(v,sep).txt?column=SC_POTEN&timerange=2014-01-02
is an aggregation, where the data files for a specific time will be downloaded and each loaded and combined to make one continuous dataset. The following wildcards can be used:
|
|
Github Markdown requires that extra spaces be inserted. Please remove these when using codes.
[menubar]→Tools→Aggregate... can be used to create aggregations, where code will try to guess the aggregation automatically. Often when tab is pressed to show available files in the directory, if an aggregation is detected, this will also be offered as a completion.
Many URIs configure use of a web service to load data. HAPI servers are an example of this, where a parameter name and time range is all that's needed to retrieve data. There are no files, but instead the Data Source will write requests for the server and then interpret the stream of data into a form (QDataSet) which Autoplot can use for plotting and analysis.
Data servers often support "Time Series Browsing", meaning if you browse a long time series of data easily. For example, the Data Source knows that if vap+cdaweb:ds=OMNI2_H0_MRG1HR&id=DST1800&timerange=2016-12-10 reads the data for 2016-12-10, then vap+cdaweb:ds=OMNI2_H0_MRG1HR&id=DST1800&timerange=2016-12-11 will read the next days data and will load this automatically when you advance to the next day.
The CDAWeb at NASA/Goddard contains roughly 1400 data products from around 40 different missions. Each data product corresponds to CDF files spanning a mission, containing many plottable parameters. Instrument teams produce CDF files and provide them to the CDAWeb which makes the data available. An example URI is:
vap+cdaweb:ds=OMNI2_H0_MRG1HR&id=DST1800&timerange=Oct+2016
which means from the data identified as "OMNI2_H0_MRG1HR" plot the parameter "DST1800", loading data to cover the time range "Oct 2016."
HAPI is an open interface groups can implement to provide access to their data. What makes HAPI unique is that any group can set up this API, which has been developed by a committee of heliophysics data providers. HAPI URIs look like:
vap+hapi:http://datashop.elasticbeanstalk.com/hapi?id=CASSINI_MAG_HI_RES¶meters=SSO_X,SSO_Y,SSO_Z&timerange=2004-07-01
which says with the HAPI server located at http://datashop.elasticbeanstalk.com/hapi, from the dataset CASSINI_MAG_HI_RES, load the parameters SSO_X,SSO_Y,SSO_Z. Any group can set up a HAPI server using the documentation at http://hapi-server.org/. A list of known HAPI servers is retrieved from https://github.com/hapi-server/servers/blob/master/all.txt, and once a HAPI server is manually entered in the GUI, it will appear in the droplist of available servers.
URIs are a compact method for referring to data, and are used though out Autoplot.
Autoplot keeps track of everything you have plotted in the file HOME/autoplot_data/bookmarks/history.txt. The URI history dialog ([menubar]→File→"Open URI History") is a tool that tries to sort out the history and provides a search tool for locating lost datasets. For example, if you know that you were plotting a txt file, set the filter to "vap+txt" to see all the matching URIs.
URIs can be:
- Sent in an email to refer to data.
- Used in scripts to load data.
- In IDL, Matlab, and Python, can call Autoplot to get data for a URI.
- Used in .vap files to refer to data.
.vap ("dot vap") files are used to store the layout of the canvas.
A .vap file is created by simply using menubar→File→Save. Autoplot URIs are used to refer to data, rather than storing the data within the file. If the URI references within refer to files on the web, the .vap file can be sent to a colleague who can also load the .vap to look at the same data.
One can also embed data within a .vap file. In this case, a .vap.zip file is created, and this file will contain a .vap file with relative references to the data, and the data will also be contained in the .zip file. Note data from servers and public web sites will not be included in the .zip file.
A .vap file is loaded by entering the .vap file name in the address bar, or using [menubar]→File→"Open .vap File". This .vap file can be on a web site, so like data files you don't necessarily have to download the file first.
Note that the open dialog allows a different timerange to be entered, so that one vap can be used to look at data at any time during a mission. At the address bar, the .vap file name followed by "?timerange=...", allowing a new timerange to be loaded.
Now we'll survey different file types handled. Autoplot has "plug-ins" for data sources, meaning when it launches, it checks to see who implements the DataSourceFactory interface, and each of these is associated with a file extension. For example, the ASCIIFileDataSourceFactory is associated with .txt and .dat files. When a URI is entered, the extension is identified, using a file extension or an explicit identifier for the URI. So "vap+cdaweb:..." is handled by the plugin that knows how to get data from CDAWeb. One can explicitly identify a data source to use, for example "vap+bin:/tmp/ap/data.dat" will use the binary source even though the file has a .dat extension.
Data Sources can also provide a DataSourceEditor, which is a GUI that knows how to construct URIs for a given data source. For example, when a CDF file is entered and the inspect button () pressed, the Data Source Editor for CDF Files loads the CDF file and shows what is inside the file.
For each of the data sources, an image of the Data Sources Editor as well as keywords found in the URIs.
The ASCII Table reader reads in a flat ASCII file with one record per line. Each line of the file is identified as a record or non-record. Autoplot URIs are the name of the ascii file and parameters that specify how to parse the file, listed below. A GUI is also provided that allows the URI to be created graphically. This does not provide access to all the available controls, but is easier to use.
-
fixedColumns an optimized parser should be used since each row of the file has a fixed column width. By default the row are split into columns using a delimiter. The value may take several forms:
- value contains ",": specify column locations as in "0-10,20-34"
- value is int: specify the number of columns in each row. Actual may be less.
- value is unspecified: the first row is split to determine the column locations.
- columnCount override the number of columns.
- column identifies the field in each record to treat as the dependent variable. By default the columns are named field0, field1, field2, etc.
- depend0 identifies a field as the independent variable.
-
rank2 when in the URL indicates that the dataset produced should be a rank 2 dataset, and the number of fixed columns indicates the number of elements per row. When the value contains a colon (:), this indicates a range. The range is : or -.
- value may be empty, meaning use all columns
- 1: means the second row and up.
- -5: means the last five rows.
- 20:24 means the four rows starting at the 21st row.
- field3:field6 three columns
- B_x_gsm-B_z_gsm three columns.
- depend1Labels indicates the first record contains the labels for the rank 2 dataset. (<firstColumn>:<lastColumn exclusive>)
- depend1Values indicates the first record contains the values for the rank 2 dataset (so it is displayed as a spectrogram). (<firstColumn>:<lastColumn exclusive>)
- skip is an integer indicating the number of lines that should be skipped before processing begins.
- fill is the number that indicates missing or fill data.
- timeFormat specifies the time format, based on the Unix date command. "ISO8601" means the times are ISO8601 conforment, or use template with fields from #Wildcard_codes.
- time specifies the field that is the time record. This also sets the independent variable.
- delim identifies the delimiter character. By default, the first record is inspected for commas, tabs and then whitespace.
- comment prefix string that indicates records to ignore.
- fill values to be treated as fill data.
-
where constraint for the records, for example with a timerange or containing a string.
- where=field2.lt(100)
- where=field2.eg(1)
- where=field2.eq(rbspa) ordinal data can be used with eq.
- where=field2.within(4+to+40)
- where=field2.matches(Heater+Status+(On%7COff)) The plus is turned into a space, and %7C is a pipe character.
- label concise label for the data
- title one-line title for the data