Xarray-SQL: Query Xarray Datasets with SQL
Data and Resources
This dataset has no data
Additional Info
| Field | Value |
|---|---|
| Last Updated | April 6, 2026, 18:51 (UTC) |
| Created | April 6, 2026, 18:51 (UTC) |
| appAccessMethod | Python package published on PyPI |
| appAudience | Data Scientists and Programmers |
| appPrereqs | Needs basic familiarity with Python, SQL, and Xarray. |
| appSummary | <p>Xarray-SQL lets one treat Xarray Datasets as if they were SQL tables. This allows data practitioners to join gridded rasters and traditional tabular datasets together.</p><p><br></p><pre class="ql-syntax" spellcheck="false">pip install xarray-sql </pre><p><br></p><pre class="ql-syntax" spellcheck="false">import xarray as xr import xarray_sql as xql ds = xr.tutorial.open_dataset('air_temperature') # The same as a dask-sql Context; i.e. an Apache DataFusion Context. ctx = xql.XarrayContext() ctx.from_dataset('air', ds, chunks=dict(time=24)) # the dataset needs to be chunked! # data is only materialized when we make a query. result = ctx.sql(''' SELECT "lat", "lon", AVG("air") as air_avg FROM "air" GROUP BY "lat", "lon" ''') # DataFrame() # +------+-------+--------------------+ # | lat | lon | air_avg | # +------+-------+--------------------+ # | 75.0 | 205.0 | 259.88662671232834 | # | 75.0 | 207.5 | 259.48268150684896 | # | 75.0 | 230.0 | 258.9192123287667 | # | 75.0 | 275.0 | 257.07574315068456 | # | 75.0 | 322.5 | 250.11792123287654 | # | 75.0 | 325.0 | 250.81590068493134 | # | 72.5 | 205.0 | 262.74933904109537 | # | 72.5 | 207.5 | 262.5384315068488 | # | 72.5 | 230.0 | 260.82879452054743 | # | 72.5 | 275.0 | 257.3063321917804 | # +------+-------+--------------------+ # Data truncated. </pre><p><br></p> |
| appUrls | ["https://colab.research.google.com/drive/1JAzzsmOvf5LsOz2EOFzKL6M7MtQp9s20"] |
| appVideos | ["https://www.youtube.com/watch?v=AtB_6c-GcJE"] |
| creationMethod | <p>Open source software that is community maintained.</p> |
| creatorEmail | al@merose.com |
| creatorName | Alexander Merose |
| creatorWebsite | https://alex.merose.com |
| dataAuthType | public |
| dataType | Software |
| docsURL | https://alxmrs.github.io/xarray-sql/ |
| issueDate | 2023-09-30 |
| lang | en |
| lastUpdateDate | 2026-03-29 |
| license | other |
| ndp_creator_md5 | 6bac2b7876195a6842b7db2d3585796b |
| otherLicense | Apache 2.0 |
| pocEmail | al@merose.com |
| pocName | Alexander Merose |
| pocWebsite | https://alex.merose.com |
| publisherEmail | |
| publisherName | PyPI |
| publisherWebsite | https://pypi.org/project/xarray_sql/ |
| purpose | <p>Instead of manually planning out how one might make sense of multiple raster datasets along with tabular datasets, diving deep into the physical structures of disparate data representations, SQL provides a high level logical view of all data. If you can write a declaration of how data can be related to one another in SQL, then you almost certainly will be able to perform the query. Unfortunately, this property is not possible in the typical way we work with data in the Python ecosystem, where one has to "manually" manipulate dataframes according to their physical layout ("hand written" query plans). In Xarray-SQL, we let users of data wield gridded rasters but think of them as tables, which we argue is a more accessible way to work with data.</p> |
| status | submitted |
| theme | [] |
| updateFreq | Monthly |
| uploadType | application |