sections_fts
478 rows sorted by rowid descending
This data as json, CSV (advanced)
Link | rowid ▲ | title | content | sections_fts | rank |
---|---|---|---|---|---|
278 | 278 | 0.58 (2021-07-14) | New datasette --uds /tmp/datasette.sock option for binding Datasette to a Unix domain socket, see proxy documentation ( #1388 ) "searchmode": "raw" table metadata option for defaulting a table to executing SQLite full-text search syntax without first escaping it, see Advanced SQLite search queries . ( #1389 ) New plugin hook: get_metadata(datasette, key, database, table) , for returning custom metadata for an instance, database or table. Thanks, Brandon Roberts! ( #1384 ) New plugin hook: skip_csrf(datasette, scope) , for opting out of CSRF protection based on the incoming request. ( #1377 ) The menu_links() , table_actions() and database_actions() plugin hooks all gained a new optional request argument providing access to the current request. ( #1371 ) Major performance improvement for Datasette faceting. ( #1394 ) Improved documentation for Running Datasette behind a proxy to recommend using ProxyPreservehost On with Apache. ( #1387 ) POST requests to endpoints that do not support that HTTP verb now return a 405 error. db.path can now be provided as a pathlib.Path object, useful when writing unit tests for plugins. Thanks, Chris Amico. ( #1365 ) | 10 | |
277 | 277 | 0.58.1 (2021-07-16) | Fix for an intermittent race condition caused by the refresh_schemas() internal function. ( #1231 ) | 10 | |
276 | 276 | 0.59 (2021-10-14) | Columns can now have associated metadata descriptions in metadata.json , see Column descriptions . ( #942 ) New register_commands() plugin hook allows plugins to register additional Datasette CLI commands, e.g. datasette mycommand file.db . ( #1449 ) Adding ?_facet_size=max to a table page now shows the number of unique values in each facet. ( #1423 ) Upgraded dependency httpx 0.20 - the undocumented allow_redirects= parameter to datasette.client is now follow_redirects= , and defaults to False where it previously defaulted to True . ( #1488 ) The --cors option now causes Datasette to return the Access-Control-Allow-Headers: Authorization header, in addition to Access-Control-Allow-Origin: * . ( #1467 ) Code that figures out which named parameters a SQL query takes in order to display form fields for them is no longer confused by strings that contain colon characters. ( #1421 ) Renamed --help-config option to --help-settings . ( #1431 ) datasette.databases property is now a documented API. ( #1443 ) The base.html template now wraps everything other than the <footer> in a <div class="not-footer"> element, to help with advanced CSS customization. ( #1446 ) The render_cell() plugin hook can now return an awaitable function. This means the hook can execute SQL queries. ( #1425 ) register_routes(datasette) plugin hook now accepts an optional datasette argument. ( #1404 ) … | 10 | |
275 | 275 | 0.59.1 (2021-10-24) | Fix compatibility with Python 3.10. ( #1482 ) Documentation on how to use Named parameters with integer and floating point values. ( #1496 ) | 10 | |
274 | 274 | 0.59.2 (2021-11-13) | Column names with a leading underscore now work correctly when used as a facet. ( #1506 ) Applying ?_nocol= to a column no longer removes that column from the filtering interface. ( #1503 ) Official Datasette Docker container now uses Debian Bullseye as the base image. ( #1497 ) Datasette is four years old today! Here's the original release announcement from 2017. | 10 | |
273 | 273 | 0.59.3 (2021-11-20) | Fixed numerous bugs when running Datasette behind a proxy with a prefix URL path using the base_url setting. A live demo of this mode is now available at datasette-apache-proxy-demo.datasette.io/prefix/ . ( #1519 , #838 ) ?column__arraycontains= and ?column__arraynotcontains= table parameters now also work against SQL views. ( #448 ) ?_facet_array=column no longer returns incorrect counts if columns contain the same value more than once. | 10 | |
272 | 272 | 0.59.4 (2021-11-29) | Fixed bug where columns with a leading underscore could not be removed from the interactive filters list. ( #1527 ) Fixed bug where columns with a leading underscore were not correctly linked to by the "Links from other tables" interface on the row page. ( #1525 ) Upgraded dependencies aiofiles , black and janus . | 10 | |
271 | 271 | Other small fixes | Made several performance improvements to the database schema introspection code that runs when Datasette first starts up. ( #1555 ) Label columns detected for foreign keys are now case-insensitive, so Name or TITLE will be detected in the same way as name or title . ( #1544 ) Upgraded Pluggy dependency to 1.0. ( #1575 ) Now using Plausible analytics for the Datasette documentation. explain query plan is now allowed with varying amounts of whitespace in the query. ( #1588 ) New CLI reference page showing the output of --help for each of the datasette sub-commands. This lead to several small improvements to the help copy. ( #1594 ) Fixed bug where writable canned queries could not be used with custom templates. ( #1547 ) Improved fix for a bug where columns with a underscore prefix could result in unnecessary hidden form fields. ( #1527 ) | 10 | |
270 | 270 | Faceting | The number of unique values in a facet is now always displayed. Previously it was only displayed if the user specified ?_facet_size=max . ( #1556 ) Facets of type date or array can now be configured in metadata.json , see Facets in metadata.json . Thanks, David Larlet. ( #1552 ) New ?_nosuggest=1 parameter for table views, which disables facet suggestion. ( #1557 ) Fixed bug where ?_facet_array=tags&_facet=tags would only display one of the two selected facets. ( #625 ) | 10 | |
269 | 269 | Plugins and internals | New plugin hook: filters_from_request(request, database, table, datasette) , which runs on the table page and can be used to support new custom query string parameters that modify the SQL query. ( #473 ) Added two additional methods for writing to the database: await db.execute_write_script(sql, block=True) and await db.execute_write_many(sql, params_seq, block=True) . ( #1570 ) The db.execute_write() internal method now defaults to blocking until the write operation has completed. Previously it defaulted to queuing the write and then continuing to run code while the write was in the queue. ( #1579 ) Database write connections now execute the prepare_connection(conn, database, datasette) plugin hook. ( #1564 ) The Datasette() constructor no longer requires the files= argument, and is now documented at Datasette class . ( #1563 ) The tracing feature now traces write queries, not just read queries. ( #1568 ) The query string variables exposed by request.args will now include blank strings for arguments such as foo in ?foo=&bar=1 rather than ignoring those parameters entirely. ( #1551 ) | 10 | |
268 | 268 | 0.60 (2022-01-13) | 10 | ||
267 | 267 | 0.60.1 (2022-01-20) | Fixed a bug where installation on Python 3.6 stopped working due to a change to an underlying dependency. This release can now be installed on Python 3.6, but is the last release of Datasette that will support anything less than Python 3.7. ( #1609 ) | 10 | |
266 | 266 | 0.60.2 (2022-02-07) | Fixed a bug where Datasette would open the same file twice with two different database names if you ran datasette file.db file.db . ( #1632 ) | 10 | |
265 | 265 | 0.61 (2022-03-23) | In preparation for Datasette 1.0, this release includes two potentially backwards-incompatible changes. Hashed URL mode has been moved to a separate plugin, and the way Datasette generates URLs to databases and tables with special characters in their name such as / and . has changed. Datasette also now requires Python 3.7 or higher. URLs within Datasette now use a different encoding scheme for tables or databases that include "special" characters outside of the range of a-zA-Z0-9_- . This scheme is explained here: Tilde encoding . ( #1657 ) Removed hashed URL mode from Datasette. The new datasette-hashed-urls plugin can be used to achieve the same result, see datasette-hashed-urls for details. ( #1661 ) Databases can now have a custom path within the Datasette instance that is independent of the database name, using the db.route property. ( #1668 ) Datasette is now covered by a Code of Conduct . ( #1654 ) Python 3.6 is no longer supported. ( #1577 ) Tests now run against Python 3.11-dev. ( #1621 ) New datasette.ensure_permissions(actor, permissions) internal method for checking multiple permissions at once. ( #1675 ) New datasette.check_visibility(actor, action, resource=None) internal method for checking if a user can see a resource that would otherwise be invisible to unauthenticated users. ( #1678 ) Table and row HTML pages now include a <link rel="alternate" type="application/json+datasette" href="..."> element and return a Link: URL; rel="alternate"; type="applicatio… | 10 | |
264 | 264 | 0.61.1 (2022-03-23) | Fixed a bug where databases with a different route from their name (as used by the datasette-hashed-urls plugin ) returned errors when executing custom SQL queries. ( #1682 ) | 10 | |
263 | 263 | Documentation | Examples in the documentation now include a copy-to-clipboard button. ( #1748 ) Documentation now uses the Furo Sphinx theme. ( #1746 ) Code examples in the documentation are now all formatted using Black. ( #1718 ) Request.fake() method is now documented, see Request object . New documentation for plugin authors: Registering a plugin for the duration of a test . ( #903 ) | 10 | |
262 | 262 | Bug fixes | Don't show the facet option in the cog menu if faceting is not allowed. ( #1683 ) ?_sort and ?_sort_desc now work if the column that is being sorted has been excluded from the query using ?_col= or ?_nocol= . ( #1773 ) Fixed bug where ?_sort_desc was duplicated in the URL every time the Apply button was clicked. ( #1738 ) | 10 | |
261 | 261 | Plugin hooks | New plugin hook: handle_exception() , for custom handling of exceptions caught by Datasette. ( #1770 ) The render_cell() plugin hook is now also passed a row argument, representing the sqlite3.Row object that is being rendered. ( #1300 ) The configuration directory is now stored in datasette.config_dir , making it available to plugins. Thanks, Chris Amico. ( #1766 ) | 10 | |
260 | 260 | Features | Datasette is now compatible with Pyodide . This is the enabling technology behind Datasette Lite . ( #1733 ) Database file downloads now implement conditional GET using ETags. ( #1739 ) HTML for facet results and suggested results has been extracted out into new templates _facet_results.html and _suggested_facets.html . Thanks, M. Nasimul Haque. ( #1759 ) Datasette now runs some SQL queries in parallel. This has limited impact on performance, see this research issue for details. New --nolock option for ignoring file locks when opening read-only databases. ( #1744 ) Spaces in the database names in URLs are now encoded as + rather than ~20 . ( #1701 ) <Binary: 2427344 bytes> is now displayed as <Binary: 2,427,344 bytes> and is accompanied by tooltip showing "2.3MB". ( #1712 ) The base Docker image used by datasette publish cloudrun , datasette package and the official Datasette image has been upgraded to 3.10.6-slim-bullseye . ( #1768 ) Canned writable queries against immutable databases now show a warning message. ( #1728 ) datasette publish cloudrun has a new --timeout option which can be used to increase the time limit applied by the Google Cloud build environment. Thanks, Tim Sherratt. ( #1717 ) datasette publish cloudrun has new --min-instances and --max-instances options. ( #1779 ) | 10 | |
259 | 259 | 0.62 (2022-08-14) | Datasette can now run entirely in your browser using WebAssembly. Try out Datasette Lite , take a look at the code or read more about it in Datasette Lite: a server-side Python web application running in a browser . Datasette now has a Discord community for questions and discussions about Datasette and its ecosystem of projects. | 10 | |
258 | 258 | Documentation | New tutorial: Cleaning data with sqlite-utils and Datasette . Screenshots in the documentation are now maintained using shot-scraper , as described in Automating screenshots for the Datasette documentation using shot-scraper . ( #1844 ) More detailed command descriptions on the CLI reference page. ( #1787 ) New documentation on Running Datasette using OpenRC - thanks, Adam Simpson. ( #1825 ) | 10 | |
257 | 257 | Plugin hooks and internals | The prepare_jinja2_environment(env, datasette) plugin hook now accepts an optional datasette argument. Hook implementations can also now return an async function which will be awaited automatically. ( #1809 ) Database(is_mutable=) now defaults to True . ( #1808 ) The datasette.check_visibility() method now accepts an optional permissions= list, allowing it to take multiple permissions into account at once when deciding if something should be shown as public or private. This has been used to correctly display padlock icons in more places in the Datasette interface. ( #1829 ) Datasette no longer enforces upper bounds on its dependencies. ( #1800 ) | 10 | |
256 | 256 | Features | Now tested against Python 3.11. Docker containers used by datasette publish and datasette package both now use that version of Python. ( #1853 ) --load-extension option now supports entrypoints. Thanks, Alex Garcia. ( #1789 ) Facet size can now be set per-table with the new facet_size table metadata option. ( #1804 ) The truncate_cells_html setting now also affects long URLs in columns. ( #1805 ) The non-JavaScript SQL editor textarea now increases height to fit the SQL query. ( #1786 ) Facets are now displayed with better line-breaks in long values. Thanks, Daniel Rech. ( #1794 ) The settings.json file used in Configuration directory mode is now validated on startup. ( #1816 ) SQL queries can now include leading SQL comments, using /* ... */ or -- ... syntax. Thanks, Charles Nepote. ( #1860 ) SQL query is now re-displayed when terminated with a time limit error. ( #1819 ) The inspect data mechanism is now used to speed up server startup - thanks, Forest Gregg. ( #1834 ) In Configuration directory mode databases with filenames ending in .sqlite or .sqlite3 are now automatically added to the Datasette instance. ( #1646 ) Breadcrumb navigation display now respects the current user's permissions. ( #1831 ) | 10 | |
255 | 255 | 0.63 (2022-10-27) | See Datasette 0.63: The annotated release notes for more background on the changes in this release. | 10 | |
254 | 254 | 0.63.1 (2022-11-10) | Fixed a bug where Datasette's table filter form would not redirect correctly when run behind a proxy using the base_url setting. ( #1883 ) SQL query is now shown wrapped in a <textarea> if a query exceeds a time limit. ( #1876 ) Fixed an intermittent "Too many open files" error while running the test suite. ( #1843 ) New db.close() internal method. | 10 | |
253 | 253 | 0.63.2 (2022-11-18) | Fixed a bug in datasette publish heroku where deployments failed due to an older version of Python being requested. ( #1905 ) New datasette publish heroku --generate-dir <dir> option for generating a Heroku deployment directory without deploying it. | 10 | |
252 | 252 | 0.63.3 (2022-12-17) | Fixed a bug where datasette --root , when running in Docker, would only output the URL to sign in as root when the server shut down, not when it started up. ( #1958 ) You no longer need to ensure await datasette.invoke_startup() has been called in order for Datasette to start correctly serving requests - this is now handled automatically the first time the server receives a request. This fixes a bug experienced when Datasette is served directly by an ASGI application server such as Uvicorn or Gunicorn. It also fixes a bug with the datasette-gunicorn plugin. ( #1955 ) | 10 | |
251 | 251 | 0.64 (2023-01-09) | Datasette now strongly recommends against allowing arbitrary SQL queries if you are using SpatiaLite . SpatiaLite includes SQL functions that could cause the Datasette server to crash. See SpatiaLite for more details. New default_allow_sql setting, providing an easier way to disable all arbitrary SQL execution by end users: datasette --setting default_allow_sql off . See also Controlling the ability to execute arbitrary SQL . ( #1409 ) Building a location to time zone API with SpatiaLite is a new Datasette tutorial showing how to safely use SpatiaLite to create a location to time zone API. New documentation about how to debug problems loading SQLite extensions . The error message shown when an extension cannot be loaded has also been improved. ( #1979 ) Fixed an accessibility issue: the <select> elements in the table filter form now show an outline when they are currently focused. ( #1771 ) | 10 | |
250 | 250 | 0.64.1 (2023-01-11) | Documentation now links to a current source of information for installing Python 3. ( #1987 ) Incorrectly calling the Datasette constructor using Datasette("path/to/data.db") instead of Datasette(["path/to/data.db"]) now returns a useful error message. ( #1985 ) | 10 | |
249 | 249 | 0.64.2 (2023-03-08) | Fixed a bug with datasette publish cloudrun where deploys all used the same Docker image tag. This was mostly inconsequential as the service is deployed as soon as the image has been pushed to the registry, but could result in the incorrect image being deployed if two different deploys for two separate services ran at exactly the same time. ( #2036 ) | 10 | |
248 | 248 | 0.64.3 (2023-04-27) | Added pip and setuptools as explicit dependencies. This fixes a bug where Datasette could not be installed using Rye . ( #2065 ) | 10 | |
247 | 247 | 0.64.4 (2023-09-21) | Fix for a crashing bug caused by viewing the table page for a named in-memory database. ( #2189 ) | 10 | |
246 | 246 | 0.64.5 (2023-10-08) | Dropped dependency on click-default-group-wheel , which could cause a dependency conflict. ( #2197 ) | 10 | |
245 | 245 | 0.64.6 (2023-12-22) | Fixed a bug where CSV export with expanded labels could fail if a foreign key reference did not correctly resolve. ( #2214 ) | 10 | |
244 | 244 | Changelog | 10 | ||
243 | 243 | datasette package | If you have docker installed (e.g. using Docker for Mac ) you can use the datasette package command to create a new Docker image in your local repository containing the datasette app bundled together with one or more SQLite databases: datasette package mydatabase.db Here's example output for the package command: $ datasette package parlgov.db --extra-options="--setting sql_time_limit_ms 2500" Sending build context to Docker daemon 4.459MB Step 1/7 : FROM python:3.11.0-slim-bullseye ---> 79e1dc9af1c1 Step 2/7 : COPY . /app ---> Using cache ---> cd4ec67de656 Step 3/7 : WORKDIR /app ---> Using cache ---> 139699e91621 Step 4/7 : RUN pip install datasette ---> Using cache ---> 340efa82bfd7 Step 5/7 : RUN datasette inspect parlgov.db --inspect-file inspect-data.json ---> Using cache ---> 5fddbe990314 Step 6/7 : EXPOSE 8001 ---> Using cache ---> 8e83844b0fed Step 7/7 : CMD datasette serve parlgov.db --port 8001 --inspect-file inspect-data.json --setting sql_time_limit_ms 2500 ---> Using cache ---> 1bd380ea8af3 Successfully built 1bd380ea8af3 You can now run the resulting container like so: docker run -p 8081:8001 1bd380ea8af3 This exposes port 8001 inside the container as port 8081 on your host machine, so you can access the application at http://localhost:8081/ You can customize the port that is exposed by the container using the --port option: datasette package mydatabase.db --port 8080 A full list of options can be seen by running datasette package --help : See datasette package for the full list of options for this command. | 10 | |
242 | 242 | Custom metadata and plugins | datasette publish accepts a number of additional options which can be used to further customize your Datasette instance. You can define your own Metadata and deploy that with your instance like so: datasette publish cloudrun --service=my-service mydatabase.db -m metadata.json If you just want to set the title, license or source information you can do that directly using extra options to datasette publish : datasette publish cloudrun mydatabase.db --service=my-service \ --title="Title of my database" \ --source="Where the data originated" \ --source_url="http://www.example.com/" You can also specify plugins you would like to install. For example, if you want to include the datasette-vega visualization plugin you can use the following: datasette publish cloudrun mydatabase.db --service=my-service --install=datasette-vega If a plugin has any Secret configuration values you can use the --plugin-secret option to set those secrets at publish time. For example, using Heroku with datasette-auth-github you might run the following command: $ datasette publish heroku my_database.db \ --name my-heroku-app-demo \ --install=datasette-auth-github \ --plugin-secret datasette-auth-github client_id your_client_id \ --plugin-secret datasette-auth-github client_secret your_client_secret | 10 | |
241 | 241 | Publishing to Fly | Fly is a competitively priced Docker-compatible hosting platform that supports running applications in globally distributed data centers close to your end users. You can deploy Datasette instances to Fly using the datasette-publish-fly plugin. pip install datasette-publish-fly datasette publish fly mydatabase.db --app="my-app" Consult the datasette-publish-fly README for more details. | 10 | |
240 | 240 | Publishing to Vercel | Vercel - previously known as Zeit Now - provides a layer over AWS Lambda to allow for quick, scale-to-zero deployment. You can deploy Datasette instances to Vercel using the datasette-publish-vercel plugin. pip install datasette-publish-vercel datasette publish vercel mydatabase.db --project my-database-project Not every feature is supported: consult the datasette-publish-vercel README for more details. | 10 | |
239 | 239 | Publishing to Heroku | To publish your data using Heroku , first create an account there and install and configure the Heroku CLI tool . You can publish one or more databases to Heroku using the following command: datasette publish heroku mydatabase.db This will output some details about the new deployment, including a URL like this one: https://limitless-reef-88278.herokuapp.com/ deployed to Heroku You can specify a custom app name by passing -n my-app-name to the publish command. This will also allow you to overwrite an existing app. Rather than deploying directly you can use the --generate-dir option to output the files that would be deployed to a directory: datasette publish heroku mydatabase.db --generate-dir=/tmp/deploy-this-to-heroku See datasette publish heroku for the full list of options for this command. | 10 | |
238 | 238 | Publishing to Google Cloud Run | Google Cloud Run allows you to publish data in a scale-to-zero environment, so your application will start running when the first request is received and will shut down again when traffic ceases. This means you only pay for time spent serving traffic. Cloud Run is a great option for inexpensively hosting small, low traffic projects - but costs can add up for projects that serve a lot of requests. Be particularly careful if your project has tables with large numbers of rows. Search engine crawlers that index a page for every row could result in a high bill. The datasette-block-robots plugin can be used to request search engine crawlers omit crawling your site, which can help avoid this issue. You will first need to install and configure the Google Cloud CLI tools by following these instructions . You can then publish one or more SQLite database files to Google Cloud Run using the following command: datasette publish cloudrun mydatabase.db --service=my-database A Cloud Run service is a single hosted application. The service name you specify will be used as part of the Cloud Run URL. If you deploy to a service name that you have used in the past your new deployment will replace the previous one. If you omit the --service option you will be asked to pick a service name interactively during the deploy. You may need to interact with prompts from the tool. Many of the prompts ask for values that can be set as properties for the Google Cloud SDK if you want to avoid the prompts. For example, the default region for the deployed instance can be set using the command: gcloud config set run/region us-central1 You should replace us-central1 with your desired region . Alternately, you can specify the region by setting the CLOUDSDK_RUN_REGION environment… | 10 | |
237 | 237 | datasette publish | Once you have created a SQLite database (e.g. using csvs-to-sqlite ) you can deploy it to a hosting account using a single command. You will need a hosting account with Heroku or Google Cloud . Once you have created your account you will need to install and configure the heroku or gcloud command-line tools. | 10 | |
236 | 236 | Publishing data | Datasette includes tools for publishing and deploying your data to the internet. The datasette publish command will deploy a new Datasette instance containing your databases directly to a Heroku or Google Cloud hosting account. You can also use datasette package to create a Docker image that bundles your databases together with the datasette application that is used to serve them. | 10 | |
235 | 235 | Using YAML for metadata | Datasette accepts YAML as an alternative to JSON for your metadata configuration file. YAML is particularly useful for including multiline HTML and SQL strings. Here's an example of a metadata.yml file, re-using an example from Canned queries . title: Demonstrating Metadata from YAML description_html: |- <p>This description includes a long HTML string</p> <ul> <li>YAML is better for embedding HTML strings than JSON!</li> </ul> license: ODbL license_url: https://opendatacommons.org/licenses/odbl/ databases: fixtures: tables: no_primary_key: hidden: true queries: neighborhood_search: sql: |- select neighborhood, facet_cities.name, state from facetable join facet_cities on facetable.city_id = facet_cities.id where neighborhood like '%' || :text || '%' order by neighborhood; title: Search neighborhoods description_html: |- <p>This demonstrates <em>basic</em> LIKE search The metadata.yml file is passed to Datasette using the same --metadata option: datasette fixtures.db --metadata metadata.yml | 10 | |
234 | 234 | Hiding tables | You can hide tables from the database listing view (in the same way that FTS and SpatiaLite tables are automatically hidden) using "hidden": true : { "databases": { "database1": { "tables": { "example_table": { "hidden": true } } } } } | 10 | |
233 | 233 | Specifying the label column for a table | Datasette's HTML interface attempts to display foreign key references as labelled hyperlinks. By default, it looks for referenced tables that only have two columns: a primary key column and one other. It assumes that the second column should be used as the link label. If your table has more than two columns you can specify which column should be used for the link label with the label_column property: { "databases": { "database1": { "tables": { "example_table": { "label_column": "title" } } } } } | 10 | |
232 | 232 | Setting which columns can be used for sorting | Datasette allows any column to be used for sorting by default. If you need to control which columns are available for sorting you can do so using the optional sortable_columns key: { "databases": { "database1": { "tables": { "example_table": { "sortable_columns": [ "height", "weight" ] } } } } } This will restrict sorting of example_table to just the height and weight columns. You can also disable sorting entirely by setting "sortable_columns": [] You can use sortable_columns to enable specific sort orders for a view called name_of_view in the database my_database like so: { "databases": { "my_database": { "tables": { "name_of_view": { "sortable_columns": [ "clicks", "impressions" ] } } } } } | 10 | |
231 | 231 | Setting a custom page size | Datasette defaults to displaying 100 rows per page, for both tables and views. You can change this default page size on a per-table or per-view basis using the "size" key in metadata.json : { "databases": { "mydatabase": { "tables": { "example_table": { "size": 10 } } } } } This size can still be over-ridden by passing e.g. ?_size=50 in the query string. | 10 | |
230 | 230 | Setting a default sort order | By default Datasette tables are sorted by primary key. You can over-ride this default for a specific table using the "sort" or "sort_desc" metadata properties: { "databases": { "mydatabase": { "tables": { "example_table": { "sort": "created" } } } } } Or use "sort_desc" to sort in descending order: { "databases": { "mydatabase": { "tables": { "example_table": { "sort_desc": "created" } } } } } | 10 | |
229 | 229 | Specifying units for a column | Datasette supports attaching units to a column, which will be used when displaying values from that column. SI prefixes will be used where appropriate. Column units are configured in the metadata like so: { "databases": { "database1": { "tables": { "example_table": { "units": { "column1": "metres", "column2": "Hz" } } } } } } Units are interpreted using Pint , and you can see the full list of available units in Pint's unit registry . You can also add custom units to the metadata, which will be registered with Pint: { "custom_units": [ "decibel = [] = dB" ] } | 10 | |
228 | 228 | Column descriptions | You can include descriptions for your columns by adding a "columns": {"name-of-column": "description-of-column"} block to your table metadata: { "databases": { "database1": { "tables": { "example_table": { "columns": { "column1": "Description of column 1", "column2": "Description of column 2" } } } } } } These will be displayed at the top of the table page, and will also show in the cog menu for each column. You can see an example of how these look at latest.datasette.io/fixtures/roadside_attractions . | 10 | |
227 | 227 | Source, license and about | The three visible metadata fields you can apply to everything, specific databases or specific tables are source, license and about. All three are optional. source and source_url should be used to indicate where the underlying data came from. license and license_url should be used to indicate the license under which the data can be used. about and about_url can be used to link to further information about the project - an accompanying blog entry for example. For each of these you can provide just the *_url field and Datasette will treat that as the default link label text and display the URL directly on the page. | 10 | |
226 | 226 | Per-database and per-table metadata | Metadata at the top level of the JSON will be shown on the index page and in the footer on every page of the site. The license and source is expected to apply to all of your data. You can also provide metadata at the per-database or per-table level, like this: { "databases": { "database1": { "source": "Alternative source", "source_url": "http://example.com/", "tables": { "example_table": { "description_html": "Custom <em>table</em> description", "license": "CC BY 3.0 US", "license_url": "https://creativecommons.org/licenses/by/3.0/us/" } } } } } Each of the top-level metadata fields can be used at the database and table level. | 10 | |
225 | 225 | Metadata | Data loves metadata. Any time you run Datasette you can optionally include a JSON file with metadata about your databases and tables. Datasette will then display that information in the web UI. Run Datasette like this: datasette database1.db database2.db --metadata metadata.json Your metadata.json file can look something like this: { "title": "Custom title for your index page", "description": "Some description text can go here", "license": "ODbL", "license_url": "https://opendatacommons.org/licenses/odbl/", "source": "Original Data Source", "source_url": "http://example.com/" } You can optionally use YAML instead of JSON, see Using YAML for metadata . The above metadata will be displayed on the index page of your Datasette-powered site. The source and license information will also be included in the footer of every page served by Datasette. Any special HTML characters in description will be escaped. If you want to include HTML in your description, you can use a description_html property instead. | 10 | |
224 | 224 | datasette-hashed-urls | If you open a database file in immutable mode using the -i option, you can be assured that the content of that database will not change for the lifetime of the Datasette server. The datasette-hashed-urls plugin implements an optimization where your database is served with part of the SHA-256 hash of the database contents baked into the URL. A database at /fixtures will instead be served at /fixtures-aa7318b , and a year-long cache expiry header will be returned with those pages. This will then be cached by both browsers and caching proxies such as Cloudflare or Fastly, providing a potentially significant performance boost. To install the plugin, run the following: datasette install datasette-hashed-urls Prior to Datasette 0.61 hashed URL mode was a core Datasette feature, enabled using the hash_urls setting. This implementation has now been removed in favor of the datasette-hashed-urls plugin. Prior to Datasette 0.28 hashed URL mode was the default behaviour for Datasette, since all database files were assumed to be immutable and unchanging. From 0.28 onwards the default has been to treat database files as mutable unless explicitly configured otherwise. | 10 | |
223 | 223 | HTTP caching | If your database is immutable and guaranteed not to change, you can gain major performance improvements from Datasette by enabling HTTP caching. This can work at two different levels. First, it can tell browsers to cache the results of queries and serve future requests from the browser cache. More significantly, it allows you to run Datasette behind a caching proxy such as Varnish or use a cache provided by a hosted service such as Fastly or Cloudflare . This can provide incredible speed-ups since a query only needs to be executed by Datasette the first time it is accessed - all subsequent hits can then be served by the cache. Using a caching proxy in this way could enable a Datasette-backed visualization to serve thousands of hits a second while running Datasette itself on extremely inexpensive hosting. Datasette's integration with HTTP caches can be enabled using a combination of configuration options and query string arguments. The default_cache_ttl setting sets the default HTTP cache TTL for all Datasette pages. This is 5 seconds unless you change it - you can set it to 0 if you wish to disable HTTP caching entirely. You can also change the cache timeout on a per-request basis using the ?_ttl=10 query string parameter. This can be useful when you are working with the Datasette JSON API - you may decide that a specific query can be cached for a longer time, or maybe you need to set ?_ttl=0 for some requests for example if you are running a SQL order by random() query. | 10 | |
222 | 222 | Using "datasette inspect" | Counting the rows in a table can be a very expensive operation on larger databases. In immutable mode Datasette performs this count only once and caches the results, but this can still cause server startup time to increase by several seconds or more. If you know that a database is never going to change you can precalculate the table row counts once and store then in a JSON file, then use that file when you later start the server. To create a JSON file containing the calculated row counts for a database, use the following: datasette inspect data.db --inspect-file=counts.json Then later you can start Datasette against the counts.json file and use it to skip the row counting step and speed up server startup: datasette -i data.db --inspect-file=counts.json You need to use the -i immutable mode against the database file here or the counts from the JSON file will be ignored. You will rarely need to use this optimization in every-day use, but several of the datasette publish commands described in Publishing data use this optimization for better performance when deploying a database file to a hosting provider. | 10 | |
221 | 221 | Immutable mode | If you can be certain that a SQLite database file will not be changed by another process you can tell Datasette to open that file in immutable mode . Doing so will disable all locking and change detection, which can result in improved query performance. This also enables further optimizations relating to HTTP caching, described below. To open a file in immutable mode pass it to the datasette command using the -i option: datasette -i data.db When you open a file in immutable mode like this Datasette will also calculate and cache the row counts for each table in that database when it first starts up, further improving performance. | 10 | |
220 | 220 | Performance and caching | Datasette runs on top of SQLite, and SQLite has excellent performance. For small databases almost any query should return in just a few milliseconds, and larger databases (100s of MBs or even GBs of data) should perform extremely well provided your queries make sensible use of database indexes. That said, there are a number of tricks you can use to improve Datasette's performance. | 10 | |
219 | 219 | Streaming all records | The stream all rows option is designed to be as efficient as possible - under the hood it takes advantage of Python 3 asyncio capabilities and Datasette's efficient pagination to stream back the full CSV file. Since databases can get pretty large, by default this option is capped at 100MB - if a table returns more than 100MB of data the last line of the CSV will be a truncation error message. You can increase or remove this limit using the max_csv_mb config setting. You can also disable the CSV export feature entirely using allow_csv_stream . | 10 | |
218 | 218 | URL parameters | The following options can be used to customize the CSVs returned by Datasette. ?_header=off This removes the first row of the CSV file specifying the headings - only the row data will be returned. ?_stream=on Stream all matching records, not just the first page of results. See below. ?_dl=on Causes Datasette to return a content-disposition: attachment; filename="filename.csv" header. | 10 | |
217 | 217 | CSV export | Any Datasette table, view or custom SQL query can be exported as CSV. To obtain the CSV representation of the table you are looking, click the "this data as CSV" link. You can also use the advanced export form for more control over the resulting file, which looks like this and has the following options: download file - instead of displaying CSV in your browser, this forces your browser to download the CSV to your downloads directory. expand labels - if your table has any foreign key references this option will cause the CSV to gain additional COLUMN_NAME_label columns with a label for each foreign key derived from the linked table. In this example the city_id column is accompanied by a city_id_label column. stream all rows - by default CSV files only contain the first max_returned_rows records. This option will cause Datasette to loop through every matching record and return them as a single CSV file. You can try that out on https://latest.datasette.io/fixtures/facetable?_size=4 | 10 | |
216 | 216 | Row | Every row in every Datasette table has its own URL. This means individual records can be linked to directly. Table cells with extremely long text contents are truncated on the table view according to the truncate_cells_html setting. If a cell has been truncated the full length version of that cell will be available on the row page. Rows which are the targets of foreign key references from other tables will show a link to a filtered search for all records that reference that row. Here's an example from the Registers of Members Interests database: ../people/uk.org.publicwhip%2Fperson%2F10001 Note that this URL includes the encoded primary key of the record. Here's that same page as JSON: ../people/uk.org.publicwhip%2Fperson%2F10001.json | 10 | |
215 | 215 | Table | The table page is the heart of Datasette: it allows users to interactively explore the contents of a database table, including sorting, filtering, Full-text search and applying Facets . The HTML interface is worth spending some time exploring. As with other pages, you can return the JSON data by appending .json to the URL path, before any ? query string arguments. The query string arguments are described in more detail here: Table arguments You can also use the table page to interactively construct a SQL query - by applying different filters and a sort order for example - and then click the "View and edit SQL" link to see the SQL query that was used for the page and edit and re-submit it. Some examples: ../items lists all of the line-items registered by UK MPs as potential conflicts of interest. It demonstrates Datasette's support for Full-text search . ../antiquities-act%2Factions_under_antiquities_act is an interface for exploring the "actions under the antiquities act" data table published by FiveThirtyEight. ../global-power-plants?country_long=United+Kingdom&primary_fuel=Gas is a filtered table page showing every Gas power plant in the United Kingdom. It includes some default facets (configured using its metadata.json ) and uses the datasette-cluster-map plugin to show a map of the results. | 10 | |
214 | 214 | Database | Each database has a page listing the tables, views and canned queries available for that database. If the execute-sql permission is enabled (it's on by default) there will also be an interface for executing arbitrary SQL select queries against the data. Examples: fivethirtyeight.datasettes.com/fivethirtyeight global-power-plants.datasettes.com/global-power-plants The JSON version of this page provides programmatic access to the underlying data: fivethirtyeight.datasettes.com/fivethirtyeight.json global-power-plants.datasettes.com/global-power-plants.json | 10 | |
213 | 213 | Top-level index | The root page of any Datasette installation is an index page that lists all of the currently attached databases. Some examples: fivethirtyeight.datasettes.com global-power-plants.datasettes.com register-of-members-interests.datasettes.com Add /.json to the end of the URL for the JSON version of the underlying data: fivethirtyeight.datasettes.com/.json global-power-plants.datasettes.com/.json register-of-members-interests.datasettes.com/.json | 10 | |
212 | 212 | Pages and API endpoints | The Datasette web application offers a number of different pages that can be accessed to explore the data in question, each of which is accompanied by an equivalent JSON API. | 10 | |
211 | 211 | Custom error pages | Datasette returns an error page if an unexpected error occurs, access is forbidden or content cannot be found. You can customize the response returned for these errors by providing a custom error page template. Content not found errors use a 404.html template. Access denied errors use 403.html . Invalid input errors use 400.html . Unexpected errors of other kinds use 500.html . If a template for the specific error code is not found a template called error.html will be used instead. If you do not provide that template Datasette's default error.html template will be used. The error template will be passed the following context: status - integer The integer HTTP status code, e.g. 404, 500, 403, 400. error - string Details of the specific error, usually a full sentence. title - string or None A title for the page representing the class of error. This is often None for errors that do not provide a title separate from their error message. | 10 | |
210 | 210 | Custom redirects | You can use the custom_redirect(location) function to redirect users to another page, for example in a file called pages/datasette.html : {{ custom_redirect("https://github.com/simonw/datasette") }} Now requests to http://localhost:8001/datasette will result in a redirect. These redirects are served with a 302 Found status code by default. You can send a 301 Moved Permanently code by passing 301 as the second argument to the function: {{ custom_redirect("https://github.com/simonw/datasette", 301) }} | 10 | |
209 | 209 | Returning 404s | To indicate that content could not be found and display the default 404 page you can use the raise_404(message) function: {% if not rows %} {{ raise_404("Content not found") }} {% endif %} If you call raise_404() the other content in your template will be ignored. | 10 | |
208 | 208 | Custom headers and status codes | Custom pages default to being served with a content-type of text/html; charset=utf-8 and a 200 status code. You can change these by calling a custom function from within your template. For example, to serve a custom page with a 418 I'm a teapot HTTP status code, create a file in pages/teapot.html containing the following: {{ custom_status(418) }} <html> <head><title>Teapot</title></head> <body> I'm a teapot </body> </html> To serve a custom HTTP header, add a custom_header(name, value) function call. For example: {{ custom_status(418) }} {{ custom_header("x-teapot", "I am") }} <html> <head><title>Teapot</title></head> <body> I'm a teapot </body> </html> You can verify this is working using curl like this: $ curl -I 'http://127.0.0.1:8001/teapot' HTTP/1.1 418 date: Sun, 26 Apr 2020 18:38:30 GMT server: uvicorn x-teapot: I am content-type: text/html; charset=utf-8 | 10 | |
207 | 207 | Path parameters for pages | You can define custom pages that match multiple paths by creating files with {variable} definitions in their filenames. For example, to capture any request to a URL matching /about/* , you would create a template in the following location: templates/pages/about/{slug}.html A hit to /about/news would render that template and pass in a variable called slug with a value of "news" . If you use this mechanism don't forget to return a 404 if the referenced content could not be found. You can do this using {{ raise_404() }} described below. Templates defined using custom page routes work particularly well with the sql() template function from datasette-template-sql or the graphql() template function from datasette-graphql . | 10 | |
206 | 206 | Custom pages | You can add templated pages to your Datasette instance by creating HTML files in a pages directory within your templates directory. For example, to add a custom page that is served at http://localhost/about you would create a file in templates/pages/about.html , then start Datasette like this: $ datasette mydb.db --template-dir=templates/ You can nest directories within pages to create a nested structure. To create a http://localhost:8001/about/map page you would create templates/pages/about/map.html . | 10 | |
205 | 205 | Custom templates | By default, Datasette uses default templates that ship with the package. You can over-ride these templates by specifying a custom --template-dir like this: datasette mydb.db --template-dir=mytemplates/ Datasette will now first look for templates in that directory, and fall back on the defaults if no matches are found. It is also possible to over-ride templates on a per-database, per-row or per- table basis. The lookup rules Datasette uses are as follows: Index page (/): index.html Database page (/mydatabase): database-mydatabase.html database.html Custom query page (/mydatabase?sql=...): query-mydatabase.html query.html Canned query page (/mydatabase/canned-query): query-mydatabase-canned-query.html query-mydatabase.html query.html Table page (/mydatabase/mytable): table-mydatabase-mytable.html table.html Row page (/mydatabase/mytable/id): row-mydatabase-mytable.html row.html Table of rows and columns include on table page: _table-table-mydatabase-mytable.html _table-mydatabase-mytable.html _table.html Table of rows and columns include on row page: _table-row-mydatabase-mytable.html _table-mydatabase-mytable.html _table.html If a table name has spaces or other unexpected characters in it, the template filename will follow the same rules as our custom <body> CSS classes - for example, a table called "Food Trucks" will attempt to load the following templates: table-mydatabase-Food-Trucks-399138.html table.html You can find out which templates were considered for a specific page by viewing source on that page and looking for an HTML comment at the bottom. The comment will look something like this: <!-- Templates considered: *query-mydb-tz.html, query-mydb.html, query.html… | 10 | |
204 | 204 | Publishing static assets | The datasette publish command can be used to publish your static assets, using the same syntax as above: $ datasette publish cloudrun mydb.db --static assets:static-files/ This will upload the contents of the static-files/ directory as part of the deployment, and configure Datasette to correctly serve the assets from /assets/ . | 10 | |
203 | 203 | Serving static files | Datasette can serve static files for you, using the --static option. Consider the following directory structure: metadata.json static-files/styles.css static-files/app.js You can start Datasette using --static assets:static-files/ to serve those files from the /assets/ mount point: $ datasette -m metadata.json --static assets:static-files/ --memory The following URLs will now serve the content from those CSS and JS files: http://localhost:8001/assets/styles.css http://localhost:8001/assets/app.js You can reference those files from metadata.json like so: { "extra_css_urls": [ "/assets/styles.css" ], "extra_js_urls": [ "/assets/app.js" ] } | 10 | |
202 | 202 | CSS classes on the <body> | Every default template includes CSS classes in the body designed to support custom styling. The index template (the top level page at / ) gets this: <body class="index"> The database template ( /dbname ) gets this: <body class="db db-dbname"> The custom SQL template ( /dbname?sql=... ) gets this: <body class="query db-dbname"> A canned query template ( /dbname/queryname ) gets this: <body class="query db-dbname query-queryname"> The table template ( /dbname/tablename ) gets: <body class="table db-dbname table-tablename"> The row template ( /dbname/tablename/rowid ) gets: <body class="row db-dbname table-tablename"> The db-x and table-x classes use the database or table names themselves if they are valid CSS identifiers. If they aren't, we strip any invalid characters out and append a 6 character md5 digest of the original name, in order to ensure that multiple tables which resolve to the same stripped character version still have different CSS classes. Some examples: "simple" => "simple" "MixedCase" => "MixedCase" "-no-leading-hyphens" => "no-leading-hyphens-65bea6" "_no-leading-underscores" => "no-leading-underscores-b921bc" "no spaces" => "no-spaces-7088d7" "-" => "336d5e" "no $ characters" => "no--characters-59e024" <td> and <th> elements also get custom CSS classes reflecting the database column they are representing, for example: <table> <thead> <tr> <th class="col-id" scope="col">id</th> <th class="col-name" scope="col">name</th> </tr> </thead> <tbody> <tr> <td class="col-id"><a href="...">1</a></td> … | 10 | |
201 | 201 | Custom CSS and JavaScript | When you launch Datasette, you can specify a custom metadata file like this: datasette mydb.db --metadata metadata.json Your metadata.json file can include links that look like this: { "extra_css_urls": [ "https://simonwillison.net/static/css/all.bf8cd891642c.css" ], "extra_js_urls": [ "https://code.jquery.com/jquery-3.2.1.slim.min.js" ] } The extra CSS and JavaScript files will be linked in the <head> of every page: <link rel="stylesheet" href="https://simonwillison.net/static/css/all.bf8cd891642c.css"> <script src="https://code.jquery.com/jquery-3.2.1.slim.min.js"></script> You can also specify a SRI (subresource integrity hash) for these assets: { "extra_css_urls": [ { "url": "https://simonwillison.net/static/css/all.bf8cd891642c.css", "sri": "sha384-9qIZekWUyjCyDIf2YK1FRoKiPJq4PHt6tp/ulnuuyRBvazd0hG7pWbE99zvwSznI" } ], "extra_js_urls": [ { "url": "https://code.jquery.com/jquery-3.2.1.slim.min.js", "sri": "sha256-k2WSCIexGzOj3Euiig+TlR8gA0EmPjuc79OEeY5L45g=" } ] } This will produce: <link rel="stylesheet" href="https://simonwillison.net/static/css/all.bf8cd891642c.css" integrity="sha384-9qIZekWUyjCyDIf2YK1FRoKiPJq4PHt6tp/ulnuuyRBvazd0hG7pWbE99zvwSznI" crossorigin="anonymous"> <script src="https://code.jquery.com/jquery-3.2.1.slim.min.js" integrity="sha256-k2WSCIexGzOj3Euiig+TlR8gA0EmPjuc79OEeY5L45g=" crossorigin="anonymous"></script> Modern browsers will only execute the stylesheet or JavaScript if the SRI hash matches the content served. You can generate hashes using www.srihash.org Items in "extra_js_urls" can specify "module": true if they reference JavaScript that uses JavaScript modules . This configuration: { "extra_js_urls": [ { "url": "https://exa… | 10 | |
200 | 200 | Custom pages and templates | Datasette provides a number of ways of customizing the way data is displayed. | 10 | |
199 | 199 | Upgrading CodeMirror | Datasette bundles CodeMirror for the SQL editing interface, e.g. on this page . Here are the steps for upgrading to a new version of CodeMirror: Download and extract latest CodeMirror zip file from https://codemirror.net/codemirror.zip Rename lib/codemirror.js to codemirror-5.57.0.js (using latest version number) Rename lib/codemirror.css to codemirror-5.57.0.css Rename mode/sql/sql.js to codemirror-5.57.0-sql.js Edit both JavaScript files to make the top license comment a /* */ block instead of multiple // lines Minify the JavaScript files like this: npx uglify-js codemirror-5.57.0.js -o codemirror-5.57.0.min.js --comments '/LICENSE/' npx uglify-js codemirror-5.57.0-sql.js -o codemirror-5.57.0-sql.min.js --comments '/LICENSE/' Check that the LICENSE comment did indeed survive minification Minify the CSS file like this: npx clean-css-cli codemirror-5.57.0.css -o codemirror-5.57.0.min.css Edit the _codemirror.html template to reference the new files git rm the old files, git add the new files | 10 | |
198 | 198 | Releasing bug fixes from a branch | If it's necessary to publish a bug fix release without shipping new features that have landed on main a release branch can be used. Create it from the relevant last tagged release like so: git branch 0.52.x 0.52.4 git checkout 0.52.x Next cherry-pick the commits containing the bug fixes: git cherry-pick COMMIT Write the release notes in the branch, and update the version number in version.py . Then push the branch: git push -u origin 0.52.x Once the tests have completed, publish the release from that branch target using the GitHub Draft a new release form. Finally, cherry-pick the commit with the release notes and version number bump across to main : git checkout main git cherry-pick COMMIT git push | 10 | |
197 | 197 | Alpha and beta releases | Alpha and beta releases are published to preview upcoming features that may not yet be stable - in particular to preview new plugin hooks. You are welcome to try these out, but please be aware that details may change before the final release. Please join discussions on the issue tracker to share your thoughts and experiences with on alpha and beta features that you try out. | 10 | |
196 | 196 | Release process | Datasette releases are performed using tags. When a new release is published on GitHub, a GitHub Action workflow will perform the following: Run the unit tests against all supported Python versions. If the tests pass... Build a Docker image of the release and push a tag to https://hub.docker.com/r/datasetteproject/datasette Re-point the "latest" tag on Docker Hub to the new image Build a wheel bundle of the underlying Python source code Push that new wheel up to PyPI: https://pypi.org/project/datasette/ To deploy new releases you will need to have push access to the main Datasette GitHub repository. Datasette follows Semantic Versioning : major.minor.patch We increment major for backwards-incompatible releases. Datasette is currently pre-1.0 so the major version is always 0 . We increment minor for new features. We increment patch for bugfix releass. Alpha and beta releases may have an additional a0 or b0 prefix - the integer component will be incremented with each subsequent alpha or beta. To release a new version, first create a commit that updates the version number in datasette/version.py and the the changelog with highlights of the new version. An example commit can be seen here : # Update changelog git commit -m " Release 0.51a1 Refs #1056, #1039, #998, #1045, #1033, #1036, #1034, #976, #1057, #1058, #1053, #1064, #1066" -a git push Referencing the issues that are part of the release in the commit message ensures the name of the release shows up on those issue pages, e.g. here . You can generate the list of issue referen… | 10 | |
195 | 195 | Continuously deployed demo instances | The demo instance at latest.datasette.io is re-deployed automatically to Google Cloud Run for every push to main that passes the test suite. This is implemented by the GitHub Actions workflow at .github/workflows/deploy-latest.yml . Specific branches can also be set to automatically deploy by adding them to the on: push: branches block at the top of the workflow YAML file. Branches configured in this way will be deployed to a new Cloud Run service whether or not their tests pass. The Cloud Run URL for a branch demo can be found in the GitHub Actions logs. | 10 | |
194 | 194 | Running Cog | Some pages of documentation (in particular the CLI reference ) are automatically updated using Cog . To update these pages, run the following command: cog -r docs/*.rst | 10 | |
193 | 193 | Editing and building the documentation | Datasette's documentation lives in the docs/ directory and is deployed automatically using Read The Docs . The documentation is written using reStructuredText. You may find this article on The subset of reStructuredText worth committing to memory useful. You can build it locally by installing sphinx and sphinx_rtd_theme in your Datasette development environment and then running make html directly in the docs/ directory: # You may first need to activate your virtual environment: source venv/bin/activate # Install the dependencies needed to build the docs pip install -e .[docs] # Now build the docs cd docs/ make html This will create the HTML version of the documentation in docs/_build/html . You can open it in your browser like so: open _build/html/index.html Any time you make changes to a .rst file you can re-run make html to update the built documents, then refresh them in your browser. For added productivity, you can use use sphinx-autobuild to run Sphinx in auto-build mode. This will run a local webserver serving the docs that automatically rebuilds them and refreshes the page any time you hit save in your editor. sphinx-autobuild will have been installed when you ran pip install -e .[docs] . In your docs/ directory you can start the server by running the following: make livehtml Now browse to http://localhost:8000/ to view the documentation. Any edits you make should be instantly reflected in your browser. | 10 | |
192 | 192 | Prettier | To install Prettier, install Node.js and then run the following in the root of your datasette repository checkout: $ npm install This will install Prettier in a node_modules directory. You can then check that your code matches the coding style like so: $ npm run prettier -- --check > prettier > prettier 'datasette/static/*[!.min].js' "--check" Checking formatting... [warn] datasette/static/plugins.js [warn] Code style issues found in the above file(s). Forgot to run Prettier? You can fix any problems by running: $ npm run fix | 10 | |
191 | 191 | blacken-docs | The blacken-docs command applies Black formatting rules to code examples in the documentation. Run it like this: blacken-docs -l 60 docs/*.rst | 10 | |
190 | 190 | Running Black | Black will be installed when you run pip install -e '.[test]' . To test that your code complies with Black, run the following in your root datasette repository checkout: $ black . --check All done! ✨ 🍰 ✨ 95 files would be left unchanged. If any of your code does not conform to Black you can run this to automatically fix those problems: $ black . reformatted ../datasette/setup.py All done! ✨ 🍰 ✨ 1 file reformatted, 94 files left unchanged. | 10 | |
189 | 189 | Code formatting | Datasette uses opinionated code formatters: Black for Python and Prettier for JavaScript. These formatters are enforced by Datasette's continuous integration: if a commit includes Python or JavaScript code that does not match the style enforced by those tools, the tests will fail. When developing locally, you can verify and correct the formatting of your code using these tools. | 10 | |
188 | 188 | Debugging | Any errors that occur while Datasette is running while display a stack trace on the console. You can tell Datasette to open an interactive pdb debugger session if an error occurs using the --pdb option: datasette --pdb fixtures.db | 10 | |
187 | 187 | Using fixtures | To run Datasette itself, type datasette . You're going to need at least one SQLite database. A quick way to get started is to use the fixtures database that Datasette uses for its own tests. You can create a copy of that database by running this command: python tests/fixtures.py fixtures.db Now you can run Datasette against the new fixtures database like so: datasette fixtures.db This will start a server at http://127.0.0.1:8001/ . Any changes you make in the datasette/templates or datasette/static folder will be picked up immediately (though you may need to do a force-refresh in your browser to see changes to CSS or JavaScript). If you want to change Datasette's Python code you can use the --reload option to cause Datasette to automatically reload any time the underlying code changes: datasette --reload fixtures.db You can also use the fixtures.py script to recreate the testing version of metadata.json used by the unit tests. To do that: python tests/fixtures.py fixtures.db fixtures-metadata.json Or to output the plugins used by the tests, run this: python tests/fixtures.py fixtures.db fixtures-metadata.json fixtures-plugins Test tables written to fixtures.db - metadata written to fixtures-metadata.json Wrote plugin: fixtures-plugins/register_output_renderer.py Wrote plugin: fixtures-plugins/view_name.py Wrote plugin: fixtures-plugins/my_plugin.py Wrote plugin: fixtures-plugins/messages_output_renderer.py Wrote plugin: fixtures-plugins/my_plugin_2.py Then run Datasette like this: datasette fixtures.db -m fixtures-metadata.json --plugins-dir=fixtures-plugins/ | 10 | |
186 | 186 | Running the tests | Once you have done this, you can run the Datasette unit tests from inside your datasette/ directory using pytest like so: pytest You can run the tests faster using multiple CPU cores with pytest-xdist like this: pytest -n auto -m "not serial" -n auto detects the number of available cores automatically. The -m "not serial" skips tests that don't work well in a parallel test environment. You can run those tests separately like so: pytest -m "serial" | 10 | |
185 | 185 | Setting up a development environment | If you have Python 3.7 or higher installed on your computer (on OS X the quickest way to do this is using homebrew ) you can install an editable copy of Datasette using the following steps. If you want to use GitHub to publish your changes, first create a fork of datasette under your own GitHub account. Now clone that repository somewhere on your computer: git clone git@github.com:YOURNAME/datasette If you want to get started without creating your own fork, you can do this instead: git clone git@github.com:simonw/datasette The next step is to create a virtual environment for your project and use it to install Datasette's dependencies: cd datasette # Create a virtual environment in ./venv python3 -m venv ./venv # Now activate the virtual environment, so pip can install into it source venv/bin/activate # Install Datasette and its testing dependencies python3 -m pip install -e '.[test]' That last line does most of the work: pip install -e means "install this package in a way that allows me to edit the source code in place". The .[test] option means "use the setup.py in this directory and install the optional testing dependencies as well". | 10 | |
184 | 184 | General guidelines | main should always be releasable . Incomplete features should live in branches. This ensures that any small bug fixes can be quickly released. The ideal commit should bundle together the implementation, unit tests and associated documentation updates. The commit message should link to an associated issue. New plugin hooks should only be shipped if accompanied by a separate release of a non-demo plugin that uses them. | 10 | |
183 | 183 | Contributing | Datasette is an open source project. We welcome contributions! This document describes how to contribute to Datasette core. You can also contribute to the wider Datasette ecosystem by creating new Plugins . | 10 | |
182 | 182 | Import shortcuts | The following commonly used symbols can be imported directly from the datasette module: from datasette import Response from datasette import Forbidden from datasette import NotFound from datasette import hookimpl from datasette import actor_matches_allow | 10 | |
181 | 181 | Tracing child tasks | If your code uses a mechanism such as asyncio.gather() to execute code in additional tasks you may find that some of the traces are missing from the display. You can use the trace_child_tasks() context manager to ensure these child tasks are correctly handled. from datasette import tracer with tracer.trace_child_tasks(): results = await asyncio.gather( # ... async tasks here ) This example uses the register_routes() plugin hook to add a page at /parallel-queries which executes two SQL queries in parallel using asyncio.gather() and returns their results. from datasette import hookimpl from datasette import tracer @hookimpl def register_routes(): async def parallel_queries(datasette): db = datasette.get_database() with tracer.trace_child_tasks(): one, two = await asyncio.gather( db.execute("select 1"), db.execute("select 2"), ) return Response.json( { "one": one.single_value(), "two": two.single_value(), } ) return [ (r"/parallel-queries$", parallel_queries), ] Adding ?_trace=1 will show that the trace covers both of those child tasks. | 10 | |
180 | 180 | datasette.tracer | Running Datasette with --setting trace_debug 1 enables trace debug output, which can then be viewed by adding ?_trace=1 to the query string for any page. You can see an example of this at the bottom of latest.datasette.io/fixtures/facetable?_trace=1 . The JSON output shows full details of every SQL query that was executed to generate the page. The datasette-pretty-traces plugin can be installed to provide a more readable display of this information. You can see a demo of that here . You can add your own custom traces to the JSON output using the trace() context manager. This takes a string that identifies the type of trace being recorded, and records any keyword arguments as additional JSON keys on the resulting trace object. The start and end time, duration and a traceback of where the trace was executed will be automatically attached to the JSON object. This example uses trace to record the start, end and duration of any HTTP GET requests made using the function: from datasette.tracer import trace import httpx async def fetch_url(url): with trace("fetch-url", url=url): async with httpx.AsyncClient() as client: return await client.get(url) | 10 | |
179 | 179 | Tilde encoding | Datasette uses a custom encoding scheme in some places, called tilde encoding . This is primarily used for table names and row primary keys, to avoid any confusion between / characters in those values and the Datasette URLs that reference them. Tilde encoding uses the same algorithm as URL percent-encoding , but with the ~ tilde character used in place of % . Any character other than ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz0123456789_- will be replaced by the numeric equivalent preceded by a tilde. For example: / becomes ~2F . becomes ~2E % becomes ~25 ~ becomes ~7E Space becomes + polls/2022.primary becomes polls~2F2022~2Eprimary Note that the space character is a special case: it will be replaced with a + symbol. datasette.utils. tilde_encode s : str str Returns tilde-encoded string - for example /foo/bar -> ~2Ffoo~2Fbar datasette.utils. tilde_decode s : str str Decodes a tilde-encoded string, so ~2Ffoo~2Fbar -> /foo/bar | 10 |
Advanced export
JSON shape: default, array, newline-delimited
CREATE VIRTUAL TABLE [sections_fts] USING FTS5 ( [title], [content], tokenize='porter', content=[sections] );