sections
478 rows sorted by title
This data as json, CSV (advanced)
id | page | ref | title ▼ | content | breadcrumbs | references |
---|---|---|---|---|---|---|
pages:rowview | pages | rowview | Row | Every row in every Datasette table has its own URL. This means individual records can be linked to directly. Table cells with extremely long text contents are truncated on the table view according to the truncate_cells_html setting. If a cell has been truncated the full length version of that cell will be available on the row page. Rows which are the targets of foreign key references from other tables will show a link to a filtered search for all records that reference that row. Here's an example from the Registers of Members Interests database: ../people/uk.org.publicwhip%2Fperson%2F10001 Note that this URL includes the encoded primary key of the record. Here's that same page as JSON: ../people/uk.org.publicwhip%2Fperson%2F10001.json | ["Pages and API endpoints"] | [{"href": "https://register-of-members-interests.datasettes.com/regmem/people/uk.org.publicwhip%2Fperson%2F10001", "label": "../people/uk.org.publicwhip%2Fperson%2F10001"}, {"href": "https://register-of-members-interests.datasettes.com/regmem/people/uk.org.publicwhip%2Fperson%2F10001.json", "label": "../people/uk.org.publicwhip%2Fperson%2F10001.json"}] |
contributing:contributing-formatting-black | contributing | contributing-formatting-black | Running Black | Black will be installed when you run pip install -e '.[test]' . To test that your code complies with Black, run the following in your root datasette repository checkout: $ black . --check All done! ✨ 🍰 ✨ 95 files would be left unchanged. If any of your code does not conform to Black you can run this to automatically fix those problems: $ black . reformatted ../datasette/setup.py All done! ✨ 🍰 ✨ 1 file reformatted, 94 files left unchanged. | ["Contributing", "Code formatting"] | [] |
contributing:contributing-documentation-cog | contributing | contributing-documentation-cog | Running Cog | Some pages of documentation (in particular the CLI reference ) are automatically updated using Cog . To update these pages, run the following command: cog -r docs/*.rst | ["Contributing", "Editing and building the documentation"] | [{"href": "https://github.com/nedbat/cog", "label": "Cog"}] |
changelog:running-datasette-behind-a-proxy | changelog | running-datasette-behind-a-proxy | Running Datasette behind a proxy | The base_url configuration option is designed to help run Datasette on a specific path behind a proxy - for example if you want to run an instance of Datasette at /my-datasette/ within your existing site's URL hierarchy, proxied behind nginx or Apache. Support for this configuration option has been greatly improved ( #1023 ), and guidelines for using it are now available in a new documentation section on Running Datasette behind a proxy . ( #1027 ) | ["Changelog", "0.51 (2020-10-31)"] | [{"href": "https://github.com/simonw/datasette/issues/1023", "label": "#1023"}, {"href": "https://github.com/simonw/datasette/issues/1027", "label": "#1027"}] |
deploying:deploying-proxy | deploying | deploying-proxy | Running Datasette behind a proxy | You may wish to run Datasette behind an Apache or nginx proxy, using a path within your existing site. You can use the base_url configuration setting to tell Datasette to serve traffic with a specific URL prefix. For example, you could run Datasette like this: datasette my-database.db --setting base_url /my-datasette/ -p 8009 This will run Datasette with the following URLs: http://127.0.0.1:8009/my-datasette/ - the Datasette homepage http://127.0.0.1:8009/my-datasette/my-database - the page for the my-database.db database http://127.0.0.1:8009/my-datasette/my-database/some_table - the page for the some_table table You can now set your nginx or Apache server to proxy the /my-datasette/ path to this Datasette instance. | ["Deploying Datasette"] | [] |
deploying:deploying-openrc | deploying | deploying-openrc | Running Datasette using OpenRC | OpenRC is the service manager on non-systemd Linux distributions like Alpine Linux and Gentoo . Create an init script at /etc/init.d/datasette with the following contents: #!/sbin/openrc-run name="datasette" command="datasette" command_args="serve -h 0.0.0.0 /path/to/db.db" command_background=true pidfile="/run/${RC_SVCNAME}.pid" You then need to configure the service to run at boot and start it: rc-update add datasette rc-service datasette start | ["Deploying Datasette"] | [{"href": "https://www.alpinelinux.org/", "label": "Alpine Linux"}, {"href": "https://www.gentoo.org/", "label": "Gentoo"}] |
deploying:deploying-systemd | deploying | deploying-systemd | Running Datasette using systemd | You can run Datasette on Ubuntu or Debian systems using systemd . First, ensure you have Python 3 and pip installed. On Ubuntu you can use sudo apt-get install python3 python3-pip . You can install Datasette into a virtual environment, or you can install it system-wide. To install system-wide, use sudo pip3 install datasette . Now create a folder for your Datasette databases, for example using mkdir /home/ubuntu/datasette-root . You can copy a test database into that folder like so: cd /home/ubuntu/datasette-root curl -O https://latest.datasette.io/fixtures.db Create a file at /etc/systemd/system/datasette.service with the following contents: [Unit] Description=Datasette After=network.target [Service] Type=simple User=ubuntu Environment=DATASETTE_SECRET= WorkingDirectory=/home/ubuntu/datasette-root ExecStart=datasette serve . -h 127.0.0.1 -p 8000 Restart=on-failure [Install] WantedBy=multi-user.target Add a random value for the DATASETTE_SECRET - this will be used to sign Datasette cookies such as the CSRF token cookie. You can generate a suitable value like so: $ python3 -c 'import secrets; print(secrets.token_hex(32))' This configuration will run Datasette against all database files contained in the /home/ubuntu/datasette-root directory. If that directory contains a metadata.yml (or .json ) file or a templates/ or plugins/ sub-directory those will automatically be loaded by Datasette - see Configuration directory mode for details. You can start the Datasette process running using the following: sudo systemctl daemon-reload sudo systemctl start datasette.service You will need to restart the Datasette service after making changes to its metadata.json configuration or adding a new database file to that directory. You can do that using: sudo systemctl restart datasette.service Once th… | ["Deploying Datasette"] | [] |
sql_queries:sql | sql_queries | sql | Running SQL queries | Datasette treats SQLite database files as read-only and immutable. This means it is not possible to execute INSERT or UPDATE statements using Datasette, which allows us to expose SELECT statements to the outside world without needing to worry about SQL injection attacks. The easiest way to execute custom SQL against Datasette is through the web UI. The database index page includes a SQL editor that lets you run any SELECT query you like. You can also construct queries using the filter interface on the tables page, then click "View and edit SQL" to open that query in the custom SQL editor. Note that this interface is only available if the execute-sql permission is allowed. Any Datasette SQL query is reflected in the URL of the page, allowing you to bookmark them, share them with others and navigate through previous queries using your browser back button. You can also retrieve the results of any query as JSON by adding .json to the base URL. | [] | [] |
contributing:contributing-running-tests | contributing | contributing-running-tests | Running the tests | Once you have done this, you can run the Datasette unit tests from inside your datasette/ directory using pytest like so: pytest You can run the tests faster using multiple CPU cores with pytest-xdist like this: pytest -n auto -m "not serial" -n auto detects the number of available cores automatically. The -m "not serial" skips tests that don't work well in a parallel test environment. You can run those tests separately like so: pytest -m "serial" | ["Contributing"] | [{"href": "https://docs.pytest.org/", "label": "pytest"}, {"href": "https://pypi.org/project/pytest-xdist/", "label": "pytest-xdist"}] |
full_text_search:full-text-search-custom-sql | full_text_search | full-text-search-custom-sql | Searches using custom SQL | You can include full-text search results in custom SQL queries. The general pattern with SQLite search is to run the search as a sub-select that returns rowid values, then include those rowids in another part of the query. You can see the syntax for a basic search by running that search on a table page and then clicking "View and edit SQL" to see the underlying SQL. For example, consider this search for manafort is the US FARA database : /fara/FARA_All_ShortForms?_search=manafort If you click View and edit SQL you'll see that the underlying SQL looks like this: select rowid, Short_Form_Termination_Date, Short_Form_Date, Short_Form_Last_Name, Short_Form_First_Name, Registration_Number, Registration_Date, Registrant_Name, Address_1, Address_2, City, State, Zip from FARA_All_ShortForms where rowid in ( select rowid from FARA_All_ShortForms_fts where FARA_All_ShortForms_fts match escape_fts(:search) ) order by rowid limit 101 | ["Full-text search"] | [{"href": "https://fara.datasettes.com/fara/FARA_All_ShortForms?_search=manafort", "label": "manafort is the US FARA database"}, {"href": "https://fara.datasettes.com/fara?sql=select%0D%0A++rowid%2C%0D%0A++Short_Form_Termination_Date%2C%0D%0A++Short_Form_Date%2C%0D%0A++Short_Form_Last_Name%2C%0D%0A++Short_Form_First_Name%2C%0D%0A++Registration_Number%2C%0D%0A++Registration_Date%2C%0D%0A++Registrant_Name%2C%0D%0A++Address_1%2C%0D%0A++Address_2%2C%0D%0A++City%2C%0D%0A++State%2C%0D%0A++Zip%0D%0Afrom%0D%0A++FARA_All_ShortForms%0D%0Awhere%0D%0A++rowid+in+%28%0D%0A++++select%0D%0A++++++rowid%0D%0A++++from%0D%0A++++++FARA_All_ShortForms_fts%0D%0A++++where%0D%0A++++++FARA_All_ShortForms_fts+match+escape_fts%28%3Asearch%29%0D%0A++%29%0D%0Aorder+by%0D%0A++rowid%0D%0Alimit%0D%0A++101&search=manafort", "label": "View and edit SQL"}] |
plugins:plugins-configuration-secret | plugins | plugins-configuration-secret | Secret configuration values | Any values embedded in metadata.json will be visible to anyone who views the /-/metadata page of your Datasette instance. Some plugins may need configuration that should stay secret - API keys for example. There are two ways in which you can store secret configuration values. As environment variables . If your secret lives in an environment variable that is available to the Datasette process, you can indicate that the configuration value should be read from that environment variable like so: { "plugins": { "datasette-auth-github": { "client_secret": { "$env": "GITHUB_CLIENT_SECRET" } } } } As values in separate files . Your secrets can also live in files on disk. To specify a secret should be read from a file, provide the full file path like this: { "plugins": { "datasette-auth-github": { "client_secret": { "$file": "/secrets/client-secret" } } } } If you are publishing your data using the datasette publish family of commands, you can use the --plugin-secret option to set these secrets at publish time. For example, using Heroku you might run the following command: $ datasette publish heroku my_database.db \ --name my-heroku-app-demo \ --install=datasette-auth-github \ --plugin-secret datasette-auth-github client_id your_client_id \ --plugin-secret datasette-auth-github client_secret your_client_secret This will set the necessary environment variables and add the following to the deployed metadata.json : { "plugins": { "datasette-auth-github": { "client_id": { "$env": "DATASETTE_AUTH_GITHUB_CLIENT_ID" }, "client_secret": { "$env": "DATASETTE_AUTH_GITHUB_CLIENT_SECRET" } } } } | ["Plugins", "Plugin configuration"] | [] |
changelog:secret-plugin-configuration-options | changelog | secret-plugin-configuration-options | Secret plugin configuration options | Plugins like datasette-auth-github need a safe way to set secret configuration options. Since the default mechanism for configuring plugins exposes those settings in /-/metadata a new mechanism was needed. Secret configuration values describes how plugins can now specify that their settings should be read from a file or an environment variable: { "plugins": { "datasette-auth-github": { "client_secret": { "$env": "GITHUB_CLIENT_SECRET" } } } } These plugin secrets can be set directly using datasette publish . See Custom metadata and plugins for details. ( #538 and #543 ) | ["Changelog", "0.29 (2019-07-07)"] | [{"href": "https://github.com/simonw/datasette-auth-github", "label": "datasette-auth-github"}, {"href": "https://github.com/simonw/datasette/issues/538", "label": "#538"}, {"href": "https://github.com/simonw/datasette/issues/543", "label": "#543"}] |
plugins:plugins-installed | plugins | plugins-installed | Seeing what plugins are installed | You can see a list of installed plugins by navigating to the /-/plugins page of your Datasette instance - for example: https://fivethirtyeight.datasettes.com/-/plugins You can also use the datasette plugins command: $ datasette plugins [ { "name": "datasette_json_html", "static": false, "templates": false, "version": "0.4.0" } ] [[[cog from datasette import cli from click.testing import CliRunner import textwrap, json cog.out("\n") result = CliRunner().invoke(cli.cli, ["plugins", "--all"]) # cog.out() with text containing newlines was unindenting for some reason cog.outl("If you run ``datasette plugins --all`` it will include default plugins that ship as part of Datasette::\n") plugins = [p for p in json.loads(result.output) if p["name"].startswith("datasette.")] indented = textwrap.indent(json.dumps(plugins, indent=4), " ") for line in indented.split("\n"): cog.outl(line) cog.out("\n\n") ]]] If you run datasette plugins --all it will include default plugins that ship as part of Datasette: [ { "name": "datasette.actor_auth_cookie", "static": false, "templates": false, "version": null, "hooks": [ "actor_from_request" ] }, { "name": "datasette.blob_renderer", "static": false, "templates": false, "version": null, "hooks": [ "register_output_renderer" ] }, { "name": "datasette.default_magic_parameters", "static": false, "templates": false, "version": null, "hooks": [ "register_magic_parameters" ] }, { "name": "datasette.default_menu_links", "static": false, "templates": false, "version": null, "hooks": [ "menu_links" ] }, { "name": "datasette.default_permissions", "static": false, "templ… | ["Plugins"] | [{"href": "https://fivethirtyeight.datasettes.com/-/plugins", "label": "https://fivethirtyeight.datasettes.com/-/plugins"}] |
custom_templates:customization-static-files | custom_templates | customization-static-files | Serving static files | Datasette can serve static files for you, using the --static option. Consider the following directory structure: metadata.json static-files/styles.css static-files/app.js You can start Datasette using --static assets:static-files/ to serve those files from the /assets/ mount point: $ datasette -m metadata.json --static assets:static-files/ --memory The following URLs will now serve the content from those CSS and JS files: http://localhost:8001/assets/styles.css http://localhost:8001/assets/app.js You can reference those files from metadata.json like so: { "extra_css_urls": [ "/assets/styles.css" ], "extra_js_urls": [ "/assets/app.js" ] } | ["Custom pages and templates", "Custom CSS and JavaScript"] | [] |
metadata:metadata-page-size | metadata | metadata-page-size | Setting a custom page size | Datasette defaults to displaying 100 rows per page, for both tables and views. You can change this default page size on a per-table or per-view basis using the "size" key in metadata.json : { "databases": { "mydatabase": { "tables": { "example_table": { "size": 10 } } } } } This size can still be over-ridden by passing e.g. ?_size=50 in the query string. | ["Metadata"] | [] |
metadata:metadata-default-sort | metadata | metadata-default-sort | Setting a default sort order | By default Datasette tables are sorted by primary key. You can over-ride this default for a specific table using the "sort" or "sort_desc" metadata properties: { "databases": { "mydatabase": { "tables": { "example_table": { "sort": "created" } } } } } Or use "sort_desc" to sort in descending order: { "databases": { "mydatabase": { "tables": { "example_table": { "sort_desc": "created" } } } } } | ["Metadata"] | [] |
internals:internals-response-set-cookie | internals | internals-response-set-cookie | Setting cookies with response.set_cookie() | To set cookies on the response, use the response.set_cookie(...) method. The method signature looks like this: def set_cookie( self, key, value="", max_age=None, expires=None, path="/", domain=None, secure=False, httponly=False, samesite="lax", ): ... You can use this with datasette.sign() to set signed cookies. Here's how you would set the ds_actor cookie for use with Datasette authentication : response = Response.redirect("/") response.set_cookie( "ds_actor", datasette.sign({"a": {"id": "cleopaws"}}, "actor"), ) return response | ["Internals for plugins", "Response class"] | [] |
testing_plugins:testing-plugins-datasette-test-instance | testing_plugins | testing-plugins-datasette-test-instance | Setting up a Datasette test instance | The above example shows the easiest way to start writing tests against a Datasette instance: from datasette.app import Datasette import pytest @pytest.mark.asyncio async def test_plugin_is_installed(): datasette = Datasette(memory=True) response = await datasette.client.get("/-/plugins.json") assert response.status_code == 200 Creating a Datasette() instance like this as useful shortcut in tests, but there is one detail you need to be aware of. It's important to ensure that the async method .invoke_startup() is called on that instance. You can do that like this: datasette = Datasette(memory=True) await datasette.invoke_startup() This method registers any startup(datasette) or prepare_jinja2_environment(env, datasette) plugins that might themselves need to make async calls. If you are using await datasette.client.get() and similar methods then you don't need to worry about this - Datasette automatically calls invoke_startup() the first time it handles a request. | ["Testing plugins"] | [] |
contributing:devenvironment | contributing | devenvironment | Setting up a development environment | If you have Python 3.7 or higher installed on your computer (on OS X the quickest way to do this is using homebrew ) you can install an editable copy of Datasette using the following steps. If you want to use GitHub to publish your changes, first create a fork of datasette under your own GitHub account. Now clone that repository somewhere on your computer: git clone git@github.com:YOURNAME/datasette If you want to get started without creating your own fork, you can do this instead: git clone git@github.com:simonw/datasette The next step is to create a virtual environment for your project and use it to install Datasette's dependencies: cd datasette # Create a virtual environment in ./venv python3 -m venv ./venv # Now activate the virtual environment, so pip can install into it source venv/bin/activate # Install Datasette and its testing dependencies python3 -m pip install -e '.[test]' That last line does most of the work: pip install -e means "install this package in a way that allows me to edit the source code in place". The .[test] option means "use the setup.py in this directory and install the optional testing dependencies as well". | ["Contributing"] | [{"href": "https://docs.python-guide.org/starting/install3/osx/", "label": "is using homebrew"}, {"href": "https://github.com/simonw/datasette/fork", "label": "create a fork of datasette"}] |
metadata:metadata-sortable-columns | metadata | metadata-sortable-columns | Setting which columns can be used for sorting | Datasette allows any column to be used for sorting by default. If you need to control which columns are available for sorting you can do so using the optional sortable_columns key: { "databases": { "database1": { "tables": { "example_table": { "sortable_columns": [ "height", "weight" ] } } } } } This will restrict sorting of example_table to just the height and weight columns. You can also disable sorting entirely by setting "sortable_columns": [] You can use sortable_columns to enable specific sort orders for a view called name_of_view in the database my_database like so: { "databases": { "my_database": { "tables": { "name_of_view": { "sortable_columns": [ "clicks", "impressions" ] } } } } } | ["Metadata"] | [] |
settings:id1 | settings | id1 | Settings | [] | [] | |
settings:id2 | settings | id2 | Settings | The following options can be set using --setting name value , or by storing them in the settings.json file for use with Configuration directory mode . | ["Settings"] | [] |
changelog:signed-values-and-secrets | changelog | signed-values-and-secrets | Signed values and secrets | Both flash messages and user authentication needed a way to sign values and set signed cookies. Two new methods are now available for plugins to take advantage of this mechanism: .sign(value, namespace="default") and .unsign(value, namespace="default") . Datasette will generate a secret automatically when it starts up, but to avoid resetting the secret (and hence invalidating any cookies) every time the server restarts you should set your own secret. You can pass a secret to Datasette using the new --secret option or with a DATASETTE_SECRET environment variable. See Configuring the secret for more details. You can also set a secret when you deploy Datasette using datasette publish or datasette package - see Using secrets with datasette publish . Plugins can now sign values and verify their signatures using the datasette.sign() and datasette.unsign() methods. | ["Changelog", "0.44 (2020-06-11)"] | [] |
changelog:id83 | changelog | id83 | Small changes | We now show the size of the database file next to the download link ( #172 ) New /-/databases introspection page shows currently connected databases ( #470 ) Binary data is no longer displayed on the table and row pages ( #442 - thanks, Russ Garrett) New show/hide SQL links on custom query pages ( #415 ) The extra_body_script plugin hook now accepts an optional view_name argument ( #443 - thanks, Russ Garrett) Bumped Jinja2 dependency to 2.10.1 ( #426 ) All table filters are now documented, and documentation is enforced via unit tests ( 2c19a27 ) New project guideline: master should stay shippable at all times! ( 31f36e1 ) Fixed a bug where sqlite_timelimit() occasionally failed to clean up after itself ( bac4e01 ) We no longer load additional plugins when executing pytest ( #438 ) Homepage now links to database views if there are less than five tables in a database ( #373 ) The --cors option is now respected by error pages ( #453 ) datasette publish heroku now uses the --include-vcs-ignore option, which means it works under Travis CI ( #407 ) datasette publish heroku now publishes using Python 3.6.8 ( 66… | ["Changelog", "0.28 (2019-05-19)"] | [{"href": "https://github.com/simonw/datasette/issues/172", "label": "#172"}, {"href": "https://github.com/simonw/datasette/issues/470", "label": "#470"}, {"href": "https://github.com/simonw/datasette/pull/442", "label": "#442"}, {"href": "https://github.com/simonw/datasette/issues/415", "label": "#415"}, {"href": "https://github.com/simonw/datasette/pull/443", "label": "#443"}, {"href": "https://github.com/simonw/datasette/pull/426", "label": "#426"}, {"href": "https://github.com/simonw/datasette/commit/2c19a27d15a913e5f3dd443f04067169a6f24634", "label": "2c19a27"}, {"href": "https://github.com/simonw/datasette/commit/31f36e1b97ccc3f4387c80698d018a69798b6228", "label": "31f36e1"}, {"href": "https://github.com/simonw/datasette/commit/bac4e01f40ae7bd19d1eab1fb9349452c18de8f5", "label": "bac4e01"}, {"href": "https://github.com/simonw/datasette/issues/438", "label": "#438"}, {"href": "https://github.com/simonw/datasette/issues/373", "label": "#373"}, {"href": "https://github.com/simonw/datasette/issues/453", "label": "#453"}, {"href": "https://github.com/simonw/datasette/pull/407", "label": "#407"}, {"href": "https://github.com/simonw/datasette/commit/666c37415a898949fae0437099d62a35b1e9c430", "label": "666c374"}, {"href": "https://github.com/simonw/datasette/issues/472", "label": "#472"}, {"href": "https://github.com/simonw/datasette/commit/09ef305c687399384fe38487c075e8669682deb4", "label": "09ef305"}, {"href": "https://github.com/simonw/datasette/issues/476", "label": "#476"}] |
changelog:small-changes | changelog | small-changes | Small changes | Databases published using datasette publish now open in Immutable mode . ( #469 ) ?col__date= now works for columns containing spaces Automatic label detection (for deciding which column to show when linking to a foreign key) has been improved. ( #485 ) Fixed bug where pagination broke when combined with an expanded foreign key. ( #489 ) Contributors can now run pip install -e .[docs] to get all of the dependencies needed to build the documentation, including cd docs && make livehtml support. Datasette's dependencies are now all specified using the ~= match operator. ( #532 ) white-space: pre-wrap now used for table creation SQL. ( #505 ) Full list of commits between 0.28 and 0.29. | ["Changelog", "0.29 (2019-07-07)"] | [{"href": "https://github.com/simonw/datasette/issues/469", "label": "#469"}, {"href": "https://github.com/simonw/datasette/issues/485", "label": "#485"}, {"href": "https://github.com/simonw/datasette/issues/489", "label": "#489"}, {"href": "https://github.com/simonw/datasette/issues/532", "label": "#532"}, {"href": "https://github.com/simonw/datasette/issues/505", "label": "#505"}, {"href": "https://github.com/simonw/datasette/compare/0.28...0.29", "label": "Full list of commits"}] |
changelog:id56 | changelog | id56 | Smaller changes | Cascading view permissions - so if a user has view-table they can view the table page even if they do not have view-database or view-instance . ( #832 ) CSRF protection no longer applies to Authentication: Bearer token requests or requests without cookies. ( #835 ) datasette.add_message() now works inside plugins. ( #864 ) Workaround for "Too many open files" error in test runs. ( #846 ) Respect existing scope["actor"] if already set by ASGI middleware. ( #854 ) New process for shipping Alpha and beta releases . ( #807 ) {{ csrftoken() }} now works when plugins render a template using datasette.render_template(..., request=request) . ( #863 ) Datasette now creates a single Request object and uses it throughout the lifetime of the current HTTP request. ( #870 ) | ["Changelog", "0.45 (2020-07-01)"] | [{"href": "https://github.com/simonw/datasette/issues/832", "label": "#832"}, {"href": "https://github.com/simonw/datasette/issues/835", "label": "#835"}, {"href": "https://github.com/simonw/datasette/issues/864", "label": "#864"}, {"href": "https://github.com/simonw/datasette/issues/846", "label": "#846"}, {"href": "https://github.com/simonw/datasette/issues/854", "label": "#854"}, {"href": "https://github.com/simonw/datasette/issues/807", "label": "#807"}, {"href": "https://github.com/simonw/datasette/issues/863", "label": "#863"}, {"href": "https://github.com/simonw/datasette/issues/870", "label": "#870"}] |
changelog:id58 | changelog | id58 | Smaller changes | New internals documentation for Request object and Response class . ( #706 ) request.url now respects the force_https_urls config setting. closes ( #781 ) request.args.getlist() returns [] if missing. Removed request.raw_args entirely. ( #774 ) New datasette.get_database() method. Added _ prefix to many private, undocumented methods of the Datasette class. ( #576 ) Removed the db.get_outbound_foreign_keys() method which duplicated the behaviour of db.foreign_keys_for_table() . New await datasette.permission_allowed() method. /-/actor debugging endpoint for viewing the currently authenticated actor. New request.cookies property. /-/plugins endpoint now shows a list of hooks implemented by each plugin, e.g. https://latest.datasette.io/-/plugins?all=1 request.post_vars() method no longer discards empty values. New "params" canned query key for explicitly setting named parameters, see Canned query parameters . ( #797 ) request.args is now a MultiParams object. Fixed a bug with the datasette plugins command. ( #802 ) Nicer pa… | ["Changelog", "0.44 (2020-06-11)"] | [{"href": "https://github.com/simonw/datasette/issues/706", "label": "#706"}, {"href": "https://github.com/simonw/datasette/issues/781", "label": "#781"}, {"href": "https://github.com/simonw/datasette/issues/774", "label": "#774"}, {"href": "https://github.com/simonw/datasette/issues/576", "label": "#576"}, {"href": "https://latest.datasette.io/-/plugins?all=1", "label": "https://latest.datasette.io/-/plugins?all=1"}, {"href": "https://github.com/simonw/datasette/issues/797", "label": "#797"}, {"href": "https://github.com/simonw/datasette/issues/802", "label": "#802"}, {"href": "https://github.com/simonw/datasette/issues/395", "label": "#395"}, {"href": "https://github.com/simonw/datasette/issues/777", "label": "#777"}, {"href": "https://github.com/simonw/datasette/issues/822", "label": "#822"}, {"href": "https://github.com/simonw/datasette/issues/804", "label": "#804"}, {"href": "https://github.com/simonw/datasette/issues/830", "label": "#830"}, {"href": "https://github.com/simonw/datasette/issues/837", "label": "#837"}] |
changelog:smaller-changes | changelog | smaller-changes | Smaller changes | Wide tables shown within Datasette now scroll horizontally ( #998 ). This is achieved using a new <div class="table-wrapper"> element which may impact the implementation of some plugins (for example this change to datasette-cluster-map ). New debug-menu permission. ( #1068 ) Removed --debug option, which didn't do anything. ( #814 ) Link: HTTP header pagination. ( #1014 ) x button for clearing filters. ( #1016 ) Edit SQL button on canned queries, ( #1019 ) --load-extension=spatialite shortcut. ( #1028 ) scale-in animation for column action menu. ( #1039 ) Option to pass a list of templates to .render_template() is now documented. ( #1045 ) New datasette.urls.static_plugins() method. ( #1033 ) datasette -o option now opens the most relevant page. ( #976 ) datasette --cors option now enables access to /database.db downloads. ( #1057 ) Database file downloads now implement cascading permissions, so you can download a database if you have view-database-download permission even if you do not have permission to access the Datasette instance. ( #1058 ) New documentation on Designing URLs for your plugin . (… | ["Changelog", "0.51 (2020-10-31)"] | [{"href": "https://github.com/simonw/datasette/issues/998", "label": "#998"}, {"href": "https://github.com/simonw/datasette-cluster-map/commit/fcb4abbe7df9071c5ab57defd39147de7145b34e", "label": "this change to datasette-cluster-map"}, {"href": "https://github.com/simonw/datasette/issues/1068", "label": "#1068"}, {"href": "https://github.com/simonw/datasette/issues/814", "label": "#814"}, {"href": "https://github.com/simonw/datasette/issues/1014", "label": "#1014"}, {"href": "https://github.com/simonw/datasette/issues/1016", "label": "#1016"}, {"href": "https://github.com/simonw/datasette/issues/1019", "label": "#1019"}, {"href": "https://github.com/simonw/datasette/issues/1028", "label": "#1028"}, {"href": "https://github.com/simonw/datasette/issues/1039", "label": "#1039"}, {"href": "https://github.com/simonw/datasette/issues/1045", "label": "#1045"}, {"href": "https://github.com/simonw/datasette/issues/1033", "label": "#1033"}, {"href": "https://github.com/simonw/datasette/issues/976", "label": "#976"}, {"href": "https://github.com/simonw/datasette/issues/1057", "label": "#1057"}, {"href": "https://github.com/simonw/datasette/issues/1058", "label": "#1058"}, {"href": "https://github.com/simonw/datasette/issues/1053", "label": "#1053"}] |
metadata:metadata-source-license-about | metadata | metadata-source-license-about | Source, license and about | The three visible metadata fields you can apply to everything, specific databases or specific tables are source, license and about. All three are optional. source and source_url should be used to indicate where the underlying data came from. license and license_url should be used to indicate the license under which the data can be used. about and about_url can be used to link to further information about the project - an accompanying blog entry for example. For each of these you can provide just the *_url field and Datasette will treat that as the default link label text and display the URL directly on the page. | ["Metadata"] | [] |
spatialite:id1 | spatialite | id1 | SpatiaLite | The SpatiaLite module for SQLite adds features for handling geographic and spatial data. For an example of what you can do with it, see the tutorial Building a location to time zone API with SpatiaLite . To use it with Datasette, you need to install the mod_spatialite dynamic library. This can then be loaded into Datasette using the --load-extension command-line option. Datasette can look for SpatiaLite in common installation locations if you run it like this: datasette --load-extension=spatialite --setting default_allow_sql off If SpatiaLite is in another location, use the full path to the extension instead: datasette --setting default_allow_sql off \ --load-extension=/usr/local/lib/mod_spatialite.dylib | [] | [{"href": "https://www.gaia-gis.it/fossil/libspatialite/index", "label": "SpatiaLite module"}, {"href": "https://datasette.io/tutorials/spatialite", "label": "Building a location to time zone API with SpatiaLite"}] |
spatialite:spatial-indexing-latitude-longitude-columns | spatialite | spatial-indexing-latitude-longitude-columns | Spatial indexing latitude/longitude columns | Here's a recipe for taking a table with existing latitude and longitude columns, adding a SpatiaLite POINT geometry column to that table, populating the new column and then populating a spatial index: import sqlite3 conn = sqlite3.connect("museums.db") # Lead the spatialite extension: conn.enable_load_extension(True) conn.load_extension("/usr/local/lib/mod_spatialite.dylib") # Initialize spatial metadata for this database: conn.execute("select InitSpatialMetadata(1)") # Add a geometry column called point_geom to our museums table: conn.execute( "SELECT AddGeometryColumn('museums', 'point_geom', 4326, 'POINT', 2);" ) # Now update that geometry column with the lat/lon points conn.execute( """ UPDATE museums SET point_geom = GeomFromText('POINT('||"longitude"||' '||"latitude"||')',4326); """ ) # Now add a spatial index to that column conn.execute( 'select CreateSpatialIndex("museums", "point_geom");' ) # If you don't commit your changes will not be persisted: conn.commit() conn.close() | ["SpatiaLite"] | [] |
json_api:json-api-special | json_api | json-api-special | Special JSON arguments | Every Datasette endpoint that can return JSON also accepts the following query string arguments: ?_shape=SHAPE The shape of the JSON to return, documented above. ?_nl=on When used with ?_shape=array produces newline-delimited JSON objects. ?_json=COLUMN1&_json=COLUMN2 If any of your SQLite columns contain JSON values, you can use one or more _json= parameters to request that those columns be returned as regular JSON. Without this argument those columns will be returned as JSON objects that have been double-encoded into a JSON string value. Compare this query without the argument to this query using the argument ?_json_infinity=on If your data contains infinity or -infinity values, Datasette will replace them with None when returning them as JSON. If you pass _json_infinity=1 Datasette will instead return them as Infinity or -Infinity which is invalid JSON but can be processed by some custom JSON parsers. ?_timelimit=MS Sets a custom time limit for the query in ms. You can use this for optimistic queries where you would like Datasette to give up if the query takes too long, for example if you want to implement autocomplete search but only… | ["JSON API"] | [{"href": "https://fivethirtyeight.datasettes.com/fivethirtyeight.json?sql=select+%27{%22this+is%22%3A+%22a+json+object%22}%27+as+d&_shape=array", "label": "this query without the argument"}, {"href": "https://fivethirtyeight.datasettes.com/fivethirtyeight.json?sql=select+%27{%22this+is%22%3A+%22a+json+object%22}%27+as+d&_shape=array&_json=d", "label": "this query using the argument"}] |
json_api:json-api-table-arguments | json_api | json-api-table-arguments | Special table arguments | ?_col=COLUMN1&_col=COLUMN2 List specific columns to display. These will be shown along with any primary keys. ?_nocol=COLUMN1&_nocol=COLUMN2 List specific columns to hide - any column not listed will be displayed. Primary keys cannot be hidden. ?_labels=on/off Expand foreign key references for every possible column. See below. ?_label=COLUMN1&_label=COLUMN2 Expand foreign key references for one or more specified columns. ?_size=1000 or ?_size=max Sets a custom page size. This cannot exceed the max_returned_rows limit passed to datasette serve . Use max to get max_returned_rows . ?_sort=COLUMN Sorts the results by the specified column. ?_sort_desc=COLUMN Sorts the results by the specified column in descending order. ?_search=keywords For SQLite tables that have been configured for full-text search executes a search … | ["JSON API", "Table arguments"] | [{"href": "https://www.sqlite.org/fts3.html", "label": "full-text search"}, {"href": "https://www.sqlite.org/fts5.html#full_text_query_syntax", "label": "advanced SQLite FTS syntax"}, {"href": "https://latest.datasette.io/fixtures/facetable?_where=_neighborhood%20like%20%22%c%%22&_where=_city_id=3", "label": "facetable?_where=_neighborhood like \"%c%\"&_where=_city_id=3"}, {"href": "https://latest.datasette.io/fixtures/facetable?_where=_city_id%20in%20(select%20id%20from%20facet_cities%20where%20name%20!=%20%22Detroit%22)", "label": "facetable?_where=_city_id in (select id from facet_cities where name != \"Detroit\")"}, {"href": "https://latest.datasette.io/fixtures/roadside_attractions?_through={%22table%22:%22roadside_attraction_characteristics%22,%22column%22:%22characteristic_id%22,%22value%22:%221%22}", "label": "an example"}] |
metadata:label-columns | metadata | label-columns | Specifying the label column for a table | Datasette's HTML interface attempts to display foreign key references as labelled hyperlinks. By default, it looks for referenced tables that only have two columns: a primary key column and one other. It assumes that the second column should be used as the link label. If your table has more than two columns you can specify which column should be used for the link label with the label_column property: { "databases": { "database1": { "tables": { "example_table": { "label_column": "title" } } } } } | ["Metadata"] | [] |
metadata:specifying-units-for-a-column | metadata | specifying-units-for-a-column | Specifying units for a column | Datasette supports attaching units to a column, which will be used when displaying values from that column. SI prefixes will be used where appropriate. Column units are configured in the metadata like so: { "databases": { "database1": { "tables": { "example_table": { "units": { "column1": "metres", "column2": "Hz" } } } } } } Units are interpreted using Pint , and you can see the full list of available units in Pint's unit registry . You can also add custom units to the metadata, which will be registered with Pint: { "custom_units": [ "decibel = [] = dB" ] } | ["Metadata"] | [{"href": "https://pint.readthedocs.io/", "label": "Pint"}, {"href": "https://github.com/hgrecco/pint/blob/master/pint/default_en.txt", "label": "unit registry"}, {"href": "http://pint.readthedocs.io/en/latest/defining.html", "label": "custom units"}] |
facets:speeding-up-facets-with-indexes | facets | speeding-up-facets-with-indexes | Speeding up facets with indexes | The performance of facets can be greatly improved by adding indexes on the columns you wish to facet by. Adding indexes can be performed using the sqlite3 command-line utility. Here's how to add an index on the state column in a table called Food_Trucks : $ sqlite3 mydatabase.db SQLite version 3.19.3 2017-06-27 16:48:08 Enter ".help" for usage hints. sqlite> CREATE INDEX Food_Trucks_state ON Food_Trucks("state"); Or using the sqlite-utils command-line utility: $ sqlite-utils create-index mydatabase.db Food_Trucks state | ["Facets"] | [{"href": "https://sqlite-utils.datasette.io/en/stable/cli.html#creating-indexes", "label": "sqlite-utils"}] |
writing_plugins:writing-plugins-cookiecutter | writing_plugins | writing-plugins-cookiecutter | Starting an installable plugin using cookiecutter | Plugins that can be installed should be written as Python packages using a setup.py file. The quickest way to start writing one an installable plugin is to use the datasette-plugin cookiecutter template. This creates a new plugin structure for you complete with an example test and GitHub Actions workflows for testing and publishing your plugin. Install cookiecutter and then run this command to start building a plugin using the template: cookiecutter gh:simonw/datasette-plugin Read a cookiecutter template for writing Datasette plugins for more information about this template. | ["Writing plugins"] | [{"href": "https://github.com/simonw/datasette-plugin", "label": "datasette-plugin"}, {"href": "https://cookiecutter.readthedocs.io/en/stable/installation.html", "label": "Install cookiecutter"}, {"href": "https://simonwillison.net/2020/Jun/20/cookiecutter-plugins/", "label": "a cookiecutter template for writing Datasette plugins"}] |
writing_plugins:writing-plugins-static-assets | writing_plugins | writing-plugins-static-assets | Static assets | If your plugin has a static/ directory, Datasette will automatically configure itself to serve those static assets from the following path: /-/static-plugins/NAME_OF_PLUGIN_PACKAGE/yourfile.js Use the datasette.urls.static_plugins(plugin_name, path) method to generate URLs to that asset that take the base_url setting into account, see datasette.urls . To bundle the static assets for a plugin in the package that you publish to PyPI, add the following to the plugin's setup.py : package_data = ( { "datasette_plugin_name": [ "static/plugin.js", ], }, ) Where datasette_plugin_name is the name of the plugin package (note that it uses underscores, not hyphens) and static/plugin.js is the path within that package to the static file. datasette-cluster-map is a useful example of a plugin that includes packaged static assets in this way. | ["Writing plugins"] | [{"href": "https://github.com/simonw/datasette-cluster-map", "label": "datasette-cluster-map"}] |
csv_export:streaming-all-records | csv_export | streaming-all-records | Streaming all records | The stream all rows option is designed to be as efficient as possible - under the hood it takes advantage of Python 3 asyncio capabilities and Datasette's efficient pagination to stream back the full CSV file. Since databases can get pretty large, by default this option is capped at 100MB - if a table returns more than 100MB of data the last line of the CSV will be a truncation error message. You can increase or remove this limit using the max_csv_mb config setting. You can also disable the CSV export feature entirely using allow_csv_stream . | ["CSV export"] | [] |
facets:suggested-facets | facets | suggested-facets | Suggested facets | Datasette's table UI will suggest facets for the user to apply, based on the following criteria: For the currently filtered data are there any columns which, if applied as a facet... Will return 30 or less unique options Will return more than one unique option Will return less unique options than the total number of filtered rows And the query used to evaluate this criteria can be completed in under 50ms That last point is particularly important: Datasette runs a query for every column that is displayed on a page, which could get expensive - so to avoid slow load times it sets a time limit of just 50ms for each of those queries. This means suggested facets are unlikely to appear for tables with millions of records in them. | ["Facets"] | [] |
changelog:v0-28-databases-that-change | changelog | v0-28-databases-that-change | Supporting databases that change | From the beginning of the project, Datasette has been designed with read-only databases in mind. If a database is guaranteed not to change it opens up all kinds of interesting opportunities - from taking advantage of SQLite immutable mode and HTTP caching to bundling static copies of the database directly in a Docker container. The interesting ideas in Datasette explores this idea in detail. As my goals for the project have developed, I realized that read-only databases are no longer the right default. SQLite actually supports concurrent access very well provided only one thread attempts to write to a database at a time, and I keep encountering sensible use-cases for running Datasette on top of a database that is processing inserts and updates. So, as-of version 0.28 Datasette no longer assumes that a database file will not change. It is now safe to point Datasette at a SQLite database which is being updated by another process. Making this change was a lot of work - see tracking tickets #418 , #419 and #420 . It required new thinking around how Datasette should calculate table counts (an expensive operation against a large, changing database) and also meant reconsidering the "content hash" URLs Datasette has used in the past to optimize the performance of HTTP caches. Datasette can still run against immutable files and gains numerous performance benefits from doing so, but this is no longer the default behaviour. Take a look at the new Performance and caching documentation section for details on how to make the most of Datasette against data that you know will be staying read-only and immutable. | ["Changelog", "0.28 (2019-05-19)"] | [{"href": "https://simonwillison.net/2018/Oct/4/datasette-ideas/", "label": "The interesting ideas in Datasette"}, {"href": "https://github.com/simonw/datasette/issues/418", "label": "#418"}, {"href": "https://github.com/simonw/datasette/issues/419", "label": "#419"}, {"href": "https://github.com/simonw/datasette/issues/420", "label": "#420"}] |
pages:tableview | pages | tableview | Table | The table page is the heart of Datasette: it allows users to interactively explore the contents of a database table, including sorting, filtering, Full-text search and applying Facets . The HTML interface is worth spending some time exploring. As with other pages, you can return the JSON data by appending .json to the URL path, before any ? query string arguments. The query string arguments are described in more detail here: Table arguments You can also use the table page to interactively construct a SQL query - by applying different filters and a sort order for example - and then click the "View and edit SQL" link to see the SQL query that was used for the page and edit and re-submit it. Some examples: ../items lists all of the line-items registered by UK MPs as potential conflicts of interest. It demonstrates Datasette's support for Full-text search . ../antiquities-act%2Factions_under_antiquities_act is an interface for exploring the "actions under the antiquities act" data table published by FiveThirtyEight. ../global-power-plants?country_long=United+Kingdom&primary_fuel=Gas is a filtered table page showing every Gas power plant in the United Kingdom. It includes some default facets (configured using its metadata.json ) and uses the datasette-cluster-map plugin to show a map of the results. | ["Pages and API endpoints"] | [{"href": "https://register-of-members-interests.datasettes.com/regmem/items", "label": "../items"}, {"href": "https://fivethirtyeight.datasettes.com/fivethirtyeight/antiquities-act%2Factions_under_antiquities_act", "label": "../antiquities-act%2Factions_under_antiquities_act"}, {"href": "https://global-power-plants.datasettes.com/global-power-plants/global-power-plants?_facet=primary_fuel&_facet=owner&_facet=country_long&country_long__exact=United+Kingdom&primary_fuel=Gas", "label": "../global-power-plants?country_long=United+Kingdom&primary_fuel=Gas"}, {"href": "https://global-power-plants.datasettes.com/-/metadata", "label": "its metadata.json"}, {"href": "https://github.com/simonw/datasette-cluster-map", "label": "datasette-cluster-map"}] |
json_api:id2 | json_api | id2 | Table arguments | The Datasette table view takes a number of special query string arguments. | ["JSON API"] | [] |
testing_plugins:testing-plugins-pytest-httpx | testing_plugins | testing-plugins-pytest-httpx | Testing outbound HTTP calls with pytest-httpx | If your plugin makes outbound HTTP calls - for example datasette-auth-github or datasette-import-table - you may need to mock those HTTP requests in your tests. The pytest-httpx package is a useful library for mocking calls. It can be tricky to use with Datasette though since it mocks all HTTPX requests, and Datasette's own testing mechanism uses HTTPX internally. To avoid breaking your tests, you can return ["localhost"] from the non_mocked_hosts() fixture. As an example, here's a very simple plugin which executes an HTTP response and returns the resulting content: from datasette import hookimpl from datasette.utils.asgi import Response import httpx @hookimpl def register_routes(): return [ (r"^/-/fetch-url$", fetch_url), ] async def fetch_url(datasette, request): if request.method == "GET": return Response.html( """ <form action="/-/fetch-url" method="post"> <input type="hidden" name="csrftoken" value="{}"> <input name="url"><input type="submit"> </form>""".format( request.scope["csrftoken"]() ) ) vars = await request.post_vars() url = vars["url"] return Response.text(httpx.get(url).text) Here's a test for that plugin that mocks the HTTPX outbound request: from datasette.app import Datasette import pytest @pytest.fixture def non_mocked_hosts(): # This ensures httpx-mock will not affect Datasette's own # httpx calls made in the tests by datasette.client: return ["localhost"] async def test_outbound_http_call(httpx_mock): httpx_mock.add_response( url="https://www.example.com/", text="Hello world", ) datasette = Datasette([], memory=True) response = await datasette.client.post( "/-/fetch-url", data={"url": "https://www.example.com/"}, ) assert response.text == "Hello world" outbound_request = httpx_mock.get_request()… | ["Testing plugins"] | [{"href": "https://pypi.org/project/pytest-httpx/", "label": "pytest-httpx"}] |
testing_plugins:id1 | testing_plugins | id1 | Testing plugins | We recommend using pytest to write automated tests for your plugins. If you use the template described in Starting an installable plugin using cookiecutter your plugin will start with a single test in your tests/ directory that looks like this: from datasette.app import Datasette import pytest @pytest.mark.asyncio async def test_plugin_is_installed(): datasette = Datasette(memory=True) response = await datasette.client.get("/-/plugins.json") assert response.status_code == 200 installed_plugins = {p["name"] for p in response.json()} assert ( "datasette-plugin-template-demo" in installed_plugins ) This test uses the datasette.client object to exercise a test instance of Datasette. datasette.client is a wrapper around the HTTPX Python library which can imitate HTTP requests using ASGI. This is the recommended way to write tests against a Datasette instance. This test also uses the pytest-asyncio package to add support for async def test functions running under pytest. You can install these packages like so: pip install pytest pytest-asyncio If you are building an installable package you can add them as test dependencies to your setup.py module like this: setup( name="datasette-my-plugin", # ... extras_require={"test": ["pytest", "pytest-asyncio"]}, tests_require=["datasette-my-plugin[test]"], ) You can then install the test dependencies like so: pip install -e '.[test]' Then run the tests using pytest like so: pytest | [] | [{"href": "https://docs.pytest.org/", "label": "pytest"}, {"href": "https://www.python-httpx.org/", "label": "HTTPX"}, {"href": "https://pypi.org/project/pytest-asyncio/", "label": "pytest-asyncio"}] |
authentication:allowdebugview | authentication | allowdebugview | The /-/allow-debug tool | The /-/allow-debug tool lets you try out different "action" blocks against different "actor" JSON objects. You can try that out here: https://latest.datasette.io/-/allow-debug | ["Authentication and permissions", "Permissions"] | [{"href": "https://latest.datasette.io/-/allow-debug", "label": "https://latest.datasette.io/-/allow-debug"}] |
authentication:logoutview | authentication | logoutview | The /-/logout page | The page at /-/logout provides the ability to log out of a ds_actor cookie authentication session. | ["Authentication and permissions", "The ds_actor cookie"] | [] |
ecosystem:ecosystem | ecosystem | ecosystem | The Datasette Ecosystem | Datasette sits at the center of a growing ecosystem of open source tools aimed at making it as easy as possible to gather, analyze and publish interesting data. These tools are divided into two main groups: tools for building SQLite databases (for use with Datasette) and plugins that extend Datasette's functionality. The Datasette project website includes a directory of plugins and a directory of tools: Plugins directory on datasette.io Tools directory on datasette.io | [] | [{"href": "https://datasette.io/", "label": "Datasette project website"}, {"href": "https://datasette.io/plugins", "label": "Plugins directory on datasette.io"}, {"href": "https://datasette.io/tools", "label": "Tools directory on datasette.io"}] |
internals:internals-multiparams | internals | internals-multiparams | The MultiParams class | request.args is a MultiParams object - a dictionary-like object which provides access to query string parameters that may have multiple values. Consider the query string ?foo=1&foo=2&bar=3 - with two values for foo and one value for bar . request.args[key] - string Returns the first value for that key, or raises a KeyError if the key is missing. For the above example request.args["foo"] would return "1" . request.args.get(key) - string or None Returns the first value for that key, or None if the key is missing. Pass a second argument to specify a different default, e.g. q = request.args.get("q", "") . request.args.getlist(key) - list of strings Returns the list of strings for that key. request.args.getlist("foo") would return ["1", "2"] in the above example. request.args.getlist("bar") would return ["3"] . If the key is missing an empty list will be returned. request.args.keys() - list of strings Returns the list of available keys - for the example this would be ["foo", "bar"] . key in request.args - True or False You can use if key in request.args to check if a key is present. for key in request.args - iterator This lets you loop through every available key. le… | ["Internals for plugins"] | [] |
changelog:the-internal-database | changelog | the-internal-database | The _internal database | As part of ongoing work to help Datasette handle much larger numbers of connected databases and tables (see Datasette Library ) Datasette now maintains an in-memory SQLite database with details of all of the attached databases, tables, columns, indexes and foreign keys. ( #1150 ) This will support future improvements such as a searchable, paginated homepage of all available tables. You can explore an example of this database by signing in as root to the latest.datasette.io demo instance and then navigating to latest.datasette.io/_internal . Plugins can use these tables to introspect attached data in an efficient way. Plugin authors should note that this is not yet considered a stable interface, so any plugins that use this may need to make changes prior to Datasette 1.0 if the _internal table schemas change. | ["Changelog", "0.54 (2021-01-25)"] | [{"href": "https://github.com/simonw/datasette/issues/417", "label": "Datasette Library"}, {"href": "https://github.com/simonw/datasette/issues/1150", "label": "#1150"}, {"href": "https://latest.datasette.io/login-as-root", "label": "signing in as root"}, {"href": "https://latest.datasette.io/_internal", "label": "latest.datasette.io/_internal"}] |
internals:internals-internal | internals | internals-internal | The _internal database | This API should be considered unstable - the structure of these tables may change prior to the release of Datasette 1.0. Datasette maintains an in-memory SQLite database with details of the the databases, tables and columns for all of the attached databases. By default all actors are denied access to the view-database permission for the _internal database, so the database is not visible to anyone unless they sign in as root . Plugins can access this database by calling db = datasette.get_database("_internal") and then executing queries using the Database API . You can explore an example of this database by signing in as root to the latest.datasette.io demo instance and then navigating to latest.datasette.io/_internal . | ["Internals for plugins"] | [{"href": "https://latest.datasette.io/login-as-root", "label": "signing in as root"}, {"href": "https://latest.datasette.io/_internal", "label": "latest.datasette.io/_internal"}] |
internals:internals-utils | internals | internals-utils | The datasette.utils module | The datasette.utils module contains various utility functions used by Datasette. As a general rule you should consider anything in this module to be unstable - functions and classes here could change without warning or be removed entirely between Datasette releases, without being mentioned in the release notes. The exception to this rule is anythang that is documented here. If you find a need for an undocumented utility function in your own work, consider opening an issue requesting that the function you are using be upgraded to documented and supported status. | ["Internals for plugins"] | [{"href": "https://github.com/simonw/datasette/issues/new", "label": "opening an issue"}] |
authentication:authentication-ds-actor | authentication | authentication-ds-actor | The ds_actor cookie | Datasette includes a default authentication plugin which looks for a signed ds_actor cookie containing a JSON actor dictionary. This is how the root actor mechanism works. Authentication plugins can set signed ds_actor cookies themselves like so: response = Response.redirect("/") response.set_cookie( "ds_actor", datasette.sign({"a": {"id": "cleopaws"}}, "actor"), ) Note that you need to pass "actor" as the namespace to .sign(value, namespace="default") . The shape of data encoded in the cookie is as follows: { "a": {... actor ...} } | ["Authentication and permissions"] | [] |
authentication:permissionsdebugview | authentication | permissionsdebugview | The permissions debug tool | The debug tool at /-/permissions is only available to the authenticated root user (or any actor granted the permissions-debug action according to a plugin). It shows the thirty most recent permission checks that have been carried out by the Datasette instance. This is designed to help administrators and plugin authors understand exactly how permission checks are being carried out, in order to effectively configure Datasette's permission system. | ["Authentication and permissions"] | [] |
changelog:the-road-to-datasette-1-0 | changelog | the-road-to-datasette-1-0 | The road to Datasette 1.0 | I've assembled a milestone for Datasette 1.0 . The focus of the 1.0 release will be the following: Signify confidence in the quality/stability of Datasette Give plugin authors confidence that their plugins will work for the whole 1.x release cycle Provide the same confidence to developers building against Datasette JSON APIs If you have thoughts about what you would like to see for Datasette 1.0 you can join the conversation on issue #519 . | ["Changelog", "0.44 (2020-06-11)"] | [{"href": "https://github.com/simonw/datasette/milestone/7", "label": "milestone for Datasette 1.0"}, {"href": "https://github.com/simonw/datasette/issues/519", "label": "the conversation on issue #519"}] |
full_text_search:full-text-search-table-view-api | full_text_search | full-text-search-table-view-api | The table page and table view API | Table views that support full-text search can be queried using the ?_search=TERMS query string parameter. This will run the search against content from all of the columns that have been included in the index. Try this example: fara.datasettes.com/fara/FARA_All_ShortForms?_search=manafort SQLite full-text search supports wildcards. This means you can easily implement prefix auto-complete by including an asterisk at the end of the search term - for example: /dbname/tablename/?_search=rob* This will return all records containing at least one word that starts with the letters rob . You can also run searches against just the content of a specific named column by using _search_COLNAME=TERMS - for example, this would search for just rows where the name column in the FTS index mentions Sarah : /dbname/tablename/?_search_name=Sarah | ["Full-text search"] | [{"href": "https://fara.datasettes.com/fara/FARA_All_ShortForms?_search=manafort", "label": "fara.datasettes.com/fara/FARA_All_ShortForms?_search=manafort"}] |
internals:internals-tilde-encoding | internals | internals-tilde-encoding | Tilde encoding | Datasette uses a custom encoding scheme in some places, called tilde encoding . This is primarily used for table names and row primary keys, to avoid any confusion between / characters in those values and the Datasette URLs that reference them. Tilde encoding uses the same algorithm as URL percent-encoding , but with the ~ tilde character used in place of % . Any character other than ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz0123456789_- will be replaced by the numeric equivalent preceded by a tilde. For example: / becomes ~2F . becomes ~2E % becomes ~25 ~ becomes ~7E Space becomes + polls/2022.primary becomes polls~2F2022~2Eprimary Note that the space character is a special case: it will be replaced with a + symbol. datasette.utils. tilde_encode s : str str Returns tilde-encoded string - for example /foo/bar -> ~2Ffoo~2Fbar datasette.utils. tilde_decode s : str str Decodes a tilde-encoded string, so ~2Ffoo~2Fbar -> /foo/bar | ["Internals for plugins", "The datasette.utils module"] | [{"href": "https://developer.mozilla.org/en-US/docs/Glossary/percent-encoding", "label": "URL percent-encoding"}] |
pages:indexview | pages | indexview | Top-level index | The root page of any Datasette installation is an index page that lists all of the currently attached databases. Some examples: fivethirtyeight.datasettes.com global-power-plants.datasettes.com register-of-members-interests.datasettes.com Add /.json to the end of the URL for the JSON version of the underlying data: fivethirtyeight.datasettes.com/.json global-power-plants.datasettes.com/.json register-of-members-interests.datasettes.com/.json | ["Pages and API endpoints"] | [{"href": "https://fivethirtyeight.datasettes.com/", "label": "fivethirtyeight.datasettes.com"}, {"href": "https://global-power-plants.datasettes.com/", "label": "global-power-plants.datasettes.com"}, {"href": "https://register-of-members-interests.datasettes.com/", "label": "register-of-members-interests.datasettes.com"}, {"href": "https://fivethirtyeight.datasettes.com/.json", "label": "fivethirtyeight.datasettes.com/.json"}, {"href": "https://global-power-plants.datasettes.com/.json", "label": "global-power-plants.datasettes.com/.json"}, {"href": "https://register-of-members-interests.datasettes.com/.json", "label": "register-of-members-interests.datasettes.com/.json"}] |
internals:internals-tracer-trace-child-tasks | internals | internals-tracer-trace-child-tasks | Tracing child tasks | If your code uses a mechanism such as asyncio.gather() to execute code in additional tasks you may find that some of the traces are missing from the display. You can use the trace_child_tasks() context manager to ensure these child tasks are correctly handled. from datasette import tracer with tracer.trace_child_tasks(): results = await asyncio.gather( # ... async tasks here ) This example uses the register_routes() plugin hook to add a page at /parallel-queries which executes two SQL queries in parallel using asyncio.gather() and returns their results. from datasette import hookimpl from datasette import tracer @hookimpl def register_routes(): async def parallel_queries(datasette): db = datasette.get_database() with tracer.trace_child_tasks(): one, two = await asyncio.gather( db.execute("select 1"), db.execute("select 2"), ) return Response.json( { "one": one.single_value(), "two": two.single_value(), } ) return [ (r"/parallel-queries$", parallel_queries), ] Adding ?_trace=1 will show that the trace covers both of those child tasks. | ["Internals for plugins", "datasette.tracer"] | [] |
getting_started:getting-started-glitch | getting_started | getting-started-glitch | Try Datasette without installing anything using Glitch | Glitch is a free online tool for building web apps directly from your web browser. You can use Glitch to try out Datasette without needing to install any software on your own computer. Here's a demo project on Glitch which you can use as the basis for your own experiments: glitch.com/~datasette-csvs Glitch allows you to "remix" any project to create your own copy and start editing it in your browser. You can remix the datasette-csvs project by clicking this button: Find a CSV file and drag it onto the Glitch file explorer panel - datasette-csvs will automatically convert it to a SQLite database (using sqlite-utils ) and allow you to start exploring it using Datasette. If your CSV file has a latitude and longitude column you can visualize it on a map by uncommenting the datasette-cluster-map line in the requirements.txt file using the Glitch file editor. Need some data? Try this Public Art Data for the city of Seattle - hit "Export" and select "CSV" to download it as a CSV file. For more on how this works, see Running Datasette on Glitch . | ["Getting started"] | [{"href": "https://glitch.com/", "label": "Glitch"}, {"href": "https://glitch.com/~datasette-csvs", "label": "glitch.com/~datasette-csvs"}, {"href": "https://glitch.com/edit/#!/remix/datasette-csvs", "label": null}, {"href": "https://github.com/simonw/sqlite-utils", "label": "sqlite-utils"}, {"href": "https://data.seattle.gov/Community/Public-Art-Data/j7sn-tdzk", "label": "Public Art Data"}, {"href": "https://simonwillison.net/2019/Apr/23/datasette-glitch/", "label": "Running Datasette on Glitch"}] |
changelog:url-building | changelog | url-building | URL building | The new datasette.urls family of methods can be used to generate URLs to key pages within the Datasette interface, both within custom templates and Datasette plugins. See Building URLs within plugins for more details. ( #904 ) | ["Changelog", "0.51 (2020-10-31)"] | [{"href": "https://github.com/simonw/datasette/issues/904", "label": "#904"}] |
csv_export:csv-export-url-parameters | csv_export | csv-export-url-parameters | URL parameters | The following options can be used to customize the CSVs returned by Datasette. ?_header=off This removes the first row of the CSV file specifying the headings - only the row data will be returned. ?_stream=on Stream all matching records, not just the first page of results. See below. ?_dl=on Causes Datasette to return a content-disposition: attachment; filename="filename.csv" header. | ["CSV export"] | [] |
contributing:contributing-upgrading-codemirror | contributing | contributing-upgrading-codemirror | Upgrading CodeMirror | Datasette bundles CodeMirror for the SQL editing interface, e.g. on this page . Here are the steps for upgrading to a new version of CodeMirror: Download and extract latest CodeMirror zip file from https://codemirror.net/codemirror.zip Rename lib/codemirror.js to codemirror-5.57.0.js (using latest version number) Rename lib/codemirror.css to codemirror-5.57.0.css Rename mode/sql/sql.js to codemirror-5.57.0-sql.js Edit both JavaScript files to make the top license comment a /* */ block instead of multiple // lines Minify the JavaScript files like this: npx uglify-js codemirror-5.57.0.js -o codemirror-5.57.0.min.js --comments '/LICENSE/' npx uglify-js codemirror-5.57.0-sql.js -o codemirror-5.57.0-sql.min.js --comments '/LICENSE/' Check that the LICENSE comment did indeed survive minification Minify the CSS file like this: npx clean-css-cli codemirror-5.57.0.css -o codemirror-5.57.0.min.css Edit the _codemirror.html template to reference the new files git rm the old files, git add the new files | ["Contributing"] | [{"href": "https://codemirror.net/", "label": "CodeMirror"}, {"href": "https://latest.datasette.io/fixtures", "label": "this page"}, {"href": "https://codemirror.net/codemirror.zip", "label": "https://codemirror.net/codemirror.zip"}] |
installation:upgrading-packages-using-pipx | installation | upgrading-packages-using-pipx | Upgrading packages using pipx | You can upgrade your pipx installation to the latest release of Datasette using pipx upgrade datasette : $ pipx upgrade datasette upgraded package datasette from 0.39 to 0.40 (location: /Users/simon/.local/pipx/venvs/datasette) To upgrade a plugin within the pipx environment use pipx runpip datasette install -U name-of-plugin - like this: % datasette plugins [ { "name": "datasette-vega", "static": true, "templates": false, "version": "0.6" } ] $ pipx runpip datasette install -U datasette-vega Collecting datasette-vega Downloading datasette_vega-0.6.2-py3-none-any.whl (1.8 MB) |████████████████████████████████| 1.8 MB 2.0 MB/s ... Installing collected packages: datasette-vega Attempting uninstall: datasette-vega Found existing installation: datasette-vega 0.6 Uninstalling datasette-vega-0.6: Successfully uninstalled datasette-vega-0.6 Successfully installed datasette-vega-0.6.2 $ datasette plugins [ { "name": "datasette-vega", "static": true, "templates": false, "version": "0.6.2" } ] | ["Installation", "Advanced installation options", "Using pipx"] | [] |
performance:performance-inspect | performance | performance-inspect | Using "datasette inspect" | Counting the rows in a table can be a very expensive operation on larger databases. In immutable mode Datasette performs this count only once and caches the results, but this can still cause server startup time to increase by several seconds or more. If you know that a database is never going to change you can precalculate the table row counts once and store then in a JSON file, then use that file when you later start the server. To create a JSON file containing the calculated row counts for a database, use the following: datasette inspect data.db --inspect-file=counts.json Then later you can start Datasette against the counts.json file and use it to skip the row counting step and speed up server startup: datasette -i data.db --inspect-file=counts.json You need to use the -i immutable mode against the database file here or the counts from the JSON file will be ignored. You will rarely need to use this optimization in every-day use, but several of the datasette publish commands described in Publishing data use this optimization for better performance when deploying a database file to a hosting provider. | ["Performance and caching"] | [] |
settings:using-setting | settings | using-setting | Using --setting | Datasette supports a number of settings. These can be set using the --setting name value option to datasette serve . You can set multiple settings at once like this: datasette mydatabase.db \ --setting default_page_size 50 \ --setting sql_time_limit_ms 3500 \ --setting max_returned_rows 2000 | ["Settings"] | [] |
getting_started:getting-started-your-computer | getting_started | getting-started-your-computer | Using Datasette on your own computer | First, follow the Installation instructions. Now you can run Datasette against a SQLite file on your computer using the following command: datasette path/to/database.db This will start a web server on port 8001 - visit http://localhost:8001/ to access the web interface. Add -o to open your browser automatically once Datasette has started: datasette path/to/database.db -o Use Chrome on OS X? You can run datasette against your browser history like so: datasette ~/Library/Application\ Support/Google/Chrome/Default/History --nolock The --nolock option ignores any file locks. This is safe as Datasette will open the file in read-only mode. Now visiting http://localhost:8001/History/downloads will show you a web interface to browse your downloads data: http://localhost:8001/History/downloads.json will return that data as JSON: { "database": "History", "columns": [ "id", "current_path", "target_path", "start_time", "received_bytes", "total_bytes", ... ], "rows": [ [ 1, "/Users/simonw/Downloads/DropboxInstaller.dmg", "/Users/simonw/Downloads/DropboxInstaller.dmg", 13097290269022132, 626688, 0, ... ] ] } http://localhost:8001/History/downloads.json?_shape=objects will return that data as JSON in a more convenient format: { ... "rows": [ { "start_time": 13097290269022132, "interrupt_reason": 0, "hash": "", "id": 1, "site_url": "", "referrer": "https://www.dropbox.com/downloading?src=index", ... } ] } | ["Getting started"] | [{"href": "http://localhost:8001/", "label": "http://localhost:8001/"}, {"href": "http://localhost:8001/History/downloads", "label": "http://localhost:8001/History/downloads"}, {"href": "http://localhost:8001/History/downloads.json", "label": "http://localhost:8001/History/downloads.json"}, {"href": "http://localhost:8001/History/downloads.json?_shape=objects", "label": "http://localhost:8001/History/downloads.json?_shape=objects"}] |
installation:installation-docker | installation | installation-docker | Using Docker | A Docker image containing the latest release of Datasette is published to Docker Hub here: https://hub.docker.com/r/datasetteproject/datasette/ If you have Docker installed (for example with Docker for Mac on OS X) you can download and run this image like so: docker run -p 8001:8001 -v `pwd`:/mnt \ datasetteproject/datasette \ datasette -p 8001 -h 0.0.0.0 /mnt/fixtures.db This will start an instance of Datasette running on your machine's port 8001, serving the fixtures.db file in your current directory. Now visit http://127.0.0.1:8001/ to access Datasette. (You can download a copy of fixtures.db from https://latest.datasette.io/fixtures.db ) To upgrade to the most recent release of Datasette, run the following: docker pull datasetteproject/datasette | ["Installation", "Advanced installation options"] | [{"href": "https://hub.docker.com/r/datasetteproject/datasette/", "label": "https://hub.docker.com/r/datasetteproject/datasette/"}, {"href": "https://www.docker.com/docker-mac", "label": "Docker for Mac"}, {"href": "http://127.0.0.1:8001/", "label": "http://127.0.0.1:8001/"}, {"href": "https://latest.datasette.io/fixtures.db", "label": "https://latest.datasette.io/fixtures.db"}] |
installation:installation-homebrew | installation | installation-homebrew | Using Homebrew | If you have a Mac and use Homebrew , you can install Datasette by running this command in your terminal: brew install datasette This should install the latest version. You can confirm by running: datasette --version You can upgrade to the latest Homebrew packaged version using: brew upgrade datasette Once you have installed Datasette you can install plugins using the following: datasette install datasette-vega If the latest packaged release of Datasette has not yet been made available through Homebrew, you can upgrade your Homebrew installation in-place using: datasette install -U datasette | ["Installation", "Basic installation"] | [{"href": "https://brew.sh/", "label": "Homebrew"}] |
metadata:metadata-yaml | metadata | metadata-yaml | Using YAML for metadata | Datasette accepts YAML as an alternative to JSON for your metadata configuration file. YAML is particularly useful for including multiline HTML and SQL strings. Here's an example of a metadata.yml file, re-using an example from Canned queries . title: Demonstrating Metadata from YAML description_html: |- <p>This description includes a long HTML string</p> <ul> <li>YAML is better for embedding HTML strings than JSON!</li> </ul> license: ODbL license_url: https://opendatacommons.org/licenses/odbl/ databases: fixtures: tables: no_primary_key: hidden: true queries: neighborhood_search: sql: |- select neighborhood, facet_cities.name, state from facetable join facet_cities on facetable.city_id = facet_cities.id where neighborhood like '%' || :text || '%' order by neighborhood; title: Search neighborhoods description_html: |- <p>This demonstrates <em>basic</em> LIKE search The metadata.yml file is passed to Datasette using the same --metadata option: datasette fixtures.db --metadata metadata.yml | ["Metadata"] | [] |
contributing:contributing-using-fixtures | contributing | contributing-using-fixtures | Using fixtures | To run Datasette itself, type datasette . You're going to need at least one SQLite database. A quick way to get started is to use the fixtures database that Datasette uses for its own tests. You can create a copy of that database by running this command: python tests/fixtures.py fixtures.db Now you can run Datasette against the new fixtures database like so: datasette fixtures.db This will start a server at http://127.0.0.1:8001/ . Any changes you make in the datasette/templates or datasette/static folder will be picked up immediately (though you may need to do a force-refresh in your browser to see changes to CSS or JavaScript). If you want to change Datasette's Python code you can use the --reload option to cause Datasette to automatically reload any time the underlying code changes: datasette --reload fixtures.db You can also use the fixtures.py script to recreate the testing version of metadata.json used by the unit tests. To do that: python tests/fixtures.py fixtures.db fixtures-metadata.json Or to output the plugins used by the tests, run this: python tests/fixtures.py fixtures.db fixtures-metadata.json fixtures-plugins Test tables written to fixtures.db - metadata written to fixtures-metadata.json Wrote plugin: fixtures-plugins/register_output_renderer.py Wrote plugin: fixtures-plugins/view_name.py Wrote plugin: fixtures-plugins/my_plugin.py Wrote plugin: fixtures-plugins/messages_output_renderer.py Wrote plugin: fixtures-plugins/my_plugin_2.py Then run Datasette like this: datasette fixtures.db -m fixtures-metadata.json --plugins-dir=fixtures-plugins/ | ["Contributing"] | [] |
testing_plugins:testing-plugins-pdb | testing_plugins | testing-plugins-pdb | Using pdb for errors thrown inside Datasette | If an exception occurs within Datasette itself during a test, the response returned to your plugin will have a response.status_code value of 500. You can add pdb=True to the Datasette constructor to drop into a Python debugger session inside your test run instead of getting back a 500 response code. This is equivalent to running the datasette command-line tool with the --pdb option. Here's what that looks like in a test function: def test_that_opens_the_debugger_or_errors(): ds = Datasette([db_path], pdb=True) response = await ds.client.get("/") If you use this pattern you will need to run pytest with the -s option to avoid capturing stdin/stdout in order to interact with the debugger prompt. | ["Testing plugins"] | [] |
installation:installation-pip | installation | installation-pip | Using pip | Datasette requires Python 3.7 or higher. The Python.org Python For Beginners page has instructions for getting started. You can install Datasette and its dependencies using pip : pip install datasette You can now run Datasette like so: datasette | ["Installation", "Basic installation"] | [{"href": "https://www.python.org/about/gettingstarted/", "label": "Python.org Python For Beginners"}] |
installation:installation-pipx | installation | installation-pipx | Using pipx | pipx is a tool for installing Python software with all of its dependencies in an isolated environment, to ensure that they will not conflict with any other installed Python software. If you use Homebrew on macOS you can install pipx like this: brew install pipx pipx ensurepath Without Homebrew you can install it like so: python3 -m pip install --user pipx python3 -m pipx ensurepath The pipx ensurepath command configures your shell to ensure it can find commands that have been installed by pipx - generally by making sure ~/.local/bin has been added to your PATH . Once pipx is installed you can use it to install Datasette like this: pipx install datasette Then run datasette --version to confirm that it has been successfully installed. | ["Installation", "Advanced installation options"] | [{"href": "https://pipxproject.github.io/pipx/", "label": "pipx"}, {"href": "https://brew.sh/", "label": "Homebrew"}] |
testing_plugins:testing-plugins-fixtures | testing_plugins | testing-plugins-fixtures | Using pytest fixtures | Pytest fixtures can be used to create initial testable objects which can then be used by multiple tests. A common pattern for Datasette plugins is to create a fixture which sets up a temporary test database and wraps it in a Datasette instance. Here's an example that uses the sqlite-utils library to populate a temporary test database. It also sets the title of that table using a simulated metadata.json configuration: from datasette.app import Datasette import pytest import sqlite_utils @pytest.fixture(scope="session") def datasette(tmp_path_factory): db_directory = tmp_path_factory.mktemp("dbs") db_path = db_directory / "test.db" db = sqlite_utils.Database(db_path) db["dogs"].insert_all( [ {"id": 1, "name": "Cleo", "age": 5}, {"id": 2, "name": "Pancakes", "age": 4}, ], pk="id", ) datasette = Datasette( [db_path], metadata={ "databases": { "test": { "tables": { "dogs": {"title": "Some dogs"} } } } }, ) return datasette @pytest.mark.asyncio async def test_example_table_json(datasette): response = await datasette.client.get( "/test/dogs.json?_shape=array" ) assert response.status_code == 200 assert response.json() == [ {"id": 1, "name": "Cleo", "age": 5}, {"id": 2, "name": "Pancakes", "age": 4}, ] @pytest.mark.asyncio async def test_example_table_html(datasette): response = await datasette.client.get("/test/dogs") assert ">Some dogs</h1>" in response.text Here the datasette() function defines the fixture, which is than automatically passed to the two test functions based on pytest automatically matching their datasette function parameters. The @pytest.fixture(scope="session") line here ensures the fixture is reused for the full pytest execution session. This… | ["Testing plugins"] | [{"href": "https://docs.pytest.org/en/stable/fixture.html", "label": "Pytest fixtures"}, {"href": "https://sqlite-utils.datasette.io/en/stable/python-api.html", "label": "sqlite-utils library"}] |
settings:setting-publish-secrets | settings | setting-publish-secrets | Using secrets with datasette publish | The datasette publish and datasette package commands both generate a secret for you automatically when Datasette is deployed. This means that every time you deploy a new version of a Datasette project, a new secret will be generated. This will cause signed cookies to become invalid on every fresh deploy. You can fix this by creating a secret that will be used for multiple deploys and passing it using the --secret option: datasette publish cloudrun mydb.db --service=my-service --secret=cdb19e94283a20f9d42cca5 | ["Settings"] | [] |
authentication:authentication-root | authentication | authentication-root | Using the "root" actor | Datasette currently leaves almost all forms of authentication to plugins - datasette-auth-github for example. The one exception is the "root" account, which you can sign into while using Datasette on your local machine. This provides access to a small number of debugging features. To sign in as root, start Datasette using the --root command-line option, like this: $ datasette --root http://127.0.0.1:8001/-/auth-token?token=786fc524e0199d70dc9a581d851f466244e114ca92f33aa3b42a139e9388daa7 INFO: Started server process [25801] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8001 (Press CTRL+C to quit) The URL on the first line includes a one-use token which can be used to sign in as the "root" actor in your browser. Click on that link and then visit http://127.0.0.1:8001/-/actor to confirm that you are authenticated as an actor that looks like this: { "id": "root" } | ["Authentication and permissions", "Actors"] | [{"href": "https://github.com/simonw/datasette-auth-github", "label": "datasette-auth-github"}] |
sql_queries:sql-views | sql_queries | sql-views | Views | If you want to bundle some pre-written SQL queries with your Datasette-hosted database you can do so in two ways. The first is to include SQL views in your database - Datasette will then list those views on your database index page. The quickest way to create views is with the SQLite command-line interface: $ sqlite3 sf-trees.db SQLite version 3.19.3 2017-06-27 16:48:08 Enter ".help" for usage hints. sqlite> CREATE VIEW demo_view AS select qSpecies from Street_Tree_List; <CTRL+D> | ["Running SQL queries"] | [] |
spatialite:spatialite-warning | spatialite | spatialite-warning | Warning | The SpatiaLite extension adds a large number of additional SQL functions , some of which are not be safe for untrusted users to execute: they may cause the Datasette server to crash. You should not expose a SpatiaLite-enabled Datasette instance to the public internet without taking extra measures to secure it against potentially harmful SQL queries. The following steps are recommended: Disable arbitrary SQL queries by untrusted users. See Controlling the ability to execute arbitrary SQL for ways to do this. The easiest is to start Datasette with the datasette --setting default_allow_sql off option. Define Canned queries with the SQL queries that use SpatiaLite functions that you want people to be able to execute. The Datasette SpatiaLite tutorial includes detailed instructions for running SpatiaLite safely using these techniques | ["SpatiaLite"] | [{"href": "https://www.gaia-gis.it/gaia-sins/spatialite-sql-5.0.1.html", "label": "a large number of additional SQL functions"}, {"href": "https://datasette.io/tutorials/spatialite", "label": "Datasette SpatiaLite tutorial"}] |
changelog:writable-canned-queries | changelog | writable-canned-queries | Writable canned queries | Datasette's Canned queries feature lets you define SQL queries in metadata.json which can then be executed by users visiting a specific URL. https://latest.datasette.io/fixtures/neighborhood_search for example. Canned queries were previously restricted to SELECT , but Datasette 0.44 introduces the ability for canned queries to execute INSERT or UPDATE queries as well, using the new "write": true property ( #800 ): { "databases": { "dogs": { "queries": { "add_name": { "sql": "INSERT INTO names (name) VALUES (:name)", "write": true } } } } } See Writable canned queries for more details. | ["Changelog", "0.44 (2020-06-11)"] | [{"href": "https://latest.datasette.io/fixtures/neighborhood_search", "label": "https://latest.datasette.io/fixtures/neighborhood_search"}, {"href": "https://github.com/simonw/datasette/issues/800", "label": "#800"}] |
sql_queries:canned-queries-writable | sql_queries | canned-queries-writable | Writable canned queries | Canned queries by default are read-only. You can use the "write": true key to indicate that a canned query can write to the database. See Controlling access to specific canned queries for details on how to add permission checks to canned queries, using the "allow" key. { "databases": { "mydatabase": { "queries": { "add_name": { "sql": "INSERT INTO names (name) VALUES (:name)", "write": true } } } } } This configuration will create a page at /mydatabase/add_name displaying a form with a name field. Submitting that form will execute the configured INSERT query. You can customize how Datasette represents success and errors using the following optional properties: on_success_message - the message shown when a query is successful on_success_redirect - the path or URL the user is redirected to on success on_error_message - the message shown when a query throws an error on_error_redirect - the path or URL the user is redirected to on error For example: { "databases": { "mydatabase": { "queries": { "add_name": { "sql": "INSERT INTO names (name) VALUES (:name)", "write": true, "on_success_message": "Name inserted", "on_success_redirect": "/mydatabase/names", "on_error_message": "Name insert failed", "on_error_redirect": "/mydatabase" } } } } } You can use "p… | ["Running SQL queries", "Canned queries"] | [] |
writing_plugins:writing-plugins-one-off | writing_plugins | writing-plugins-one-off | Writing one-off plugins | The quickest way to start writing a plugin is to create a my_plugin.py file and drop it into your plugins/ directory. Here is an example plugin, which adds a new custom SQL function called hello_world() which takes no arguments and returns the string Hello world! . from datasette import hookimpl @hookimpl def prepare_connection(conn): conn.create_function( "hello_world", 0, lambda: "Hello world!" ) If you save this in plugins/my_plugin.py you can then start Datasette like this: datasette serve mydb.db --plugins-dir=plugins/ Now you can navigate to http://localhost:8001/mydb and run this SQL: select hello_world(); To see the output of your plugin. | ["Writing plugins"] | [{"href": "http://localhost:8001/mydb", "label": "http://localhost:8001/mydb"}] |
writing_plugins:id1 | writing_plugins | id1 | Writing plugins | You can write one-off plugins that apply to just one Datasette instance, or you can write plugins which can be installed using pip and can be shipped to the Python Package Index ( PyPI ) for other people to install. Want to start by looking at an example? The Datasette plugins directory lists more than 90 open source plugins with code you can explore. The plugin hooks page includes links to example plugins for each of the documented hooks. | [] | [{"href": "https://pypi.org/", "label": "PyPI"}, {"href": "https://datasette.io/plugins", "label": "Datasette plugins directory"}] |
writing_plugins:writing-plugins-configuration | writing_plugins | writing-plugins-configuration | Writing plugins that accept configuration | When you are writing plugins, you can access plugin configuration like this using the datasette plugin_config() method. If you know you need plugin configuration for a specific table, you can access it like this: plugin_config = datasette.plugin_config( "datasette-cluster-map", database="sf-trees", table="Street_Tree_List" ) This will return the {"latitude_column": "lat", "longitude_column": "lng"} in the above example. If there is no configuration for that plugin, the method will return None . If it cannot find the requested configuration at the table layer, it will fall back to the database layer and then the root layer. For example, a user may have set the plugin configuration option like so: { "databases: { "sf-trees": { "plugins": { "datasette-cluster-map": { "latitude_column": "xlat", "longitude_column": "xlng" } } } } } In this case, the above code would return that configuration for ANY table within the sf-trees database. The plugin configuration could also be set at the top level of metadata.json : { "title": "This is the top-level title in metadata.json", "plugins": { "datasette-cluster-map": { "latitude_column": "xlat", "longitude_column": "xlng" } } } Now that datasette-cluster-map plugin configuration will apply to every table in every database. | ["Writing plugins"] | [] |
plugin_hooks:plugin-hook-actor-from-request | plugin_hooks | plugin-hook-actor-from-request | actor_from_request(datasette, request) | datasette - Datasette class You can use this to access plugin configuration options via datasette.plugin_config(your_plugin_name) , or to execute SQL queries. request - Request object The current HTTP request. This is part of Datasette's authentication and permissions system . The function should attempt to authenticate an actor (either a user or an API actor of some sort) based on information in the request. If it cannot authenticate an actor, it should return None . Otherwise it should return a dictionary representing that actor. Here's an example that authenticates the actor based on an incoming API key: from datasette import hookimpl import secrets SECRET_KEY = "this-is-a-secret" @hookimpl def actor_from_request(datasette, request): authorization = ( request.headers.get("authorization") or "" ) expected = "Bearer {}".format(SECRET_KEY) if secrets.compare_digest(authorization, expected): return {"id": "bot"} If you install this in your plugins directory you can test it like this: $ curl -H 'Authorization: Bearer this-is-a-secret' http://localhost:8003/-/actor.json Instead of returning a dictionary, this function can return an awaitable function which itself returns either None or a dictionary. This is useful for authentication functions that need to make a database query - for example: from datasette import hookimpl @hookimpl def actor_from_request(datasette, request): async def inner(): token = request.args.get("_token") if not token: return None # Look up ?_token=xxx in sessions table result = await datasette.get_database().execute( "select count(*) from sessions wher… | ["Plugin hooks"] | [{"href": "https://datasette.io/plugins/datasette-auth-tokens", "label": "datasette-auth-tokens"}] |
authentication:authentication-actor-matches-allow | authentication | authentication-actor-matches-allow | actor_matches_allow() | Plugins that wish to implement this same "allow" block permissions scheme can take advantage of the datasette.utils.actor_matches_allow(actor, allow) function: from datasette.utils import actor_matches_allow actor_matches_allow({"id": "root"}, {"id": "*"}) # returns True The currently authenticated actor is made available to plugins as request.actor . | ["Authentication and permissions"] | [] |
settings:setting-allow-csv-stream | settings | setting-allow-csv-stream | allow_csv_stream | Enables the CSV export feature where an entire table (potentially hundreds of thousands of rows) can be exported as a single CSV file. This is turned on by default - you can turn it off like this: datasette mydatabase.db --setting allow_csv_stream off | ["Settings", "Settings"] | [] |
settings:setting-allow-download | settings | setting-allow-download | allow_download | Should users be able to download the original SQLite database using a link on the database index page? This is turned on by default. However, databases can only be downloaded if they are served in immutable mode and not in-memory. If downloading is unavailable for either of these reasons, the download link is hidden even if allow_download is on. To disable database downloads, use the following: datasette mydatabase.db --setting allow_download off | ["Settings", "Settings"] | [] |
settings:setting-allow-facet | settings | setting-allow-facet | allow_facet | Allow users to specify columns they would like to facet on using the ?_facet=COLNAME URL parameter to the table view. This is enabled by default. If disabled, facets will still be displayed if they have been specifically enabled in metadata.json configuration for the table. Here's how to disable this feature: datasette mydatabase.db --setting allow_facet off | ["Settings", "Settings"] | [] |
plugin_hooks:plugin-asgi-wrapper | plugin_hooks | plugin-asgi-wrapper | asgi_wrapper(datasette) | Return an ASGI middleware wrapper function that will be applied to the Datasette ASGI application. This is a very powerful hook. You can use it to manipulate the entire Datasette response, or even to configure new URL routes that will be handled by your own custom code. You can write your ASGI code directly against the low-level specification, or you can use the middleware utilities provided by an ASGI framework such as Starlette . This example plugin adds a x-databases HTTP header listing the currently attached databases: from datasette import hookimpl from functools import wraps @hookimpl def asgi_wrapper(datasette): def wrap_with_databases_header(app): @wraps(app) async def add_x_databases_header( scope, receive, send ): async def wrapped_send(event): if event["type"] == "http.response.start": original_headers = ( event.get("headers") or [] ) event = { "type": event["type"], "status": event["status"], "headers": original_headers + [ [ b"x-databases", ", ".join( datasette.databases.keys() ).encode("utf-8"), ] ], } await send(event) await app(scope, receive, wrapped_send) return add_x_databases_header return wrap_with_databases_header Examples: datasette-cors , datasette-pyinstrument , datasette-total-page-time | ["Plugin hooks"] | [{"href": "https://asgi.readthedocs.io/", "label": "ASGI"}, {"href": "https://www.starlette.io/middleware/", "label": "Starlette"}, {"href": "https://datasette.io/plugins/datasette-cors", "label": "datasette-cors"}, {"href": "https://datasette.io/plugins/datasette-pyinstrument", "label": "datasette-pyinstrument"}, {"href": "https://datasette.io/plugins/datasette-total-page-time", "label": "datasette-total-page-time"}] |
internals:datasette-check-visibility | internals | datasette-check-visibility | await .check_visibility(actor, action=None, resource=None, permissions=None) | actor - dictionary The authenticated actor. This is usually request.actor . action - string, optional The name of the action that is being permission checked. resource - string or tuple, optional The resource, e.g. the name of the database, or a tuple of two strings containing the name of the database and the name of the table. Only some permissions apply to a resource. permissions - list of action strings or (action, resource) tuples, optional Provide this instead of action and resource to check multiple permissions at once. This convenience method can be used to answer the question "should this item be considered private, in that it is visible to me but it is not visible to anonymous users?" It returns a tuple of two booleans, (visible, private) . visible indicates if the actor can see this resource. private will be True if an anonymous user would not be able to view the resource. This example checks if the user can access a specific table, and sets private so that a padlock icon can later be displayed: visible, private = await self.ds.check_visibility( request.actor, action="view-table", resource=(database, table), ) The following example runs three checks in a row, similar to await .ensure_permissions(actor, permissions) . If any of the checks are denied before one of them is explicitly granted then visible will be F… | ["Internals for plugins", "Datasette class"] | [] |
internals:datasette-ensure-permissions | internals | datasette-ensure-permissions | await .ensure_permissions(actor, permissions) | actor - dictionary The authenticated actor. This is usually request.actor . permissions - list A list of permissions to check. Each permission in that list can be a string action name or a 2-tuple of (action, resource) . This method allows multiple permissions to be checked at once. It raises a datasette.Forbidden exception if any of the checks are denied before one of them is explicitly granted. This is useful when you need to check multiple permissions at once. For example, an actor should be able to view a table if either one of the following checks returns True or not a single one of them returns False : await self.ds.ensure_permissions( request.actor, [ ("view-table", (database, table)), ("view-database", database), "view-instance", ], ) | ["Internals for plugins", "Datasette class"] | [] |
internals:datasette-permission-allowed | internals | datasette-permission-allowed | await .permission_allowed(actor, action, resource=None, default=False) | actor - dictionary The authenticated actor. This is usually request.actor . action - string The name of the action that is being permission checked. resource - string or tuple, optional The resource, e.g. the name of the database, or a tuple of two strings containing the name of the database and the name of the table. Only some permissions apply to a resource. default - optional, True or False Should this permission check be default allow or default deny. Check if the given actor has permission to perform the given action on the given resource. Some permission checks are carried out against rules defined in metadata.json , while other custom permissions may be decided by plugins that implement the permission_allowed(datasette, actor, action, resource) plugin hook. If neither metadata.json nor any of the plugins provide an answer to the permission query the default argument will be returned. See Built-in permissions for a full list of permission actions included in Datasette core. | ["Internals for plugins", "Datasette class"] | [] |
internals:datasette-render-template | internals | datasette-render-template | await .render_template(template, context=None, request=None) | template - string, list of strings or jinja2.Template The template file to be rendered, e.g. my_plugin.html . Datasette will search for this file first in the --template-dir= location, if it was specified - then in the plugin's bundled templates and finally in Datasette's set of default templates. If this is a list of template file names then the first one that exists will be loaded and rendered. If this is a Jinja Template object it will be used directly. context - None or a Python dictionary The context variables to pass to the template. request - request object or None If you pass a Datasette request object here it will be made available to the template. Renders a Jinja template using Datasette's preconfigured instance of Jinja and returns the resulting string. The template will have access to Datasette's default template functions and any functions that have been made available by other plugins. | ["Internals for plugins", "Datasette class"] | [{"href": "https://jinja.palletsprojects.com/en/2.11.x/api/#jinja2.Template", "label": "Template object"}, {"href": "https://jinja.palletsprojects.com/en/2.11.x/", "label": "Jinja template"}] |
internals:database-execute | internals | database-execute | await db.execute(sql, ...) | Executes a SQL query against the database and returns the resulting rows (see Results ). sql - string (required) The SQL query to execute. This can include ? or :named parameters. params - list or dict A list or dictionary of values to use for the parameters. List for ? , dictionary for :named . truncate - boolean Should the rows returned by the query be truncated at the maximum page size? Defaults to True , set this to False to disable truncation. custom_time_limit - integer ms A custom time limit for this query. This can be set to a lower value than the Datasette configured default. If a query takes longer than this it will be terminated early and raise a dataette.database.QueryInterrupted exception. page_size - integer Set a custom page size for truncation, over-riding the configured Datasette default. log_sql_errors - boolean Should any SQL errors be logged to the console in addition to being raised as an error? Defaults to True . | ["Internals for plugins", "Database class"] | [] |
internals:database-execute-fn | internals | database-execute-fn | await db.execute_fn(fn) | Executes a given callback function against a read-only database connection running in a thread. The function will be passed a SQLite connection, and the return value from the function will be returned by the await . Example usage: def get_version(conn): return conn.execute( "select sqlite_version()" ).fetchall()[0][0] version = await db.execute_fn(get_version) | ["Internals for plugins", "Database class"] | [] |
internals:database-execute-write | internals | database-execute-write | await db.execute_write(sql, params=None, block=True) | SQLite only allows one database connection to write at a time. Datasette handles this for you by maintaining a queue of writes to be executed against a given database. Plugins can submit write operations to this queue and they will be executed in the order in which they are received. This method can be used to queue up a non-SELECT SQL query to be executed against a single write connection to the database. You can pass additional SQL parameters as a tuple or dictionary. The method will block until the operation is completed, and the return value will be the return from calling conn.execute(...) using the underlying sqlite3 Python library. If you pass block=False this behaviour changes to "fire and forget" - queries will be added to the write queue and executed in a separate thread while your code can continue to do other things. The method will return a UUID representing the queued task. | ["Internals for plugins", "Database class"] | [] |
internals:database-execute-write-fn | internals | database-execute-write-fn | await db.execute_write_fn(fn, block=True) | This method works like .execute_write() , but instead of a SQL statement you give it a callable Python function. Your function will be queued up and then called when the write connection is available, passing that connection as the argument to the function. The function can then perform multiple actions, safe in the knowledge that it has exclusive access to the single writable connection for as long as it is executing. fn needs to be a regular function, not an async def function. For example: def delete_and_return_count(conn): conn.execute("delete from some_table where id > 5") return conn.execute( "select count(*) from some_table" ).fetchone()[0] try: num_rows_left = await database.execute_write_fn( delete_and_return_count ) except Exception as e: print("An error occurred:", e) The value returned from await database.execute_write_fn(...) will be the return value from your function. If your function raises an exception that exception will be propagated up to the await line. If you specify block=False the method becomes fire-and-forget, queueing your function to be executed and then allowing your code after the call to .execute_write_fn() to continue running while the underlying thread waits for an opportunity to run your function. A UUID representing the queued task will be returned. Any exceptions in your code will be silently swallowed. | ["Internals for plugins", "Database class"] | [] |
internals:database-execute-write-many | internals | database-execute-write-many | await db.execute_write_many(sql, params_seq, block=True) | Like execute_write() but uses the sqlite3 conn.executemany() method. This will efficiently execute the same SQL statement against each of the parameters in the params_seq iterator, for example: await db.execute_write_many( "insert into characters (id, name) values (?, ?)", [(1, "Melanie"), (2, "Selma"), (2, "Viktor")], ) | ["Internals for plugins", "Database class"] | [{"href": "https://docs.python.org/3/library/sqlite3.html#sqlite3.Cursor.executemany", "label": "conn.executemany()"}] |
internals:database-execute-write-script | internals | database-execute-write-script | await db.execute_write_script(sql, block=True) | Like execute_write() but can be used to send multiple SQL statements in a single string separated by semicolons, using the sqlite3 conn.executescript() method. | ["Internals for plugins", "Database class"] | [{"href": "https://docs.python.org/3/library/sqlite3.html#sqlite3.Cursor.executescript", "label": "conn.executescript()"}] |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [sections] ( [id] TEXT PRIMARY KEY, [page] TEXT, [ref] TEXT, [title] TEXT, [content] TEXT, [breadcrumbs] TEXT, [references] TEXT );