sections_fts
476 rows
This data as json, CSV (advanced)
Link | rowid ▼ | title | content | sections_fts | rank |
---|---|---|---|---|---|
1 | 1 | Authentication and permissions | Datasette does not require authentication by default. Any visitor to a Datasette instance can explore the full data and execute read-only SQL queries. Datasette's plugin system can be used to add many different styles of authentication, such as user accounts, single sign-on or API keys. | 1 | |
2 | 2 | Actors | Through plugins, Datasette can support both authenticated users (with cookies) and authenticated API agents (via authentication tokens). The word "actor" is used to cover both of these cases. Every request to Datasette has an associated actor value, available in the code as request.actor . This can be None for unauthenticated requests, or a JSON compatible Python dictionary for authenticated users or API agents. The actor dictionary can be any shape - the design of that data structure is left up to the plugins. A useful convention is to include an "id" string, as demonstrated by the "root" actor below. Plugins can use the actor_from_request(datasette, request) hook to implement custom logic for authenticating an actor based on the incoming HTTP request. | 1 | |
3 | 3 | Using the "root" actor | Datasette currently leaves almost all forms of authentication to plugins - datasette-auth-github for example. The one exception is the "root" account, which you can sign into while using Datasette on your local machine. This provides access to a small number of debugging features. To sign in as root, start Datasette using the --root command-line option, like this: $ datasette --root http://127.0.0.1:8001/-/auth-token?token=786fc524e0199d70dc9a581d851f466244e114ca92f33aa3b42a139e9388daa7 INFO: Started server process [25801] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8001 (Press CTRL+C to quit) The URL on the first line includes a one-use token which can be used to sign in as the "root" actor in your browser. Click on that link and then visit http://127.0.0.1:8001/-/actor to confirm that you are authenticated as an actor that looks like this: { "id": "root" } | 1 | |
4 | 4 | Permissions | Datasette has an extensive permissions system built-in, which can be further extended and customized by plugins. The key question the permissions system answers is this: Is this actor allowed to perform this action , optionally against this particular resource ? Actors are described above . An action is a string describing the action the actor would like to perform. A full list is provided below - examples include view-table and execute-sql . A resource is the item the actor wishes to interact with - for example a specific database or table. Some actions, such as permissions-debug , are not associated with a particular resource. Datasette's built-in view permissions ( view-database , view-table etc) default to allow - unless you configure additional permission rules unauthenticated users will be allowed to access content. Permissions with potentially harmful effects should default to deny . Plugin authors should account for this when designing new plugins - for example, the datasette-upload-csvs plugin defaults to deny so that installations don't accidentally allow unauthenticated users to create new tables by uploading a CSV file. | 1 | |
5 | 5 | Defining permissions with "allow" blocks | The standard way to define permissions in Datasette is to use an "allow" block. This is a JSON document describing which actors are allowed to perform a permission. The most basic form of allow block is this ( allow demo , deny demo ): { "allow": { "id": "root" } } This will match any actors with an "id" property of "root" - for example, an actor that looks like this: { "id": "root", "name": "Root User" } An allow block can specify "deny all" using false ( demo ): { "allow": false } An "allow" of true allows all access ( demo ): { "allow": true } Allow keys can provide a list of values. These will match any actor that has any of those values ( allow demo , deny demo ): { "allow": { "id": ["simon", "cleopaws"] } } This will match any actor with an "id" of either "simon" or "cleopaws" . Actors can have properties that feature a list of values. These will be matched against the list of values in an allow block. Consider the following actor: { "id": "simon", "roles": ["staff", "developer"] } This allow block will provide access to any actor that has "developer" as one of their roles ( allow demo , deny demo ): { "allow": { "roles": ["developer"] } } Note that "roles" is not a concept that is baked into Datasette - it's a convention that plugins can choose to implement and act on. If you want to provide access to any actor with a value for a specific key, use "*" . For example, to match any logged-in user specify the following ( allow demo , deny demo ): { "allow": { "id": "*" } } You can specify that only unauthenticated actors (from anynomous HTTP requests) should be all… | 1 | |
6 | 6 | The /-/allow-debug tool | The /-/allow-debug tool lets you try out different "action" blocks against different "actor" JSON objects. You can try that out here: https://latest.datasette.io/-/allow-debug | 1 | |
7 | 7 | Configuring permissions in metadata.json | You can limit who is allowed to view different parts of your Datasette instance using "allow" keys in your Metadata configuration. You can control the following: Access to the entire Datasette instance Access to specific databases Access to specific tables and views Access to specific Canned queries If a user cannot access a specific database, they will not be able to access tables, views or queries within that database. If a user cannot access the instance they will not be able to access any of the databases, tables, views or queries. | 1 | |
8 | 8 | Controlling access to an instance | Here's how to restrict access to your entire Datasette instance to just the "id": "root" user: { "title": "My private Datasette instance", "allow": { "id": "root" } } To deny access to all users, you can use "allow": false : { "title": "My entirely inaccessible instance", "allow": false } One reason to do this is if you are using a Datasette plugin - such as datasette-permissions-sql - to control permissions instead. | 1 | |
9 | 9 | Controlling access to specific databases | To limit access to a specific private.db database to just authenticated users, use the "allow" block like this: { "databases": { "private": { "allow": { "id": "*" } } } } | 1 | |
10 | 10 | Controlling access to specific tables and views | To limit access to the users table in your bakery.db database: { "databases": { "bakery": { "tables": { "users": { "allow": { "id": "*" } } } } } } This works for SQL views as well - you can list their names in the "tables" block above in the same way as regular tables. Restricting access to tables and views in this way will NOT prevent users from querying them using arbitrary SQL queries, like this for example. If you are restricting access to specific tables you should also use the "allow_sql" block to prevent users from bypassing the limit with their own SQL queries - see Controlling the ability to execute arbitrary SQL . | 1 | |
11 | 11 | Controlling access to specific canned queries | Canned queries allow you to configure named SQL queries in your metadata.json that can be executed by users. These queries can be set up to both read and write to the database, so controlling who can execute them can be important. To limit access to the add_name canned query in your dogs.db database to just the root user : { "databases": { "dogs": { "queries": { "add_name": { "sql": "INSERT INTO names (name) VALUES (:name)", "write": true, "allow": { "id": ["root"] } } } } } } | 1 | |
12 | 12 | Controlling the ability to execute arbitrary SQL | Datasette defaults to allowing any site visitor to execute their own custom SQL queries, for example using the form on the database page or by appending a ?_where= parameter to the table page like this . Access to this ability is controlled by the execute-sql permission. The easiest way to disable arbitrary SQL queries is using the default_allow_sql setting when you first start Datasette running. You can alternatively use an "allow_sql" block to control who is allowed to execute arbitrary SQL queries. To prevent any user from executing arbitrary SQL queries, use this: { "allow_sql": false } To enable just the root user to execute SQL for all databases in your instance, use the following: { "allow_sql": { "id": "root" } } To limit this ability for just one specific database, use this: { "databases": { "mydatabase": { "allow_sql": { "id": "root" } } } } | 1 | |
13 | 13 | Checking permissions in plugins | Datasette plugins can check if an actor has permission to perform an action using the datasette.permission_allowed(...) method. Datasette core performs a number of permission checks, documented below . Plugins can implement the permission_allowed(datasette, actor, action, resource) plugin hook to participate in decisions about whether an actor should be able to perform a specified action. | 1 | |
14 | 14 | actor_matches_allow() | Plugins that wish to implement this same "allow" block permissions scheme can take advantage of the datasette.utils.actor_matches_allow(actor, allow) function: from datasette.utils import actor_matches_allow actor_matches_allow({"id": "root"}, {"id": "*"}) # returns True The currently authenticated actor is made available to plugins as request.actor . | 1 | |
15 | 15 | The permissions debug tool | The debug tool at /-/permissions is only available to the authenticated root user (or any actor granted the permissions-debug action according to a plugin). It shows the thirty most recent permission checks that have been carried out by the Datasette instance. This is designed to help administrators and plugin authors understand exactly how permission checks are being carried out, in order to effectively configure Datasette's permission system. | 1 | |
16 | 16 | The ds_actor cookie | Datasette includes a default authentication plugin which looks for a signed ds_actor cookie containing a JSON actor dictionary. This is how the root actor mechanism works. Authentication plugins can set signed ds_actor cookies themselves like so: response = Response.redirect("/") response.set_cookie( "ds_actor", datasette.sign({"a": {"id": "cleopaws"}}, "actor"), ) Note that you need to pass "actor" as the namespace to .sign(value, namespace="default") . The shape of data encoded in the cookie is as follows: { "a": {... actor ...} } | 1 | |
17 | 17 | Including an expiry time | ds_actor cookies can optionally include a signed expiry timestamp, after which the cookies will no longer be valid. Authentication plugins may chose to use this mechanism to limit the lifetime of the cookie. For example, if a plugin implements single-sign-on against another source it may decide to set short-lived cookies so that if the user is removed from the SSO system their existing Datasette cookies will stop working shortly afterwards. To include an expiry, add a "e" key to the cookie value containing a base62-encoded integer representing the timestamp when the cookie should expire. For example, here's how to set a cookie that expires after 24 hours: import time from datasette.utils import baseconv expires_at = int(time.time()) + (24 * 60 * 60) response = Response.redirect("/") response.set_cookie( "ds_actor", datasette.sign( { "a": {"id": "cleopaws"}, "e": baseconv.base62.encode(expires_at), }, "actor", ), ) The resulting cookie will encode data that looks something like this: { "a": { "id": "cleopaws" }, "e": "1jjSji" } | 1 | |
18 | 18 | The /-/logout page | The page at /-/logout provides the ability to log out of a ds_actor cookie authentication session. | 1 | |
19 | 19 | Built-in permissions | This section lists all of the permission checks that are carried out by Datasette core, along with the resource if it was passed. | 1 | |
20 | 20 | view-instance | Top level permission - Actor is allowed to view any pages within this instance, starting at https://latest.datasette.io/ Default allow . | 1 | |
21 | 21 | view-database | Actor is allowed to view a database page, e.g. https://latest.datasette.io/fixtures resource - string The name of the database Default allow . | 1 | |
22 | 22 | view-database-download | Actor is allowed to download a database, e.g. https://latest.datasette.io/fixtures.db resource - string The name of the database Default allow . | 1 | |
23 | 23 | view-table | Actor is allowed to view a table (or view) page, e.g. https://latest.datasette.io/fixtures/complex_foreign_keys resource - tuple: (string, string) The name of the database, then the name of the table Default allow . | 1 | |
24 | 24 | view-query | Actor is allowed to view (and execute) a canned query page, e.g. https://latest.datasette.io/fixtures/pragma_cache_size - this includes executing Writable canned queries . resource - tuple: (string, string) The name of the database, then the name of the canned query Default allow . | 1 | |
25 | 25 | execute-sql | Actor is allowed to run arbitrary SQL queries against a specific database, e.g. https://latest.datasette.io/fixtures?sql=select+100 resource - string The name of the database Default allow . See also the default_allow_sql setting . | 1 | |
26 | 26 | permissions-debug | Actor is allowed to view the /-/permissions debug page. Default deny . | 1 | |
27 | 27 | debug-menu | Controls if the various debug pages are displayed in the navigation menu. Default deny . | 1 | |
28 | 28 | Settings | 1 | ||
29 | 29 | Using --setting | Datasette supports a number of settings. These can be set using the --setting name value option to datasette serve . You can set multiple settings at once like this: datasette mydatabase.db \ --setting default_page_size 50 \ --setting sql_time_limit_ms 3500 \ --setting max_returned_rows 2000 | 1 | |
30 | 30 | Configuration directory mode | Normally you configure Datasette using command-line options. For a Datasette instance with custom templates, custom plugins, a static directory and several databases this can get quite verbose: $ datasette one.db two.db \ --metadata=metadata.json \ --template-dir=templates/ \ --plugins-dir=plugins \ --static css:css As an alternative to this, you can run Datasette in configuration directory mode. Create a directory with the following structure: # In a directory called my-app: my-app/one.db my-app/two.db my-app/metadata.json my-app/templates/index.html my-app/plugins/my_plugin.py my-app/static/my.css Now start Datasette by providing the path to that directory: $ datasette my-app/ Datasette will detect the files in that directory and automatically configure itself using them. It will serve all *.db files that it finds, will load metadata.json if it exists, and will load the templates , plugins and static folders if they are present. The files that can be included in this directory are as follows. All are optional. *.db (or *.sqlite3 or *.sqlite ) - SQLite database files that will be served by Datasette metadata.json - Metadata for those databases - metadata.yaml or metadata.yml can be used as well inspect-data.json - the result of running datasette inspect *.db --inspect-file=inspect-data.json from the configuration directory - any database files listed here will be treated as immutable, so they should not be changed while Datasette is running settings.json - settings that would normally be passed using --setting - here they should be stored as a JSON object of key/value pairs templates/ - a di… | 1 | |
31 | 31 | Settings | The following options can be set using --setting name value , or by storing them in the settings.json file for use with Configuration directory mode . | 1 | |
32 | 32 | default_allow_sql | Should users be able to execute arbitrary SQL queries by default? Setting this to off causes permission checks for execute-sql to fail by default. datasette mydatabase.db --setting default_allow_sql off There are two ways to achieve this: the other is to add "allow_sql": false to your metadata.json file, as described in Controlling the ability to execute arbitrary SQL . This setting offers a more convenient way to do this. | 1 | |
33 | 33 | default_page_size | The default number of rows returned by the table page. You can over-ride this on a per-page basis using the ?_size=80 query string parameter, provided you do not specify a value higher than the max_returned_rows setting. You can set this default using --setting like so: datasette mydatabase.db --setting default_page_size 50 | 1 | |
34 | 34 | sql_time_limit_ms | By default, queries have a time limit of one second. If a query takes longer than this to run Datasette will terminate the query and return an error. If this time limit is too short for you, you can customize it using the sql_time_limit_ms limit - for example, to increase it to 3.5 seconds: datasette mydatabase.db --setting sql_time_limit_ms 3500 You can optionally set a lower time limit for an individual query using the ?_timelimit=100 query string argument: /my-database/my-table?qSpecies=44&_timelimit=100 This would set the time limit to 100ms for that specific query. This feature is useful if you are working with databases of unknown size and complexity - a query that might make perfect sense for a smaller table could take too long to execute on a table with millions of rows. By setting custom time limits you can execute queries "optimistically" - e.g. give me an exact count of rows matching this query but only if it takes less than 100ms to calculate. | 1 | |
35 | 35 | max_returned_rows | Datasette returns a maximum of 1,000 rows of data at a time. If you execute a query that returns more than 1,000 rows, Datasette will return the first 1,000 and include a warning that the result set has been truncated. You can use OFFSET/LIMIT or other methods in your SQL to implement pagination if you need to return more than 1,000 rows. You can increase or decrease this limit like so: datasette mydatabase.db --setting max_returned_rows 2000 | 1 | |
36 | 36 | num_sql_threads | Maximum number of threads in the thread pool Datasette uses to execute SQLite queries. Defaults to 3. datasette mydatabase.db --setting num_sql_threads 10 Setting this to 0 turns off threaded SQL queries entirely - useful for environments that do not support threading such as Pyodide . | 1 | |
37 | 37 | allow_facet | Allow users to specify columns they would like to facet on using the ?_facet=COLNAME URL parameter to the table view. This is enabled by default. If disabled, facets will still be displayed if they have been specifically enabled in metadata.json configuration for the table. Here's how to disable this feature: datasette mydatabase.db --setting allow_facet off | 1 | |
38 | 38 | default_facet_size | The default number of unique rows returned by Facets is 30. You can customize it like this: datasette mydatabase.db --setting default_facet_size 50 | 1 | |
39 | 39 | facet_time_limit_ms | This is the time limit Datasette allows for calculating a facet, which defaults to 200ms: datasette mydatabase.db --setting facet_time_limit_ms 1000 | 1 | |
40 | 40 | facet_suggest_time_limit_ms | When Datasette calculates suggested facets it needs to run a SQL query for every column in your table. The default for this time limit is 50ms to account for the fact that it needs to run once for every column. If the time limit is exceeded the column will not be suggested as a facet. You can increase this time limit like so: datasette mydatabase.db --setting facet_suggest_time_limit_ms 500 | 1 | |
41 | 41 | suggest_facets | Should Datasette calculate suggested facets? On by default, turn this off like so: datasette mydatabase.db --setting suggest_facets off | 1 | |
42 | 42 | allow_download | Should users be able to download the original SQLite database using a link on the database index page? This is turned on by default. However, databases can only be downloaded if they are served in immutable mode and not in-memory. If downloading is unavailable for either of these reasons, the download link is hidden even if allow_download is on. To disable database downloads, use the following: datasette mydatabase.db --setting allow_download off | 1 | |
43 | 43 | default_cache_ttl | Default HTTP caching max-age header in seconds, used for Cache-Control: max-age=X . Can be over-ridden on a per-request basis using the ?_ttl= query string parameter. Set this to 0 to disable HTTP caching entirely. Defaults to 5 seconds. datasette mydatabase.db --setting default_cache_ttl 60 | 1 | |
44 | 44 | cache_size_kb | Sets the amount of memory SQLite uses for its per-connection cache , in KB. datasette mydatabase.db --setting cache_size_kb 5000 | 1 | |
45 | 45 | allow_csv_stream | Enables the CSV export feature where an entire table (potentially hundreds of thousands of rows) can be exported as a single CSV file. This is turned on by default - you can turn it off like this: datasette mydatabase.db --setting allow_csv_stream off | 1 | |
46 | 46 | max_csv_mb | The maximum size of CSV that can be exported, in megabytes. Defaults to 100MB. You can disable the limit entirely by settings this to 0: datasette mydatabase.db --setting max_csv_mb 0 | 1 | |
47 | 47 | truncate_cells_html | In the HTML table view, truncate any strings that are longer than this value. The full value will still be available in CSV, JSON and on the individual row HTML page. Set this to 0 to disable truncation. datasette mydatabase.db --setting truncate_cells_html 0 | 1 | |
48 | 48 | force_https_urls | Forces self-referential URLs in the JSON output to always use the https:// protocol. This is useful for cases where the application itself is hosted using HTTP but is served to the outside world via a proxy that enables HTTPS. datasette mydatabase.db --setting force_https_urls 1 | 1 | |
49 | 49 | template_debug | This setting enables template context debug mode, which is useful to help understand what variables are available to custom templates when you are writing them. Enable it like this: datasette mydatabase.db --setting template_debug 1 Now you can add ?_context=1 or &_context=1 to any Datasette page to see the context that was passed to that template. Some examples: https://latest.datasette.io/?_context=1 https://latest.datasette.io/fixtures?_context=1 https://latest.datasette.io/fixtures/roadside_attractions?_context=1 | 1 | |
50 | 50 | trace_debug | This setting enables appending ?_trace=1 to any page in order to see the SQL queries and other trace information that was used to generate that page. Enable it like this: datasette mydatabase.db --setting trace_debug 1 Some examples: https://latest.datasette.io/?_trace=1 https://latest.datasette.io/fixtures/roadside_attractions?_trace=1 See datasette.tracer for details on how to hook into this mechanism as a plugin author. | 1 | |
51 | 51 | base_url | If you are running Datasette behind a proxy, it may be useful to change the root path used for the Datasette instance. For example, if you are sending traffic from https://www.example.com/tools/datasette/ through to a proxied Datasette instance you may wish Datasette to use /tools/datasette/ as its root URL. You can do that like so: datasette mydatabase.db --setting base_url /tools/datasette/ | 1 | |
52 | 52 | Configuring the secret | Datasette uses a secret string to sign secure values such as cookies. If you do not provide a secret, Datasette will create one when it starts up. This secret will reset every time the Datasette server restarts though, so things like authentication cookies will not stay valid between restarts. You can pass a secret to Datasette in two ways: with the --secret command-line option or by setting a DATASETTE_SECRET environment variable. $ datasette mydb.db --secret=SECRET_VALUE_HERE Or: $ export DATASETTE_SECRET=SECRET_VALUE_HERE $ datasette mydb.db One way to generate a secure random secret is to use Python like this: $ python3 -c 'import secrets; print(secrets.token_hex(32))' cdb19e94283a20f9d42cca50c5a4871c0aa07392db308755d60a1a5b9bb0fa52 Plugin authors make use of this signing mechanism in their plugins using .sign(value, namespace="default") and .unsign(value, namespace="default") . | 1 | |
53 | 53 | Using secrets with datasette publish | The datasette publish and datasette package commands both generate a secret for you automatically when Datasette is deployed. This means that every time you deploy a new version of a Datasette project, a new secret will be generated. This will cause signed cookies to become invalid on every fresh deploy. You can fix this by creating a secret that will be used for multiple deploys and passing it using the --secret option: datasette publish cloudrun mydb.db --service=my-service --secret=cdb19e94283a20f9d42cca5 | 1 | |
54 | 54 | Metadata | Data loves metadata. Any time you run Datasette you can optionally include a JSON file with metadata about your databases and tables. Datasette will then display that information in the web UI. Run Datasette like this: datasette database1.db database2.db --metadata metadata.json Your metadata.json file can look something like this: { "title": "Custom title for your index page", "description": "Some description text can go here", "license": "ODbL", "license_url": "https://opendatacommons.org/licenses/odbl/", "source": "Original Data Source", "source_url": "http://example.com/" } You can optionally use YAML instead of JSON, see Using YAML for metadata . The above metadata will be displayed on the index page of your Datasette-powered site. The source and license information will also be included in the footer of every page served by Datasette. Any special HTML characters in description will be escaped. If you want to include HTML in your description, you can use a description_html property instead. | 1 | |
55 | 55 | Per-database and per-table metadata | Metadata at the top level of the JSON will be shown on the index page and in the footer on every page of the site. The license and source is expected to apply to all of your data. You can also provide metadata at the per-database or per-table level, like this: { "databases": { "database1": { "source": "Alternative source", "source_url": "http://example.com/", "tables": { "example_table": { "description_html": "Custom <em>table</em> description", "license": "CC BY 3.0 US", "license_url": "https://creativecommons.org/licenses/by/3.0/us/" } } } } } Each of the top-level metadata fields can be used at the database and table level. | 1 | |
56 | 56 | Source, license and about | The three visible metadata fields you can apply to everything, specific databases or specific tables are source, license and about. All three are optional. source and source_url should be used to indicate where the underlying data came from. license and license_url should be used to indicate the license under which the data can be used. about and about_url can be used to link to further information about the project - an accompanying blog entry for example. For each of these you can provide just the *_url field and Datasette will treat that as the default link label text and display the URL directly on the page. | 1 | |
57 | 57 | Column descriptions | You can include descriptions for your columns by adding a "columns": {"name-of-column": "description-of-column"} block to your table metadata: { "databases": { "database1": { "tables": { "example_table": { "columns": { "column1": "Description of column 1", "column2": "Description of column 2" } } } } } } These will be displayed at the top of the table page, and will also show in the cog menu for each column. You can see an example of how these look at latest.datasette.io/fixtures/roadside_attractions . | 1 | |
58 | 58 | Specifying units for a column | Datasette supports attaching units to a column, which will be used when displaying values from that column. SI prefixes will be used where appropriate. Column units are configured in the metadata like so: { "databases": { "database1": { "tables": { "example_table": { "units": { "column1": "metres", "column2": "Hz" } } } } } } Units are interpreted using Pint , and you can see the full list of available units in Pint's unit registry . You can also add custom units to the metadata, which will be registered with Pint: { "custom_units": [ "decibel = [] = dB" ] } | 1 | |
59 | 59 | Setting a default sort order | By default Datasette tables are sorted by primary key. You can over-ride this default for a specific table using the "sort" or "sort_desc" metadata properties: { "databases": { "mydatabase": { "tables": { "example_table": { "sort": "created" } } } } } Or use "sort_desc" to sort in descending order: { "databases": { "mydatabase": { "tables": { "example_table": { "sort_desc": "created" } } } } } | 1 | |
60 | 60 | Setting a custom page size | Datasette defaults to displaying 100 rows per page, for both tables and views. You can change this default page size on a per-table or per-view basis using the "size" key in metadata.json : { "databases": { "mydatabase": { "tables": { "example_table": { "size": 10 } } } } } This size can still be over-ridden by passing e.g. ?_size=50 in the query string. | 1 | |
61 | 61 | Setting which columns can be used for sorting | Datasette allows any column to be used for sorting by default. If you need to control which columns are available for sorting you can do so using the optional sortable_columns key: { "databases": { "database1": { "tables": { "example_table": { "sortable_columns": [ "height", "weight" ] } } } } } This will restrict sorting of example_table to just the height and weight columns. You can also disable sorting entirely by setting "sortable_columns": [] You can use sortable_columns to enable specific sort orders for a view called name_of_view in the database my_database like so: { "databases": { "my_database": { "tables": { "name_of_view": { "sortable_columns": [ "clicks", "impressions" ] } } } } } | 1 | |
62 | 62 | Specifying the label column for a table | Datasette's HTML interface attempts to display foreign key references as labelled hyperlinks. By default, it looks for referenced tables that only have two columns: a primary key column and one other. It assumes that the second column should be used as the link label. If your table has more than two columns you can specify which column should be used for the link label with the label_column property: { "databases": { "database1": { "tables": { "example_table": { "label_column": "title" } } } } } | 1 | |
63 | 63 | Hiding tables | You can hide tables from the database listing view (in the same way that FTS and SpatiaLite tables are automatically hidden) using "hidden": true : { "databases": { "database1": { "tables": { "example_table": { "hidden": true } } } } } | 1 | |
64 | 64 | Using YAML for metadata | Datasette accepts YAML as an alternative to JSON for your metadata configuration file. YAML is particularly useful for including multiline HTML and SQL strings. Here's an example of a metadata.yml file, re-using an example from Canned queries . title: Demonstrating Metadata from YAML description_html: |- <p>This description includes a long HTML string</p> <ul> <li>YAML is better for embedding HTML strings than JSON!</li> </ul> license: ODbL license_url: https://opendatacommons.org/licenses/odbl/ databases: fixtures: tables: no_primary_key: hidden: true queries: neighborhood_search: sql: |- select neighborhood, facet_cities.name, state from facetable join facet_cities on facetable.city_id = facet_cities.id where neighborhood like '%' || :text || '%' order by neighborhood; title: Search neighborhoods description_html: |- <p>This demonstrates <em>basic</em> LIKE search The metadata.yml file is passed to Datasette using the same --metadata option: datasette fixtures.db --metadata metadata.yml | 1 | |
65 | 65 | Introspection | Datasette includes some pages and JSON API endpoints for introspecting the current instance. These can be used to understand some of the internals of Datasette and to see how a particular instance has been configured. Each of these pages can be viewed in your browser. Add .json to the URL to get back the contents as JSON. | 1 | |
66 | 66 | /-/metadata | Shows the contents of the metadata.json file that was passed to datasette serve , if any. Metadata example : { "license": "CC Attribution 4.0 License", "license_url": "http://creativecommons.org/licenses/by/4.0/", "source": "fivethirtyeight/data on GitHub", "source_url": "https://github.com/fivethirtyeight/data", "title": "Five Thirty Eight", "databases": { } } | 1 | |
67 | 67 | /-/versions | Shows the version of Datasette, Python and SQLite. Versions example : { "datasette": { "version": "0.60" }, "python": { "full": "3.8.12 (default, Dec 21 2021, 10:45:09) \n[GCC 10.2.1 20210110]", "version": "3.8.12" }, "sqlite": { "extensions": { "json1": null }, "fts_versions": [ "FTS5", "FTS4", "FTS3" ], "compile_options": [ "COMPILER=gcc-6.3.0 20170516", "ENABLE_FTS3", "ENABLE_FTS4", "ENABLE_FTS5", "ENABLE_JSON1", "ENABLE_RTREE", "THREADSAFE=1" ], "version": "3.37.0" } } | 1 | |
68 | 68 | /-/plugins | Shows a list of currently installed plugins and their versions. Plugins example : [ { "name": "datasette_cluster_map", "static": true, "templates": false, "version": "0.10", "hooks": ["extra_css_urls", "extra_js_urls", "extra_body_script"] } ] Add ?all=1 to include details of the default plugins baked into Datasette. | 1 | |
69 | 69 | /-/settings | Shows the Settings for this instance of Datasette. Settings example : { "default_facet_size": 30, "default_page_size": 100, "facet_suggest_time_limit_ms": 50, "facet_time_limit_ms": 1000, "max_returned_rows": 1000, "sql_time_limit_ms": 1000 } | 1 | |
70 | 70 | /-/databases | Shows currently attached databases. Databases example : [ { "hash": null, "is_memory": false, "is_mutable": true, "name": "fixtures", "path": "fixtures.db", "size": 225280 } ] | 1 | |
71 | 71 | /-/threads | Shows details of threads and asyncio tasks. Threads example : { "num_threads": 2, "threads": [ { "daemon": false, "ident": 4759197120, "name": "MainThread" }, { "daemon": true, "ident": 123145319682048, "name": "Thread-1" }, ], "num_tasks": 3, "tasks": [ "<Task pending coro=<RequestResponseCycle.run_asgi() running at uvicorn/protocols/http/httptools_impl.py:385> cb=[set.discard()]>", "<Task pending coro=<Server.serve() running at uvicorn/main.py:361> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x10365c3d0>()]> cb=[run_until_complete.<locals>.<lambda>()]>", "<Task pending coro=<LifespanOn.main() running at uvicorn/lifespan/on.py:48> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x10364f050>()]>>" ] } | 1 | |
72 | 72 | /-/actor | Shows the currently authenticated actor. Useful for debugging Datasette authentication plugins. { "actor": { "id": 1, "username": "some-user" } } | 1 | |
73 | 73 | /-/messages | The debug tool at /-/messages can be used to set flash messages to try out that feature. See .add_message(request, message, type=datasette.INFO) for details of this feature. | 1 | |
74 | 74 | Running SQL queries | Datasette treats SQLite database files as read-only and immutable. This means it is not possible to execute INSERT or UPDATE statements using Datasette, which allows us to expose SELECT statements to the outside world without needing to worry about SQL injection attacks. The easiest way to execute custom SQL against Datasette is through the web UI. The database index page includes a SQL editor that lets you run any SELECT query you like. You can also construct queries using the filter interface on the tables page, then click "View and edit SQL" to open that query in the custom SQL editor. Note that this interface is only available if the execute-sql permission is allowed. Any Datasette SQL query is reflected in the URL of the page, allowing you to bookmark them, share them with others and navigate through previous queries using your browser back button. You can also retrieve the results of any query as JSON by adding .json to the base URL. | 1 | |
75 | 75 | Named parameters | Datasette has special support for SQLite named parameters. Consider a SQL query like this: select * from Street_Tree_List where "PermitNotes" like :notes and "qSpecies" = :species If you execute this query using the custom query editor, Datasette will extract the two named parameters and use them to construct form fields for you to provide values. You can also provide values for these fields by constructing a URL: /mydatabase?sql=select...&species=44 SQLite string escaping rules will be applied to values passed using named parameters - they will be wrapped in quotes and their content will be correctly escaped. Values from named parameters are treated as SQLite strings. If you need to perform numeric comparisons on them you should cast them to an integer or float first using cast(:name as integer) or cast(:name as real) , for example: select * from Street_Tree_List where latitude > cast(:min_latitude as real) and latitude < cast(:max_latitude as real) Datasette disallows custom SQL queries containing the string PRAGMA (with a small number of exceptions ) as SQLite pragma statements can be used to change database settings at runtime. If you need to include the string "pragma" in a query you can do so safely using a named parameter. | 1 | |
76 | 76 | Views | If you want to bundle some pre-written SQL queries with your Datasette-hosted database you can do so in two ways. The first is to include SQL views in your database - Datasette will then list those views on your database index page. The quickest way to create views is with the SQLite command-line interface: $ sqlite3 sf-trees.db SQLite version 3.19.3 2017-06-27 16:48:08 Enter ".help" for usage hints. sqlite> CREATE VIEW demo_view AS select qSpecies from Street_Tree_List; <CTRL+D> | 1 | |
77 | 77 | Canned queries | As an alternative to adding views to your database, you can define canned queries inside your metadata.json file. Here's an example: { "databases": { "sf-trees": { "queries": { "just_species": { "sql": "select qSpecies from Street_Tree_List" } } } } } Then run Datasette like this: datasette sf-trees.db -m metadata.json Each canned query will be listed on the database index page, and will also get its own URL at: /database-name/canned-query-name For the above example, that URL would be: /sf-trees/just_species You can optionally include "title" and "description" keys to show a title and description on the canned query page. As with regular table metadata you can alternatively specify "description_html" to have your description rendered as HTML (rather than having HTML special characters escaped). | 1 | |
78 | 78 | Canned query parameters | Canned queries support named parameters, so if you include those in the SQL you will then be able to enter them using the form fields on the canned query page or by adding them to the URL. This means canned queries can be used to create custom JSON APIs based on a carefully designed SQL statement. Here's an example of a canned query with a named parameter: select neighborhood, facet_cities.name, state from facetable join facet_cities on facetable.city_id = facet_cities.id where neighborhood like '%' || :text || '%' order by neighborhood; In the canned query metadata (here Using YAML for metadata as metadata.yaml ) it looks like this: databases: fixtures: queries: neighborhood_search: sql: |- select neighborhood, facet_cities.name, state from facetable join facet_cities on facetable.city_id = facet_cities.id where neighborhood like '%' || :text || '%' order by neighborhood title: Search neighborhoods Here's the equivalent using JSON (as metadata.json ): { "databases": { "fixtures": { "queries": { "neighborhood_search": { "sql": "select neighborhood, facet_cities.name, state\nfrom facetable\n join facet_cities on facetable.city_id = facet_cities.id\nwhere neighborhood like '%' || :text || '%'\norder by neighborhood", "title": "Search neighborhoods" } } } } } Note that we are using SQLite string concatenation here - the || operator - to add wildcard % characters to the string provided by the user. You can try this canned query out here: https://latest.datasette.io/fixtures/neighborhood_search?text=town In this example the :text named parameter is automatically extracted from the query using a regular expression. … | 1 | |
79 | 79 | Additional canned query options | Additional options can be specified for canned queries in the YAML or JSON configuration. | 1 | |
80 | 80 | hide_sql | Canned queries default to displaying their SQL query at the top of the page. If the query is extremely long you may want to hide it by default, with a "show" link that can be used to make it visible. Add the "hide_sql": true option to hide the SQL query by default. | 1 | |
81 | 81 | fragment | Some plugins, such as datasette-vega , can be configured by including additional data in the fragment hash of the URL - the bit that comes after a # symbol. You can set a default fragment hash that will be included in the link to the canned query from the database index page using the "fragment" key. This example demonstrates both fragment and hide_sql : { "databases": { "fixtures": { "queries": { "neighborhood_search": { "sql": "select neighborhood, facet_cities.name, state\nfrom facetable join facet_cities on facetable.city_id = facet_cities.id\nwhere neighborhood like '%' || :text || '%' order by neighborhood;", "fragment": "fragment-goes-here", "hide_sql": true } } } } } See here for a demo of this in action. | 1 | |
82 | 82 | Writable canned queries | Canned queries by default are read-only. You can use the "write": true key to indicate that a canned query can write to the database. See Controlling access to specific canned queries for details on how to add permission checks to canned queries, using the "allow" key. { "databases": { "mydatabase": { "queries": { "add_name": { "sql": "INSERT INTO names (name) VALUES (:name)", "write": true } } } } } This configuration will create a page at /mydatabase/add_name displaying a form with a name field. Submitting that form will execute the configured INSERT query. You can customize how Datasette represents success and errors using the following optional properties: on_success_message - the message shown when a query is successful on_success_redirect - the path or URL the user is redirected to on success on_error_message - the message shown when a query throws an error on_error_redirect - the path or URL the user is redirected to on error For example: { "databases": { "mydatabase": { "queries": { "add_name": { "sql": "INSERT INTO names (name) VALUES (:name)", "write": true, "on_success_message": "Name inserted", "on_success_redirect": "/mydatabase/names", "on_error_message": "Name insert failed", "on_error_redirect": "/mydatabase" } } } } } You can use "p… | 1 | |
83 | 83 | Magic parameters | Named parameters that start with an underscore are special: they can be used to automatically add values created by Datasette that are not contained in the incoming form fields or query string. These magic parameters are only supported for canned queries: to avoid security issues (such as queries that extract the user's private cookies) they are not available to SQL that is executed by the user as a custom SQL query. Available magic parameters are: _actor_* - e.g. _actor_id , _actor_name Fields from the currently authenticated Actors . _header_* - e.g. _header_user_agent Header from the incoming HTTP request. The key should be in lower case and with hyphens converted to underscores e.g. _header_user_agent or _header_accept_language . _cookie_* - e.g. _cookie_lang The value of the incoming cookie of that name. _now_epoch The number of seconds since the Unix epoch. _now_date_utc The date in UTC, e.g. 2020-06-01 _now_datetime_utc The ISO 8601 datetime in UTC, e.g. 2020-06-24T18:01:07Z _random_chars_* - e.g. … | 1 | |
84 | 84 | JSON API for writable canned queries | Writable canned queries can also be accessed using a JSON API. You can POST data to them using JSON, and you can request that their response is returned to you as JSON. To submit JSON to a writable canned query, encode key/value parameters as a JSON document: POST /mydatabase/add_message {"message": "Message goes here"} You can also continue to submit data using regular form encoding, like so: POST /mydatabase/add_message message=Message+goes+here There are three options for specifying that you would like the response to your request to return JSON data, as opposed to an HTTP redirect to another page. Set an Accept: application/json header on your request Include ?_json=1 in the URL that you POST to Include "_json": 1 in your JSON body, or &_json=1 in your form encoded body The JSON response will look like this: { "ok": true, "message": "Query executed, 1 row affected", "redirect": "/data/add_name" } The "message" and "redirect" values here will take into account on_success_message , on_success_redirect , on_error_message and on_error_redirect , if they have been set. | 1 | |
85 | 85 | Pagination | Datasette's default table pagination is designed to be extremely efficient. SQL OFFSET/LIMIT pagination can have a significant performance penalty once you get into multiple thousands of rows, as each page still requires the database to scan through every preceding row to find the correct offset. When paginating through tables, Datasette instead orders the rows in the table by their primary key and performs a WHERE clause against the last seen primary key for the previous page. For example: select rowid, * from Tree_List where rowid > 200 order by rowid limit 101 This represents page three for this particular table, with a page size of 100. Note that we request 101 items in the limit clause rather than 100. This allows us to detect if we are on the last page of the results: if the query returns less than 101 rows we know we have reached the end of the pagination set. Datasette will only return the first 100 rows - the 101st is used purely to detect if there should be another page. Since the where clause acts against the index on the primary key, the query is extremely fast even for records that are a long way into the overall pagination set. | 1 | |
86 | 86 | Cross-database queries | SQLite has the ability to run queries that join across multiple databases. Up to ten databases can be attached to a single SQLite connection and queried together. Datasette can execute joins across multiple databases if it is started with the --crossdb option: datasette fixtures.db extra_database.db --crossdb If it is started in this way, the /_memory page can be used to execute queries that join across multiple databases. References to tables in attached databases should be preceded by the database name and a period. For example, this query will show a list of tables across both of the above databases: select 'fixtures' as database, * from [fixtures].sqlite_master union select 'extra_database' as database, * from [extra_database].sqlite_master Try that out here . | 1 | |
87 | 87 | CSV export | Any Datasette table, view or custom SQL query can be exported as CSV. To obtain the CSV representation of the table you are looking, click the "this data as CSV" link. You can also use the advanced export form for more control over the resulting file, which looks like this and has the following options: download file - instead of displaying CSV in your browser, this forces your browser to download the CSV to your downloads directory. expand labels - if your table has any foreign key references this option will cause the CSV to gain additional COLUMN_NAME_label columns with a label for each foreign key derived from the linked table. In this example the city_id column is accompanied by a city_id_label column. stream all rows - by default CSV files only contain the first max_returned_rows records. This option will cause Datasette to loop through every matching record and return them as a single CSV file. You can try that out on https://latest.datasette.io/fixtures/facetable?_size=4 | 1 | |
88 | 88 | URL parameters | The following options can be used to customize the CSVs returned by Datasette. ?_header=off This removes the first row of the CSV file specifying the headings - only the row data will be returned. ?_stream=on Stream all matching records, not just the first page of results. See below. ?_dl=on Causes Datasette to return a content-disposition: attachment; filename="filename.csv" header. | 1 | |
89 | 89 | Streaming all records | The stream all rows option is designed to be as efficient as possible - under the hood it takes advantage of Python 3 asyncio capabilities and Datasette's efficient pagination to stream back the full CSV file. Since databases can get pretty large, by default this option is capped at 100MB - if a table returns more than 100MB of data the last line of the CSV will be a truncation error message. You can increase or remove this limit using the max_csv_mb config setting. You can also disable the CSV export feature entirely using allow_csv_stream . | 1 | |
90 | 90 | Performance and caching | Datasette runs on top of SQLite, and SQLite has excellent performance. For small databases almost any query should return in just a few milliseconds, and larger databases (100s of MBs or even GBs of data) should perform extremely well provided your queries make sensible use of database indexes. That said, there are a number of tricks you can use to improve Datasette's performance. | 1 | |
91 | 91 | Immutable mode | If you can be certain that a SQLite database file will not be changed by another process you can tell Datasette to open that file in immutable mode . Doing so will disable all locking and change detection, which can result in improved query performance. This also enables further optimizations relating to HTTP caching, described below. To open a file in immutable mode pass it to the datasette command using the -i option: datasette -i data.db When you open a file in immutable mode like this Datasette will also calculate and cache the row counts for each table in that database when it first starts up, further improving performance. | 1 | |
92 | 92 | Using "datasette inspect" | Counting the rows in a table can be a very expensive operation on larger databases. In immutable mode Datasette performs this count only once and caches the results, but this can still cause server startup time to increase by several seconds or more. If you know that a database is never going to change you can precalculate the table row counts once and store then in a JSON file, then use that file when you later start the server. To create a JSON file containing the calculated row counts for a database, use the following: datasette inspect data.db --inspect-file=counts.json Then later you can start Datasette against the counts.json file and use it to skip the row counting step and speed up server startup: datasette -i data.db --inspect-file=counts.json You need to use the -i immutable mode against the database file here or the counts from the JSON file will be ignored. You will rarely need to use this optimization in every-day use, but several of the datasette publish commands described in Publishing data use this optimization for better performance when deploying a database file to a hosting provider. | 1 | |
93 | 93 | HTTP caching | If your database is immutable and guaranteed not to change, you can gain major performance improvements from Datasette by enabling HTTP caching. This can work at two different levels. First, it can tell browsers to cache the results of queries and serve future requests from the browser cache. More significantly, it allows you to run Datasette behind a caching proxy such as Varnish or use a cache provided by a hosted service such as Fastly or Cloudflare . This can provide incredible speed-ups since a query only needs to be executed by Datasette the first time it is accessed - all subsequent hits can then be served by the cache. Using a caching proxy in this way could enable a Datasette-backed visualization to serve thousands of hits a second while running Datasette itself on extremely inexpensive hosting. Datasette's integration with HTTP caches can be enabled using a combination of configuration options and query string arguments. The default_cache_ttl setting sets the default HTTP cache TTL for all Datasette pages. This is 5 seconds unless you change it - you can set it to 0 if you wish to disable HTTP caching entirely. You can also change the cache timeout on a per-request basis using the ?_ttl=10 query string parameter. This can be useful when you are working with the Datasette JSON API - you may decide that a specific query can be cached for a longer time, or maybe you need to set ?_ttl=0 for some requests for example if you are running a SQL order by random() query. | 1 | |
94 | 94 | datasette-hashed-urls | If you open a database file in immutable mode using the -i option, you can be assured that the content of that database will not change for the lifetime of the Datasette server. The datasette-hashed-urls plugin implements an optimization where your database is served with part of the SHA-256 hash of the database contents baked into the URL. A database at /fixtures will instead be served at /fixtures-aa7318b , and a year-long cache expiry header will be returned with those pages. This will then be cached by both browsers and caching proxies such as Cloudflare or Fastly, providing a potentially significant performance boost. To install the plugin, run the following: datasette install datasette-hashed-urls Prior to Datasette 0.61 hashed URL mode was a core Datasette feature, enabled using the hash_urls setting. This implementation has now been removed in favor of the datasette-hashed-urls plugin. Prior to Datasette 0.28 hashed URL mode was the default behaviour for Datasette, since all database files were assumed to be immutable and unchanging. From 0.28 onwards the default has been to treat database files as mutable unless explicitly configured otherwise. | 1 | |
95 | 95 | Datasette | An open source multi-tool for exploring and publishing data Datasette is a tool for exploring and publishing data. It helps people take data of any shape or size and publish that as an interactive, explorable website and accompanying API. Datasette is aimed at data journalists, museum curators, archivists, local governments and anyone else who has data that they wish to share with the world. It is part of a wider ecosystem of tools and plugins dedicated to making working with structured data as productive as possible. Explore a demo , watch a presentation about the project or Try Datasette without installing anything using Glitch . Interested in learning Datasette? Start with the official tutorials . Support questions, feedback? Join our GitHub Discussions forum . | 1 | |
96 | 96 | Contents | Getting started Play with a live demo Follow a tutorial Datasette in your browser with Datasette Lite Try Datasette without installing anything using Glitch Using Datasette on your own computer Installation Basic installation Datasette Desktop for Mac Using Homebrew Using pip Advanced installation options Using pipx Using Docker A note about extensions The Datasette Ecosystem sqlite-utils Dogsheep CLI reference datasette --help datasette serve datasette --get datasette serve --help-settings datasette plugins datasette install datasette uninstall datasette publish datasette publish cloudrun datasette publish heroku datasette package datasette inspect Pages and API endpoints Top-level index Database Table Row Publishing data datasette publish Publishing to Google Cloud Run Publishing to Heroku Publishing to Vercel Publishing to Fly Custom metadata and plugins datasette package Deploying Datasette Deployment fundamentals Running Datasette using systemd Running Datasette using OpenRC Deploying using buildpacks Running Datasette behind a proxy Nginx proxy configuration Apache proxy configuration JSON API Different shapes Pagination Special JSON arguments Table arguments Column filter arguments Special table arguments Expanding foreign key references Discovering the JSON for a page Running SQL queries Named parameters Views Canned queries Canned query parameters Additional canned query options Writable canned queries Magic parameters JSON API for writable canned queries Pagination Cross-database queries Authentication and permissions Actors Using the "root" actor Permissions Defining permissions with "allow" blocks The /-/allow-debug tool Configuring permissions in metadata.json Controlling access to an instance Controlling access to specific databases Controlling access to specific tables and views Controlling access to specific canned queries Controlling the ability to execute arbitrary SQL Checking permissions in plugins actor_matches_allow() The permissions debug tool The ds_actor cookie Including an expiry time Th… | 1 | |
97 | 97 | Changelog | 1 | ||
98 | 98 | 0.64.4 (2023-09-21) | Fix for a crashing bug caused by viewing the table page for a named in-memory database. ( #2189 ) | 1 | |
99 | 99 | 0.64.3 (2023-04-27) | Added pip and setuptools as explicit dependencies. This fixes a bug where Datasette could not be installed using Rye . ( #2065 ) | 1 | |
100 | 100 | 0.64.2 (2023-03-08) | Fixed a bug with datasette publish cloudrun where deploys all used the same Docker image tag. This was mostly inconsequential as the service is deployed as soon as the image has been pushed to the registry, but could result in the incorrect image being deployed if two different deploys for two separate services ran at exactly the same time. ( #2036 ) | 1 |
Advanced export
JSON shape: default, array, newline-delimited
CREATE VIRTUAL TABLE [sections_fts] USING FTS5 ( [title], [content], tokenize='porter', content=[sections] );