Aug 27, 2020

Selective data deletion: a new feature for data quality management

Selective data deletion lets you delete rows of a Tinybird datasource that match a specified delete condition. This is important for an API-first platform.
Jorge Sancha
Co-founder & CEO

Data deletion operations are pretty common in transactional databases where your operational data lives. Often due to a data quality process in your operational database you will also need to update or delete your analytical data in Tinybird.

That’s why we already talked about how to update your analytical data selectively but there are other times when you need to fully delete certain data to continue providing reliable enterprise analytics.

Selective data deletion is pretty common in data reconcilliation processes like making your real-time analyses GDPR-compliant

Whether some of the applications ingesting your operational data were buggy, a transient error operating the production database or a change in some regulation, you might need the capability to delete unneeded data influencing your analysis in Tinybird.

A new API endpoint for selective data deletion.

Selective data deletion allows you to delete rows of a Tinybird datasource that match a specified delete condition. For an API-first platform like Tinybird, this operation translates into a secured API endpoint that developers can easily integrate in their real-time data quality management flows.

How to delete data selectively in Tinybird

In Tinybird data is organized in Datasources. Whether you have a CSV file locally or remotely accessible via HTTP(s) you can seamlessly ingest it in a datasource to start analyzing it, building and publishing real-time API endpoints.

Data deletion works by firing a POST request to the delete API endpoint providing the name of one of your datasources in Tinybird and a delete_condition parameter, which is an SQL expression filter.

Let’s say we want to delete all the rows from a transactions datasource for the country ES. We’d send a POST request to the delete enpoint like this:

The auth token used needs to have the DATASOURCES:CREATE scope, that way your data is protected from applications and/or users that only have read access to it.

The POST request to the delete API endpoint is asynchronous. It returns a Job response, indicating an ID for the job, the status of the job, the delete_condition and some other metadata.

Although the delete operation runs asynchronously (hence the job response), the operation waits synchronously for all the mutations to be re-written and data replicas to be deleted.

You can poll periodically the job_url with the given ID to check the status of the deletion process. When it’s done it means the data matching the SQL expression filter has been removed and all your pipes and API endpoints will continue running with the remaining data in the datasource.

Beyond data deletion in data quality management processes

While real-time analytical databases are optimized for SELECTs and INSERTs we keep fully supporting other operations needed in data quality management processes. We do that by hiding the complexity of data replication, partitions management or mutations rewriting, so you just have to worry about your data engineering flows and not the internals of real-time analytical databases.

We recommend you check our API docs for more information on how to:

What are your main challenges when dealing with large quantities of data? Request access to Tinybird and get started with real-time analytics right away.

Do you like this post?

Related posts

Real-time Data Visualization: How to build faster dashboards
A new way to create intermediate Data Sources in Tinybird
Tinybird
Team
Jun 15, 2023
Export data from Tinybird to Amazon S3 with the S3 Sink
Tinybird
Team
Mar 21, 2024
Tinybird: A ksqlDB alternative when stateful stream processing isn't enough
To the limits of SQL... and beyond
Automating data workflows with plaintext files and Git
Chatting GraphQL with Jamie Barton of Grafbase
Tinybird
Team
Apr 24, 2023
What it takes to build a real-time recommendation system
We launched an open source ClickHouse Knowledge Base
Tinybird
Team
Oct 11, 2022
The definition of real-time data

Build fast data products, faster.

Try Tinybird and bring your data sources together and enable engineers to build with data in minutes. No credit card required, free to get started.
Need more? Contact sales for Enterprise support.