PostgreSQL插件TimescaleDB

网友投稿 1036 2022-05-30

TimescaleDB

TimescaleDB是一个开源的postgresql的时序数据库扩展。基于PostgreSQL的强大能力,TimescaleDB能够提供跨时间和空间的自动分区以及完整的SQL支持。

TimescaleDB既像传统的关系型数据库一样支持全量SQL,又像NoSQL型数据库一样可扩展。它具有如下的优势:

简单易用

PostgreSQL插件之TimescaleDB

PostgreSQL原生支持的所有的SQL接口(包括二级索引,基于非时间的聚合,子查询,JOIN,窗口函数)

无需更改任何连接和使用PostgreSQL的客户端和工具

面向时间的功能、API函数以及优化

强大的数据保存策略的支持

扩展性强

透明的时间/空间分区功能,可进行向上和向外扩展

高数据写入率(包括批处理提交,内存索引,事务支持,对数据回填的支持)

单个节点上大小合适的块(二维数据分区)可确保即使在大数据量下也能快速提取

跨块和服务器的并行操作

可靠性高

基于可靠的PostgreSQL

受益于PostgreSQL 20多年研究(包括流复制,备份)的基础

灵活的管理选项(与现有的PostgreSQL生态系统和工具兼容)

更多关于TimescaleDB的信息,请参考[官方文档](https://docs.timescale.com/latest/main)

系统要求

PostgreSQL 9.6+

使用说明

创建插件

CREATE EXTENSION IF NOT EXISTS timescaledb;

删除插件

DROP EXTENSION IF EXISTS timescaledb;

使用示例1

1、创建普通测试表

postgres=# CREATE TABLE conditions ( postgres(# time TIMESTAMPTZ NOT NULL, postgres(# location TEXT NOT NULL, postgres(# temperature DOUBLE PRECISION NULL, postgres(# humidity DOUBLE PRECISION NULL postgres(# ); CREATE TABLE

2、基于time分区将上一步创建的普通表转换为超表

postgres=# SELECT create_hypertable('conditions', 'time'); create_hypertable ------------------------- (1,public,conditions,t) (1 row) postgres=#

关于超表概念以及如何创建超表,请参考超表和创建超表

3、插入数据并查询

postgres=# INSERT INTO conditions(time, location, temperature, humidity) SELECT now(), to_char(i, 'FM0000'), random()*i, random()*i FROM generate_series(1,10000) i; INSERT 0 10000 postgres=# SELECT * FROM conditions ORDER BY time DESC LIMIT 10; time | location | temperature | humidity -------------------------------+----------+--------------------+-------------------- 2021-02-04 14:25:05.861412+08 | 0001 | 0.8499118449648186 | 0.7081022857469108 2021-02-04 14:25:05.861412+08 | 0002 | 1.032869253930933 | 0.6919497591141521 2021-02-04 14:25:05.861412+08 | 0003 | 2.181961480555376 | 2.798874636756544 2021-02-04 14:25:05.861412+08 | 0004 | 2.0593786791351505 | 3.1111923049135726 2021-02-04 14:25:05.861412+08 | 0005 | 4.631874193357941 | 1.152362876862405 2021-02-04 14:25:05.861412+08 | 0006 | 3.1238939110304145 | 3.5332473749587052 2021-02-04 14:25:05.861412+08 | 0007 | 3.0055947001481975 | 2.681226072554999 2021-02-04 14:25:05.861412+08 | 0008 | 5.97665218304374 | 4.776195580392255 2021-02-04 14:25:05.861412+08 | 0009 | 0.4395944889294263 | 3.79820483119601 2021-02-04 14:25:05.861412+08 | 0010 | 5.213994089740517 | 6.613006901233973 (10 rows) ---针对过去3小时的数据,每15分钟采集度量一次,按照时间和温度降序排序 postgres=# SELECT time_bucket('15 minutes', time) AS fifteen_min, postgres-# location, COUNT(*), postgres-# MAX(temperature) AS max_temp, postgres-# MAX(humidity) AS max_hum postgres-# FROM conditions postgres-# WHERE time > NOW() - interval '3 hours' postgres-# GROUP BY fifteen_min, location postgres-# ORDER BY fifteen_min DESC, max_temp DESC; fifteen_min | location | count | max_temp | max_hum ------------------------+----------+-------+----------------------+------------------------ 2021-02-04 14:15:00+08 | 9974 | 1 | 9959.154515189259 | 8393.882609488966 2021-02-04 14:15:00+08 | 9963 | 1 | 9893.781627907236 | 7968.817036613598 2021-02-04 14:15:00+08 | 9852 | 1 | 9831.658490260146 | 4594.882403274832 2021-02-04 14:15:00+08 | 9962 | 1 | 9668.102999374136 | 9060.332750304518 2021-02-04 14:15:00+08 | 9807 | 1 | 9632.48883539376 | 6923.888752408443 2021-02-04 14:15:00+08 | 9627 | 1 | 9613.157259692176 | 9241.248136652985 2021-02-04 14:15:00+08 | 9800 | 1 | 9590.790872497493 | 2464.9034796248 2021-02-04 14:15:00+08 | 9600 | 1 | 9537.116866999975 | 3404.4305571978384 2021-02-04 14:15:00+08 | 9954 | 1 | 9523.032966159419 | 1555.249976678025 2021-02-04 14:15:00+08 | 9908 | 1 | 9499.432451454542 | 1110.053567264063 2021-02-04 14:15:00+08 | 9806 | 1 | 9481.471221781818 | 8187.133268036299 2021-02-04 14:15:00+08 | 9552 | 1 | 9378.450791195617 | 5158.115086741863 2021-02-04 14:15:00+08 | 9451 | 1 | 9371.378793857662 | 2626.3288166080547 2021-02-04 14:15:00+08 | 9654 | 1 | 9340.183911422057 | 8080.395366505775 2021-02-04 14:15:00+08 | 9419 | 1 | 9337.334558829742 | 1577.535230574912 2021-02-04 14:15:00+08 | 9929 | 1 | 9335.892011237425 | 7598.584466488536 2021-02-04 14:15:00+08 | 9803 | 1 | 9333.235219857552 | 6644.544463200405 2021-02-04 14:15:00+08 | 9899 | 1 | 9327.24793091667 | 1563.9061617953776 2021-02-04 14:15:00+08 | 9750 | 1 | 9327.22125169839 | 3425.3518007559906 2021-02-04 14:15:00+08 | 9914 | 1 | 9326.780445982797 | 7394.6374437081995 2021-02-04 14:15:00+08 | 9872 | 1 | 9302.263030958715 | 1101.1974349270733 2021-02-04 14:15:00+08 | 9581 | 1 | 9277.185015285479 | 6663.525557202216 2021-02-04 14:15:00+08 | 9708 | 1 | 9273.063941424782 | 3175.170956246176 2021-02-04 14:15:00+08 | 9848 | 1 | 9260.316420825136 | 3523.155521805003 2021-02-04 14:15:00+08 | 9632 | 1 | 9259.105259260195 | 227.1827826256217 2021-02-04 14:15:00+08 | 9466 | 1 | 9254.85035699184 | 2616.3724030114145 2021-02-04 14:15:00+08 | 9860 | 1 | 9247.49280995008 | 9379.583724363356 2021-02-04 14:15:00+08 | 9235 | 1 | 9226.944980400396 | 1061.4152797323275 ......

关于time_bucket()等TimescaleDB提供的API函数,请参考API Functions

使用示例2

该示例演示了如何分析纽约市出租车的运营情况,关于示例数据及更多信息介绍,请参考官方Tutorial

1、创建示例数据库及环境

postgres=# CREATE DATABASE nyc_data; CREATE DATABASE postgres=# \c nyc_data You are now connected to database "nyc_data" as user "pg12". nyc_data=# CREATE EXTENSION IF NOT EXISTS timescaledb;

2、创建普通测试表、超表及索引

nyc_data=# CREATE TABLE "rides"( nyc_data(# vendor_id TEXT, nyc_data(# pickup_datetime TIMESTAMP WITHOUT TIME ZONE NOT NULL, nyc_data(# dropoff_datetime TIMESTAMP WITHOUT TIME ZONE NOT NULL, nyc_data(# passenger_count NUMERIC, nyc_data(# trip_distance NUMERIC, nyc_data(# pickup_longitude NUMERIC, nyc_data(# pickup_latitude NUMERIC, nyc_data(# rate_code INTEGER, nyc_data(# dropoff_longitude NUMERIC, nyc_data(# dropoff_latitude NUMERIC, nyc_data(# payment_type INTEGER, nyc_data(# fare_amount NUMERIC, nyc_data(# extra NUMERIC, nyc_data(# mta_tax NUMERIC, nyc_data(# tip_amount NUMERIC, nyc_data(# tolls_amount NUMERIC, nyc_data(# improvement_surcharge NUMERIC, nyc_data(# total_amount NUMERIC nyc_data(# ); CREATE TABLE nyc_data=# SELECT create_hypertable('rides', 'pickup_datetime', 'payment_type', 2, create_default_indexes=>FALSE); create_hypertable -------------------- (1,public,rides,t) (1 row) nyc_data=# CREATE INDEX ON rides (vendor_id, pickup_datetime DESC); CREATE INDEX nyc_data=# CREATE INDEX ON rides (rate_code, pickup_datetime DESC); CREATE INDEX nyc_data=# CREATE INDEX ON rides (passenger_count, pickup_datetime DESC); CREATE INDEX

3、创建辅助表并插入数据

nyc_data=# CREATE TABLE IF NOT EXISTS "payment_types"( nyc_data(# payment_type INTEGER, nyc_data(# description TEXT nyc_data(# ); CREATE TABLE nyc_data=# INSERT INTO payment_types(payment_type, description) VALUES nyc_data-# (1, 'credit card'), nyc_data-# (2, 'cash'), nyc_data-# (3, 'no charge'), nyc_data-# (4, 'dispute'), nyc_data-# (5, 'unknown'), nyc_data-# (6, 'voided trip'); INSERT 0 6 nyc_data=# CREATE TABLE IF NOT EXISTS "rates"( nyc_data(# rate_code INTEGER, nyc_data(# description TEXT nyc_data(# ); CREATE TABLE nyc_data=# INSERT INTO rates(rate_code, description) VALUES nyc_data-# (1, 'standard rate'), nyc_data-# (2, 'JFK'), nyc_data-# (3, 'Newark'), nyc_data-# (4, 'Nassau or Westchester'), nyc_data-# (5, 'negotiated fare'), nyc_data-# (6, 'group ride'); INSERT 0 6

4、向测试表导入测试数据

nyc_data=# \COPY rides FROM nyc_data_rides.csv CSV; COPY 10906860

5、分析处理数据

--- 2016年1月每天都多少次打车行为 nyc_data=# SELECT date_trunc('day', pickup_datetime) as day, COUNT(*) FROM rides GROUP BY day ORDER BY day; day | count ---------------------+-------- 2016-01-01 00:00:00 | 345037 2016-01-02 00:00:00 | 312831 2016-01-03 00:00:00 | 302878 2016-01-04 00:00:00 | 316171 2016-01-05 00:00:00 | 343251 2016-01-06 00:00:00 | 348516 2016-01-07 00:00:00 | 364894 2016-01-08 00:00:00 | 392070 2016-01-09 00:00:00 | 405825 2016-01-10 00:00:00 | 351788 2016-01-11 00:00:00 | 342651 2016-01-12 00:00:00 | 367390 2016-01-13 00:00:00 | 395090 2016-01-14 00:00:00 | 396473 2016-01-15 00:00:00 | 401289 2016-01-16 00:00:00 | 411899 2016-01-17 00:00:00 | 379156 2016-01-18 00:00:00 | 341481 2016-01-19 00:00:00 | 385187 2016-01-20 00:00:00 | 382105 2016-01-21 00:00:00 | 399654 2016-01-22 00:00:00 | 420162 2016-01-23 00:00:00 | 78133 2016-01-24 00:00:00 | 159766 2016-01-25 00:00:00 | 282087 2016-01-26 00:00:00 | 327655 2016-01-27 00:00:00 | 359180 2016-01-28 00:00:00 | 383326 2016-01-29 00:00:00 | 414039 2016-01-30 00:00:00 | 435369 2016-01-31 00:00:00 | 361505 2017-11-17 00:00:00 | 2 (32 rows) ---前7天平均每个顾客的打车费是多少 nyc_data=# SELECT date_trunc('day', pickup_datetime) AS day, avg(fare_amount) nyc_data-# FROM rides WHERE passenger_count = 1 AND pickup_datetime < '2016-01-08' nyc_data-# GROUP BY day ORDER BY day; day | avg ---------------------+--------------------- 2016-01-01 00:00:00 | 12.5464748850129787 2016-01-02 00:00:00 | 12.1129878886746750 2016-01-03 00:00:00 | 12.8262352076841150 2016-01-04 00:00:00 | 11.9116533573721472 2016-01-05 00:00:00 | 11.7534235580737452 2016-01-06 00:00:00 | 11.7824805635293235 2016-01-07 00:00:00 | 11.9498961299166930 (7 rows) ---一个月内每种费率类型的次数是多少 nyc_data=# SELECT rate_code, COUNT(vendor_id) AS num_trips nyc_data-# FROM rides WHERE pickup_datetime < '2016-02-01' nyc_data-# GROUP BY rate_code ORDER BY rate_code; rate_code | num_trips -----------+----------- 1 | 10626315 2 | 225019 3 | 16822 4 | 4696 5 | 33688 6 | 102 99 | 216 (7 rows) ---2016年第一天每5分钟有多少次乘车? nyc_data=# SELECT time_bucket('5 minute', pickup_datetime) AS five_min, count(*) nyc_data-# FROM rides WHERE pickup_datetime < '2016-01-02 00:00' nyc_data-# GROUP BY five_min ORDER BY five_min; five_min | count ---------------------+------- 2016-01-01 00:00:00 | 703 2016-01-01 00:05:00 | 1482 2016-01-01 00:10:00 | 1959 2016-01-01 00:15:00 | 2200 2016-01-01 00:20:00 | 2285 2016-01-01 00:25:00 | 2291 2016-01-01 00:30:00 | 2349 2016-01-01 00:35:00 | 2328 2016-01-01 00:40:00 | 2440 2016-01-01 00:45:00 | 2372 2016-01-01 00:50:00 | 2388 2016-01-01 00:55:00 | 2473 2016-01-01 01:00:00 | 2395 2016-01-01 01:05:00 | 2510 2016-01-01 01:10:00 | 2412 2016-01-01 01:15:00 | 2482 2016-01-01 01:20:00 | 2428 2016-01-01 01:25:00 | 2433 2016-01-01 01:30:00 | 2337 2016-01-01 01:35:00 | 2366 2016-01-01 01:40:00 | 2325 2016-01-01 01:45:00 | 2257 2016-01-01 01:50:00 | 2316 2016-01-01 01:55:00 | 2250 2016-01-01 02:00:00 | 2303 2016-01-01 02:05:00 | 2259 2016-01-01 02:10:00 | 2253 2016-01-01 02:15:00 | 2176 2016-01-01 02:20:00 | 2138 ......

PostgreSQL 云数据库 PostgreSQL 数据库

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:[转]Blender2.5快捷键
下一篇:excel表格中的折线图突出最高值有什么方法
相关文章