Iceberg 表不能用 Show Partitions 显示分区信息
总结:
- Iceberg 表不能用 Show Partitions 显示分区信息
- 插入 iceberg 的速度并不比插入正常表的速度快。
启动命令
spark-sql \--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions \--conf spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog \--conf spark.sql.catalog.spark_catalog.type=hive
iceberg 表
CREATE TABLE iceberg.inventory (inv_item_sk INT,inv_warehouse_sk INT,inv_quantity_on_hand INT,inv_date_sk INT)USING iceberg
PARTITIONED BY (inv_date_sk);insert into iceberg.inventory select * from tpcds_sf1000_withdecimal_withdate_withnulls.inventory;
插入语句指向性时间 Time taken: 338.963 seconds;
iceberg 格式的表查不到分区信息
show partitions spark_catalog.iceberg.inventory;
[INVALID_PARTITION_OPERATION.PARTITION_MANAGEMENT_IS_UNSUPPORTED] The partition command is invalid. Table `spark_catalog`.`iceberg`.`inventory` does not support partition management.; line 1 pos 16;
ShowPartitions [partition#161]
+- ResolvedTable org.apache.iceberg.spark.SparkSessionCatalog@2af1bf5a, iceberg.inventory, spark_catalog.iceberg.inventory, [inv_item_sk#162, inv_warehouse_sk#163, inv_quantity_on_hand#164, inv_date_sk#165]
parquet 格式
CREATE TABLE iceberg.inventory_parquet (inv_item_sk INT,inv_warehouse_sk INT,inv_quantity_on_hand INT,inv_date_sk INT)USING parquet
PARTITIONED BY (inv_date_sk);insert into iceberg.inventory_parquet select * from tpcds_sf1000_withdecimal_withdate_withnulls.inventory;spark-sql (iceberg)> insert into iceberg.inventory_parquet select * from tpcds_sf1000_withdecimal_withdate_withnulls.inventory;
插入语句指向性时间 Time taken: 329.275 seconds