The result of using SparkSQL to perform statistics on the data volume of the Hive table is twice that of the table data volume

  Kiến thức lập trình

orc格式表,spark 版本2.2
sql样式(select count(*)from table)
统计之前执行了收集表数据信息命令,收集到的表数据信息也是2倍;
以下没有用是凑字数的 111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111

New contributor

shyl is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

1

LEAVE A COMMENT